Hello 大家好，社长带你一站式选 AI Hi there — your club president guides a one-stop AI toolkit

我是锦一高中国际部 AI 社团的社长，常年在一线折腾 chatgpt、stable diffusion + comfyui，做过文本写作、图像/视频生成控制、音乐生成、代码自动化等项目；也在计算机视觉、数学建模、环境监测等方向做了不少研究。以下内容完全基于我们的上手体验，为作业、项目、竞赛与内容创作提供可直接落地的选型参考。 I'm the AI club president at Jinyi High School International Division. I've spent tons of hours hands-on with ChatGPT, Stable Diffusion + ComfyUI, video & music generation, and coding agents, plus projects in computer vision, math modeling, and environmental monitoring. This guide is based on real usage to help you pick tools for homework, projects, competitions, and content creation.

最后更新：2025-12-30 Last updated: Dec 30, 2025

文本与研究（写作、总结、资料综述、调研）Text & Research (writing, summarization, literature review)

模型：ChatGPT-5、DeepSeek、Gemini 3、Qwen (通义千问)、文心一言、Grok Models: ChatGPT-5, DeepSeek, Gemini 3, Qwen, Wenxin Yiyan, Grok

ChatGPT-5（OpenAI）多模态 LLMMultimodal LLM

指令完成度高、逻辑稳；超长上下文与多模态整合；Deep Research 支持多步检索与带引用综述。 High instruction following and reasoning; very long context & multimodal; Deep Research for multi-step web review with citations.

PRO 联网覆盖广（含 Google、GitHub）；thinking 长短可调；PDF/图片阅读顺滑；整合视频（Sora）、图片（DALL·E 3）、代码（Codex）。 Wide web coverage (Google, GitHub); adjustable reasoning depth; smooth PDF/image reading; integrates video (Sora), image (DALL·E 3), code (Codex).

CON 付费（Plus $20/月）；国内通常需要科学上网。 Paid (Plus $20/mo); requires VPN in Mainland China.

官网Official Deep Research 多模态 · 长上下文Multimodal · Long context

Gemini 3（Google）原生多模态Native Multimodal

Google 最新旗舰，原生理解音频/视频/图像/文本；超大上下文；与 Google 生态（Drive/Docs/YT）联动极强。 Google's flagship; natively understands audio/video/images/text; massive context; deep integration with Google Workspace.

PRO 反应速度极快；处理长视频与大型文档库能力独一档。 Extremely fast; unmatched capability in handling long videos and large document sets.

CON 功能开放度与地区相关；部分高级推理功能需付费。 Features vary by region; advanced reasoning features may require paid plans.

官网Official 原生多模态 · 2M+ ContextNative Multimodal · 2M+ Context

Grok（xAI）联网 LLMWeb-connected LLM

对话、编码与推理全面，Live/Deep Search 联通 Web、X、News、RSS 等多源；与 X 生态打通。 Covers chat, coding, reasoning; Live/Deep Search across Web/X/News/RSS; integrated with the X ecosystem.

PRO 热点追踪与时效资料对照高效。 Efficient for breaking news tracking and multi-source corroboration.

CON 付费；国内使用需科学上网。 Paid; VPN typically required in Mainland China.

官网Official Live/Deep Search X 生态 · 实时检索X ecosystem · Live search

DeepSeek LLMLLM

thinking 模式扎实，执行与推理稳定；API 定价友好，适合大批量任务。 Solid “thinking” mode, stable execution & reasoning; budget-friendly API for large batches.

PRO 性价比高，长文与批量生成压力小。 Great value; handles long-form and batch jobs cheaply.

CON 多模态一般；对部分海外专业站点联动较弱。 Weaker multimodal; limited linkage to some overseas dev sites.

官网Official API Pricing 低成本 · 强推理Low cost · Strong reasoning

Qwen（通义千问）开源/闭源 LLMOpen/Closed LLM

阿里出品，开源版本（2.5）性能强悍；数学与代码能力在中文模型中顶尖；中文语境理解极深。 From Alibaba; open weights (2.5) are powerful; top-tier math & coding performance; deep understanding of Chinese context.

PRO 开源可部署；API 性价比高；理科解题强。 Locally deployable; high API value; strong in math/STEM.

CON 多模态能力稍逊于 GPT-5/Gemini。 Multimodal capabilities slightly behind GPT-5/Gemini.

官网Official Hugging Face 数学代码 · 中文最强Math/Code · Top Chinese

文心一言（百度）中文 LLMChinese LLM

中文语料亲和、本地化强；中文写作/传统文化类素材表现稳。 Chinese-centric corpus and localization; reliable for Chinese writing & culture topics.

PRO 中文表达顺畅、资料本地化。 Fluent Chinese; local content.

CON 跨境素材与多模态深度需视项目评估。 Cross-border content & advanced multimodal vary by use case.

官网Official 中文语料 · 本地化Chinese corpus · Localized

编程与自动化（写代码、重构、代理执行）Coding & Automation (coding, refactor, agent exec)

工具/模型：Cursor（软件）、Cline（开源插件）、ChatGPT-5 / o3 / DeepSeek（聊天框代码），Codex Tools: Cursor (IDE), Cline (open-source plugin agent), ChatGPT-5 / o3 / DeepSeek (chat coding), Codex

Codex（OpenAI）代码模型Code Model

早期经典代码模型，现已并入新版 ChatGPT 的代码能力；提示对齐好、报错率低。 Classic code model now folded into modern ChatGPT; strong instruction-following, low error rate.

PRO 快速原型与算法草稿方便。 Great for rapid prototyping and algorithm drafts.

CON 需订阅 ChatGPT 套餐。 Requires ChatGPT subscription.

ChatGPT Plus 代码生成 · APICode gen · API

Cline IDE 代理IDE Agent

开源，BYOK（自带 API Key）；在许可下执行命令、编辑文件、开浏览器并分步调试；Plan Mode & 上下文用量可视化。 Open-source BYOK; executes commands, edits files, opens browser, stepwise debugging; plan mode & context usage viz.

PRO 多步执行、受控自动化强。 Strong for multi-step, controlled automation.

CON 需配置；能力依赖所选 API 模型。 Setup required; depends on your chosen API model.

官网Official GitHub BYOK · 受控执行BYOK · Controlled exec

聊天框写代码（ChatGPT-5 / o3 / DeepSeek） LLMLLM

适合函数/脚本/小工具级任务；配合单元测试与最小复现更稳。Best for functions/scripts/utilities; pair with unit tests & minimal repro.

ChatGPT DeepSeek 轻量 · 小型脚本Lightweight · Small scripts

Cursor AI-Native IDEAI-Native IDE

仓库级上下文、对话式重构、自动 diff/应用、测试补全、逐步计划与执行（Agent），支持规则化批量修改，适合项目级重构与迁移。 Repo-level context, conversational refactor, auto diff/apply, test fill-ins, stepwise agent execution; rule-based bulk edits for project-wide changes.

Official Features 仓库级 · 重构代理Repo-scale · Refactor agent

图片生成与编辑（插画、概念图、修复）Image Generation & Editing

模型：Nano Banana Pro (Gemini 3 Image)、Stable Diffusion + ComfyUI、ChatGPT-DALL·E、Midjourney、豆包 Models: Nano Banana Pro (Gemini 3 Image), Stable Diffusion + ComfyUI, ChatGPT-DALL·E, Midjourney, Doubao

Nano Banana Pro (Gemini 3 Image) 高精图像生成High-Fi Image Gen

专为复杂指令设计，支持精准的物体布局、文字渲染与极高分辨率输出；对 Prompt 的理解力超越传统扩散模型。 Designed for complex instructions; supports precise object layout, text rendering, and high-res output; superior prompt adherence.

PRO 文字生成准确率高；支持图像编辑与局部重绘；速度极快。 High text rendering accuracy; in-painting/editing support; blazing speeds.

CON 风格化不如 MJ 艺术感强，更偏写实与精确。 Less artistic stylization than MJ; leans towards realism/precision.

AI Studio 精准控制 · 文字渲染Precision · Text Rendering

Stable Diffusion + ComfyUI 开源图像生成Open-source Image Gen

本地部署建议独显 RTX 4060+；节点化工作流生态庞大，文生图/修图/逐帧视频/表情与动作提取均可拼装，适合风格定制与批量生产。 For local, target RTX 4060+; node-based pipelines cover txt2img, editing, frame-wise video, expression & motion extraction—great for style control & batch output.

Stable Diffusion ComfyUI 节点化 · 高可控Node-based · Highly controllable

ChatGPT - DALL·E 3 图像生成Image Gen

控制力强、单张质量高、对话内反复修正；审核严格，适合正式项目。Strong control & single-image quality; iterative tweaks in chat; stricter safety—good for formal assets.

DALL·E 3 In ChatGPT 对话修图 · 高质Chat-based · High fidelity

Midjourney 图像生成Image Gen

风格审美强、社区素材多，快速出“高级感”。Strong aesthetics; huge community—fast “premium look”.

Official 风格强 · 社区多Stylized · Large community

豆包中文图像生成Chinese Image Gen

中文提示词友好、上手简单；日常插图/海报底图成本可控。Chinese prompt friendly; easy to start; cost-effective for daily posters.

Official 中文友好 · 低门槛Chinese-friendly · Low barrier

视频生成（短视频、镜头迁移、逐帧管线）Video Generation (shorts, motion transfer, frame pipeline)

工具：Sora、Runway、Pika、海螺 Hailuo、可灵（Kling）、Stable Diffusion + ComfyUI（逐帧/插帧/光流）、Viggle（动作迁移） Tools: Sora, Runway, Pika, Hailuo, Kling, SD+ComfyUI (frame/interp/optical flow), Viggle

海螺 Hailuo / 可灵 Kling 视频生成Video Gen

提示词控制直观，适合“文案→视频”快速转化。Prompt-to-video is straightforward for fast copy→video.

Hailuo Showcase Kling (Kuaishou) 提示工作流 · 快产出Prompt workflow · Fast output

Stable Diffusion + ComfyUI（逐帧/插帧/光流）视频管线Video Pipeline

需本地/服务器；“逐帧→插帧→光流稳定”统一风格并平滑运动。Local/server; “frame→interpolation→optical flow” for consistent style and smooth motion.

ComfyUI 逐帧 · 插帧 · 光流Frame · Interp · Optical flow

Viggle 动作迁移Motion Transfer

表演/姿态迁移快，角色驱动玩法丰富。Fast performance/pose transfer for character-driven clips.

Official 角色驱动 · 快迁移Character-driven · Fast transfer

Pika 视频生成Video Gen

创意动效多，社媒短视频好用。Creative effects; great for social shorts.

Official 创意特效 · 社媒Creative FX · Social

Runway A/V 套件A/V Suite

生成 + 编辑一体，适合团队流水线。Gen + edit in one; good for team pipelines.

Official 生成+剪辑 · 一体化Generation+Editing · Unified

Sora 视频生成Video Gen

Sora 2 在物理、语音同步与口型方面更逼真，逐步开放为 App；条款与合规以官方为准。Sora 2 offers better physics, lip-sync, and audio; rolling out as an app—check official policy/availability.

Official 长镜头 · 物理一致Long shots · Physical consistency

AI 搜索（引用检索、时效问答、深度调研）AI Search (citations, recency, deep research)

工具：Perplexity、Grok Search、DeepSeek 搜索、ChatGPT Deep Research · 工作流：考试式问答；研究式多步检索 → 证据对照 → 引用汇总 Tools: Perplexity, Grok Search, DeepSeek Search, ChatGPT Deep Research · Flows: quiz-style Q&A; multi-step research → evidence → cited summary

Perplexity AI 搜索AI Search

带引用的实时答案，看全局 + 追溯来源很稳。Real-time answers with citations—great overview and traceability.

Official 带引用 · 实时Cited · Real-time

Grok Search（Live/Deep） AI 搜索AI Search

联通 Web/X/News/RSS，多源追踪时效话题。Connects Web/X/News/RSS for multi-source, time-sensitive topics.

Docs 多源聚合 · 时效Multi-source · Recency

DeepSeek 搜索 AI 搜索AI Search

与 DeepSeek 模型协同，中文检索与成本可控的长链条查询体验不错。Works with DeepSeek models—solid Chinese search and low-cost long chains.

Official 中文检索 · 低成本Chinese search · Low cost

ChatGPT Deep Research 研究代理Research Agent

自动多步检索、证据对照与带引用报告，适合系统综述。Automates multi-step web research and cited reports for system reviews.

Intro 多步检索 · 报告Multi-step · Report

音乐生成（BGM、歌曲 Demo）Music Generation (BGM, song demos)

Suno 音乐生成Music Gen

文本到旋律与歌词一体，做 BGM/活动配乐高效，注意版权与商用条款。Text-to-music with lyrics; efficient for BGM/events—mind licensing.

Official 多风格 · 快产出Multi-style · Fast output

演示与物料（PPT/导图/讲稿 + 海报/宣传）Slides & Assets (PPT/mindmap/speech + posters)

ChatGPT Agent（PPT/导图工作流）工作流 AgentWorkflow Agent

题目 → 大纲 → 图表/素材 → 版式 → 多格式导出；可与“文本/搜索/出图”打通。Topic → outline → charts/assets → layout → export; connects to text/search/image generation.

GPTs 自动大纲 · 多格式导出Auto outline · Multi-format export

Canva / Gamma / WPS AI 版式工具Layout Tools

图片模型（SD/DALL·E/MJ）供图，版式工具完成模板与多尺寸导出。Use SD/DALL·E/MJ for images; layout tools handle templates & multi-size export.

Canva Gamma WPS AI 模板库 · 多尺寸Template library · Multi-size

联系我Contact Me

Email：gmyls90@gmail.com Email: gmyls90@gmail.com