Hello 大家好,社长带你一站式选 AI Hi there — your club president guides a one-stop AI toolkit

我是锦一高中国际部 AI 社团的社长,常年在一线折腾 chatgpt、stable diffusion + comfyui,做过文本写作、图像/视频生成控制、音乐生成、代码自动化等项目;也在计算机视觉、数学建模、环境监测等方向做了不少研究。以下内容完全基于我们的上手体验,为作业、项目、竞赛与内容创作提供可直接落地的选型参考。 I'm the AI club president at Jinyi High School International Division. I've spent tons of hours hands-on with ChatGPT, Stable Diffusion + ComfyUI, video & music generation, and coding agents, plus projects in computer vision, math modeling, and environmental monitoring. This guide is based on real usage to help you pick tools for homework, projects, competitions, and content creation.

最后更新:2025-10-09 Last updated: Oct 9, 2025

文本与研究(写作、总结、资料综述、调研)Text & Research (writing, summarization, literature review)

模型:ChatGPT-5(含 research,多模态整合)、DeepSeek、Gemini、文心一言、Grok(含 Live/Deep Search,多模态整合) Models: ChatGPT-5 (with Deep Research, multimodal), DeepSeek, Gemini, Wenxin Yiyan, Grok (Live/Deep Search, multimodal)

ChatGPT-5(OpenAI) 多模态 LLMMultimodal LLM

指令完成度高、逻辑稳;超长上下文与多模态整合;Deep Research 支持多步检索与带引用综述。 High instruction following and reasoning; very long context & multimodal; Deep Research for multi-step web review with citations.

PRO 联网覆盖广(含 Google、GitHub);thinking 长短可调;PDF/图片阅读顺滑;整合视频(Sora)、图片(DALL·E 3)、代码(Codex)。 Wide web coverage (Google, GitHub); adjustable reasoning depth; smooth PDF/image reading; integrates video (Sora), image (DALL·E 3), code (Codex).
CON 付费(Plus $20/月);国内通常需要科学上网。 Paid (Plus $20/mo); requires VPN in Mainland China.

Grok(xAI) 联网 LLMWeb-connected LLM

对话、编码与推理全面,Live/Deep Search 联通 Web、X、News、RSS 等多源;与 X 生态打通。 Covers chat, coding, reasoning; Live/Deep Search across Web/X/News/RSS; integrated with the X ecosystem.

PRO 热点追踪与时效资料对照高效。 Efficient for breaking news tracking and multi-source corroboration.
CON 付费;国内使用需科学上网。 Paid; VPN typically required in Mainland China.

DeepSeek LLMLLM

thinking 模式扎实,执行与推理稳定;API 定价友好,适合大批量任务。 Solid “thinking” mode, stable execution & reasoning; budget-friendly API for large batches.

PRO 性价比高,长文与批量生成压力小。 Great value; handles long-form and batch jobs cheaply.
CON 多模态一般;对部分海外专业站点联动较弱。 Weaker multimodal; limited linkage to some overseas dev sites.

Gemini(Google) 多模态 LLMMultimodal LLM

多模态一体化;与 Google 生态(Drive/Docs/Sheets/YouTube/Maps/Photos)联动顺滑。 Strong multimodal; seamless with Google ecosystem (Drive/Docs/Sheets/YouTube/Maps/Photos).

PRO 跨应用工作流顺畅。 Smooth cross-app workflows.
CON 功能开放度与地区/套餐相关。 Availability depends on region/plan.

文心一言(百度) 中文 LLMChinese LLM

中文语料亲和、本地化强;中文写作/传统文化类素材表现稳。 Chinese-centric corpus and localization; reliable for Chinese writing & culture topics.

PRO 中文表达顺畅、资料本地化。 Fluent Chinese; local content.
CON 跨境素材与多模态深度需视项目评估。 Cross-border content & advanced multimodal vary by use case.

编程与自动化(写代码、重构、代理执行)Coding & Automation (coding, refactor, agent exec)

工具/模型:Cursor(软件)、Cline(开源插件)、ChatGPT-5 / o3 / DeepSeek(聊天框代码),Codex Tools: Cursor (IDE), Cline (open-source plugin agent), ChatGPT-5 / o3 / DeepSeek (chat coding), Codex

Codex(OpenAI) 代码模型Code Model

早期经典代码模型,现已并入新版 ChatGPT 的代码能力;提示对齐好、报错率低。 Classic code model now folded into modern ChatGPT; strong instruction-following, low error rate.

PRO 快速原型与算法草稿方便。 Great for rapid prototyping and algorithm drafts.
CON 需订阅 ChatGPT 套餐。 Requires ChatGPT subscription.

Cline IDE 代理IDE Agent

开源,BYOK(自带 API Key);在许可下执行命令、编辑文件、开浏览器并分步调试;Plan Mode & 上下文用量可视化。 Open-source BYOK; executes commands, edits files, opens browser, stepwise debugging; plan mode & context usage viz.

PRO 多步执行、受控自动化强。 Strong for multi-step, controlled automation.
CON 需配置;能力依赖所选 API 模型。 Setup required; depends on your chosen API model.

聊天框写代码(ChatGPT-5 / o3 / DeepSeek) LLMLLM

适合函数/脚本/小工具级任务;配合单元测试与最小复现更稳。Best for functions/scripts/utilities; pair with unit tests & minimal repro.

Cursor AI-Native IDEAI-Native IDE

仓库级上下文、对话式重构、自动 diff/应用、测试补全、逐步计划与执行(Agent),支持规则化批量修改,适合项目级重构与迁移。 Repo-level context, conversational refactor, auto diff/apply, test fill-ins, stepwise agent execution; rule-based bulk edits for project-wide changes.

图片生成与编辑(插画、概念图、修复)Image Generation & Editing

模型:Stable Diffusion(本地/云,含 LoRA/ControlNet)+ ComfyUI(节点化)、ChatGPT-DALL·E、Midjourney、豆包 Models: Stable Diffusion (local/cloud, LoRA/ControlNet) + ComfyUI, ChatGPT-DALL·E, Midjourney, Doubao

Stable Diffusion + ComfyUI 开源图像生成Open-source Image Gen

本地部署建议独显 RTX 4060+;节点化工作流生态庞大,文生图/修图/逐帧视频/表情与动作提取均可拼装,适合风格定制与批量生产。 For local, target RTX 4060+; node-based pipelines cover txt2img, editing, frame-wise video, expression & motion extraction—great for style control & batch output.

ChatGPT - DALL·E 3 图像生成Image Gen

控制力强、单张质量高、对话内反复修正;审核严格,适合正式项目。Strong control & single-image quality; iterative tweaks in chat; stricter safety—good for formal assets.

Midjourney 图像生成Image Gen

风格审美强、社区素材多,快速出“高级感”。Strong aesthetics; huge community—fast “premium look”.

豆包 中文图像生成Chinese Image Gen

中文提示词友好、上手简单;日常插图/海报底图成本可控。Chinese prompt friendly; easy to start; cost-effective for daily posters.

视频生成(短视频、镜头迁移、逐帧管线)Video Generation (shorts, motion transfer, frame pipeline)

工具:Sora、Runway、Pika、海螺 Hailuo、可灵(Kling)、Stable Diffusion + ComfyUI(逐帧/插帧/光流)、Viggle(动作迁移) Tools: Sora, Runway, Pika, Hailuo, Kling, SD+ComfyUI (frame/interp/optical flow), Viggle

海螺 Hailuo / 可灵 Kling 视频生成Video Gen

提示词控制直观,适合“文案→视频”快速转化。Prompt-to-video is straightforward for fast copy→video.

Stable Diffusion + ComfyUI(逐帧/插帧/光流) 视频管线Video Pipeline

需本地/服务器;“逐帧→插帧→光流稳定”统一风格并平滑运动。Local/server; “frame→interpolation→optical flow” for consistent style and smooth motion.

Viggle 动作迁移Motion Transfer

表演/姿态迁移快,角色驱动玩法丰富。Fast performance/pose transfer for character-driven clips.

Pika 视频生成Video Gen

创意动效多,社媒短视频好用。Creative effects; great for social shorts.

Runway A/V 套件A/V Suite

生成 + 编辑一体,适合团队流水线。Gen + edit in one; good for team pipelines.

Sora 视频生成Video Gen

Sora 2 在物理、语音同步与口型方面更逼真,逐步开放为 App;条款与合规以官方为准。Sora 2 offers better physics, lip-sync, and audio; rolling out as an app—check official policy/availability.

音乐生成(BGM、歌曲 Demo)Music Generation (BGM, song demos)

Suno 音乐生成Music Gen

文本到旋律与歌词一体,做 BGM/活动配乐高效,注意版权与商用条款。Text-to-music with lyrics; efficient for BGM/events—mind licensing.

演示与物料(PPT/导图/讲稿 + 海报/宣传)Slides & Assets (PPT/mindmap/speech + posters)

ChatGPT Agent(PPT/导图工作流) 工作流 AgentWorkflow Agent

题目 → 大纲 → 图表/素材 → 版式 → 多格式导出;可与“文本/搜索/出图”打通。Topic → outline → charts/assets → layout → export; connects to text/search/image generation.

Canva / Gamma / WPS AI 版式工具Layout Tools

图片模型(SD/DALL·E/MJ)供图,版式工具完成模板与多尺寸导出。Use SD/DALL·E/MJ for images; layout tools handle templates & multi-size export.

联系我Contact Me

Email:gmyls90@gmail.com Email: gmyls90@gmail.com