译：AI Agent

发布于 2025年6月23日

原文： https://leerob.io/blog/ai-agents
作者： lee robinson
译者： Gemini 2.5 Pro

I’m using agents to write a lot of code now.

我现在用 agent 写很多代码。

Just 6 months ago it didn’t really work well, so if you haven’t tried in a while, this is your sign. Let me explain practically how I’ve been using them, where they still kind of suck, and what tools I’ve had success with.

就在 6 个月前，它们还不太好用，所以如果你有一段时间没试过了，这篇文章就是给你的信号。让我来实际地解释一下我一直在如何使用它们，它们在哪些方面仍然不尽如人意，以及我用哪些工具取得了成功。

I’m using a combination of Cursor, Claude Code, and v0 for a variety of different tasks. You might first ask: wait, why three tools?

我正在组合使用 Cursor、Claude Code 和 v0 来处理各种不同的任务。你可能会先问：等等，为什么要用三个工具？

The reality of AI models today (and AI products) is that it’s unlikely a single tool will be able to reliably handle everything. At least that’s what I’ve seen, your mileage may vary.

当今 AI 模型（以及 AI 产品）的现实是，不太可能有一个单一的工具能够可靠地处理所有事情。至少这是我看到的，你的体验可能有所不同。

Let’s talk through each tool.

我们来逐一聊聊这些工具。

Cursor (主力 IDE)

I’ve been using Cursor as my primary IDE for about 4 months. Before that, I spent 6 months¹ flipping between Zed and Neovim (after using VS Code for a long time). Cursor is really good.

我用 Cursor 当我的主力 IDE 大概 4 个月了。在此之前，我花了 6 个月¹ 在 Zed 和 Neovim 之间来回切换（在长期使用 VS Code 之后）。Cursor 真的很棒。

For me, Cursor brings the familiarity of VS Code with the best AI interface for general programming (reading files, quick edits, tab completion). Sounds silly but the built-in git diff of VS Code is incredible and I prefer it over many other tools.

对我来说，Cursor 既有 VS Code 的熟悉感，又为通用编程（读取文件、快速编辑、tab 补全）提供了最好的 AI 界面。听起来可能有点傻，但 VS Code 内置的 git diff 功能非常出色，我更喜欢它，胜过许多其他工具。

I’ve only started to briefly experiment with background agents, because at about the same time I started trying out Claude Code. Cursor seems to keep getting better on every release, so I’m going to stick with it.

我才刚开始简单地尝试后台 agent，因为差不多在同一时间，我开始试用 Claude Code。Cursor 似乎每个版本都在变得更好，所以我会继续用它。

Claude Code (Agent 式循环)

Claude Code is the first CLI agent I’ve been extensively testing. There are others in the space (including OSS versions) which likely have similar properties, so it’s too soon for me to say one is dramatically better than others.

Claude Code 是我第一个深度测试的 CLI agent。这个领域还有其他工具（包括开源版本），它们可能具有相似的特性，所以现在说某一个比其他的要好得多还为时过早。

But more than anything, Claude Code has shown me the power of extremely fast loops with agents. It feels a bit faster than Cursor’s agent, but this could also be the UX (it’s really well designed). Claude Code has access to a bunch of tools, including web search, and is able to spin up subtasks to do even more work in parallel.

但最重要的是，Claude Code 向我展示了用 agent 进行极速循环的强大威力。它感觉比 Cursor 的 agent 快一些，但这也可能是 UX 的原因（它的设计确实很好）。Claude Code 可以使用一系列工具，包括网络搜索，并且能够启动子任务来并行处理更多工作。

In practice, I’ve found it to be extremely good when you can control the entire “loop”. Write some code, check if it compiles, if not fix it. Then try the tests. If they fail, fix it. Rinse and repeat for linting or other steps. This is where Software 1.0 best practices meet Software 2.0 (AI-era).

在实践中，我发现它在你能控制整个“循环”时非常好用。写一些代码，检查是否能编译，如果不行就修复它。然后运行测试。如果失败了，就修复它。对于 linting 或其他步骤，也是如此循环。这就是 Software 1.0 的最佳实践与 Software 2.0（AI 时代）相遇的地方。

Having deterministic, fast ways to verify correctness in your apps is key for agents. You want tests. And they can’t take 10 minutes to run. You want typed languages and even linters (I begrudgingly accept them now). This way the autonomous agents can “self heal” and fix their own mistakes.

在你的应用中，拥有确定性、快速的正确性验证方法，是 agent 的关键。你需要测试。而且测试不能花 10 分钟来运行。你需要类型化语言，甚至 linter（我现在不情愿地接受它们了）。这样，自主 agent 就能“自我修复”并纠正自己的错误。

I sometimes fire off a prompt to Claude Code and see along the way, it fixed 2 or 3 issues from TypeScript / tests. It’s worth really internalizing this point and thinking about how it will impact your tooling choices in the future.

有时我给 Claude Code 发一个 prompt，然后就看着它一路修复了 TypeScript 或测试中的两三个问题。这一点值得真正内化，并思考它将如何影响你未来的工具选择。

Claude Code still feels expensive, but relative to the value and time saved, it’s likely worth it for many people (again, your mileage may vary). I want to try out some others: OpenCode, Amp, and a few other ones hitting the market soon.

Claude Code 感觉还是挺贵的，但考虑到它节省的价值和时间，对许多人来说可能还是值得的（同样，因人而异）。我想再试试其他的：OpenCode、Amp，以及一些即将上市的新工具。

v0 (Web Agent)

I’ve been using v0 for the longest since it’s built at Vercel. The first version (cough a v0) was pretty basic, and the models at the time really weren’t that great (1.5 years ago).

v0 我用得最久，因为它是在 Vercel 开发的。第一个版本（咳，一个 v0 版本）非常基础，而且当时的模型也确实不怎么样（1.5 年前）。

But v0 has gotten dramatically better since then. At some point, probably 6 months ago or so, it crossed a threshold where quality started to become really good.

但从那以后，v0 变得好太多了。在某个时刻，大概是 6 个月前左右，它跨过了一个门槛，质量开始变得非常好。

It wasn’t one specific thing, but many small things. The underlying model² (a preprocessing / classification step, a regularly updated base model like Claude 4, and a custom trained AutoFix model) helps fix errors other base models would hit generating code, plus it’s able to weave in user preference data and general knowledge of web tools like Next.js / React / etc.

这不是某个特定功能的功劳，而是许多微小改进的集合。其底层模型²（一个预处理/分类步骤，一个像 Claude 4 这样定期更新的基础模型，以及一个自定义训练的 AutoFix 模型）有助于修复其他基础模型在生成代码时会遇到的错误，此外，它还能融合用户偏好数据以及对 Next.js / React 等 Web 工具的通用知识。

I started using v0 for prototyping and making nice UIs. Then I expanded to do animations like framer-motion. And now I’m doing full-stack, backend code on the Next.js side (APIs, talking to databases, etc).

我开始用 v0 做原型和漂亮的 UI。然后我扩展到用它来做像 framer-motion 这样的动画。现在我甚至用它来做 Next.js 端的全栈、后端代码（API、与数据库交互等）。

Still, I previously would hit a point where I needed to eject v0 and go to Cursor. Which sucked because then my time in v0 was basically done, and the models in Cursor wouldn’t be as good at web stuff as v0. But now both of those are fixed.

不过，我以前总会遇到一个点，需要从 v0 中“弹出”，然后转到 Cursor。这很糟糕，因为这样我在 v0 里的时间基本就白费了，而且 Cursor 里的模型在 Web 方面又不如 v0。但现在这两个问题都解决了。

I can use the v0 model inside of Cursor, and v0 now has two-way git sync. This means that I can push commits locally in Cursor, go back to the v0 UI, and it just automatically pulls in the latest code and keeps on cooking. This is huge because now I can use Cursor and v0 together without it feeling like a duct-tape mess.

我可以在 Cursor 内部使用 v0 模型，而且 v0 现在支持双向 git 同步。这意味着我可以在 Cursor 里本地推送 commit，然后回到 v0 的 UI，它会自动拉取最新的代码并继续它的工作。这太重要了，因为现在我可以将 Cursor 和 v0 一起使用，而不会感觉像用胶带胡乱拼凑起来的烂摊子。

基于浏览器的 Agent

My exploration here is still in progress. The Claude Code GitHub integration didn’t work when I tried it first, and haven’t revisited since, so have been really only using it locally.

我在这方面的探索还在进行中。我第一次尝试 Claude Code 的 GitHub 集成时没成功，之后就没再试过，所以基本上只在本地使用它。

I have been using OpenAI Codex a bit more on some of my side projects, essentially as yet another agent that can run in the background (in parallel). For example, I’ve asked it to think critically about the app architecture and suggest alternative approaches. Or ask it to explain how it thinks the code works, and then compare that to reality. Or even just say “are there any obvious bugs or red flags”. It’s like a swarm (hehe) or people working for me.

我在一些个人项目上更多地使用了 OpenAI Codex，基本上是把它当作另一个可以在后台（并行）运行的 agent。例如，我让它批判性地思考应用架构并提出替代方案。或者让它解释它认为代码是如何工作的，然后我再将之与现实进行比较。甚至只是简单地问“有没有任何明显的 bug 或危险信号”。就像有一群（哈哈）人为我工作。

I’m using Devin at Vercel to merge a ton of small PRs to our docs³. Those things that die off in a Slack thread somewhere, or die in a Linear backlog. I just @ mention Devin in the thread, it makes the PR, and then we ship it.

我在 Vercel 用 Devin 来合并大量指向我们文档的小 PR³。就是那些在某个 Slack 帖子中或在 Linear 待办事项里不了了之的事情。我只要在帖子里 @ 一下 Devin，它就会创建 PR，然后我们就能发布了。

Funny enough, we also built a custom lil’ GitHub Action which uses the AI SDK to have an “AI code reviewer”. This then checks the output and suggests improvements. More agents in the loop. I haven’t tried CodeRabbit but similar idea there.

有趣的是，我们还构建了一个小巧的自定义 GitHub Action，它使用 AI SDK 来充当“AI 代码审查员”。它会检查输出并提出改进建议。循环中有更多的 agent。我还没试过 CodeRabbit，但想法类似。

My recommendation to you all is: try out new tools, revisit old workflows. Things have likely gotten much better since you last tried. The state of the art will be redefined again in 6 months, and we’ll have to start this over again. Part of being a great engineer is learning to love the process (and learning to learn).

我给大家的建议是：尝试新工具，重新审视旧的工作流程。自你上次尝试以来，情况可能已经好很多了。最先进的技术将在 6 个月后被重新定义，届时我们又得从头再来。成为一名优秀工程师的一部分，就是学会热爱这个过程（以及学会如何学习）。

延伸阅读

¹: I have a longer video on Cursor/IDEs if you want more details.
¹: 如果你想了解更多细节，我有一个关于 Cursor/IDE 的更长的视频。

²: More on the v0 model if you’re curious.
²: 如果你好奇，可以查看更多关于 v0 模型的信息。

³: More on Devin here, it’s been one of my favorites so far.
³: 更多关于 Devin 的信息在这里，它是我目前最喜欢的工具之一。

Cursor (主力 IDE)

Claude Code (Agent 式循环)

v0 (Web Agent)

基于浏览器的 Agent

延伸阅读

评论 (0)