译:为不能只凭感觉行事的团队准备的 AI 辅助编码指南

发布于 2025年6月23日

原文: https://blog.nilenso.com/blog/2025/05/29/ai-assisted-coding/
作者: Atharva Raykar
译者: Gemini 2.5 Pro

2025年5月29日

AI should be adopted by serious engineering teams that want to build thoughtful, well-crafted products. This requires skillful usage of these tools. Our obsession with building high-quality software for over a decade has driven us to figure out how this new way of building can result in better products.

那些想要构建深思熟虑、精心打造的产品的严肃工程团队,应该拥抱 AI。但这需要对这些工具有技巧地使用。十多年来,我们对构建高质量软件的执着,驱使我们去探索这种新的构建方式如何能产出更好的产品。

Building with AI is fast. The gains in velocity are important, because when harnessed correctly, it allows teams to tighten feedback loops with users faster and make better products.

用 AI 构建是很快的。速度的提升至关重要,因为如果利用得当,它能让团队更快地缩短与用户的反馈循环,从而做出更好的产品。

Yet, AI tools are tricky to use. Hold it wrong, and you can generate underwhelming results, worse still, slow down your velocity by drowning your project in slop and technical debt.

然而,AI 工具的使用也很有技巧。用错了方法,你可能会得到平庸的结果,更糟的是,你的项目可能会陷入垃圾代码和技术债的泥潭,从而拖慢你的开发速度。

This living playbook is based on our experience working with AI tools in the messy trenches of production software, where no one can afford to get away with vibes. I hope other teams can learn and benefit from our findings.

这份动态更新的实践手册,源于我们在生产软件混乱的战壕中使用 AI 工具的经验——在这里,没人能凭感觉蒙混过关。我希望其他团队能从我们的发现中学习并受益。

AI 是一个放大器

To make AI good, get good yourself. AI is a multiplier. If you are a small coefficient, you won’t see much gain. If you are a negative coefficient, expect negative gains.

要想让 AI 发挥好,先要让自己变得优秀。AI 是一个放大器。如果你这个系数本身很小,你不会看到多少增益。如果你是个负系数,那么等着迎接负增长吧。

What I have noticed is that the best and most experienced engineers are able to extract a lot more out of AI tools. There are several reasons for this:

我注意到,最优秀、经验最丰富的工程师能够从 AI 工具中获得多得多的价值。这有几个原因:

  • They are extremely good at communicating technical ideas.
  • 他们极其擅长沟通技术思想。
  • They have a keen calibration and feel for what leads to a good system and can steer LLMs accordingly, i.e., they have what I like to call “the mechanic’s touch”.
  • 他们对于如何构建一个好系统有着敏锐的判断力和感觉,并能相应地引导 LLM,即他们拥有我所说的“机械师的触感”。
  • They have strong fundamentals, so they immediately get up to speed with new tools and systems where knowledge, not skill is the bottleneck.
  • 他们基础扎实,因此能迅速上手新工具和新系统,在这些领域,知识而非技能才是瓶颈。
  • AI is still sensitive to language and style and will often mirror the tastes and sensibilities of the prompter. Highly skilled engineers have really sharpened taste and instinct for what works and what doesn’t.
  • AI 对语言和风格仍然敏感,并常常会反映出提问者的品味和感知力。高水平的工程师对于什么可行、什么不可行,已经磨练出了非常敏锐的品味和直觉。

Therefore, embody the care of a craftperson. At the end of the day, you should produce artifacts you are proud of, even if the AI assisted in making it. This has translated well into the output I am seeing from these systems.

因此,要体现出工匠般的用心。说到底,你应该产出让自己引以为豪的作品,即使 AI 参与了制作过程。这种态度很好地转化为了我从这些系统中看到的产出质量。

Here’s an example. This prompt is not unreasonable but not particularly thoughtful:

举个例子。下面这个 prompt 不算离谱,但也没有经过深思熟虑:

Write a Python rate limiter that limits users to 10 requests per minute.
写一个 Python 的限流器,限制用户每分钟 10 次请求。

I would expect this prompt to give okay results, but also miss some edge cases, good practices and quality standards. This is how you might see someone at nilenso prompt an AI for the same task:

我预料这个 prompt 会给出还算可以的结果,但也会漏掉一些边界情况、良好实践和质量标准。而在 nilenso,你可能会看到有人这样向 AI 提问同样任务:

Implement a token bucket rate limiter in Python with the following requirements:
- 10 requests per minute per user (identified by `user_id` string)
- Thread-safe for concurrent access
- Automatic cleanup of expired entries
- Return tuple of (allowed: bool, retry_after_seconds: int)

Consider:
- Should tokens refill gradually or all at once?
- What happens when the system clock changes?
- How to prevent memory leaks from inactive users?

Prefer simple, readable implementation over premature optimization. Use stdlib only (no Redis/external deps).
用 Python 实现一个令牌桶限流器,要求如下:
- 每个用户(通过 `user_id` 字符串识别)每分钟 10 次请求
- 线程安全,支持并发访问
- 自动清理过期条目
- 返回一个元组 (allowed: bool, retry_after_seconds: int)

请考虑:
- 令牌是应该逐渐补充还是一次性补满?
- 当系统时钟发生变化时会怎样?
- 如何防止因不活跃用户导致的内存泄漏?

优先选择简单、可读的实现,而不是过早优化。仅使用标准库(不要用 Redis 或其他外部依赖)。

Guess which prompt is going to better achieve the program designer’s intent?

猜猜看,哪个 prompt 能更好地实现程序设计者的意图?

A technique that has worked well for us is metaprompting. I prompt the model with a simple task and ask it to help surface tradeoffs and edge cases. Then I turn it into a tech spec and hand it off to another LLM agent to execute. Even the “better prompt” I shared above is a result of asking the AI to come up with a good prompt. From my experience, models have become good at prompting themselves.

一个对我们很有效技巧是元提示(metaprompting)。我先给模型一个简单的任务,然后让它帮我发现其中的权衡和边界情况。接着,我把这些内容整理成一份技术规格,再交给另一个 LLM 代理去执行。就连我上面分享的那个“更好的 prompt”,也是我让 AI 帮我构思一个好 prompt 的结果。根据我的经验,模型已经变得很擅长提示自己了。

The mechanics of what works for these tools are in flux, but one robust principle is to really work on yourself to be a good engineer. Your habits will quickly pass on to the AI systems you work with. The reason this works is because what helps the human helps the AI.

这些工具的有效机制在不断变化,但有一个稳健的原则是:努力提升自己,成为一名优秀的工程师。你的习惯会迅速传递给你合作的 AI 系统。这之所以有效,是因为对人有益的东西,对 AI 同样有益。

对人有益的,对 AI 同样有益

I shall clarify what software engineering is, something I found worth revisiting in the light of disruption from AI advancements.

我想在此阐明一下什么是软件工程,在 AI 进步带来的颠覆之下,我觉得这个问题值得重新审视。

Software engineering is not about writing code. Or at least, that’s not the defining characteristic, much like how writing is not wrist exercises with ink on paper.

软件工程并不仅仅是写代码。至少,这并非它的决定性特征,就像写作不只是在纸上用墨水做手腕运动一样。

To me software engineering is the art and science of maintaining a large body of well-defined mental models that achieve a business or economic need. Much of the work is around crafting and curating these large, complex sociotechnical systems, and code is just one representation of these systems.

对我而言,软件工程是这样一门艺术和科学:维护一个庞大且定义清晰的心智模型集合,以满足商业或经济需求。大部分工作都围绕着构建和维护这些庞大、复杂的社会技术系统,而代码只是这些系统的一种表现形式。

Until AI is good enough to engulf this whole sociotechnical system and expel out all the humans cultivating it, it has to participate and benefit from this very system. In simpler words: AI thrives far, far better in an environment in which a human would also thrive. Which means your team’s software fundamentals should be strong.

在 AI 强大到足以吞噬整个社会技术系统并驱逐所有培育它的人类之前,它必须参与并受益于这个系统本身。简单来说:在一个能让人类茁壮成长的环境中,AI 的成长会好得多得多。这意味着,你团队的软件基础必须扎实。

A system in which AI thrives is one with markers of a high quality team and codebase. These are:

一个能让 AI 蓬勃发展的系统,通常也带有一个高质量团队和代码库的标志。这些标志包括:

  • Good test coverage, with useful assertions
  • 良好的测试覆盖率,以及有用的断言
  • Automated linting, formatting and test checks before code merges
  • 代码合并前的自动化 linting、格式化和测试检查
  • Continuous integration and deployment
  • 持续集成和持续部署
  • Well documented changes, tech specs, ADRs with good commit messages
  • 记录良好的变更、技术规格、ADR(架构决策记录),以及优质的 commit 信息
  • Consistent styles and patterns, enforced through a formatter
  • 通过格式化工具强制执行的一致风格和模式
  • Simple, concise, well-organised code
  • 简单、简洁、组织良好的代码
  • Clearly defined features, broken down into multiple small story cards
  • 清晰定义的功能,并分解为多个小型的故事卡片

Today’s AI can and will make use of all these things to make things “just work”. When I give a coding agent a task, it keeps correcting itself in the agentic loop by running the test cases and static analysis tools. This greatly reduces the hand-holding and intervention needed to get work done.

今天的 AI 能够并且将会利用所有这些东西来让事情“自动搞定”。当我给一个编码代理一项任务时,它会在代理循环中通过运行测试用例和静态分析工具来不断自我修正。这极大地减少了完成工作所需的人工指导和干预。

A rich environment and context helps the AI work better.

一个丰富的环境和上下文有助于 AI 更好地工作。

Here’s an anecdote: when I was working at a project that had two services, one of them had all of the things I described above—good tests, well-documented changes, consistent patterns in the code, lots of checks and guardrails. The other service was messier and had none of the above. Our AI coding assistant struggled to complete a task of equal difficulty on the latter codebase when compared to the former! This is likely because the messier codebase was as confusing for the AI as it would be for a human. There were mixed signals about the right way to do things.

这里有一个轶事:我曾参与一个有两个服务的项目,其中一个服务具备我上面描述的所有优点——良好的测试、文档齐全的变更、一致的代码模式、大量的检查和防护措施。而另一个服务则混乱不堪,上述优点一概没有。我们的 AI 编码助手在后一个代码库上完成一个同等难度的任务时,遇到了极大的困难,远超前一个!这很可能是因为那个混乱的代码库对 AI 来说,就像对人类一样令人困惑。关于做事的正确方式,它收到了太多混杂的信号。

编辑器内的工具和技巧

Now that I have outlined the general strategy, here are some tactics that have helped me.

既然我已经概述了通用策略,下面是一些对我有帮助的具体战术。

使用最顶尖的 AI 模型,不要吝啬。

  • Use the best coding model available. Do not try to save credits and cost by using a worse model. The goodness of a good model compounds. All the tactics that I present ahead will work far better when you have a strong coding model to begin with.
  • 使用市面上最好的编码模型。不要为了节省点数和成本而使用较差的模型。好模型的优势是会复利式增长的。我接下来介绍的所有战术,在一个强大的编码模型基础上,效果会好得多。

擅长提供上下文。

  • The effectiveness of AI-assisted coding is strongly dependent on how skillfully you can provide the right context to the LLM.
  • AI 辅助编码的有效性,在很大程度上取决于你向 LLM 提供正确上下文的技巧。
  • Use an “agentic” coding tool. These are tools that are able to read and analyse files, run shell commands, fetch docs, create plans and execute on those plans, needing no human intervention (except maybe approvals). Our current recommendation for tools that do this are Claude Code, Windsurf, Cursor, Cline.
  • 使用“代理式”编码工具。这类工具能够读取和分析文件、运行 shell 命令、获取文档、创建计划并执行这些计划,几乎不需要人工干预(或许除了审批)。我们目前推荐的这类工具有 Claude Code、Windsurf、Cursor、Cline。
  • LLMs can get distracted and fall into rabbitholes if given irrelevant or a cluttered context. Focus its attention by only @-mentioning files that are relevant and linking only to documentation that helps the task.
  • 如果给 LLM 无关或杂乱的上下文,它可能会分心并钻牛角尖。通过只 @-mention 相关文件,并且只链接对任务有帮助的文档来集中它的注意力。
  • Encode coding standards and practices in a RULES.md file. Symlink this file to agent specific rules files such as .cursorrules, .windsurfrules, claude.md, agents.md etc
    • This file should have information about the tech stack, how to use the dev tooling and run the linter, coding standard and patterns, and cover for common mistakes that the LLMs have made when working with the code. Here’s an example.
  • 将编码标准和实践固化在一个 RULES.md 文件中。然后将此文件软链接到特定代理的规则文件,如 .cursorrules.windsurfrulesclaude.mdagents.md 等。
    • 这个文件应该包含技术栈信息、如何使用开发工具和运行 linter、编码标准和模式,以及 LLM 在处理代码时常犯的错误。这是一个示例

实现新功能或重构

  • Break down the problem. AI works better the more specific you are. Remember, you can also use the AI to reduce the tedium of making your prompts better written and more specific. Reasoning models are great at this!
  • 分解问题。你提的要求越具体,AI 工作得越好。记住,你也可以用 AI 来减少把 prompt 写得更好、更具体的繁琐工作。推理模型很擅长这个!
  • If you are working on a big feature, break it down into small tasks, and feed the tasks one by one, making a commit at the end of each task. If you do these with your stories, the story card description with the task list is often a very helpful description for the AI.
  • 如果你在开发一个大功能,把它分解成小任务,然后逐一提交给 AI,每完成一个任务就进行一次 commit。如果你用 story 的方式管理工作,那么带有任务列表的 story card 描述通常对 AI 来说是非常有用的说明。
  • Supply tech specs and relevant documentation about the product and feature. Don’t just ask it to write code without broader context of the product. Also feed it documentation on how to use the libraries you are using. Pasting links to documentation often works with most tools. Some libraries provide a llms.txt for coding agents to use.
  • 提供关于产品和功能的技术规格和相关文档。不要在没有产品整体背景的情况下就让它写代码。同时,也要给它提供你正在使用的库的文档。对于大多数工具来说,直接粘贴文档链接通常是有效的。有些库会提供一个 llms.txt 文件供编码代理使用。
  • Another pattern that has worked well for us is to break down the feature into “planning” and “execution” stages. Some coding agents already do this kind of a breakdown for you.
  • 另一个对我们很有效的模式是,将功能开发分解为“规划”和“执行”两个阶段。一些编码代理已经为你做了这种分解。
  • Do not take AI suggestions for granted. Ask it to justify its choices, present alternatives and think about advantages and drawbacks.
  • 不要想当然地接受 AI 的建议。要求它解释其选择的理由,提出替代方案,并思考优缺点。

调试

  • Use AI to debug errors in its generation. Always paste the error context most relevant for the LLM to help it understand the issue (I prefer to delineate the error logs or output in a separate XML tag).
  • 用 AI 来调试它自己生成代码中的错误。务必粘贴与 LLM 最相关的错误上下文来帮助它理解问题(我倾向于用一个单独的 XML 标签来界定错误日志或输出)。
  • Explain what you have tried, and additional observations to help the model generate correct hypotheses and eliminate bad ones. Provide lots of context.
  • 说明你已经尝试了什么,并提供额外的观察,以帮助模型生成正确的假设并排除错误的假设。提供尽可能多的上下文。

编辑器外的工具和技巧

用 AI 提升你自己的技能和知识

  • LLMs are an infinitely patient teacher with massive world knowledge (and more recently, ability to research effectively). Aggressively use them to learn things and demystify any new code or stack. Relentlessly dig. Figure out the best practices. Ensure you are learning correctly by getting the LLM to cite high quality sources.
  • LLM 是一个拥有海量知识(最近还具备了有效研究能力)且无限耐心的老师。积极地用它们来学习新事物,解开任何新代码或技术栈的神秘面纱。不懈地深挖。找出最佳实践。通过让 LLM 引用高质量的来源,来确保你的学习是正确的。

创建详尽的文档

  • Create lots of detailed documentation easily by feeding codebases to the LLM. Egs:
    • Explain functionality, create a knowledge base
    • Summarise all the current metrics being collected
    • Identify missing test cases more intelligently
  • 通过向 LLM 提供代码库,轻松地创建大量详细文档。例如:
    • 解释功能,创建知识库
    • 总结当前收集的所有指标
    • 更智能地识别缺失的测试用例

There’s a good reason to do this—documentation is now cheap to generate and feeds back into making your LLMs (and humans) on the project a lot more effective.

这样做有一个很好的理由——文档的生成成本现在很低,而它又能反哺项目,让你的 LLM(和人类)工作得更高效。

微摩擦润滑剂

LLMs greatly reduce the cost of creating lubricants for all the minor friction points that teams run into on a daily basis.

对于团队日常遇到的各种微小摩擦点,LLM 极大地降低了为其创造“润滑剂”的成本。

  • Use them to create mockservers to coordinate and unblock work between frontend and backend teams. All that is needed is agreeing on a contract.
  • 用它们来创建 mock server,以协调和疏通前端与后端团队之间的工作。所需要的只是就一份契约达成一致。
  • Create runbooks and guides for infra deployments, common types of troubleshooting and more by supplying shell history sessions to the LLM.
  • 通过向 LLM 提供 shell 历史会话,为基础设施部署、常见故障排除等创建操作手册和指南。
  • Feed existing runbooks and guides to an LLM to make them into scripts automating common tasks.
  • 将现有的操作手册和指南喂给 LLM,让它把它们变成自动化常见任务的脚本。

代码审查

  • Have a template for Pull Requests, feed the code diff (git log -p <range>) of each feature to the AI to explain the changes and how to deploy them. Some tools can already do this for you.
  • 为 Pull Request 准备一个模板,将每个功能的代码 diff (git log -p <range>) 提供给 AI,让它解释变更内容以及如何部署。一些工具已经可以为你做这件事了。
  • To reduce time to first PR review, use a code reviewing bot for the first part. But do not replace human review!
  • 为了缩短首次 PR 审查的时间,可以使用代码审查机器人在初期介入。但绝不要取代人工审查!
  • Use LLMs to explain a change that you don’t fully understand as a reviewer. Ask it for clarification, and then ask the implementer after gathering the necessary context.
  • 作为审查者,用 LLM 来解释你不太理解的变更。向它寻求澄清,然后在收集了必要的上下文之后,再去问实现者本人。

调试和监控线上应用

  • Use researching capabilities of LLMs to help find solutions to uncommon errors. Follow the advice of debugging in the editor to debug outside it. Provide as much context as you can.
  • 利用 LLM 的研究能力来帮助寻找不常见错误的解决方案。遵循在编辑器内调试的建议来处理编辑器外的调试。提供尽可能多的上下文。
  • LLMs are quite decent at writing queries and alerting rules for observability tools. They also are good at crunching data and performing analyses by writing custom python code.
  • LLM 在为可观测性工具编写查询和告警规则方面表现得相当不错。它们也很擅长通过编写自定义的 python 代码来处理数据和进行分析。

性能优化

  • Use LLMs to help you optimise databases and tune configuration. When doing so provide context on the infrastructure and hardware. Share query plans.
  • 用 LLM 帮助你优化数据库和调整配置。这样做时,要提供关于基础设施和硬件的上下文,并分享查询计划。
  • This is an example of such an interaction I had recently.
  • 这篇是我最近一次类似交互的例子。

AI 如何改变软件工艺的启示

This is a huge shift in how we write software, and I believe it warrants some changes to ideas that were previously considered common wisdom.

这是我们编写软件方式的一次巨大转变,我相信,这值得我们去修正一些过去被认为是普通常识的观念。

Firstly, It’s less valuable to spend too much time looking for and building sophisticated abstractions. DRY is useful for ensuring patterns in the code don’t go out of sync, but there are costs to implementing and maintaining an abstraction to handle changing requirements. LLMs make some repetition palatable and allow you to wait a bit more and avoid premature abstraction.

首先,花费过多时间去寻找和构建复杂的抽象,其价值已经降低了。DRY 原则对于确保代码模式不失同步很有用,但为了应对需求变化而实现和维护一个抽象是有成本的。LLM 让某些重复变得可以接受,并允许你多等一等,以避免过早的抽象。

Redoing work is now extremely cheap. Code in the small is less important than structural patterns and organisation of the code in the large. You can also build lots of prototypes to test an idea out. For this, vibe-coding is great, as long as the prototype is thrown away and rewritten properly later.

返工的成本现在极其低廉。微观层面的代码,其重要性已不如宏观层面的结构模式和代码组织。你也可以构建大量的原型来测试一个想法。为此,凭感觉编码(vibe-coding)是很棒的,只要这个原型用完即弃,之后再被妥善地重写。

Working with LLMs also lets you take advantage of the generator-verifier gap. Often it’s easier to verify and fix things than it is to produce them from scratch. This reduces activation energy to try new things.

与 LLM 合作,还能让你利用“生成者-验证者差距”。通常情况下,验证和修复问题比从头创造要容易得多。这降低了尝试新事物的启动门槛。

Tests are non-negotiable, and AI removes all excuses to not write them because of how fast they can belt them out. But always review the assertions!

测试是不可协商的,而 AI 消除了所有不写测试的借口,因为它们生成测试的速度飞快。但一定要审查那些断言!

随着我们对这些工具了解的深入,本手册未来会增加的内容

  • Deploying autonomous agents like Devin/Jules/Claude Code and using them well
  • 部署像 Devin/Jules/Claude Code 这样的自主代理并善加利用
  • AI tooling for writing queries, performing data analysis
  • 用于编写查询、执行数据分析的 AI 工具
  • Concerns with leaking proprietary code, hosted LLM options, etc
  • 关于泄露专有代码、托管 LLM 选项等的考量
  • Building a culture of sharing prompts, patterns and templates
  • 建立分享 prompt、模式和模板的文化
  • Effective ways of driving AI adoption in teams
  • 在团队中有效推动 AI 采纳的方法

参考文献

Programming with LLMs - David Crawshaw
Here’s how I use LLMs to help me write code - Simon Willison
How I use “AI” - Nicholas Carlini

评论 (0)

请登录后发表评论

暂无评论,快来发表第一条评论吧!