Home / Glossary

AI Coding Glossary

Key terms in AI-assisted development, explained clearly with practical context.

Agentic 编程

Agentic 编程是一种软件开发方式，AI 代理能够自主读取代码库、编写代码、运行命令并迭代结果，无需手动复制粘贴。与聊天式 AI 不同，代理直接在你的开发环境中执行操作，完成多步骤任务。

→

Claude Code

Claude Code 是 Anthropic 基于终端的 AI 编程代理，直接在你的开发环境中运行。它能读取整个项目、跨多个文件编写代码、运行 Shell 命令、管理 Git 工作流，并自主迭代处理错误——全程通过命令行完成。

→

模型上下文协议（MCP）

模型上下文协议（MCP）是 Anthropic 创建的开放标准，提供了一种将 AI 模型连接到外部工具、数据源和 API 的通用方式。它像"AI 的 USB 接口"一样充当标准化接口，使任何兼容 MCP 的工具都能与任何兼容 MCP 的 AI 代理协作。

→

CLAUDE.md

CLAUDE.md 是放置在项目根目录中的 Markdown 配置文件，为 Claude Code 提供持久化的项目专属指令。它向代理说明编码规范、架构、常用命令和规则，充当适用于该项目每次会话的长期记忆。

→

AI 结对编程

AI 结对编程是一种开发工作流，人类开发者与 AI 工具实时协作编写代码。开发者提供方向、上下文和判断，AI 则贡献代码建议、捕获 Bug 并处理重复性实现任务。

→

上下文窗口

上下文窗口是 AI 模型在单次交互中能处理的最大 Token（词语、代码字符和符号）数量。它定义了 AI 能同时在记忆中保留的信息上限——包括你的提示词、代码和模型的回复。

→

编程代理

编程代理是由 AI 驱动的工具，能够自主读取文件、编写代码、执行终端命令并迭代结果，以完成编程任务。与被动的代码建议工具不同，编程代理在你的开发环境中采取独立行动，以实现既定目标。

→

Vibe 编程

Vibe 编程是一种非正式的软件开发方式，开发者用自然语言描述想要的效果，让 AI 工具处理实现细节。不同于编写精确规格，开发者通过随意的对话传达意图，并根据结果迭代调整。

→

AI 代码审查

AI 代码审查是利用人工智能自动分析源代码中的 Bug、安全漏洞、风格不一致和质量问题的过程。AI 审查工具可以检查拉取请求、提出改进建议，并发现人工审查者因疲劳或时间压力可能遗漏的问题。

→

代码提示工程

代码提示工程是精心设计清晰、具体指令的实践，帮助 AI 编程工具产出准确、相关的结果。它涉及以适当的上下文、约束条件和示例来构建你的请求，以便 AI 理解你想要什么以及你希望如何实现。

→

无头 AI 代理

无头 AI 代理是指在没有面向用户界面或实时交互的情况下运行的编程代理。它在后台进程、CI/CD 流水线或定时任务中自主执行任务——读取代码、做出修改、运行测试并报告结果，全程无需人工输入。

→

子代理

子代理是由主 AI 编程代理派生的并行子进程，用于同时处理复杂任务中相互独立的部分。主代理不必按顺序处理所有任务，而是将子任务委派给专门的子代理并行执行，各子代理完成后将结果汇报给父代理。

→

AI Code Completion

AI code completion is a feature in development tools that uses machine learning models to predict and suggest code as you type. It ranges from single-line autocomplete to multi-line function generation, analyzing the surrounding code context to offer relevant suggestions that match your intent and coding style.

→

Large Language Model (LLM)

A large language model (LLM) is a deep learning system with billions of parameters, trained on vast datasets of text and code to understand, generate, and reason about natural language and programming languages. LLMs like Claude, GPT-4, and Gemini are the foundation of modern AI coding tools.

→

Retrieval-Augmented Generation (RAG)

Retrieval-augmented generation (RAG) is an AI architecture that improves the accuracy of language model responses by retrieving relevant information from external knowledge sources before generating an answer. Instead of relying solely on what the model memorized during training, RAG fetches up-to-date, domain-specific data and includes it in the model's context.

→

Tool Use

Tool use (also called tool calling) is the capability of a large language model to invoke external functions, APIs, or system commands as part of generating a response. Instead of being limited to producing text, a model with tool use can read files, run code, query databases, and interact with services—making it the foundation of agentic AI systems.

→

Function Calling

Function calling is an AI model capability where the model generates structured JSON arguments to invoke external functions instead of producing plain text. This enables LLMs to interact with APIs, databases, file systems, and other tools in a reliable, programmatic way—turning a conversational model into one that can take real-world actions.

→

System Prompt

A system prompt is a set of instructions provided to an AI model before the user's message that defines the model's behavior, persona, constraints, and capabilities. It acts as a configuration layer that shapes every response the model produces, without the user needing to repeat these instructions in each message.

→

Temperature

Temperature is a parameter in large language models that controls the randomness of the output. A temperature of 0 makes the model deterministic, always choosing the most probable next token. Higher temperatures (up to 1.0 or 2.0) increase randomness, making less probable tokens more likely to be selected. For coding tasks, lower temperatures generally produce more reliable, consistent code.

→

Token

A token is the fundamental unit of text that a large language model processes. Tokenization splits text into chunks—sometimes whole words, sometimes subwords, sometimes individual characters—that the model can work with. In English text, one token is roughly 3-4 characters or 0.75 words. In code, tokens map to keywords, operators, variable names, and whitespace.

→

Fine-Tuning

Fine-tuning is the process of further training a pre-trained large language model on a smaller, task-specific dataset to adapt its behavior for a particular use case. The model's weights are updated to specialize in a domain—such as a specific programming language, codebase, or output format—while retaining its general capabilities from pre-training.

→

Code Generation

AI code generation is the process of using artificial intelligence to produce source code from natural language descriptions, specifications, or existing code context. Modern code generation powered by LLMs can write entire functions, classes, tests, and even full applications from high-level instructions, across virtually any programming language.

→

AI Refactoring

AI refactoring is the use of artificial intelligence to automatically restructure, simplify, and improve existing source code without changing its external behavior. AI refactoring tools analyze code for complexity, duplication, poor naming, and anti-patterns, then apply transformations that make the code cleaner, more maintainable, and easier to understand.

→

AI Testing

AI testing is the application of artificial intelligence to software testing workflows—including generating unit tests, integration tests, and end-to-end tests from source code; identifying untested edge cases; analyzing test failures; and suggesting fixes. AI testing tools understand code semantics to write meaningful tests that go beyond basic coverage.

→

Multi-Modal AI

Multi-modal AI refers to artificial intelligence systems that can process, understand, and generate multiple types of data—text, images, audio, video, and code—within a single model. Unlike single-modal models that only handle text, multi-modal models can analyze a screenshot of a UI, read the associated code, and generate modifications based on both visual and textual understanding.

→

Chain-of-Thought

Chain-of-thought (CoT) prompting is a technique that encourages a large language model to break down complex problems into intermediate reasoning steps before producing a final answer. Instead of jumping to a conclusion, the model "thinks out loud," explaining each step of its logic. This significantly improves accuracy on tasks that require multi-step reasoning, including debugging, algorithm design, and code architecture decisions.

→

Few-Shot Prompting

Few-shot prompting is a technique where you include a small number of example input-output pairs in your prompt to demonstrate the pattern you want the AI to follow. By showing the model 2-5 examples of the desired behavior, it learns the format, style, and logic you expect—without any model training or fine-tuning. This is one of the most effective techniques for getting consistent, formatted output from LLMs.

→

Zero-Shot Prompting

Zero-shot prompting is a technique where you instruct an AI model to perform a task without providing any examples of the desired input-output format. You describe what you want in natural language, and the model relies entirely on its pre-trained knowledge to produce the output. It is the most natural way to interact with AI—just tell it what to do.

→

Embeddings

Embeddings are dense numerical vectors (arrays of floating-point numbers) that represent text, code, or other data in a high-dimensional space where semantically similar items are positioned close together. They enable AI systems to measure similarity between pieces of code, search codebases by meaning rather than keywords, and power retrieval-augmented generation (RAG) systems.

→

Vector Database

A vector database is a specialized database designed to store, index, and search high-dimensional embedding vectors efficiently. Unlike traditional databases that match exact values or keywords, vector databases find the most similar vectors to a query vector—enabling semantic search, recommendation systems, and the retrieval component of RAG (retrieval-augmented generation) architectures.

→

Technical Debt

Technical debt is the implied cost of future rework caused by choosing a quick, expedient solution now instead of a better approach that would take longer. Like financial debt, it accumulates interest: the longer it remains unaddressed, the more time and effort future changes require. Common sources include rushed features, skipped tests, outdated dependencies, and inconsistent architecture.

→