Edit: Lilian Weng： Agent = LLM (大脑) + Planning (规划) + Memory (记忆) + Tools (工具)

编辑文章

标题 *

URL 别名 *

内容 * (支持 Markdown 格式)

这是一个非常硬核且有趣的话题。**Claude Code** 是 Anthropic 最近推出的一个基于终端的 Agent 工具，它不仅能写代码，还能执行终端命令、管理文件、自我纠错。

要理解并复刻它，我们必须深入 **Agent（智能体）** 的核心。

本文将分为三个部分：
1.  **解构 Agent**：概念与核心公式。
2.  **核心模式**：ReAct 循环与工具调用。
3.  **实战构建**：用 Python 从零手写一个“Mini Claude Code”。

---

### 第一部分：Agent 的概念与解构

为什么 Chatbot 不是 Agent？
*   **Chatbot (ChatGPT 网页版)**：被动。你问，它答。它在真空中运行，无法影响外部世界。
*   **Agent (智能体)**：主动。它有“手”和“眼”。它能感知环境（读取文件、看网页），制定计划，并执行动作（运行代码、发邮件）来改变环境。

#### Agent 的核心公式
AI 科学家 Lilian Weng 提出了一个著名的公式，这也是所有 Agent 的基石：

$$ \text{Agent} = \text{LLM (大脑)} + \text{Planning (规划)} + \text{Memory (记忆)} + \text{Tools (工具)} $$

1.  **LLM (大脑)**：负责推理。决定“我现在该做什么”。
2.  **Tools (工具)**：LLM 的扩展能力。例如：执行 Shell 命令、读写文件、搜索网络。
3.  **Planning (规划)**：
    *   **分解**：把“写一个贪吃蛇游戏”拆解为“创建文件”、“写逻辑”、“测试”。
    *   **反思**：代码跑不通时，分析报错日志，自我修正。
4.  **Memory (记忆)**：记住之前的操作结果（比如之前 `ls` 看到了什么文件）。

---

### 第二部分：核心运作模式 —— ReAct

要构建 Claude Code 这样的工具，最核心的模式是 **ReAct (Reason + Act)**。

它的运行流程是一个死循环（Loop），直到任务完成：

1.  **Thought (思考)**：用户让我改 Bug，我需要先看看代码。
2.  **Action (行动)**：调用工具 `read_file('main.py')`。
3.  **Observation (观察)**：(工具返回了文件内容) 发现第 10 行有个语法错误。
4.  **Thought (再思考)**：我找到了错误，现在需要修复它。
5.  **Action (再行动)**：调用工具 `write_file('main.py', new_content)`。
6.  **Observation (观察)**：写入成功。
7.  **Final Answer (回答)**：Bug 已修复。

**这就是 Claude Code 的本质：一个具备文件系统权限和终端权限的 ReAct 循环。**

---

### 第三部分：从零打造 "Mini Claude Code"

我们将不使用 LangChain 等重型框架，而是用**纯 Python + OpenAI 格式的 API**（兼容 DeepSeek/Claude/OpenAI）来实现，这样你能看清每一行底层的逻辑。

#### 1. 定义工具 (Tools)

Claude Code 的核心能力是操作文件和终端。我们需要定义两个“神之手”函数。

```python
import subprocess
import os

# 工具 1: 执行终端命令 (这是 Agent 的手)
def run_shell_command(command):
    print(f"\n[系统执行]: {command}")
    try:
        # ⚠️ 警告：在生产环境中这非常危险，需要沙箱环境 (如 Docker/E2B)
        result = subprocess.run(
            command, shell=True, capture_output=True, text=True, timeout=10
        )
        output = result.stdout + result.stderr
        return output if output.strip() else "(无输出)"
    except Exception as e:
        return f"执行出错: {str(e)}"

# 工具 2: 写文件
def write_to_file(filename, content):
    print(f"\n[系统写入]: {filename}")
    try:
        with open(filename, 'w', encoding='utf-8') as f:
            f.write(content)
        return f"成功写入文件: {filename}"
    except Exception as e:
        return f"写入失败: {str(e)}"

# 定义工具描述 (Schema)，告诉 LLM 它有哪些工具可用
tools_schema = [
    {
        "type": "function",
        "function": {
            "name": "run_shell_command",
            "description": "在终端执行 Shell 命令。用于查看目录、读取文件(cat)、运行代码等。",
            "parameters": {
                "type": "object",
                "properties": {
                    "command": {"type": "string", "description": "要执行的命令，例如 'ls -la' 或 'python app.py'"}
                },
                "required": ["command"],
            },
        },
    },
    {
        "type": "function",
        "function": {
            "name": "write_to_file",
            "description": "创建或覆盖文件内容。",
            "parameters": {
                "type": "object",
                "properties": {
                    "filename": {"type": "string", "description": "文件名"},
                    "content": {"type": "string", "description": "文件内容"},
                },
                "required": ["filename", "content"],
            },
        },
    },
]
```

#### 2. 构建大脑 (The Brain)

我们需要一个类来管理对话历史，并处理“模型想调用工具”的请求。

```python
import json
from openai import OpenAI

# 这里可以使用 OpenAI, DeepSeek, 或者 Anthropic 的 API
# 假设我们使用兼容 OpenAI 格式的客户端
client = OpenAI(api_key="你的API_KEY", base_url="你的API_BASE_URL")

class MiniClaudeCode:
    def __init__(self):
        # System Prompt 赋予它人设
        self.messages = [{
            "role": "system", 
            "content": "你是一个高级编程助手。你可以执行终端命令和管理文件。当用户让你写代码时，直接写入文件并尝试运行它。如果报错，请阅读错误并修复。"
        }]
    
    def chat(self, user_input):
        self.messages.append({"role": "user", "content": user_input})
        
        # 这是一个循环 (Loop)，因为 Agent 可能需要连续执行多个动作
        # 比如：先 ls 看文件 -> 再 cat 读内容 -> 再 write 改代码
        while True:
            # 1. 调用大模型
            response = client.chat.completions.create(
                model="gpt-4o", # 或 deepseek-chat
                messages=self.messages,
                tools=tools_schema,
                tool_choice="auto" # 让模型自己决定是否用工具
            )
            
            message = response.choices[0].message
            self.messages.append(message) # 把模型的回复加入记忆
            
            # 2. 判断模型是否想调用工具
            if message.tool_calls:
                # 3. 执行工具 (Action)
                for tool_call in message.tool_calls:
                    func_name = tool_call.function.name
                    args = json.loads(tool_call.function.arguments)
                    
                    result = ""
                    if func_name == "run_shell_command":
                        result = run_shell_command(args['command'])
                    elif func_name == "write_to_file":
                        result = write_to_file(args['filename'], args['content'])
                    
                    # 4. 将结果反馈给模型 (Observation)
                    # 必须带上 tool_call_id，模型才知道这是哪次调用的结果
                    self.messages.append({
                        "role": "tool",
                        "tool_call_id": tool_call.id,
                        "content": result
                    })
                    print(f"  └── [观察结果]: {result[:100]}..." if len(result) > 100 else f"  └── [观察结果]: {result}")
                
                # 循环继续，模型收到结果后会进行下一次思考
            else:
                # 模型没有调用工具，说明它完成了任务，输出了最终回答
                return message.content
```

#### 3. 启动引擎 (The Loop)

现在我们把它们组合起来，启动一个命令行交互界面。

```python
def main():
    agent = MiniClaudeCode()
    print("🤖 Mini Claude Code 已启动 (输入 'exit' 退出)")
    print("---------------------------------------------")
    
    while True:
        try:
            user_input = input("\n> ")
            if user_input.lower() in ['exit', 'quit']:
                break
            
            print("Thinking...", end="", flush=True)
            response = agent.chat(user_input)
            print(f"\n🤖 Agent: {response}")
            
        except KeyboardInterrupt:
            print("\n再见！")
            break
        except Exception as e:
            print(f"Error: {e}")

if __name__ == "__main__":
    main()
```

---

### 第四部分：运行演示 (Simulation)

假设你运行了上面的代码，并输入：
`> 请帮我写一个 python 脚本 hello.py，打印当前时间，然后运行它。`

**Mini Claude Code 的内部运行流程如下：**

1.  **Loop 1 (思考)**:
    *   模型收到请求。
    *   模型决定调用工具 `write_to_file`。
    *   参数: `filename="hello.py"`, `content="import datetime; print(datetime.datetime.now())"`
2.  **Loop 1 (行动)**:
    *   Python 代码执行 `write_to_file`。
    *   文件 `hello.py` 被创建。
    *   返回结果: `"成功写入文件: hello.py"`。
3.  **Loop 2 (再思考)**:
    *   模型收到“成功写入”的消息。
    *   模型决定下一步：运行它。
    *   模型决定调用工具 `run_shell_command`。
    *   参数: `command="python hello.py"`。
4.  **Loop 2 (行动)**:
    *   Python 代码执行 `subprocess.run("python hello.py"...)`。
    *   返回结果: `"2025-12-09 18:30:00.123456"`。
5.  **Loop 3 (总结)**:
    *   模型收到时间输出。
    *   模型认为任务结束。
    *   输出文本: "我已经创建并运行了脚本，当前时间是 2025-12-09..."。

---

### 第五部分：从 Demo 到生产级 Claude Code 的差距

上面的代码只有不到 100 行，它展示了原理，但要成为生产力工具，还需要解决几个大问题：

1.  **安全性 (Safety)**：
    *   *问题*：如果用户让 Agent 执行 `rm -rf /`，上面的代码真会去执行。
    *   *解决*：**Docker 容器化**。真实的 Claude Code 或 Replit Agent 都在临时的 Docker 容器或微虚拟机 (Firecracker MicroVM) 中运行，用完即焚。

2.  **上下文管理 (Context Management)**：
    *   *问题*：如果你 `cat` 了一个 10MB 的日志文件，Token 会瞬间爆掉。
    *   *解决*：需要一个“截断”机制。只读取文件的前 100 行，或者使用 RAG (向量检索) 来查找文件中相关的内容，而不是全量读取。

3.  **人类介入 (Human-in-the-loop)**：
    *   *问题*：Agent 可能会陷入死循环（不断报错，不断重试）。
    *   *解决*：在执行高危命令或连续重试多次后，强制暂停，询问用户“是否继续”。

4.  **流式输出 (Streaming)**：
    *   为了不让用户干等，需要实现类似 ChatGPT 的打字机效果。

### 总结

构建一个 Agent 不需要黑魔法。它的本质就是：
**一个循环 (While Loop)** + **一个聪明的 LLM** + **通过 API 暴露的函数 (Tools)**。

当你理解了这一点，你不仅能复刻 Claude Code，还能为你的业务构建 SQL Agent（查询数据库）、运维 Agent（查日志）、甚至办公 Agent（操作 Excel 发邮件）。

配图 (可多选)

选择新图片文件或拖拽到此处

标签