Computer Use：内嵌 MCP、应用审批与桌面控制协议

Claude Code 的 Computer Use 能力标志着它从“代码操作员”进化为“通用计算机代理”。它不再局限于 Shell，而是能够直接“观看”屏幕并操作鼠标键盘。源码揭示了这套能力的背后，是一套高度受控的内嵌 MCP 架构。

它解决了什么问题

Computer Use 本质上是 借由内嵌 MCP 协议暴露的桌面控制代理（Embedded MCP Desktop Proxy）。

它在实现上并不是一组简单的本地内建工具，而是被伪装成了一个名为 computer-use 的动态 MCP 服务。通过这种方式，它向模型暴露出一组以 mcp__computer-use__* 开头的规范化工具名。

这套架构由三层防御和执行逻辑支撑：

内嵌 MCP 层：将截图、点按、键盘输入等能力封装为标准的 MCP 接口。
会话级审批层：决定当前会话允许接触哪些 App，并管理临时的权限标记。
全局互斥与清理层：确保同一时间只有一个 Claude 会话能控制桌面，并在操作结束后自动恢复环境。

运行时的真相

这套能力的启动和执行受到平台权限（macOS）、功能开关（Feature Gates）和本地文件锁的严格约束。

1. 动态 MCP 挂载

在 claude-code-opensource/src/main.tsx 中，只有当满足 macOS 平台、交互式会话且开启了相关 Feature Gate 时，系统才会调用 setupComputerUseMCP()。随后，claude-code-opensource/src/services/mcp/client.ts 会拦截对 computer-use 服务的请求，将其转为进程内（In-process）调用。这种“伪装”是为了让后端模型能通过标准的 MCP 发现机制识别并使用桌面能力。

2. 会话级应用审批（App Allowlist）

当你第一次要求 Claude 打开某个应用时，会触发 ComputerUseApproval.tsx。

系统级权限：首先检查 macOS 的“辅助功能（Accessibility）”和“屏幕录制”权限，缺失时会直接引导用户打开系统设置。
临时白名单：用户批准的 App 会存入当前会话的 allowedApps。这是一种临时授权，仅对当前会话有效，并不会赋予 Claude 永久控制权。
高危预警：对于能绕过隔离的应用（如终端或系统设置），系统会标记“Sentinel Warning”，提醒用户潜在风险。

3. 全局互斥锁与紧急刹车

为了防止多个 Claude 会话冲突，claude-code-opensource/src/utils/computerUse/computerUseLock.ts 会竞争 ~/.claude/computer-use.lock 物理锁。一旦拿锁成功，系统会注册一个全局的 Esc 热键。这是用户的“紧急刹车”：在任何时候双击 Esc，都能通过底层硬件钩子强行终止 Claude 的桌面操作。

4. 视觉循环与回合清理

真正的执行由 claude-code-opensource/src/utils/computerUse/executor.ts 完成。它通过 @ant/computer-use-swift 获取截图，并自动排除掉 Claude 自身的终端窗口以防干扰。在每个交互回合结束时，cleanup.ts 会自动运行：释放文件锁、注销 Esc 热键、并“取消隐藏（Unhide）”之前为了截图清晰度而被临时隐藏的非白名单应用。

使用时的关键约束

非物理隔离：Computer Use 操作的是你真实的桌面环境，而非沙箱。它的安全性完全依赖于 App 白名单 和 TCC（系统权限控制）。
单会话排他性：由于物理桌面只有一个，源码通过文件锁强制实现了单 session 互斥。如果第二个会话尝试开启桌面控制，会收到“已被占用”的报错。
高开销与延迟：每一轮“观察-行动”循环都涉及高清截图的上传和 Token 消耗，因此它通常被定位为处理 GUI 独有任务的“最后手段”。
环境自愈：Claude 在工作时会为了减少视觉干扰而隐藏部分应用，但 cleanup 机制确保了即使程序异常退出，也会尽量恢复这些应用的可见性。

源码锚点

claude-code-opensource/src/main.tsx: Computer Use 的总开关与 MCP 动态挂载逻辑。

📄 src/main.tsx — Computer Use 的总开关与 MCP 动态挂载逻辑。L1479-1482 of 4684

tsx

            isComputerUseMCPServer,
            COMPUTER_USE_MCP_SERVER_NAME
          } = await import('src/utils/computerUse/common.js');
          if (nonSdkConfigNames.some(isComputerUseMCPServer)) {

claude-code-opensource/src/utils/computerUse/setup.ts: 负责将桌面能力包装为 mcp__computer-use__* 工具集。

📄 src/utils/computerUse/setup.ts — 负责将桌面能力包装为 `mcp__computer-use__*` 工具集。L23-52 of 54

typescript

export function setupComputerUseMCP(): {
  mcpConfig: Record<string, ScopedMcpServerConfig>
  allowedTools: string[]
} {
  const allowedTools = buildComputerUseTools(
    CLI_CU_CAPABILITIES,
    getChicagoCoordinateMode(),
  ).map(t => buildMcpToolName(COMPUTER_USE_MCP_SERVER_NAME, t.name))

  // command/args are never spawned — client.ts intercepts by name and
  // uses the in-process server. The config just needs to exist with
  // type 'stdio' to hit the right branch. Mirrors Chrome's setup.
  const args = isInBundledMode()
    ? ['--computer-use-mcp']
    : [
        join(fileURLToPath(import.meta.url), '..', 'cli.js'),
        '--computer-use-mcp',
      ]

  return {
    mcpConfig: {
      [COMPUTER_USE_MCP_SERVER_NAME]: {
        type: 'stdio',
        command: process.execPath,
        args,
        scope: 'dynamic',
      } as const,
    },
    allowedTools,
  }

claude-code-opensource/src/utils/computerUse/mcpServer.ts: 内建 MCP Server 实现，包含 App 名称枚举。

📄 src/utils/computerUse/mcpServer.ts — 内建 MCP Server 实现，包含 App 名称枚举。L39-39 of 107

typescript

      `[Computer Use MCP] app enumeration exceeded ${APP_ENUM_TIMEOUT_MS}ms or failed; tool description omits list`,

claude-code-opensource/src/utils/computerUse/computerUseLock.ts: 基于物理文件锁的单会话互斥机制。

📄 src/utils/computerUse/computerUseLock.ts — 基于物理文件锁的单会话互斥机制。L10-20 of 216

typescript

const LOCK_FILENAME = 'computer-use.lock'

// Holds the unregister function for the shutdown cleanup handler.
// Set when the lock is acquired, cleared when released.
let unregisterCleanup: (() => void) | undefined

type ComputerUseLock = {
  readonly sessionId: string
  readonly pid: number
  readonly acquiredAt: number
}

claude-code-opensource/src/utils/computerUse/executor.ts: 视觉采样与输入执行的核心中枢，处理坐标对齐与终端豁免。

📄 src/utils/computerUse/executor.ts — 视觉采样与输入执行的核心中枢，处理坐标对齐与终端豁免。L57-68 of 659

typescript

const SCREENSHOT_JPEG_QUALITY = 0.75

/** Logical → physical → API target dims. See `targetImageSize` + COORDINATES.md. */
function computeTargetDims(
  logicalW: number,
  logicalH: number,
  scaleFactor: number,
): [number, number] {
  const physW = Math.round(logicalW * scaleFactor)
  const physH = Math.round(logicalH * scaleFactor)
  return targetImageSize(physW, physH, API_RESIZE_PARAMS)
}

claude-code-opensource/src/utils/computerUse/cleanup.ts: 回合末尾的环境恢复逻辑（锁释放、Unhide、热键注销）。

📄 src/utils/computerUse/cleanup.ts — 回合末尾的环境恢复逻辑（锁释放、Unhide、热键注销）。L15-44 of 87

typescript

const UNHIDE_TIMEOUT_MS = 5000

/**
 * Turn-end cleanup for the chicago MCP surface: auto-unhide apps that
 * `prepareForAction` hid, then release the file-based lock.
 *
 * Called from three sites: natural turn end (`stopHooks.ts`), abort during
 * streaming (`query.ts` aborted_streaming), abort during tool execution
 * (`query.ts` aborted_tools). All three reach this via dynamic import gated
 * on `feature('CHICAGO_MCP')`. `executor.js` (which pulls both native
 * modules) is dynamic-imported below so non-CU turns don't load native
 * modules just to no-op.
 *
 * No-ops cheaply on non-CU turns: both gate checks are zero-syscall.
 */
export async function cleanupComputerUseAfterTurn(
  ctx: Pick<
    ToolUseContext,
    'getAppState' | 'setAppState' | 'sendOSNotification'
  >,
): Promise<void> {
  const appState = ctx.getAppState()

  const hidden = appState.computerUseMcpState?.hiddenDuringTurn
  if (hidden && hidden.size > 0) {
    const { unhideComputerUseApps } = await import('./executor.js')
    const unhide = unhideComputerUseApps([...hidden]).catch(err =>
      logForDebugging(
        `[Computer Use MCP] auto-unhide failed: ${errorMessage(err)}`,
      ),

claude-code-opensource/src/components/permissions/ComputerUseApproval/: 应用审批 UI 与 TCC 权限引导逻辑。

Computer Use：内嵌 MCP、应用审批与桌面控制协议 ​

它解决了什么问题 ​

运行时的真相 ​

1. 动态 MCP 挂载 ​

2. 会话级应用审批（App Allowlist） ​

3. 全局互斥锁与紧急刹车 ​

4. 视觉循环与回合清理 ​

使用时的关键约束 ​

推荐阅读路径 ​

源码锚点 ​

Computer Use：内嵌 MCP、应用审批与桌面控制协议

它解决了什么问题

运行时的真相

1. 动态 MCP 挂载

2. 会话级应用审批（App Allowlist）

3. 全局互斥锁与紧急刹车

4. 视觉循环与回合清理

使用时的关键约束

推荐阅读路径

源码锚点