refactor(agent-runner): remove host context windowing

feat(agent-runner): normalize binding config boundaries
fix: enforce agent run API permissions
2026-06-02 12:05:54 +00:00 · 2026-06-02 17:01:45 +08:00 · 2026-06-02 15:40:57 +08:00 · 2026-05-30 20:14:06 +08:00 · 2026-05-30 09:48:27 +08:00 · 2026-05-30 09:10:51 +08:00
275 changed files with 26425 additions and 24410 deletions
--- a/.github/workflows/run-tests.yml
+++ b/.github/workflows/run-tests.yml
@@ -15,10 +15,14 @@ on:
    branches:
      - master
      - develop
-      - 'feat/**'
-    # No path filter on push: every push to the branches above runs the
-    # full unit-test suite. feat/** branches in particular must be tested
-    # on every push (they accumulate large changes before a PR exists).
+    paths:
+      - 'src/langbot/**'
+      - 'tests/**'
+      - '.github/workflows/run-tests.yml'
+      - 'pyproject.toml'
+      - 'uv.lock'
+      - 'run_tests.sh'
+      - 'scripts/test-*.sh'

 jobs:
  test:
--- a/README.md
+++ b/README.md
@@ -47,8 +47,6 @@ LangBot is an **open-source, production-grade platform** for building AI-powered

 [→ Learn more about all features](https://link.langbot.app/en/docs/features)

-📍 Practical guides: [deploy a multi-platform AI bot in 5 minutes](https://blog.langbot.app/en/blog/deploy-ai-bot-in-5-minutes/), [connect DeepSeek to WeChat, Discord, and Telegram](https://blog.langbot.app/en/blog/connect-deepseek-to-wechat/), [run a Dify Agent in Discord, Telegram, and Slack](https://blog.langbot.app/en/blog/dify-agent-discord-telegram-slack/), and [build an n8n-powered chatbot](https://blog.langbot.app/en/blog/n8n-multi-platform-ai-chatbot/).
-
 ---

 ## Quick Start
--- a/README_CN.md
+++ b/README_CN.md
@@ -25,7 +25,7 @@
 <a href="https://link.langbot.app/zh/docs/guide">文档</a> ｜
 <a href="https://link.langbot.app/zh/docs/api">API</a> ｜
 <a href="https://space.langbot.app/cloud">Cloud</a> ｜
-<a href="https://space.langbot.app">扩展市场</a> ｜
+<a href="https://space.langbot.app">插件市场</a> ｜
 <a href="https://langbot.featurebase.app/roadmap">路线图</a>

 </div>
@@ -47,8 +47,6 @@ LangBot 是一个**开源的生产级平台**，用于构建 AI 驱动的即时

 [→ 了解更多功能特性](https://link.langbot.app/zh/docs/features)

-📍 实践指南：[5 分钟部署多平台 AI 机器人](https://blog.langbot.app/zh/blog/deploy-ai-bot-in-5-minutes/)、[将 DeepSeek 接入微信、企业微信与 Discord](https://blog.langbot.app/zh/blog/connect-deepseek-to-wechat/)、[让 Dify Agent 跑在 Discord、Telegram 和 Slack 上](https://blog.langbot.app/zh/blog/dify-agent-discord-telegram-slack/)，以及[用 n8n 构建多平台 AI 聊天机器人](https://blog.langbot.app/zh/blog/n8n-multi-platform-ai-chatbot/)。
-
 ---

 ## 快速开始
--- a/README_ES.md
+++ b/README_ES.md
@@ -46,8 +46,6 @@ LangBot es una **plataforma de código abierto y grado de producción** para con

 [→ Conocer más sobre todas las funcionalidades](https://link.langbot.app/en/docs/features)

-📍 Guías prácticas: [desplegar un bot de IA multiplataforma en 5 minutos](https://blog.langbot.app/en/blog/deploy-ai-bot-in-5-minutes/), [conectar DeepSeek a WeChat, Discord y Telegram](https://blog.langbot.app/en/blog/connect-deepseek-to-wechat/), [ejecutar un Dify Agent en Discord, Telegram y Slack](https://blog.langbot.app/en/blog/dify-agent-discord-telegram-slack/) y [crear un chatbot con n8n](https://blog.langbot.app/en/blog/n8n-multi-platform-ai-chatbot/).
-
 ---

 ## Inicio Rápido
--- a/README_FR.md
+++ b/README_FR.md
@@ -46,8 +46,6 @@ LangBot est une **plateforme open-source de niveau production** pour créer des

 [→ En savoir plus sur toutes les fonctionnalités](https://link.langbot.app/en/docs/features)

-📍 Guides pratiques : [déployer un bot IA multiplateforme en 5 minutes](https://blog.langbot.app/en/blog/deploy-ai-bot-in-5-minutes/), [connecter DeepSeek à WeChat, Discord et Telegram](https://blog.langbot.app/en/blog/connect-deepseek-to-wechat/), [exécuter un Dify Agent dans Discord, Telegram et Slack](https://blog.langbot.app/en/blog/dify-agent-discord-telegram-slack/) et [créer un chatbot avec n8n](https://blog.langbot.app/en/blog/n8n-multi-platform-ai-chatbot/).
-
 ---

 ## Démarrage Rapide
--- a/README_JP.md
+++ b/README_JP.md
@@ -46,8 +46,6 @@ LangBot は、AI搭載のインスタントメッセージングボットを構

 [→ すべての機能について詳しく見る](https://link.langbot.app/ja/docs/features)

-📍 実践ガイド: [5分でマルチプラットフォームAIボットをデプロイ](https://blog.langbot.app/en/blog/deploy-ai-bot-in-5-minutes/)、[DeepSeekをWeChat・Discord・Telegramに接続](https://blog.langbot.app/en/blog/connect-deepseek-to-wechat/)、[Dify AgentをDiscord・Telegram・Slackで動かす](https://blog.langbot.app/en/blog/dify-agent-discord-telegram-slack/)、[n8n連携チャットボットを構築](https://blog.langbot.app/en/blog/n8n-multi-platform-ai-chatbot/)。
-
 ---

 ## クイックスタート
--- a/README_KO.md
+++ b/README_KO.md
@@ -46,8 +46,6 @@ LangBot은 AI 기반 인스턴트 메시징 봇을 구축하기 위한 **오픈

 [→ 모든 기능 자세히 보기](https://link.langbot.app/en/docs/features)

-📍 실전 가이드: [5분 만에 멀티 플랫폼 AI 봇 배포하기](https://blog.langbot.app/en/blog/deploy-ai-bot-in-5-minutes/), [DeepSeek를 WeChat, Discord, Telegram에 연결하기](https://blog.langbot.app/en/blog/connect-deepseek-to-wechat/), [Dify Agent를 Discord, Telegram, Slack에서 실행하기](https://blog.langbot.app/en/blog/dify-agent-discord-telegram-slack/), [n8n 기반 챗봇 만들기](https://blog.langbot.app/en/blog/n8n-multi-platform-ai-chatbot/).
-
 ---

 ## 빠른 시작
--- a/README_RU.md
+++ b/README_RU.md
@@ -46,8 +46,6 @@ LangBot — это **платформа с открытым исходным к

 [→ Подробнее обо всех возможностях](https://link.langbot.app/en/docs/features)

-📍 Практические руководства: [развернуть мультиплатформенного ИИ-бота за 5 минут](https://blog.langbot.app/en/blog/deploy-ai-bot-in-5-minutes/), [подключить DeepSeek к WeChat, Discord и Telegram](https://blog.langbot.app/en/blog/connect-deepseek-to-wechat/), [запустить Dify Agent в Discord, Telegram и Slack](https://blog.langbot.app/en/blog/dify-agent-discord-telegram-slack/) и [создать чат-бота на n8n](https://blog.langbot.app/en/blog/n8n-multi-platform-ai-chatbot/).
-
 ---

 ## Быстрый старт
--- a/README_TW.md
+++ b/README_TW.md
@@ -48,8 +48,6 @@ LangBot 是一個**開源的生產級平台**，用於建構 AI 驅動的即時

 [→ 了解更多功能特性](https://link.langbot.app/zh/docs/features)

-📍 實踐指南：[5 分鐘部署多平台 AI 機器人](https://blog.langbot.app/zh/blog/deploy-ai-bot-in-5-minutes/)、[將 DeepSeek 接入微信、企業微信與 Discord](https://blog.langbot.app/zh/blog/connect-deepseek-to-wechat/)、[讓 Dify Agent 跑在 Discord、Telegram 和 Slack 上](https://blog.langbot.app/zh/blog/dify-agent-discord-telegram-slack/)，以及[用 n8n 建構多平台 AI 聊天機器人](https://blog.langbot.app/zh/blog/n8n-multi-platform-ai-chatbot/)。
-
 ---

 ## 快速開始
--- a/README_VI.md
+++ b/README_VI.md
@@ -46,8 +46,6 @@ LangBot là một **nền tảng mã nguồn mở, cấp sản xuất** để x

 [→ Tìm hiểu thêm về tất cả tính năng](https://link.langbot.app/en/docs/features)

-📍 Hướng dẫn thực hành: [triển khai bot AI đa nền tảng trong 5 phút](https://blog.langbot.app/en/blog/deploy-ai-bot-in-5-minutes/), [kết nối DeepSeek với WeChat, Discord và Telegram](https://blog.langbot.app/en/blog/connect-deepseek-to-wechat/), [chạy Dify Agent trên Discord, Telegram và Slack](https://blog.langbot.app/en/blog/dify-agent-discord-telegram-slack/) và [xây dựng chatbot với n8n](https://blog.langbot.app/en/blog/n8n-multi-platform-ai-chatbot/).
-
 ---

 ## Bắt đầu nhanh
--- a/docker/docker-compose.yaml
+++ b/docker/docker-compose.yaml
@@ -18,40 +18,6 @@ services:
    networks:
      - langbot_network

-  # The Box sandbox runtime is optional. It is only started when you run
-  # ``docker compose --profile box up`` (or ``docker compose --profile all
-  # up``). With Box off, LangBot keeps the dashboard / skills list visible
-  # (read-only) but disables sandbox tools, skill add/edit and stdio MCP —
-  # set ``box.enabled: false`` in ``data/config.yaml`` (or
-  # ``BOX__ENABLED=false`` in the langbot service env below) to match.
-  langbot_box:
-    image: rockchin/langbot:latest
-    container_name: langbot_box
-    profiles: ["box", "all"]
-    volumes:
-      # Keep the source and target path identical because langbot_box uses the
-      # host Docker socket to create sandbox containers. Override
-      # LANGBOT_BOX_ROOT with an absolute path if you do not want the default.
-      - ${LANGBOT_BOX_ROOT:-${PWD}/data/box}:${LANGBOT_BOX_ROOT:-${PWD}/data/box}
-      # Mount container runtime socket for Box sandbox backend.
-      # Uncomment the one that matches your container runtime:
-      # - /var/run/podman/podman.sock:/var/run/podman/podman.sock   # Podman
-      - /var/run/docker.sock:/var/run/docker.sock                   # Docker
-    restart: on-failure
-    environment:
-      - TZ=Asia/Shanghai
-      # The Box runtime does NOT read box.local.* from config.yaml or env; it
-      # receives its configuration from LangBot via the INIT RPC action.
-      # Do not add LANGBOT_BOX_* / BOX__* here — they would be silently ignored.
-    # Launched through the same CLI entry point as the plugin runtime
-    # (`langbot_plugin.cli.__init__ <subcommand>`). WebSocket is the default
-    # control transport — mirrors `rt`, which also runs with no flag. Pass
-    # `-s` / `--stdio-control` only for the stdio mode LangBot uses outside
-    # containers.
-    command: ["uv", "run", "--no-sync", "-m", "langbot_plugin.cli.__init__", "box"]
-    networks:
-      - langbot_network
-
  langbot:
    image: rockchin/langbot:latest
    container_name: langbot
@@ -60,13 +26,6 @@ services:
    restart: on-failure
    environment:
      - TZ=Asia/Shanghai
-      # Unified env-override convention: SECTION__SUBSECTION__KEY overrides the
-      # matching config.yaml field (see LoadConfigStage). These map onto
-      # box.local.* and are forwarded to the Box runtime via INIT RPC.
-      - BOX__LOCAL__HOST_ROOT=${LANGBOT_BOX_ROOT:-${PWD}/data/box}
-      - BOX__LOCAL__DEFAULT_WORKSPACE=default
-      - BOX__LOCAL__SKILLS_ROOT=skills
-      - BOX__LOCAL__ALLOWED_MOUNT_ROOTS=${LANGBOT_BOX_ROOT:-${PWD}/data/box}
    ports:
      - 5300:5300  # For web ui and webhook callback
      - 2280-2285:2280-2285  # For platform reverse connection
@@ -75,4 +34,4 @@ services:

 networks:
  langbot_network:
-    driver: bridge
+    driver: bridge
--- a/docs/agent-runner-pluginization/AGENT_CONTEXT_PROTOCOL.md
+++ b/docs/agent-runner-pluginization/AGENT_CONTEXT_PROTOCOL.md
@@ -0,0 +1,335 @@
+# Agent-owned Context 协议设计
+
+本文档描述插件化 AgentRunner 场景下的上下文边界。结论先行：LangBot 不应成为最终 agentic context manager；LangBot 应提供 context substrate，AgentRunner 或其背后的 agent runtime 自己决定如何管理历史、压缩、召回和 KV cache。
+
+## 当前状态
+
+**当前分支已落地**：
+
+- ✅ `AgentRunContext` — event-first context 模型
+- ✅ `ContextAccess` — cursor、inline policy、available APIs
+- ✅ `AgentRunAPIProxy.history` — page/search API
+- ✅ `AgentRunAPIProxy.events` — get/page API
+- ✅ `AgentRunAPIProxy.artifacts` — metadata/read_range API
+- ✅ `AgentRunAPIProxy.state` — get/set/delete API
+- ✅ EventLog / Transcript / ArtifactStore — host 事实源
+- ✅ PersistentStateStore — 持久化状态存储
+- ✅ `max-round` / host-side history window 已从 LangBot Host/Pipeline 语义中移除；如某 runner 仍需要类似参数，应由该 runner 自己解释配置
+- ✅ 外部 harness context projection 已用 Claude Code runner 做 MVP 验证：context 文件、skill 投影、MCP 配置和 host-owned resume state
+
+## 1. 设计原则
+
+### 1.1 Agent 拥有上下文策略
+
+不同 runner 背后的 runtime 差异很大：
+
+- 官方 local-agent 可能依赖 LangBot 的模型、工具、知识库和存储。
+- Claude Code SDK / Codex 类 runtime 可能有自己的 session、transcript、tool loop 和上下文压缩。
+- Pi Agent SDK 或外部 agent 平台可能只需要当前事件和一个外部 conversation key。
+
+因此 LangBot 不应强行决定最终传给模型的历史窗口。Host 只提供：
+
+- 当前事件的完整结构化信息。
+- 稳定身份和会话引用。
+- 可授权读取的 history / event / artifact / state API。
+- 可投影给外部 harness 的 scoped context、MCP、skill 和 resource refs。
+- payload hard cap 和权限 guardrail。
+
+### 1.2 不再把 `max-round` 作为目标设计
+
+`max-round` 这类历史窗口参数不应继续作为 AgentRunner 协议或 Pipeline adapter 的核心概念。
+
+如果某个 runner 仍需要“最多读取多少轮历史”这样的策略参数，应由该 runner 在自己的 manifest/config schema 中声明，并作为 binding config 存到 `ctx.config` / `runner_config`。Host 只提供 history pull API、cursor、hard cap 和权限边界；runner 自己决定是否读取、读取多少、如何截断和压缩。
+
+当前 official local-agent 方向是通过 Host history API 拉取 transcript，并由 runner 自己管理模型上下文。它不依赖 Pipeline adapter 下发历史窗口。
+
+新协议不应该问“LangBot 每轮裁几轮历史给 agent”，而应该问：
+
+- 这类 runner 是否自管 context？
+- 事件到来时 host 应 inline 哪些最小信息？
+- agent 需要更多上下文时通过什么 API 拉取？
+- host 如何保证安全、可审计和可分页？
+
+### 1.3 Host 保存事实源，Agent 管理 working context
+
+三类数据要分开：
+
+- `EventLog`: Host 保存原始事件、工具调用、投递结果、错误和系统事件。
+- `Transcript`: Host 从 EventLog 投影出的对话视图，用于 UI、审计和按需历史读取。
+- `Working context`: Agent 本轮实际送进模型或 runtime 的上下文，由 AgentRunner 决定。
+
+LangBot 不再提供 host-side bootstrap window。简单 runner 如果需要历史窗口，应在 runner 内部通过 Host history API 拉取并裁剪。
+
+## 2. Event 到来时传什么
+
+默认 `AgentRunContext` 应尽量小且稳定：
+
+```python
+class AgentRunContext(BaseModel):
+    run_id: str
+    trigger: AgentTrigger
+    event: AgentEventContext
+    conversation: ConversationContext | None
+    actor: ActorContext | None
+    subject: SubjectContext | None
+    input: AgentInput
+    delivery: DeliveryContext
+    resources: AgentResources
+    context: ContextAccess
+    state: AgentRunState
+    runtime: AgentRuntimeContext
+    config: dict[str, Any]
+```
+
+默认规则：
+
+- Host MUST NOT inline full history by default.
+- Host SHOULD inline only current event / input and context handles.
+- Runner owns working-context assembly.
+- Runner MAY use Host history / event / artifact / state / storage APIs when authorized.
+- Official runners MUST consume Host infrastructure through the same public APIs as third-party runners.
+
+### 2.1 必须 inline 的内容
+
+每次 run 必须 inline：
+
+- 当前 event 的稳定类型、id、时间、source。
+- 当前输入文本和结构化内容。
+- 附件 / 文件 / 图片的 metadata 和 artifact ref。
+- actor、subject、conversation、thread、bot、workspace。
+- delivery 能力，例如是否支持 streaming、reply target、平台限制。
+- 已授权资源列表。
+- context cursors 和可用 API 能力。
+- runner binding config。
+
+这些是 agent 决定下一步需要的最低信息。
+
+### 2.2 默认不 inline 的内容
+
+默认不要 inline：
+
+- 完整历史消息。
+- 大文件全文。
+- 大工具结果。
+- 全量知识库内容。
+- 平台原始 payload 大对象。
+- 每轮重新生成的大段 summary。
+
+这些会破坏跨进程序列化成本、泄露范围、KV cache 稳定性，也会迫使 host 替 agent 做 context 策略。
+
+### 2.3 不提供 Host Bootstrap Window
+
+`AgentRunContext.bootstrap` 可以作为协议里的可选扩展字段保留，但 LangBot Host 默认不填历史窗口，也不通过 Pipeline 配置决定窗口大小。
+
+如果 runner 需要类似 `recent_tail` 的策略，它应在自己的 manifest/config schema 中声明参数，并在 runner 内部通过 `history_page` / `history_search` 读取、裁剪和压缩历史。Host 只负责权限、分页、hard cap 和事实源。
+
+## 3. ContextAccess
+
+`ContextAccess` 是 host 交给 agent 的上下文读取入口描述：
+
+```python
+class ContextAccess(BaseModel):
+    conversation_id: str | None
+    thread_id: str | None
+    latest_cursor: str | None
+    event_seq: int | None
+    transcript_seq: int | None
+    has_history_before: bool
+    inline_policy: InlineContextPolicy
+    available_apis: ContextAPICapabilities
+```
+
+它告诉 agent：
+
+- 当前事件位于哪条 conversation / thread。
+- 若需要更多历史，从哪个 cursor 开始拉。
+- host inline 了什么，没 inline 什么。
+- 当前 run 有哪些 context API 权限。
+
+## 4. Agent 如何获取更多上下文
+
+所有 API 都必须走 `AgentRunAPIProxy`，并由 host 用 `run_id` 校验。
+
+### 4.1 History API
+
+```python
+await api.history.page(
+    conversation_id=ctx.context.conversation_id,
+    before_cursor=ctx.context.latest_cursor,
+    limit=50,
+    direction="backward",
+    include_artifacts=False,
+)
+```
+
+返回：
+
+```python
+class HistoryPage(BaseModel):
+    items: list[TranscriptItem]
+    next_cursor: str | None
+    prev_cursor: str | None
+    has_more: bool
+```
+
+约束：
+
+- `limit` 有 host hard cap。
+- 默认只能读当前 conversation / thread。
+- 跨会话读取必须有 manifest permission + binding policy。
+- 返回 artifact ref，不默认返回大文件内容。
+
+### 4.2 Search API
+
+```python
+await api.history.search(
+    query="用户之前提到的数据库连接信息",
+    filters={
+        "conversation_id": ctx.context.conversation_id,
+        "event_types": ["message.received"],
+    },
+    top_k=10,
+)
+```
+
+Search 可以先用数据库全文索引，后续再接 embedding recall。它是 host 提供的检索能力，不等于 agent 的长期记忆策略。
+
+### 4.3 Event API
+
+```python
+await api.events.get(event_id)
+await api.events.page(before_cursor=..., limit=...)
+```
+
+Event API 用于读取非消息事件、工具事件、系统事件。Agent 不应把所有事件都当成 user/assistant message。
+
+### 4.4 Artifact API
+
+```python
+await api.artifacts.metadata(artifact_id)
+await api.artifacts.read_range(artifact_id, offset=0, length=65536)
+await api.artifacts.open_stream(artifact_id)
+```
+
+约束：
+
+- 校验 artifact 所属 conversation / run / binding。
+- 校验 MIME、大小、过期时间和权限。
+- 大文件按 range/stream 读取。
+- 工具大结果也应 artifact 化。
+
+### 4.5 State API
+
+```python
+await api.state.get(scope="conversation", key="external.session_id")
+await api.state.set(scope="conversation", key="summary.checkpoint", value=...)
+```
+
+State 是可选寄宿能力。自管 runtime 可以完全不用；依附 LangBot 的官方 runner 可以使用。
+
+### 4.6 External harness context projection
+
+Claude Code、Codex、Kimi Code 这类 runtime 通常已经有自己的 session、工具 loop、MCP 加载、上下文压缩和工作目录。LangBot 不应把这类 runner 强行改造成“host prompt assembler”，而应提供可审计的事件和资源投影。
+
+推荐 projection 形态：
+
+- `agent-context.json`：结构化 JSON，包含 `run_id`、`event`、`actor`、`subject`、`input`、`delivery`、`resources`、`context`、`state`、`runtime`。
+- `LANGBOT_CONTEXT.md`：人类可读摘要，用于 code-agent harness 快速理解当前 IM 事件。
+- `resources`：只包含本次 run 授权后的模型、工具、知识库、artifact、state/storage 句柄，不暴露 Host 内部私有对象。
+- `skills`：Host 或 binding 把已授权 skill 投影为目标 harness 可读目录，例如 Claude Code 的 `.claude/skills/<name>/SKILL.md`。
+- `MCP config`：Host 或 binding 提供 scoped MCP 配置，runner adapter 转成目标 harness 的配置文件或 CLI 参数。
+- `state pointers`：外部 session id、working directory、checkpoint 等小型 JSON 状态通过 Host state API 保存，例如 `external.session_id`、`external.working_directory`。
+
+当前 Claude Code runner MVP 使用 schema `langbot.agent_runner.external_harness_context.v1`，并已通过 WebUI Debug Chat 验证 context 文件、skill 文件、MCP config 和 resume state 的基本链路。
+
+这类 projection 是“把 LangBot 事实源和授权资源交给 harness”，不是“由 LangBot 决定最终模型上下文”。外部 harness 可以继续使用自己的 transcript、工具权限和压缩策略。
+
+## 5. Runner manifest 中的上下文声明
+
+建议增加：
+
+```yaml
+context:
+  ownership: self_managed | host_bootstrap | hybrid
+  bootstrap: none | current_event | recent_tail | summary_tail
+  max_inline_events: 0
+  max_inline_bytes: 0
+  supports_history_pull: true
+  supports_history_search: true
+  supports_artifact_pull: true
+  owns_compaction: true
+  wants_static_context_refs: true
+```
+
+语义：
+
+- `self_managed`: Host 不主动 inline 历史，只提供 event 和 handles。
+- `host_bootstrap`: Host 为简单 runner inline 一个小窗口。
+- `hybrid`: Host inline summary/tail，runner 仍可按需拉更多。
+- `owns_compaction`: runner 负责压缩，host 不做语义摘要。
+- `wants_static_context_refs`: host 用 ref/hash 描述静态内容，减少重复 payload。
+
+## 6. KV cache 友好的上下文管理
+
+如果目标是支持 Claude Code SDK、Codex、Pi Agent SDK 等 runtime，必须避免每轮由 LangBot 重组大块 prompt。
+
+建议：
+
+- 稳定 session key：`workspace/bot/binding/runner/conversation/thread`。
+- 静态内容使用 `ref + version/hash`：system prompt、resource manifest、tool schema、platform policy。
+- 每轮只传 delta：当前 event、artifact refs、少量 runtime metadata。
+- 历史 append-only：不要每轮改写同一段 history 文本。
+- Summary checkpoint 稳定：只有压缩发生时产生新 checkpoint，不要每轮微调。
+- 大文件和工具结果 artifact 化。
+- Tool/context API schema 稳定，数据通过 API 拉取，而不是塞入 prompt。
+- 对自管 runtime，优先让它复用自身 session/cache，而不是强制 LangBot 每轮重放 transcript。
+
+## 7. Host guardrail
+
+Agent 自管 context 不代表无限制访问。LangBot 仍必须控制：
+
+- 每次 run 的 active `run_id`。
+- runner identity。
+- 当前 binding 的 resource policy。
+- conversation / actor / subject scope。
+- page size、artifact read size、API rate limit。
+- 跨会话读取权限。
+- 数据脱敏和敏感变量过滤。
+- 审计日志。
+
+Host 不负责“最佳上下文策略”，但负责“不越权、不爆内存、不不可审计”。
+
+## 8. 官方 runner 与业务编排边界
+
+官方 runner 插件可以选择把状态寄宿在 LangBot，但它们必须和第三方 runner 一样通过公开 Host APIs 消费这些能力。
+
+LangBot core 不应内置官方 agent 的业务流程：
+
+- 不内置 prompt 组装策略。
+- 不内置 tool loop。
+- 不内置 RAG 编排策略。
+- 不内置 summary / compaction 策略。
+- 不内置“local-agent 专用”的状态字段。
+
+官方 local-agent 应作为“依附 LangBot 基础设施的复杂 runner 参考实现”存在：
+
+- transcript / history 通过 `api.history.page()` 或 `api.history.search()` 读取。
+- summary、checkpoint、外部 session id、用户偏好通过 `api.state` 或 `api.storage` 保存。
+- 图片、文件、工具大结果通过 `api.artifacts` 读取。
+- 模型、工具、知识库通过 `api.models`、`api.tools`、`api.knowledge` 调用。
+
+这样 LangBot 保持为通用 agent host，不变成内置 agent 框架。
+
+## 9. 当前实现需要调整
+
+**已完成（当前分支）**：
+
+- ✅ `max-round` 不再是协议字段，也不再是 Host / Pipeline 通用语义
+- ✅ 新 runner 默认不收到历史窗口
+- ✅ `AgentRunContext` 增加 `context` / cursor / access capabilities
+- ✅ `AgentRunAPIProxy` 增加 history / events / artifacts / state API
+- ✅ Host 增加持久 EventLog / Transcript / ArtifactStore / PersistentStateStore
+- ✅ `run_from_query()` 委托到 event-first `run(event, binding)`
+- ✅ Claude Code external harness smoke：context JSON / Markdown、skill、MCP config、`external.session_id` / `external.working_directory`
+
+这样 LangBot 既能服务依附 host 基础设施的官方 runner，也能服务自带 memory/session/cache 的外部 agent runtime。
--- a/docs/agent-runner-pluginization/EVENT_BASED_AGENT.md
+++ b/docs/agent-runner-pluginization/EVENT_BASED_AGENT.md
@@ -0,0 +1,237 @@
+# Event Based Agent 预留设计
+
+> **注意**：本文档是 future design note，不是当前分支实现范围。
+>
+> EventGateway、EventRouter、Event subscription/notification 由其他分支实现。
+> 本分支只预留 event-first 入口和 envelope/binding models。
+> 2026-05-29 的 local-agent / Claude Code runner smoke 只验证本分支的 `run(event, binding)` 调度边界，不表示 EBA 分支已经完成联调。
+
+本文档描述未来 EBA 接入时，事件如何进入 LangBot、如何触发 AgentRunner，以及如何复用插件化 agent 基础设施。
+
+本阶段不实现完整 EventBus / EventRouter / Platform API。本阶段要做的是把协议边界设计对，避免当前消息入口继续绑死 Pipeline 和用户文本消息。
+
+## 1. 设计目标
+
+- 消息、撤回、入群、好友申请、定时任务、API 调用都能抽象为 host event。
+- EventRouter 可以根据 event type、bot、workspace、conversation、actor、subject 解析 AgentBinding。
+- AgentRunner 通过同一套 orchestrator 被调用。
+- 非消息事件不伪造成用户文本消息。
+- 平台动作执行通过显式 capability / permission / result type 预留，不混入普通文本回复。
+
+## 2. 事件不是消息
+
+`message.received` 只是事件的一种。协议不应假设：
+
+- 一定有用户文本。
+- 一定有 conversation history。
+- 一定要返回一条聊天消息。
+- actor 一定等于 sender。
+- subject 一定等于当前消息。
+
+例如：
+
+| event_type | actor | subject | input |
+| --- | --- | --- | --- |
+| `message.received` | 发消息的人 | 当前消息 | 文本、图片、文件等 |
+| `message.recalled` | 撤回操作者，未知时为系统 | 被撤回消息 | 通常为空 |
+| `group.member_joined` | 新成员或邀请人 | 群/成员关系 | 通常为空 |
+| `friend.request_received` | 申请人 | 好友申请 | 验证消息或申请理由 |
+| `schedule.triggered` | 系统 | 定时任务 | 任务 payload |
+| `api.invoked` | API caller | API request | request payload |
+
+## 3. Event Envelope
+
+建议事件 envelope：
+
+```python
+class AgentEventEnvelope(BaseModel):
+    event_id: str
+    event_type: str
+    event_time: int | None
+    source: EventSource
+    workspace_id: str | None
+    bot_id: str | None
+    conversation_id: str | None
+    thread_id: str | None
+    actor: ActorRef | None
+    subject: SubjectRef | None
+    input: AgentInput
+    delivery: DeliveryContext
+    raw_ref: RawEventRef | None
+    metadata: dict[str, Any] = {}
+```
+
+顶层字段使用 LangBot 稳定协议名。平台原始事件名和原始 payload 放到 `metadata` 或 `raw_ref`，不直接成为 runner 的稳定依赖。
+
+## 4. Event Source
+
+事件来源可以包括：
+
+- `platform_adapter`: 飞书、QQ、微信、Telegram 等 IM 平台。
+- `webui`: Debug Chat、控制台操作。
+- `http_api`: 外部系统调用 LangBot。
+- `scheduler`: 定时任务。
+- `system`: runtime、plugin、maintenance 事件。
+
+同一个 event source 可以产生多个 event type。EventRouter 不应该写死平台 adapter 的类名。
+
+## 5. Event Binding
+
+EBA 中，AgentBinding 取代 Pipeline runner 配置成为触发关系：
+
+```python
+class AgentBinding(BaseModel):
+    binding_id: str
+    enabled: bool
+    event_types: list[str]
+    scope: BindingScope
+    filters: list[EventFilter]
+    runner_id: str
+    runner_config: dict[str, Any]
+    resource_policy: ResourcePolicy
+    state_policy: StatePolicy
+    delivery_policy: DeliveryPolicy
+```
+
+Binding scope 示例：
+
+- workspace 全局。
+- bot 级别。
+- platform channel 级别。
+- conversation / group / thread 级别。
+- user / actor 级别。
+
+旧 Pipeline 可以迁移为 `message.received` 的 binding source，但不是唯一 binding source。
+
+## 6. EventRouter 调用链
+
+目标调用链：
+
+```text
+Platform Adapter / WebUI / API
+  -> Event Gateway normalize payload
+  -> EventLog append raw event
+  -> EventRouter resolve bindings
+  -> AgentRunOrchestrator.run(event, binding)
+  -> AgentRunContextBuilder.build(event, binding)
+  -> PluginRuntimeConnector.run_agent()
+  -> AgentRunResult stream
+  -> DeliveryController render / platform action
+```
+
+约束：
+
+- `run_from_event()` 必须复用现有 orchestrator 能力。
+- 不能为 EBA 单独实现另一套 plugin runner 调用协议。
+- 不能让非消息事件绕过 resource authorization。
+- Delivery 和 platform action 要走统一权限模型。
+- 外部 harness runner 也应通过同一套 envelope/binding/context/result 协议接入；EBA 不应为 Claude Code / Codex / Kimi Code 单独发明队列协议。
+
+## 7. Delivery Context
+
+Event 不一定回复到当前聊天窗口。需要显式 delivery：
+
+```python
+class DeliveryContext(BaseModel):
+    surface: str
+    reply_target: ReplyTarget | None
+    supports_streaming: bool
+    supports_edit: bool
+    supports_reaction: bool
+    max_message_size: int | None
+    platform_capabilities: dict[str, Any] = {}
+```
+
+消息事件通常带 reply target。系统事件可能没有默认 reply target，需要 runner 返回 `action.requested` 或由 binding 的 delivery policy 决定投递位置。
+
+## 8. AgentRunResult 与平台动作
+
+当前消息路径主要消费：
+
+- `message.delta`
+- `message.completed`
+- `run.completed`
+- `run.failed`
+
+EBA 后需要预留：
+
+- `action.requested`: 请求 host 执行平台动作。
+- `artifact.created`: runner 生成文件或大结果。
+- `delivery.requested`: 请求投递到某个 surface。
+
+示例：
+
+```json
+{
+  "type": "action.requested",
+  "data": {
+    "action": "friend.request.accept",
+    "target": {"platform": "wechat", "request_id": "..."},
+    "reason": "policy matched"
+  }
+}
+```
+
+Host 必须校验：
+
+- runner manifest 是否声明 platform_api capability。
+- binding 是否授权该 action。
+- actor / bot / workspace 是否允许。
+- 是否需要人工审批。
+
+本阶段如收到 `action.requested`，可以只记录 telemetry，不执行。
+
+## 9. 与 Context 协议的关系
+
+EBA 事件进入 AgentRunner 时仍使用 [AGENT_CONTEXT_PROTOCOL.md](./AGENT_CONTEXT_PROTOCOL.md) 的原则：
+
+- inline 当前事件。
+- 大 payload 用 raw/artifact ref。
+- 不默认 inline 完整 history。
+- agent 按需通过 API 拉 history/event/artifact/state。
+- Host 保留 EventLog 和权限 guardrail。
+
+非消息事件可以被投影进 Transcript，但不能强制伪装为 user message。AgentRunner 可以根据 event type 自己决定是否把它纳入模型上下文。
+
+## 10. 当前实现与目标差距
+
+**当前分支已落地（Event-first 基础设施）**：
+
+- ✅ `AgentRunOrchestrator` — event-first `run(event, binding)` 入口
+- ✅ `AgentRunContextBuilder` — event-first context 构建
+- ✅ `AgentEventEnvelope` 模型
+- ✅ `AgentBinding` 模型
+- ✅ `AgentRunResult` 基础消息流
+- ✅ `ctx.event` 的最小消息事件封装
+- ✅ `PipelineAdapter` — Query → Event + Binding 转换
+- ✅ `run_from_query()` → `run(event, binding)` 委托
+- ✅ EventLog / Transcript / ArtifactStore
+- ✅ History / Event / Artifact / State pull APIs
+- ✅ 当前消息事件 path 已用 `local-agent` 与 Claude Code external harness runner 做本地 smoke
+
+**其他分支负责（非本分支范围）**：
+
+- EventGateway 实现
+- EventRouter 实现
+- Event subscription / notification
+- EventLog 持久化管理 UI
+- AgentBinding 持久化 UI
+- 平台动作执行 (`action.requested` 执行器)
+
+**未来 EBA 完整落地需要**：
+
+- EventGateway 完整实现
+- EventRouter 与 BindingResolver 集成
+- AgentBinding 持久模型和 UI
+- DeliveryContext 完整实现
+- platform action permission model 和执行器
+- 真实平台事件接入
+
+## 11. 落地顺序
+
+1. 先把当前 Pipeline 消息入口适配成 `message.received` event。
+2. 增加 `AgentBinding` 抽象，先由 Pipeline config 生成。
+3. `AgentRunContextBuilder` 改为从 event + binding 构造 context。
+4. 引入 EventLog / Transcript。
+5. 增加非消息事件的协议测试，不接真实平台。
+6. 再接入真实 EventRouter 和 platform action。
--- a/docs/agent-runner-pluginization/HOST_SDK_INFRASTRUCTURE.md
+++ b/docs/agent-runner-pluginization/HOST_SDK_INFRASTRUCTURE.md
@@ -0,0 +1,427 @@
+# LangBot Host 与 SDK 基础设施设计
+
+本文档描述 LangBot 和 SDK 为插件化 AgentRunner 共同提供的基础设施。它不以 Pipeline 为中心，也不以官方 local-agent 的实现方式为前提。
+
+## 1. 目标
+
+LangBot 要转为 agent host，而不是内置 runner 容器：
+
+- 接收 IM、WebUI、API 和未来 EventRouter 产生的事件。
+- 根据事件、bot、workspace、scope 解析应该调用的 agent binding。
+- 发现、校验和调用插件提供的 AgentRunner。
+- 为每次 run 提供受限资源、状态、存储、上下文引用和生命周期控制。
+- 接收 AgentRunner 返回的事件流，并投递到 IM、WebUI 或其他 output surface。
+
+SDK 要提供稳定协议：
+
+- `AgentRunner` 组件定义。
+- runner manifest / capabilities / permissions / config schema。
+- `AgentRunContext` 输入 envelope。
+- `AgentRunResult` 输出事件流。
+- `AgentRunAPIProxy` 运行期受限 API。
+
+## 2. 非目标
+
+- 不把 Pipeline 当作长期架构中心。
+- 不要求所有 AgentRunner 依赖 LangBot 的上下文管理。
+- 不要求官方 local-agent 的旧行为反向塑造 host 协议。
+- 不在 host 中实现通用 agentic prompt assembler。
+- 不强制 runner 使用 LangBot state / storage；LangBot 只提供可选、受控的寄宿能力。
+- **不实现 EventGateway**：EventGateway 是 future integration point，由外部 event branch 提供。本分支只定义 host-side envelope/binding models 和 `run(event, binding)` 入口。
+
+## 3. 分层架构
+
+目标结构：
+
+```text
+IM / WebUI / API / EventRouter (future)
+        |
+        v
+Event Gateway (future - external event branch)
+        |
+        v
+AgentBindingResolver
+        |
+        v
+AgentRunOrchestrator
+        |-- AgentRunnerRegistry
+        |-- AgentResourceBuilder
+        |-- AgentContextBuilder
+        |-- AgentRunSessionRegistry
+        |-- PersistentStateStore / EventLogStore / TranscriptStore / ArtifactStore
+        v
+Plugin Runtime / AgentRunner
+        |
+        v
+AgentRunResult stream
+        |
+        v
+Delivery / Renderer / Platform API
+```
+
+**当前状态**：
+- `PipelineAdapter` 作为当前入口 adapter，将 Pipeline Query 转换为 `AgentEventEnvelope` + `AgentBinding`
+- `run_from_query()` 内部委托到 `run(event, binding)`
+- EventLog / Transcript / ArtifactStore / PersistentStateStore 已落地
+- `local-agent` 与 Claude Code runner 已通过本地 WebUI smoke，验证同一条 `run(event, binding)` path 可服务 host-infra runner 与外部 harness runner
+- EventGateway 由外部 event branch 实现
+
+当前 Pipeline 只应接入在 Pipeline adapter 位置。它可以继续产生 `message.received`，但不应继续拥有 runner 选择、上下文裁剪和业务 agent 执行的核心语义。
+
+## 4. LangBot 侧能力
+
+### 4.1 Event Gateway（Future Integration Point）
+
+> **注意**：EventGateway 由外部 event branch 实现，不在本分支范围。本分支只预留 event-first 入口和 envelope/binding models。
+
+Event Gateway 将负责把入口统一成 host event：
+
+- IM 平台消息。
+- WebUI debug chat 消息。
+- API 触发。
+- 后续非消息事件，例如入群、撤回、好友申请。
+
+输出应是稳定 envelope，而不是 Pipeline Query 私有结构：
+
+```python
+class AgentEventEnvelope(BaseModel):
+    event_id: str
+    event_type: str
+    event_time: int | None
+    source: str
+    bot_id: str | None
+    workspace_id: str | None
+    conversation_id: str | None
+    thread_id: str | None
+    actor: ActorRef | None
+    subject: SubjectRef | None
+    input: AgentInput
+    delivery: DeliveryContext
+    raw_ref: RawEventRef | None
+```
+
+**当前 adapter source**：`PipelineAdapter.query_to_event(query)` 从 Pipeline Query 生成 `AgentEventEnvelope`。
+
+原始平台 payload 可以存为 raw event 或 artifact ref；不要把平台私有字段直接扩散到 AgentRunner 顶层协议。
+
+### 4.2 Agent Binding
+
+Agent binding 是”什么事件调用哪个 runner、带什么绑定配置”的持久配置。它替代长期依赖 Pipeline runner config 的角色。
+
+建议模型：
+
+```python
+class AgentBinding(BaseModel):
+    binding_id: str
+    scope: BindingScope
+    event_types: list[str]
+    runner_id: str
+    runner_config: dict[str, Any]
+    resource_policy: ResourcePolicy
+    state_policy: StatePolicy
+    delivery_policy: DeliveryPolicy
+    enabled: bool
+```
+
+**当前 adapter source**：`PipelineAdapter.pipeline_config_to_binding(query, runner_id)` 从 Pipeline config 生成临时 `AgentBinding`。
+
+Pipeline 当前可以被迁移为一种 binding source：
+
+- Pipeline AI runner config -> `AgentBinding`
+- Pipeline extension preference -> `resource_policy`
+- Pipeline output settings -> `delivery_policy`
+
+但新设计不应再把这些字段命名为 Pipeline 专属概念。
+
+### 4.3 AgentRunnerRegistry
+
+Registry 负责收集 runner descriptor：
+
+- 插件 runtime 提供的 `AgentRunner`。
+- 可能存在的 host adapter runner。
+- 开发期本地插件 runner。
+
+Descriptor 必须包含：
+
+```python
+class AgentRunnerDescriptor(BaseModel):
+    id: str
+    source: Literal["plugin", "host_adapter"]
+    label: I18nObject
+    description: I18nObject | None = None
+    capabilities: AgentRunnerCapabilities
+    permissions: AgentRunnerPermissions
+    config_schema: list[DynamicFormItemSchema]
+    plugin: PluginRef | None = None
+```
+
+`plugin:author/name/runner` 仍可作为稳定 id 格式。多个 binding 指向同一个 runner id 时，不创建多个插件实例。
+
+### 4.4 AgentRunOrchestrator
+
+Orchestrator 是唯一运行入口：
+
+```text
+run(event, binding)
+  -> resolve runner descriptor
+  -> build resources
+  -> build context
+  -> register run session
+  -> call plugin runtime
+  -> normalize result stream
+  -> update state
+  -> unregister run session
+```
+
+它负责：
+
+- `run_id` 生成和生命周期。
+- timeout / deadline / cancellation。
+- 插件异常隔离。
+- result schema 校验和大小限制。
+- state.updated 处理。
+- delivery backpressure 和 telemetry。
+
+`run_from_query()` 这类 API 可以保留为 Pipeline adapter 入口，但内部应转换成 event + binding 后走统一 `run()`。
+
+### 4.5 Resource Authorization
+
+LangBot 在每次 run 前生成 `ctx.resources`。资源来自三层约束：
+
+- runner manifest 声明的 permissions。
+- binding/resource policy 允许的资源范围。
+- 当前 event / actor / bot / workspace 的实际权限。
+
+资源类型包括：
+
+- models
+- tools
+- knowledge bases
+- files / artifacts
+- storage
+- platform capabilities
+- history / transcript access
+
+运行期 action 必须再次通过 `run_id` 校验。SDK 侧本地校验只用于开发体验，host 侧校验才是安全边界。
+
+### 4.6 State 与 Storage
+
+LangBot 可以提供 host-owned state，让 AgentRunner 把状态寄宿在 LangBot：
+
+- conversation state
+- actor state
+- subject state
+- runner/binding state
+- workspace state
+
+但这不是强制。外部 agent runtime 可以维护自己的 session 和 memory。LangBot 只需要提供：
+
+- 授权开关。
+- scope key。
+- get/set/list/delete API。
+- 持久化 backend。
+- 审计和清理策略。
+
+当前进程内 state store 只能作为过渡实现，不能作为正式生产语义。
+
+### 4.7 EventLog / Transcript / Artifact
+
+LangBot 应提供事实源能力：
+
+- `EventLog`: 保存原始事件、系统事件、工具调用、投递结果、错误。
+- `Transcript`: 面向对话 UI / agent history 的消息投影。
+- `ArtifactStore`: 保存大文件、多模态输入、工具大结果、平台附件。
+
+AgentRunner 可以读取这些能力，但不能被迫使用 LangBot 作为唯一记忆系统。
+
+### 4.8 Prompt / Instruction Package（占位）
+
+旧 Pipeline 入口目前可以把 preprocessing 后的有效 prompt 放进 adapter metadata，
+这是为了保持旧入口行为，不是长期协议。目标形态应是 Host 保存或生成一个
+run-scoped instruction package，runner 通过 Host API 拉取：
+
+- Host 负责记录静态绑定 prompt、host hook / user plugin 产生的 instruction
+  fragment、来源和审计信息。
+- `ctx.context.available_apis.prompt_get` 只表示拉取能力是否可用。
+- Runner 拉取 instruction package 后，仍由 runner 自己决定如何与 history、RAG、
+  tool 结果、memory 和当前输入组装最终模型 prompt。
+- Host 不实现通用 agentic prompt assembler，也不把 Pipeline adapter prompt 作为
+  长期业务输入契约。
+
+### 4.9 External harness resource projection
+
+Claude Code、Codex、Kimi Code 等外部 harness runner 可能不会直接调用 LangBot 的 model/tool loop，而是把 LangBot 事件和授权资源投影到自己的 harness 中执行。Host 侧仍要保持统一边界：
+
+- Host 负责构造 event-first context、资源授权、state/storage、EventLog/Transcript/ArtifactStore 和审计。
+- Host 或 binding policy 负责决定哪些 MCP server、skill、artifact、history/state 句柄可以投影给 runner。
+- Runner plugin 负责把 scoped projection 转成目标 harness 可消费的形式，例如 context JSON/Markdown、MCP config、skill 目录、环境变量或 CLI 参数。
+- 外部 harness 负责自己的 native session、tool loop、压缩、权限模式和 resume 机制。
+
+当前 Claude Code runner MVP 已验证：
+
+- LangBot event-first context 可以写入 `agent-context.json` / `LANGBOT_CONTEXT.md`。
+- binding 中的 skill / MCP 配置可以投影到 Claude Code 原生目录和 CLI 参数。
+- `external.session_id` 与 `external.working_directory` 可以通过 Host state 保存并用于 resume。
+
+发布级路径隔离、secret 过滤、MCP allowlist、工具白名单、资源配额和 workspace 清理不属于当前协议闭环，详见 [SECURITY_HARDENING.md](./SECURITY_HARDENING.md)。
+
+## 5. SDK 侧协议
+
+### 5.1 AgentRunner 组件
+
+```python
+class AgentRunner(BaseComponent):
+    __kind__ = "AgentRunner"
+
+    @classmethod
+    def get_capabilities(cls) -> AgentRunnerCapabilities:
+        ...
+
+    @classmethod
+    def get_config_schema(cls) -> list[dict]:
+        ...
+
+    async def run(self, ctx: AgentRunContext) -> AsyncGenerator[AgentRunResult, None]:
+        ...
+```
+
+### 5.2 Capabilities
+
+建议能力声明：
+
+```yaml
+capabilities:
+  streaming: true
+  tool_calling: true
+  knowledge_retrieval: true
+  multimodal_input: true
+  event_context: true
+  platform_api: false
+  interrupt: true
+  stateful_session: true
+  self_managed_context: true
+  host_state: optional
+```
+
+`self_managed_context` 表示 runner 或外部 runtime 自己管理上下文。Host 不应给它强塞历史窗口，只提供当前事件和 context handles。
+
+### 5.3 Permissions
+
+```yaml
+permissions:
+  models: ["invoke", "stream", "rerank"]
+  tools: ["detail", "call"]
+  knowledge_bases: ["list", "retrieve"]
+  history: ["page", "search"]
+  events: ["get", "page"]
+  artifacts: ["metadata", "read"]
+  storage: ["plugin", "workspace", "binding"]
+  files: ["config", "knowledge"]
+  platform_api: []
+```
+
+权限声明是 runner 需要的最大能力，实际可用资源仍由 binding 和当前运行上下文裁剪。
+
+### 5.4 AgentRunContext
+
+Context 顶层应是 event-first，而不是 Query-first：
+
+```python
+class AgentRunContext(BaseModel):
+    run_id: str
+    trigger: AgentTrigger
+    event: AgentEventContext
+    conversation: ConversationContext | None = None
+    actor: ActorContext | None = None
+    subject: SubjectContext | None = None
+    input: AgentInput
+    resources: AgentResources
+    context: ContextAccess
+    state: AgentRunState
+    runtime: AgentRuntimeContext
+    config: dict[str, Any]
+```
+
+`messages` 可以作为兼容字段或 bootstrap 字段，但不应继续是协议核心。
+
+### 5.5 AgentRunResult
+
+输出应是事件流：
+
+```python
+class AgentRunResult(BaseModel):
+    type: Literal[
+        "message.delta",
+        "message.completed",
+        "tool.call.started",
+        "tool.call.completed",
+        "state.updated",
+        "artifact.created",
+        "action.requested",
+        "run.completed",
+        "run.failed",
+    ]
+    data: dict[str, Any] = {}
+```
+
+当前消息回复只消费 `message.delta` / `message.completed` / `run.failed`。平台动作执行等 EBA 和 platform API 权限落地后再启用。
+
+### 5.6 AgentRunAPIProxy
+
+Proxy 是 runner 访问 host 能力的唯一入口：
+
+- model APIs
+- tool APIs
+- knowledge APIs
+- state / storage APIs
+- history / event APIs
+- artifact APIs
+- platform APIs
+
+所有请求必须带 `run_id`，host 侧按 active run session 验证 runner identity 和 resource ACL。
+
+## 6. 当前实现与目标差距
+
+**已落地（当前分支）**：
+
+- ✅ `AgentRunnerRegistry`
+- ✅ `AgentRunOrchestrator` — event-first `run(event, binding)`
+- ✅ `AgentRunContextBuilder` — event-first context
+- ✅ `AgentResourceBuilder`
+- ✅ `AgentRunSessionRegistry`
+- ✅ `AgentRunAPIProxy` — model / tool / knowledge / history / event / artifact / state APIs
+- ✅ `PipelineAdapter` — Query → Event + Binding
+- ✅ `AgentBinding` 抽象
+- ✅ `AgentEventEnvelope` 抽象
+- ✅ `max-round` 从目标协议中移除；类似历史窗口参数若仍需要，应由具体 runner 的 manifest/config schema 暴露为 binding config
+- ✅ `PersistentStateStore` — 持久化状态存储
+- ✅ `EventLogStore` / `TranscriptStore` / `ArtifactStore`
+- ✅ history / artifact / event 的受限拉取 API
+- ✅ Claude Code external harness MVP：context/resource projection 与 host-owned resume state smoke
+
+**其他分支负责（非本分支范围）**：
+
+- EventGateway 实现
+- EventRouter 实现
+- AgentBinding 持久化 UI
+- platform API 动作执行
+- 发布级 security hardening
+
+## 7. 落地顺序
+
+**已完成**：
+
+1. ✅ 固化 README 路由和专题文档边界。
+2. ✅ 在 Host 中抽象 `AgentBinding`，由 Pipeline adapter 生成。
+3. ✅ 将 `AgentRunContextBuilder` 改为 event-first。
+4. ✅ 增加持久 transcript/event log/artifact/state 存储模型。
+5. ✅ 扩展 `AgentRunAPIProxy` 的 history / artifact / state API。
+6. ✅ 将 Pipeline-only 字段下沉到 Pipeline adapter。
+7. ✅ 官方 runner 插件迁移完成（7 个插件）。
+8. ✅ Claude Code runner MVP smoke：外部 harness context 投影和 state handoff。
+
+**后续工作（其他分支）**：
+
+- EventGateway 实现
+- EventRouter 与 BindingResolver 集成
+- 平台动作执行器
--- a/docs/agent-runner-pluginization/IMPLEMENTATION_PLAN.md
+++ b/docs/agent-runner-pluginization/IMPLEMENTATION_PLAN.md
@@ -0,0 +1,552 @@
+# Agent Runner 插件化当前实现与收尾计划
+
+> 2026-05-29 状态说明：本文档是实现推进计划和历史上下文，不是最新验收结论的唯一来源。当前设计入口见 [README.md](./README.md)，协议边界见 [PROTOCOL_V1.md](./PROTOCOL_V1.md)，进度见 [PROGRESS.md](./PROGRESS.md)，下一轮测试入口见 [PHASE1_QA_ACCEPTANCE_MATRIX.md](./PHASE1_QA_ACCEPTANCE_MATRIX.md)。
+
+本文档面向实现 agent，用来把当前 AgentRunner 插件化实现推进到可迁移状态。
+
+当前代码已经不是从零开始的 PoC。LangBot 已经具备 registry、orchestrator、context/resource builder、result normalizer 和插件 runtime action。本计划重点描述剩余工作：补齐宿主通用能力、对齐旧内置 runner 行为、完成官方 runner 插件迁移验收。
+
+## 1. 最终状态
+
+LangBot 最终只保留 Agent Runner 的宿主能力：
+
+- 发现 runner：`AgentRunnerRegistry`
+- 选择 runner：Pipeline 配置和未来事件绑定配置
+- 构造上下文：`AgentRunContext`
+- 裁剪资源：模型、工具、知识库、文件、存储、平台能力
+- 调度执行：`AgentRunOrchestrator`
+- 归一结果：`AgentRunResult` -> 当前 Pipeline 的 `Message` / `MessageChunk`
+- 隔离错误：插件异常、协议错误、超时、结果过大不能破坏主流程
+- 迁移旧配置：把旧内置 runner 配置迁到官方 AgentRunner 插件配置
+- 转发调用：插件 runtime 只维护已安装插件本身的运行实例，Pipeline 不创建插件实例或 runner 实例
+
+LangBot 不再长期维护内置业务 runner 分支。`local-agent`、Dify、n8n、Coze、DashScope、Langflow、Tbox 等都迁到官方 AgentRunner 插件。
+
+迁移期间允许旧 `RequestRunner` 文件继续存在，作为行为对齐基准和回退分析材料。它们不影响当前进度；真正的最终条件是主聊天执行路径不再依赖旧 runner。
+
+## 1.1 当前状态快照
+
+已完成或基本完成：
+
+- `AgentRunnerDescriptor`、runner id 解析、registry。
+- `AgentRunOrchestrator` 替换 `ChatMessageHandler` 内部 runner 调度。
+- `AgentRunContextBuilder`、`AgentResourceBuilder`、`AgentResultNormalizer`。
+- `ai.runner.id` + `ai.runner_config[id]` 的读取与旧配置映射。
+- AgentRunner runtime action：`LIST_AGENT_RUNNERS`、`RUN_AGENT`。
+- run-scoped proxy authorization：模型、工具、知识库、存储、文件。
+- EventLog / Transcript / ArtifactStore / PersistentStateStore。
+- Pipeline adapter 已委托到 event-first `run(event, binding)`。
+- `local-agent` 与 Claude Code runner 已通过本地 WebUI smoke。
+
+仍需收尾：
+
+- Docs final QA 与安装/发布文档整理。
+- timeout/deadline、取消、插件无输出、协议错误的端到端保护。
+- 官方 runner 插件安装/预装/迁移缺失处理。
+- 安全发布级 hardening：路径隔离、权限边界、secret、MCP/skill 投影策略、资源配额、审计。此项不阻塞当前协议闭环，详见 [SECURITY_HARDENING.md](./SECURITY_HARDENING.md)。
+- Codex / Kimi runner 全量接入、issue-centric 队列、复杂 workflow engine 和 EBA 分支完整联调。
+
+## 2. 高层架构
+
+```text
+Pipeline MessageProcessor / future EventRouter
+        |
+        v
+AgentRunOrchestrator
+        |
+        +--> AgentRunnerRegistry
+        |       +--> plugin runtime LIST_AGENT_RUNNERS
+        |       +--> descriptor cache / validation
+        |
+        +--> AgentRunContextBuilder
+        +--> AgentResourceBuilder
+        +--> AgentResultNormalizer
+        |
+        v
+PluginRuntimeConnector.run_agent()
+        |
+        v
+SDK Runtime RUN_AGENT -> plugin AgentRunner.run()
+```
+
+关键约束：
+
+- `ChatMessageHandler` 不解析 `plugin:*`，不实例化 wrapper，不知道 runner 组件细节。
+- `PipelineService.get_pipeline_metadata()` 不直接访问插件 runtime，而是读取 registry。
+- 旧 `RequestRunner` 只作为迁移参考，不作为最终运行路径。
+- `AgentRunOrchestrator` 是 LangBot 侧运行编排层：负责 runner 绑定解析、资源授权、context envelope provisioning、run scope 注册、插件调用和结果归一化；不负责决定 Agent 的最终 prompt/window/压缩策略。
+- 插件是无状态执行单元：多个 Pipeline 可以绑定同一个 runner id，并分别保存自己的 `ai.runner_config[id]`；运行时 LangBot 只把当前绑定配置放入 `ctx.config` 转发给同一个插件 runner。
+- 禁止按 Pipeline 或 runner config 创建多个插件实例。需要跨请求持久化的状态必须走明确授权的 plugin storage / workspace storage / 外部服务，不能隐式保存在 per-pipeline 插件对象里。
+- EBA 只做字段预留，不在本轮实现 EventBus、EventRouter、平台动作执行。
+
+## 3. 新增 LangBot 模块
+
+建议新增：
+
+```text
+src/langbot/pkg/agent/
+  __init__.py
+  runner/
+    __init__.py
+    descriptor.py
+    errors.py
+    id.py
+    registry.py
+    context_builder.py
+    resource_builder.py
+    orchestrator.py
+    result_normalizer.py
+    config_migration.py
+```
+
+### 3.1 descriptor.py
+
+定义 LangBot 内部使用的 descriptor：
+
+```python
+class AgentRunnerDescriptor(BaseModel):
+    id: str
+    source: Literal["plugin"]
+    label: dict[str, str]
+    description: dict[str, str] | None = None
+    plugin_author: str
+    plugin_name: str
+    runner_name: str
+    plugin_version: str | None = None
+    protocol_version: str = "1"
+    config_schema: list[dict[str, Any]] = []
+    capabilities: dict[str, bool] = {}
+    permissions: dict[str, list[str]] = {}
+    raw_manifest: dict[str, Any] = {}
+```
+
+`source == "builtin"` 不作为最终目标。如果实现阶段需要临时 adapter，必须标记为测试过渡代码，并在官方插件跑通后删除。
+
+### 3.2 id.py
+
+统一 runner id 解析和生成：
+
+- 插件 runner id：`plugin:{author}/{plugin_name}/{runner_name}`
+- `parse_runner_id(id)` 返回结构化对象
+- 禁止业务代码手写字符串 split
+- PoC 已存在的 `plugin:author/name/runner` 继续作为合法 id
+
+### 3.3 registry.py
+
+职责：
+
+- 调用 `ap.plugin_connector.list_agent_runners(bound_plugins=None)` 拉取插件 runner
+- 校验 manifest：
+  - `kind == AgentRunner`
+  - `metadata.name` 存在
+  - `metadata.label` 存在
+  - `spec.protocol_version` 兼容，默认 `1`
+  - `spec.config` 是 list，默认空
+  - `spec.capabilities` 是 dict，默认空
+  - `spec.permissions` 是 dict，默认空
+- 输出 `AgentRunnerDescriptor`
+- 缓存 discovery 结果，提供 `refresh()`
+- 单个插件 manifest 失败只记录 warning，不影响其它 runner
+
+刷新触发点：
+
+- 插件安装、卸载、升级、重启后
+- Pipeline metadata 请求时发现缓存为空
+- 可选 TTL，优先保证正确性
+
+### 3.4 context_builder.py / pipeline_adapter.py
+
+`context_builder.py` 只负责从 `AgentEventEnvelope + AgentBinding` 构造 SDK v1 `AgentRunContext`。Pipeline Query 的读取、参数过滤和 prompt 提取属于 `PipelineAdapter`，但 PipelineAdapter 不再做历史窗口裁剪或 bootstrap 打包。
+
+当前消息 Pipeline 进入 agent runner 的路径：
+
+```text
+Query
+  -> PipelineAdapter.query_to_event(query)
+  -> PipelineAdapter.pipeline_config_to_binding(query, runner_id)
+  -> PipelineAdapter.build_adapter_context(query, binding)
+  -> AgentRunOrchestrator.run(event, binding, adapter_context=...)
+  -> AgentRunContextBuilder.build_context_from_event(...)
+```
+
+Protocol v1 context 的稳定字段：
+
+- `run_id`: 新 UUID，不使用 query id 作为全局 run id
+- `trigger.type`: 事件触发类型，例如 `message.received`
+- `conversation`: conversation/thread/launcher/sender/bot/pipeline 投影
+- `event`: 稳定事件上下文
+- `actor`: 触发者
+- `subject`: 当前消息、群、频道或其它事件主体
+- `input`: 当前事件输入，不是历史消息窗口
+- `delivery`: 输出 surface 和平台投递能力
+- `resources`: 由 `resource_builder` 基于 binding policy 注入
+- `state`: `PersistentStateStore` 读取的 host-managed scoped state snapshot
+- `runtime`: host/version/workspace/bot/query/trace/deadline
+- `config`: 当前 binding 对该 runner id 的配置，即 `runner_config`
+- `bootstrap`: 可选扩展字段；LangBot Host 默认不填历史窗口
+- `adapter`: Pipeline 或其它入口 adapter 的元数据
+
+Pipeline adapter 的 `prompt` 和公开业务变量不进入顶层协议字段：
+
+- filtered params -> `ctx.adapter.extra["params"]`
+- legacy/effective prompt 可以暂存到 `ctx.adapter.extra["prompt"]`，但 official
+  runner 不应把它当作行为契约
+- LangBot Host 不生成 `bootstrap.messages`、`adapter_messages` 或 context packaging 元数据
+
+现阶段不要把新的压缩或 token-budget 裁剪塞回 Pipeline stage。Pipeline 只负责入口适配；完整历史和长期上下文由 EventLog / Transcript / pull APIs / future ContextCompressor 支撑。
+
+### 3.4.1 Agentic context plan
+
+EventLog / Transcript / Host pull APIs 已落地，`ContextCompressor` 仍是设计预留。
+目标是让 Pipeline 逐步退化为入口 adapter，让 AgentRunner 层拥有上下文打包职责。
+
+建议 Host 保持三类事实源和受限 API：
+
+```text
+ConversationStore / EventLog
+  -> durable append-only raw messages, events, tool results, artifact refs
+ConversationProjection
+  -> converts events into agent-readable conversation history
+ContextCompressor
+  -> future optional service for summaries/checkpoints, requested and consumed by runners
+```
+
+关键原则：
+
+- 完整历史属于 LangBot host，不属于插件实例。插件仍是 singleton/stateless。
+- `ctx.bootstrap.messages` 不是 Host 默认下发的 working context。
+- 每轮不能全量复制/序列化完整历史给插件 runtime；否则长会话会产生 O(n) 成本和跨进程 payload 膨胀。
+- `max-round` 或类似窗口规则不属于 LangBot Host / Pipeline 语义。
+- LiteLLM 接入后，模型窗口元信息应作为 resource/runtime metadata 暴露给 runner，由 runner 决定预算和压缩策略。
+- `ContextCompressor` 生成的是派生 summary/checkpoint，不能覆盖或删除 raw history。
+- 重启恢复依赖持久化 store 和 summary checkpoint，不依赖 `SessionManager` 里的进程内 conversation list。
+
+未来需要的受限 API：
+
+```python
+api.get_conversation_messages(cursor: str | None, limit: int) -> HistoryPage
+api.get_context_summary(scope: str = "conversation") -> ContextSummary | None
+api.request_context_compaction(policy: dict) -> CompactionResult
+```
+
+这些 API 必须绑定 `run_id`、runner id、actor/subject scope 和资源权限；Host 需要限制
+page size、总字节数、deadline 和可访问 conversation。
+
+### 3.4.2 Large artifacts and tool collaboration
+
+大文件、多模态输入和工具产物不要内联进 prompt、bootstrap 或 tool result。后续统一用
+artifact/resource ref 协作：
+
+- message/content 里只放小文本和必要摘要。
+- 大文件、图片、音频、长工具输出返回 `artifact_id`、`mime_type`、`size`、`digest`、
+  `summary`、`expires_at`、`permissions`。
+- `/tmp` 只能作为单次 run 的临时 staging，用于插件或工具短时间读写；它不是 durable store，
+  也不能作为重启恢复依据。
+- box/object storage 是长期 artifact 的目标位置。当前分支尚未合并 box 能力，因此本轮只写文档预留，不实现 API。
+- 工具之间传递大结果时应传 artifact ref，不传完整 blob。Agent 需要读取时走受限 proxy。
+
+未来建议 API：
+
+```python
+api.get_artifact_metadata(artifact_id: str) -> ArtifactMetadata
+api.open_artifact_stream(artifact_id: str) -> AsyncIterator[bytes]
+api.read_artifact_range(artifact_id: str, offset: int, length: int) -> bytes
+api.create_temp_artifact(name: str, content_type: str, ttl_seconds: int) -> ArtifactWriter
+```
+
+安全约束：
+
+- Host 校验 artifact 是否属于当前 run、conversation、actor/subject scope 或授权资源。
+- 默认不允许插件直接读任意本地路径，包括 `/tmp` 任意路径。
+- 临时文件应有 TTL 和清理机制；box artifact 应有 retention policy。
+- 多模态文件进入模型前，由 runner/context packager 决定传引用、摘要、缩略图还是实际 bytes。
+
+### 3.5 resource_builder.py
+
+执行前做三层裁剪：
+
+1. runner manifest 声明的 `spec.permissions`
+2. Pipeline 的 `extensions_preferences`
+3. 当前 Pipeline runner 绑定配置中选择的资源范围
+
+输出写入 `ctx.resources`，至少覆盖：
+
+- models：可调用模型 UUID、类型、能力摘要。包括 LLM、fallback LLM、rerank 等 runner config schema 中选择的模型类资源。
+- tools：可见工具 manifest，使用当前 bound plugins / MCP server 范围
+- knowledge_bases：可检索知识库列表
+- storage：plugin storage / workspace storage 权限摘要
+- files：允许读取的配置文件、知识文件摘要
+- platform_capabilities：本阶段只声明，不执行平台动作
+
+注意：旧的 unrestricted proxy action 必须二次校验，不能只靠 context 声明。AgentRunner 可用资源应来自 `ctx.resources`，不是插件 runtime 的全局能力。
+
+本阶段不接入 sandbox/skills，也不预留 runner 可见字段。后续相关分支合并后，
+执行、文件、skill、MCP 等能力应先由 Host 侧封装成普通 tool，再通过
+`ctx.resources.tools` 进入 runner；runner 不应识别或硬编码执行环境 provider。
+
+资源裁剪要尽量通用，不应只写死 local-agent：
+
+- `model-fallback-selector` 授权 primary/fallback LLM。
+- `llm-model-selector` 授权 LLM。
+- `rerank-model-selector` 授权 rerank 模型。
+- `knowledge-base-multi-selector` 授权知识库。
+- 后续新增 selector 时应在 resource builder 中统一扩展。
+
+### 3.5.1 future EventRouter 预留
+
+当前分支不实现 EBA EventRouter，但 AgentRunner 协议必须从现在开始兼容非消息事件。未来不要为消息撤回、群成员加入、好友申请各写一套 runner wrapper；统一入口应是：
+
+```text
+EventRouter -> AgentRunOrchestrator.run_from_event(event_request)
+```
+
+EBA 落地后，`ConversationStore` 不应只保存聊天消息，而应从 `EventLog` 投影生成：
+
+```text
+Platform Adapter
+  -> EventLog append raw event
+  -> ConversationProjection update message/history view when applicable
+  -> EventRouter resolve binding
+  -> AgentRunOrchestrator.run_from_event(event_request)
+  -> Context packager builds working context from projection + state + artifacts
+```
+
+这样消息事件、工具事件、群成员事件、好友申请事件可以共用同一套 run/session/state/resource
+边界；非消息事件也不需要伪造成一条用户文本消息。
+
+`event_request` 至少需要包含：
+
+- `event_type`: 稳定协议名，例如 `message.recalled`、`group.member_joined`、`friend.request_received`
+- `event_id` / `event_timestamp`
+- `event_data`: 平台原始 payload 摘要和 source event type
+- `actor`: 触发者，例如撤回操作者、新成员、好友申请人
+- `subject`: 事件作用对象，例如被撤回消息、群/成员关系、好友申请
+- `conversation`: 可选。群事件有 launcher 语义，好友申请可能还没有 conversation
+- `input`: 可选结构化输入。非消息事件允许 `text=None`、`contents=[]`
+- `binding`: 事件绑定解析出的 runner id、runner config、资源范围
+
+先保留的稳定事件名：
+
+- `message.received`
+- `message.recalled`
+- `group.member_joined`
+- `friend.request_received`
+
+这些事件名应作为插件协议的一部分保持稳定。平台原始事件名只能进入 `event_data`，不能成为 `ctx.event.event_type` 的公共契约。
+
+### 3.6 result_normalizer.py
+
+只接受 SDK v1 result：
+
+- `message.delta`
+- `message.completed`
+- `tool.call.started`
+- `tool.call.completed`
+- `state.updated`
+- `run.completed`
+- `run.failed`
+- `action.requested` 允许实验性返回，但本阶段只记录 telemetry，不执行
+
+映射：
+
+- `message.delta.data.chunk` -> `provider_message.MessageChunk`
+- `message.completed.data.message` -> `provider_message.Message`
+- `run.completed.data.message` -> `provider_message.Message`
+- `run.failed` -> 抛出受控异常，让 `ChatMessageHandler` 使用现有错误策略
+- 工具和状态事件默认不 yield 到 Pipeline，只记录 debug/telemetry
+
+防护：
+
+- 未知 type warning 后忽略
+- 单 result 序列化大小限制
+- provider message schema 校验失败转 `run.failed`
+- 插件没有输出任何消息时，按 runner failed 处理
+
+### 3.7 orchestrator.py
+
+核心入口：
+
+```python
+async def run_from_query(query: pipeline_query.Query) -> AsyncGenerator[Message | MessageChunk, None]:
+    runner_id = resolve_runner_id(query.pipeline_config)
+    descriptor = await registry.get(runner_id, bound_plugins=query.variables.get("_pipeline_bound_plugins"))
+    ctx = await context_builder.from_query(query, descriptor)
+    async for raw in plugin_connector.run_agent(...):
+        async for message in result_normalizer.normalize(raw):
+            yield message
+```
+
+必须覆盖：
+
+- runner id 不存在
+- 插件系统关闭
+- runner 不在 bound plugins 范围内
+- 插件 runtime 断连
+- runner 协议版本不兼容
+- run 超时
+- task cancellation
+
+## 4. 配置模型直接切换
+
+配置模型表达的是 Pipeline 到 runner id 的绑定，不表达插件实例。插件安装后由 plugin runtime 管理单个插件运行实例；不同 Pipeline 选择同一个 runner id 时，只是保存不同的 `runner_config[id]`，调用时随 `AgentRunContext.config` 传入。
+
+目标格式：
+
+```json
+{
+  "ai": {
+    "runner": {
+      "id": "plugin:langbot/local-agent/default",
+      "expire-time": 0
+    },
+    "runner_config": {
+      "plugin:langbot/local-agent/default": {}
+    }
+  }
+}
+```
+
+兼容读取：
+
+- 优先读 `ai.runner.id`
+- 没有 `id` 时读旧 `ai.runner.runner`
+- 旧内置 runner 名通过迁移表映射：
+  - `local-agent` -> `plugin:langbot/local-agent/default`
+  - `dify-service-api` -> `plugin:langbot/dify-agent/default`
+  - `n8n-service-api` -> `plugin:langbot/n8n-agent/default`
+  - `coze-api` -> `plugin:langbot/coze-agent/default`
+  - `dashscope-app-api` -> `plugin:langbot/dashscope-agent/default`
+  - `langflow-api` -> `plugin:langbot/langflow-agent/default`
+  - `tbox-app-api` -> `plugin:langbot/tbox-agent/default`
+
+写入策略：
+
+- 新 UI 只写 `ai.runner.id` 和 `ai.runner_config`
+- 后端 update 接口接受旧字段，但保存时归一成新格式
+- migration 最后统一落库
+
+## 5. 需要修改的 LangBot 范围
+
+必须修改：
+
+- `src/langbot/pkg/core/app.py`
+  - 增加 `agent_runner_registry` / `agent_run_orchestrator` 属性
+- `src/langbot/pkg/core/stages/build_app.py`
+  - 初始化 Agent 子系统
+- `src/langbot/pkg/pipeline/process/handlers/chat.py`
+  - 删除 `PluginAgentRunnerWrapper`
+  - 删除内置 runner 查找逻辑
+  - 调用 orchestrator
+- `src/langbot/pkg/api/http/service/pipeline.py`
+  - metadata 从 registry 生成
+- `src/langbot/pkg/plugin/connector.py`
+  - `list_agent_runners()` / `run_agent()` 增加协议校验和 bound plugin 参数
+- `src/langbot/pkg/plugin/handler.py`
+  - proxy action 二次权限校验
+- `src/langbot/pkg/pipeline/preproc/preproc.py`
+  - 不再只为 `local-agent` 构造工具、知识库、模型
+  - 对所有 agent runner 保留 multimodal input
+- `src/langbot/pkg/pipeline/pipelinemgr.py`
+  - runner name 监控改读 `runner.id`
+- `src/langbot/templates/metadata/pipeline/ai.yaml`
+  - runner 字段从 `runner` 迁到 `id`
+- `src/langbot/templates/default-pipeline-config.json`
+  - 默认 runner 改为官方 local-agent 插件 id
+- `web/src/app/home/pipelines/components/pipeline-form/PipelineFormComponent.tsx`
+  - 当前 runner 改读 `ai.runner.id`
+  - runner 配置区改写入 `ai.runner_config[id]`
+
+最终删除或停用：
+
+- `src/langbot/pkg/provider/runner.py` 的业务注册路径
+- `src/langbot/pkg/provider/runners/*` 的运行入口
+
+可以暂时保留文件作为官方插件迁移参考，但不应被运行时引用。
+
+## 6. 收尾实现顺序
+
+### Step 1：补齐宿主上下文
+
+- SDK `AgentRunContext` 保持 event-first：`event/input/delivery/resources/context/state/runtime/config/bootstrap/adapter`。
+- LangBot context builder 只从 `AgentEventEnvelope + AgentBinding` 写入稳定协议字段。
+- Pipeline adapter 可以把公开业务变量写入 `ctx.adapter.extra["params"]`；legacy/effective prompt 若保留在 `ctx.adapter.extra["prompt"]`，也只属于 adapter metadata。
+- 保持 `ctx.config` 只表达静态绑定配置。
+
+### Step 2：增强宿主 AgentRun proxy action
+
+- `invoke_llm` / `invoke_llm_stream` 通过 `run_id/query_id` 找回当前 Query。
+- 自动合并 model persisted `extra_args` 与 action-level override。
+- 自动应用 pipeline `remove-think`，并允许 action 显式 override。
+- `call_tool` 传回当前 Query，恢复旧工具调用上下文。
+- `retrieve_knowledge` 保持 `bot_uuid`、`sender_id`、`session_name` 等 settings。
+- `invoke_rerank` 使用 run-scoped model authorization。
+
+### Step 3：泛化资源构建
+
+- 按 manifest permissions + bound plugins/MCP + runner config schema 构造资源。
+- 支持 primary/fallback LLM、rerank model、KB selector。
+- 不把 local-agent 特例扩散到通用资源层。
+
+### Step 4：local-agent parity
+
+- 使用静态绑定配置 `ctx.config["prompt"]`，不读取 `ctx.adapter.extra["prompt"]`。
+- 通过 Host history API 拉取 transcript，不读取 `ctx.bootstrap.messages` 或 adapter window 字段。
+- 当前 user message 从 `ctx.input.contents` 构造，保留多模态内容。
+- RAG 只替换/插入文本部分，不丢图片/文件。
+- streaming/non-streaming 默认跟随 `runtime.metadata.streaming_supported`。
+- 首轮 fallback 成功后，tool loop 固定使用 committed model。
+- tool loop 继续传可用 tools，支持多步工具调用。
+- rerank 通过授权模型资源调用。
+
+### Step 5：端到端保护和测试
+
+- 插件无输出时按 runner failed 处理。
+- timeout/deadline 覆盖 plugin runtime、模型调用和外部 runner 调用。
+- runner 协议错误转受控错误。
+- 覆盖 local-agent 用户可见行为：普通回复、流式、工具、多步工具、KB、rerank、多模态、绑定 prompt、history API。
+
+### Step 6：官方 runner 迁移
+
+- 官方插件 ready 后移除内置 runner registry
+- 删除或隔离 provider runners 的运行引用
+- 测试旧 runner 名只能通过 migration 映射到插件 id
+
+### Step 7：历史配置迁移
+
+- 写 persistence migration
+- 更新 default pipeline config
+- 对已存在 Pipeline 执行旧字段到新字段迁移
+- 对监控/日志里的 runner 字段改用新 id
+
+## 7. 测试要求
+
+单测：
+
+- runner id parse / format
+- registry manifest 校验、失败隔离、bound plugins 过滤
+- context builder 从 query 生成完整 v1 context
+- resource builder 三层裁剪
+- result normalizer 对每种 result type 的映射
+- 旧配置 resolve 和 migration
+
+集成测试：
+
+- fake AgentRunner 插件可被 Pipeline 选择
+- streaming 输出仍能更新 message card
+- 插件异常返回用户可理解错误，不中断 runtime
+- runner 不在 bound plugins 时不可执行
+- 未授权工具 / 知识库 / 模型 proxy 调用被拒绝
+- 旧 `local-agent` Pipeline 配置迁到官方插件 id
+
+## 8. 验收标准
+
+- LangBot Pipeline 可以选择插件 AgentRunner 并完成非流式和流式回复。
+- `ChatMessageHandler` 不包含插件 runner 解析和 wrapper。
+- `PipelineService` 不直接拼插件 runner metadata。
+- 所有 runner 配置使用 `ai.runner.id` + `ai.runner_config`。
+- 插件 runtime 不为每个 Pipeline 或 runner 配置创建插件实例；`runner_config` 只作为绑定配置随 `ctx.config` 传入。
+- 主聊天路径不再通过旧内置 runner 执行业务 runner。迁移期间旧文件可以保留。
+- 插件只能访问 `ctx.resources` 授权的模型、工具、知识库和文件。
+- 宿主 action 能为 AgentRunner 调用恢复必要 Query 语义，插件不需要拿裸 Query。
+- 官方 `local-agent` 插件对外行为与旧内置 local-agent 对齐。
+- EBA 相关字段只作为 context/result 预留，不执行平台动作。
--- a/docs/agent-runner-pluginization/OFFICIAL_RUNNER_PLUGINS.md
+++ b/docs/agent-runner-pluginization/OFFICIAL_RUNNER_PLUGINS.md
@@ -0,0 +1,329 @@
+# 官方 AgentRunner 插件迁移计划
+
+本文档描述内置 `RequestRunner` 迁出 LangBot 后，官方 runner 插件如何组织、迁移和验收。
+它是 [HOST_SDK_INFRASTRUCTURE.md](./HOST_SDK_INFRASTRUCTURE.md) 和
+[AGENT_CONTEXT_PROTOCOL.md](./AGENT_CONTEXT_PROTOCOL.md) 的下游落地计划，不是 LangBot
+宿主协议的设计前提。
+
+官方 `local-agent` 可以外移，也可以重写。设计重点不是保留旧内置 runner 的内部结构，
+而是验证一个依附 LangBot host 基础设施的官方 agent 能否完整工作。同时，LangBot 的
+host 协议必须服务 Claude Code SDK、Codex、Pi Agent SDK、外部 Agent 平台等自管
+context/runtime 的 runner，不能被官方插件的实现细节绑死。
+
+当前实现已经进入过渡阶段：
+
+- LangBot 主聊天路径通过 `AgentRunOrchestrator` 调用插件化 `AgentRunner`。
+- 旧 `src/langbot/pkg/provider/runners/*` 仍保留，作为迁移参考和回退分析材料；在官方插件迁移完成前不要求删除。
+- 官方 runner 当前以独立插件目录/仓库推进，例如 `langbot-local-agent/` 和 `langbot-agent-runner/*-agent/`。不再要求先落地单一 monorepo。
+- `claude-code-agent` 与 `codex-agent` 已作为外部 harness runner MVP 接入，用来验证 Claude Code / Codex / Kimi Code 这类自管 runtime 的边界。
+
+## 1. 为什么新仓库
+
+官方 runner 插件会和 LangBot 主仓库、SDK 仓库以不同节奏迭代：
+
+- LangBot 主仓库只维护宿主协议和调度。
+- SDK 仓库维护 AgentRunner 组件和 runtime 协议。
+- 官方 runner 插件承载业务 runner 的具体实现和第三方平台适配。
+
+不要把官方 runner 插件重新绑死在 LangBot 主仓库内。允许开发期使用本地路径插件，但运行边界必须保持为：
+
+- LangBot 提供通用宿主能力：当前事件、context handles、资源授权、状态/存储、历史、artifact、模型/工具/知识库调用代理、结果归一。
+- 插件消费这些公开能力，实现具体 runner 行为。
+- LangBot 默认不把全量历史消息 inline 给 runner；runner 按需通过授权 API 拉取历史和 artifact。
+- 旧内置 runner 只作为行为对齐的基准，不作为长期运行路径。
+
+## 2. 仓库结构
+
+当前推荐策略是“官方插件可独立发布，必要时共享 SDK helper”。开发期可以采用本地多目录布局：
+
+```text
+langbot-app/
+  langbot-local-agent/
+    manifest.yaml
+    components/agent_runner/default.yaml
+    components/agent_runner/default.py
+    pkg/
+    tests/
+  langbot-agent-runner/
+    claude-code-agent/
+    codex-agent/
+    n8n-agent/
+    ...
+```
+
+后续可以把多个官方 runner 聚合进 monorepo，也可以继续独立发布。这个选择不影响协议设计；协议边界由 SDK 和 LangBot 宿主保证。
+
+如果多个 runner 出现重复逻辑，优先沉淀到 SDK 或一个明确的共享 helper 包，不要把宿主私有结构泄漏给插件。
+
+## 3. 插件命名和 runner id
+
+固定映射：
+
+| 旧 runner | 官方插件 | runner id |
+| --- | --- | --- |
+| `local-agent` | `langbot/local-agent` | `plugin:langbot/local-agent/default` |
+| `dify-service-api` | `langbot/dify-agent` | `plugin:langbot/dify-agent/default` |
+| `n8n-service-api` | `langbot/n8n-agent` | `plugin:langbot/n8n-agent/default` |
+| `coze-api` | `langbot/coze-agent` | `plugin:langbot/coze-agent/default` |
+| - | `langbot/claude-code-agent` | `plugin:langbot/claude-code-agent/default` |
+| - | `langbot/codex-agent` | `plugin:langbot/codex-agent/default` |
+| `dashscope-app-api` | `langbot/dashscope-agent` | `plugin:langbot/dashscope-agent/default` |
+| `langflow-api` | `langbot/langflow-agent` | `plugin:langbot/langflow-agent/default` |
+| `tbox-app-api` | `langbot/tbox-agent` | `plugin:langbot/tbox-agent/default` |
+
+每个插件可以后续提供多个 runner，但迁移目标的默认 runner 统一叫 `default`。
+
+## 4. 迁移优先级
+
+### Batch 1：打通协议
+
+1. `local-agent`
+2. `claude-code-agent`
+3. `codex-agent`
+4. `dify-agent`
+
+原因：
+
+- `local-agent` 覆盖模型、工具、知识库、流式、会话历史，是能力最完整的基准。
+- `claude-code-agent` / `codex-agent` 代表 Claude Code / Codex / Kimi Code 这类本地或外部 code-agent harness：它们通常自带 session、tool loop、上下文压缩和权限模型，LangBot 主要提供 IM 事件、资源投影、审计和状态指针。
+- `dify-agent` 代表外部 Agent 平台调用，配置和错误处理能验证传统 service API runner 的迁移方式。
+
+### Batch 2：迁移外部 workflow runner
+
+1. `n8n-agent`
+2. `langflow-agent`
+
+这批主要验证 webhook/workflow 输入输出、timeout、外部 conversation id。
+
+### Batch 3：迁移平台 Agent API
+
+1. `coze-agent`
+2. `dashscope-agent`
+3. `tbox-agent`
+
+这批主要验证平台特有响应格式、引用资料、文件/图片输入。
+
+## 5. 每个官方插件的组件要求
+
+每个插件至少包含：
+
+```yaml
+apiVersion: langbot/v1
+kind: AgentRunner
+metadata:
+  name: default
+  label:
+    en_US: Dify Agent
+    zh_Hans: Dify Agent
+  description:
+    en_US: Run a Dify application as a LangBot AgentRunner.
+    zh_Hans: 将 Dify 应用作为 LangBot AgentRunner 运行。
+spec:
+  config: []
+  capabilities:
+    streaming: true
+    tool_calling: false
+    knowledge_retrieval: false
+    multimodal_input: false
+    event_context: true
+    platform_api: false
+    interrupt: false
+    stateful_session: true
+  permissions:
+    models: []
+    tools: []
+    knowledge_bases: []
+    storage: ["plugin"]
+    files: []
+    platform_api: []
+execution:
+  python:
+    path: ./main.py
+    attr: DefaultAgentRunner
+```
+
+## 6. local-agent 插件方向
+
+`local-agent` 是官方插件中的重要消费者，但不是宿主协议的设计中心。它可以选择复用
+旧实现，也可以完全重写。它需要证明：一个主要依附 LangBot host 能力的 agent runner
+可以通过公开协议完成模型、工具、知识库、状态、history、artifact、上下文压缩和消息投递。
+
+LangBot core 不应为了 local-agent 保留业务编排逻辑。local-agent 的 prompt 组装、history
+拉取、summary/checkpoint、tool loop、RAG 编排、fallback、多模态处理都应在插件内完成。
+
+迁移或重写时需要覆盖旧内置 runner 的用户可见能力：
+
+- model primary/fallback 选择
+- prompt
+- knowledge-bases
+- rerank-model
+- rerank-top-k
+- function calling
+- streaming
+- multimodal input
+- conversation history
+- monitoring metadata
+
+与 LangBot 主仓库的责任边界：
+
+- LangBot 构造当前事件、结构化输入、资源授权、context handles、state/storage 能力和 delivery 能力
+- LangBot 不默认 inline 全量历史，不替插件组装最终模型上下文
+- 插件负责选择模型、拼请求、调用 LLM、处理 tool call loop、输出 result stream
+- 插件不能绕过 `ctx.resources` 调用未授权模型、工具或知识库
+
+为了保持旧内置 runner 的用户可见行为，`local-agent` 插件应消费宿主处理后的有效输入和
+受限 API，而不是读取宿主内部私有结构：
+
+- `ctx.event` / `ctx.input`：当前结构化输入，必须保留图片、文件等多模态内容。
+- `ctx.context`：history cursor、inline policy、可用 context API。
+- `AgentRunAPIProxy.history`：按需读取 transcript，而不是依赖 host 每轮强塞历史窗口。
+- `AgentRunAPIProxy.artifacts`：按需读取图片、文件、工具大结果。
+- `AgentRunAPIProxy.state` / storage：保存 summary、外部 conversation id、用户偏好等可选状态。
+- `ctx.resources`：已授权模型、工具、知识库、文件和 storage。
+- `ctx.runtime.metadata.streaming_supported`：当前 adapter 是否能消费流式输出。
+- 宿主代理 action：模型、工具、知识库、rerank 调用必须通过 `run_id` 校验资源权限。
+
+`local-agent` 不应消费 Pipeline adapter 生成的历史窗口，也不应读取
+`ctx.adapter.extra.prompt`。它应从绑定配置读取静态 `prompt`，并通过 Host
+history API 拉取 transcript。Pipeline adapter 不保留 Host-side window 兼容逻辑。
+
+建议 local-agent manifest 使用 hybrid 或 self-managed context：
+
+```yaml
+context:
+  ownership: hybrid
+  bootstrap: current_event
+  max_inline_events: 0
+  max_inline_bytes: 0
+  supports_history_pull: true
+  supports_history_search: true
+  supports_artifact_pull: true
+  owns_compaction: true
+  wants_static_context_refs: true
+```
+
+这表示：LangBot 只给当前事件和 context handles；local-agent 自己决定是否拉取历史、是否搜索、
+何时摘要、如何构造最终 prompt。
+
+### 6.1 Native Execution / Skills 后续接入
+
+本阶段不把 sandbox/skills 做成 AgentRunner 协议字段，也不预留 runner 可见字段。
+后续 sandbox/skills 分支合并后，命令执行、文件操作、skill、MCP managed process
+等能力应先由 LangBot Host 封装成 scoped tools，再通过 `ctx.resources.tools`
+暴露给 runner。
+
+这让 local-agent 只消费授权后的 Host 基础设施，而不是直接持有宿主机执行能力。
+Claude Code / Codex 这类外部 harness runner 仍可先保留自己的执行模型，但要在文档和
+配置中明确它们是否使用 LangBot 提供的工具投影。
+
+## 7. 外部 runner 插件要求
+
+外部平台 runner 迁移时遵循：
+
+- 旧配置字段尽量保持同名，便于 migration 复制
+- 输出统一转换为 `AgentRunResult`
+- 外部 API timeout 从 runner config 读取
+- 平台 conversation id 存 plugin storage 或 context runtime state，不能依赖 LangBot 内置 conversation uuid 私有结构
+- 流式支持按平台能力声明，没有流式就只发 `message.completed`
+
+### 7.1 Code-agent harness runner 要求
+
+Claude Code、Codex、Kimi Code 这类 runner 不一定通过 LangBot 的模型/工具 loop 执行。它们可以依赖自己的 harness，但仍必须遵守 LangBot 的宿主边界：
+
+- 输入来自 `ctx.event` / `ctx.input`，不能直接依赖 Pipeline 私有 `Query`。
+- LangBot 授权后的资源应被投影为 harness 可读的 context 文件、MCP 配置、skill 目录、环境变量或 CLI 参数。
+- 外部 session id、workspace、checkpoint 等跨轮次指针应写入 Host state 或 plugin storage；插件实例本身保持无状态。
+- CLI / subprocess runner 必须处理 timeout、取消、空输出、非零退出和 stderr 映射。
+- 如果外部 harness 选择使用 LangBot 托管执行能力，它应通过 scoped MCP/tool
+  投影消费 Host 授权资源；否则它属于 external harness mode，不能声称具备
+  LangBot-managed 执行隔离。
+- 外部 harness 的 permission mode、allowed/disallowed tools、MCP 配置只是一层执行约束；LangBot 仍负责调用前的资源授权、路径策略、secret 过滤和审计。发布级要求见 [SECURITY_HARDENING.md](./SECURITY_HARDENING.md)。
+
+### 7.2 SDK-owned LangBot MCP bridge
+
+Claude Code / Codex 这类外部 harness 不能直接持有 Python 进程内的
+`plugin_runtime_handler`，因此不能像 `local-agent` 一样直接调用
+`AgentRunAPIProxy`。当前轻量方案是由 SDK 提供一层 per-run MCP bridge：
+
+- `AgentRunner.create_external_mcp_bridge(ctx)` 是 runner 父类入口。
+- Bridge 由 `AgentRunAPIProxy` 和 `AgentRunContext` 构造，生命周期只覆盖当前 run。
+- Bridge 暴露 SDK 中显式注解的 `AgentRunExternalTools`，而不是扫描或导出全部 SDK action。
+- MCP tool schema 由注解和 Pydantic args model 生成；runner 插件不各自手写 LangBot tool schema。
+- stdio MCP proxy 只把外部 harness 的 MCP 调用转发回当前 run 的本地 bridge。
+- run 结束后 bridge 关闭；这不是 LangBot 主程序全局 MCP server。
+
+第一批工具保持很小：当前事件快照、history page、knowledge retrieve、authorized tool call。后续新增工具必须先进入 SDK-owned annotated surface，再由 MCP adapter 自动投影。
+
+## 8. Claude Code runner 当前形态
+
+当前 `claude-code-agent` 是最小可运行 MVP，用来证明外部 harness runner 可以接入同一套 AgentRunner 协议。
+
+### 8.1 基本行为
+
+- Runner ID：`plugin:langbot/claude-code-agent/default`
+- 执行方式：本地 Claude Code CLI print mode，默认命令为 `claude -p`
+- 默认输出：`message.completed` + `run.completed`
+- 默认权限：`permission-mode=plan`、`max-turns=1`、`disallowedTools=AskUserQuestion`
+- 默认状态：如果 Claude Code 返回 `session_id`，runner 通过 `state.updated` 写回 `external.session_id`
+- 工作目录：优先使用 binding config 的 `working-directory`，其次使用 Host state 中的 `external.working_directory`
+
+### 8.2 Context / skill / MCP 投影
+
+Claude Code runner 当前把 LangBot event-first context 投影给外部 harness：
+
+- 写入 `agent-context.json`，schema 为 `langbot.agent_runner.external_harness_context.v1`
+- 写入 `LANGBOT_CONTEXT.md`，作为人类可读摘要
+- 将 prompt prefix 指向 context 文件路径
+- 可把 binding 提供的 `skills-json` 写入 Claude Code 原生 `.claude/skills/<name>/SKILL.md`
+- 可把 binding 提供的 `mcp-config-json` 写成每次 run 的 MCP config，并通过 `--mcp-config` / `--strict-mcp-config` 传给 Claude Code
+- 可通过 `enable-langbot-mcp=true` 启用 SDK-owned per-run LangBot MCP bridge，使 Claude Code 通过 MCP 调用受限的 `AgentRunAPIProxy` 能力
+
+这些投影目前由 runner adapter 完成；长期更理想的形态是 LangBot Host 负责生成 scoped resource projection，runner 只负责适配 Claude Code 的原生目录和 CLI 参数。
+
+### 8.3 已验证能力
+
+2026-05-29 本地验证：
+
+- WebUI Debug Chat 能通过 Pipeline adapter 调用 `claude-code-agent`
+- Claude Code 能读取 LangBot context 文件并按指令输出 sentinel
+- Skill 文件可以投影到 `.claude/skills/`
+- MCP config 可以通过 binding config 投影为 Claude Code CLI 参数
+- SDK-owned per-run LangBot MCP bridge 可以被真实 Claude Code CLI 调用，并通过 `langbot_get_current_event` 读取当前 run_id
+- `external.session_id` 与 `external.working_directory` 可以写入 host-owned state，用于后续 resume
+- `codex-agent` 可通过 WebUI Debug Chat 调用本机 Codex CLI，读取 LangBot event context，并把 Codex `thread_id` 写入 host-owned state
+- SDK-owned per-run LangBot MCP bridge 可以被真实 Codex CLI 调用，并通过 `langbot_get_current_event` 读取当前 run_id
+- 对需要代理的本地运行环境，`codex-agent` 可通过 binding config 的 `environment-json` 显式传递非 secret 环境变量
+
+下一轮测试入口见 [PHASE1_QA_ACCEPTANCE_MATRIX.md](./PHASE1_QA_ACCEPTANCE_MATRIX.md)。
+
+### 8.4 当前限制
+
+- 不是发布级安全边界实现。
+- 默认只做本地 CLI 调用，不实现完整执行隔离或 workspace 生命周期。
+- 不实现 issue-centric 队列、复杂 workflow engine 或长期任务调度。
+- 不代表 Codex 发布级能力或 Kimi runner 已完成；当前只验证外部 harness runner 的协议形态。
+
+## 9. 发布和安装策略
+
+最终 LangBot 安装或升级时需要保证官方 runner 插件可用。可选方案：
+
+1. 首次启动检测缺失官方 runner 插件并提示安装。
+2. 打包发行版时预装官方 runner 插件。
+3. 在 migration 前检查对应插件是否存在，不存在则自动安装或阻止迁移。
+
+建议实现顺序：
+
+- 开发阶段使用本地路径插件。
+- 发布前支持 marketplace 安装。
+- 历史配置 migration 只在官方插件可用时执行。
+- 迁移期间保留旧内置 runner 文件，直到对应官方插件通过 parity 验收。
+
+## 10. 验收标准
+
+- 每个旧 runner 都有对应官方 AgentRunner 插件。
+- 旧 runner 配置能无损复制到新 `runner_config[id]`。
+- LangBot 主聊天路径不再通过 `RequestRunner` 执行业务 runner。
+- 官方插件测试覆盖非流式、流式、错误、timeout、配置缺失。
+- `local-agent` 插件能完成模型 fallback、tool calling、知识库检索、多模态输入、静态绑定 prompt 消费、history API 拉取、rerank。
+- `claude-code-agent` 或同类 code-agent harness runner 能消费 event-first context、投影 scoped resources、保存 external session state，并通过 WebUI Debug Chat smoke。
+- 对外行为与旧内置 local-agent runner 保持一致；代码结构不需要相同。
--- a/docs/agent-runner-pluginization/PHASE1_QA_ACCEPTANCE_MATRIX.md
+++ b/docs/agent-runner-pluginization/PHASE1_QA_ACCEPTANCE_MATRIX.md
@@ -0,0 +1,245 @@
+# Agent Runner QA 指南
+
+本文档是 agent-runner 插件化下一轮测试的唯一 QA 入口。它合并并取代旧的 Phase 1 验收矩阵与 2026-05-18 / 2026-05-29 两份本地 QA 报告。
+
+目标不是保留完整历史流水账，而是指导测试 agent 用最小但高价值的路径判断当前分支是否仍然健康。
+
+## 1. 测试边界
+
+当前主线验证的是 AgentRunner Protocol v1：
+
+```text
+event -> binding -> runner.run(ctx) -> result stream
+```
+
+本指南验证：
+
+- Host 能通过当前 Pipeline adapter 进入 event-first `run(event, binding)` 主链路。
+- Runner 来自插件 registry，而不是旧内置 runner 分支。
+- `local-agent` 能消费 Host 模型、工具、知识库、history、state、artifact 等基础设施。
+- 外部 harness runner（Claude Code / Codex）能消费 event-first context，并把 session / working directory 等指针写回 host-owned state。
+- 错误、权限裁剪、无输出、timeout 等路径不会破坏主聊天流程。
+
+本指南不验证：
+
+- Runtime Control Plane v2。
+- EventGateway / EventRouter 完整落地。
+- 发布级 path isolation、secret filtering、MCP allowlist、资源配额和 workspace cleanup。
+- 所有外部服务 runner 的真实凭据联调。
+
+这些属于后续能力或发布门槛，分别见 [RUNTIME_CONTROL_PLANE_V2.md](./RUNTIME_CONTROL_PLANE_V2.md) 与 [SECURITY_HARDENING.md](./SECURITY_HARDENING.md)。
+
+## 2. 状态定义
+
+测试报告只使用以下状态：
+
+| 状态 | 含义 |
+| --- | --- |
+| PASS | 按步骤执行，用户可见行为和日志证据都满足通过条件。 |
+| FAIL | 环境可用，但行为不满足通过条件。 |
+| BLOCKED | 凭据、CLI、外部服务、测试数据或本地配置缺失导致无法执行。必须写清阻塞原因。 |
+| N/A | 当前 runner 或平台明确不支持该能力。必须引用 manifest、文档或配置说明。 |
+
+不能使用“看起来正常”“大概通过”“基本没问题”等模糊状态。
+
+## 3. 执行顺序
+
+推荐按以下顺序执行，前一层失败时不要继续扩大测试面：
+
+1. Host / SDK / runner 单测。
+2. WebUI 登录与 Pipeline Debug Chat 基础 smoke。
+3. `local-agent` 高价值场景。
+4. Claude Code / Codex 外部 harness smoke。
+5. 权限和错误路径补充检查。
+6. 汇总 PASS / FAIL / BLOCKED，并给出下一步建议。
+
+用户可见流程必须通过 WebUI 或真实消息平台验证。API / curl 只能作为诊断证据，不能单独让 UI case PASS。
+
+## 4. 必跑基线
+
+### 4.1 单测基线
+
+在 LangBot 仓库运行：
+
+```bash
+uv run --frozen pytest tests/unit_tests/agent
+```
+
+如果本次改动只触及默认配置或 API service，也至少补跑相关目标测试，例如：
+
+```bash
+uv run pytest tests/unit_tests/api/test_pipeline_service_defaults.py
+```
+
+通过条件：
+
+- agent 单测全 PASS，或失败项已确认与本次 agent-runner 路径无关。
+- 若失败来自 `context_builder`、`orchestrator`、`session_registry`、`resource_builder`、`plugin/handler.py` 的 run action 权限路径，不应进入 UI smoke。
+
+### 4.2 环境基线
+
+用 `langbot-skills` 做环境检查：
+
+```bash
+cd "$LANGBOT_SKILLS_REPO"
+bin/lbs env doctor
+bin/lbs case list
+```
+
+`LANGBOT_SKILLS_REPO` 指向当前工作区里的 `langbot-skills` 仓库。优先使用已有 case，而不是临时发明测试路径。
+
+推荐首批 case：
+
+- `webui-login-state`
+- `pipeline-debug-chat`
+- `local-agent-basic-debug-chat`
+- `local-agent-rag-debug-chat`（改动涉及 RAG / knowledge）
+- `local-agent-plugin-tool-call-debug-chat`（改动涉及 tool / resource policy）
+
+## 5. WebUI 主链路 Smoke
+
+### 5.1 Runner registry
+
+步骤：
+
+1. 打开 WebUI Pipeline 配置页。
+2. 查看 AI runner 下拉列表。
+3. 选择 `plugin:langbot/local-agent/default`。
+4. 保存并刷新页面。
+
+通过条件：
+
+- runner 选项来自插件 registry。
+- 保存后配置仍为 `ai.runner.id` + `ai.runner_config[id]`。
+- `runner_config` 表示 binding config，不表示插件实例状态。
+- 插件没有循环重启或 metadata 加载失败。
+
+### 5.2 主聊天路径
+
+步骤：
+
+1. 使用绑定 `plugin:langbot/local-agent/default` 的 Pipeline。
+2. 在 Debug Chat 发送确定性普通文本。
+3. 查看 WebUI 回复和后端日志。
+
+通过条件：
+
+- 用户可见回复正常。
+- 后端日志显示走 `AgentRunOrchestrator` / `RUN_AGENT`。
+- 不走旧内置 local-agent 主执行分支。
+- conversation transcript 写入用户消息和助手消息。
+
+## 6. `local-agent` 高价值测试
+
+只保留最能覆盖架构边界的场景。
+
+| ID | 场景 | 操作 | 通过条件 |
+| --- | --- | --- | --- |
+| LA-01 | 绑定 prompt | 配置 system prompt 后发送文本。 | runner 使用 `ctx.config.prompt`，不读取 `ctx.adapter.extra["prompt"]`；回复体现绑定 prompt。 |
+| LA-02 | history API | 连续两轮对话，第二轮引用第一轮 marker。 | runner 通过 Host history API 或自管上下文读取历史，不依赖 bootstrap window。 |
+| LA-03 | 流式 / 非流式 | 分别用支持流式和关闭流式的路径发送文本。 | 流式 UI 不重复、不空白；非流式只输出最终消息。 |
+| LA-04 | 工具调用 | 绑定测试工具，发送会触发工具的 prompt。 | `ctx.resources.tools` 只包含授权工具；工具调用 started/completed；最终回复包含工具结果。 |
+| LA-05 | RAG | 绑定测试知识库，发送命中文档的 prompt。 | `ctx.resources.knowledge_bases` 包含所选知识库；runner 通过授权 API 检索；回复使用检索内容。 |
+| LA-06 | 多模态 | 发送图片输入。 | `ctx.input.contents` 保留图片；支持视觉模型时正常处理，不支持时受控失败。 |
+| LA-07 | fallback / 错误 | 模拟 primary 模型失败或 runner 抛错。 | fallback 或 `run.failed` 行为受控；后续请求不受影响。 |
+| LA-08 | 无输出保护 | 测试 runner 完成但不产出消息。 | 不产生空白成功回复；按受控失败或明确缺陷处理。 |
+
+Rerank、remove-think、文件输入等场景只在本次改动直接涉及时补测，不作为每轮必跑项。
+
+## 7. 外部 Harness Runner Smoke
+
+这些测试用于验证 Claude Code / Codex 这类自管 runtime 能走同一条 Host 协议路径。若本机没有 CLI、登录态或代理配置，标记 BLOCKED，不要伪造 PASS。
+
+### 7.1 Claude Code runner
+
+步骤：
+
+1. 确认 `claude` CLI 在 LangBot runtime host 上可执行。
+2. 绑定 `plugin:langbot/claude-code-agent/default`。
+3. 使用保守权限模式和确定性 prompt。
+4. 在 Debug Chat 执行一次真实 smoke。
+5. 检查 context / skill / MCP projection 和 host-owned state。
+
+通过条件：
+
+- WebUI 可见回复包含预期 sentinel。
+- context JSON schema 为 `langbot.agent_runner.external_harness_context.v1` 或当前文档声明的等价 schema。
+- context 包含 event、input、delivery、resources、context、state。
+- 如启用 skills / MCP，投影路径和配置可被 Claude Code 读取。
+- `external.session_id` / `external.working_directory` 写入 host-owned state。
+- CLI missing、nonzero exit、timeout、empty output 都转成受控 `run.failed`。
+
+### 7.2 Codex runner
+
+步骤：
+
+1. 确认 `codex` CLI 在 LangBot runtime host 上可执行。
+2. 绑定 `plugin:langbot/codex-agent/default`。
+3. 如需要代理，使用 binding config 的 `environment-json` 显式传入。
+4. 在 Debug Chat 执行一次真实 smoke。
+5. 检查 JSONL 事件、last message、host-owned state。
+
+通过条件：
+
+- WebUI 可见回复包含预期 sentinel。
+- Codex JSONL 至少包含 thread/session 起始事件、agent message、turn completed。
+- `external.session_id` / `external.working_directory` 写入 host-owned state。
+- timeout/cancel 不遗留 orphan CLI 子进程。
+- CLI missing、nonzero exit、timeout、empty output 都转成受控 `run.failed`。
+
+### 7.3 API 型外部 runner
+
+Dify、n8n、Coze、DashScope、Langflow、Tbox 等外部服务 runner 不作为每轮必跑项。只有在本次改动触及对应 runner 或凭据已经可用时执行 smoke。
+
+通过条件：
+
+- runner 可选，配置可保存。
+- 请求成功，或外部服务错误被清晰返回。
+- 外部服务凭据缺失时标记 BLOCKED，并记录缺失项。
+
+## 8. 权限与隔离补充
+
+以下优先用单测 / targeted fixture 覆盖，不要求每次通过 UI 人工构造恶意 runner。
+
+| 场景 | 推荐证据 |
+| --- | --- |
+| 未授权模型调用被拒绝 | `plugin/handler.py` run action 权限测试或目标单测。 |
+| 未授权工具调用被拒绝 | `ctx.resources.tools` 与 host action 拒绝日志。 |
+| 未授权知识库检索被拒绝 | `ctx.resources.knowledge_bases` 与 host action 拒绝日志。 |
+| run_id 结束后复用被拒绝 | session registry 注销测试。 |
+| 插件身份不匹配被拒绝 | `caller_plugin_identity` mismatch 测试。 |
+| storage/state scope 越权被拒绝 | state/storage proxy 单测。 |
+
+如果这些单测失败，不能用 WebUI 正常回复替代。
+
+## 9. 证据要求
+
+每轮测试报告至少记录：
+
+- LangBot commit、SDK commit、相关 runner 插件 commit。
+- Pipeline UUID/name、runner id、关键 runner config 摘要。
+- WebUI 截图或 Playwright 操作记录。
+- 后端日志中对应 query id / run id 的关键行。
+- `langbot-skills` case/report 路径。
+- 外部 harness runner 的 context 文件、session id、working directory、CLI 错误摘要。
+- FAIL/BLOCKED 的复现步骤和归属仓库建议。
+
+报告结论必须回答：
+
+- 是否建议继续进入下一阶段测试。
+- 是否存在主聊天路径阻塞。
+- 是否只是凭据 / 外部服务 / 本机 CLI 缺失导致 BLOCKED。
+- 是否需要进入 [SECURITY_HARDENING.md](./SECURITY_HARDENING.md) 的发布级验收。
+
+## 10. 历史高价值记录
+
+历史报告已合并为本指南，不再保留单独文档。后续若需要追溯，优先查看 `langbot-skills/reports/` 下的原始执行报告。
+
+截至 2026-05-29，已有本地 smoke 证明：
+
+- `local-agent` 可以通过 Pipeline Debug Chat 走插件化 `AgentRunOrchestrator` 主链路。
+- Claude Code runner 可以通过同一条 `run(event, binding)` 路径执行。
+- Claude Code runner 可以读取 LangBot event-first context / skill / MCP 投影，并写回 `external.session_id` / `external.working_directory`。
+- Codex runner 可以通过同一条路径执行，并把 Codex `thread_id` 写回 host-owned state。
+
+这些记录只证明本地协议闭环可用，不代表发布级 security hardening 已完成。
--- a/docs/agent-runner-pluginization/PROGRESS.md
+++ b/docs/agent-runner-pluginization/PROGRESS.md
@@ -0,0 +1,157 @@
+# Agent Runner 插件化实现进度
+
+本文档跟踪 Agent Runner 插件化的实现状态，便于快速了解当前进度。
+
+## 总体进度
+
+**当前阶段**: Phase 3.5 已完成，Event-first 基础设施已完成；2026-05-29 已通过本地 `local-agent` 与 Claude Code runner smoke。
+
+| Phase | 描述 | 状态 |
+|-------|------|------|
+| Phase 0 | PoC 验证 | ✅ 完成 |
+| Phase 1 | 核心架构（Registry、Orchestrator、上下文模型） | ✅ 完成 |
+| Phase 2 | 权限、能力声明、资源注入 | ✅ 完成 |
+| Phase 3 | 内置 runner 迁移到插件 | ✅ 完成（7/7） |
+| Phase 3.5 | Event-first 基础设施 | ✅ 完成 |
+| Phase 3.6 | 外部 harness runner 协议 smoke | ✅ 完成（Claude Code MVP） |
+| Phase 4 | EBA 事件支持 | 🔲 未开始（已预留 event-first 入口，EventGateway 由其他分支实现） |
+
+---
+
+## 详细状态
+
+### SDK 侧 (`langbot-plugin-sdk`)
+
+| 组件 | 状态 | 备注 |
+|------|------|------|
+| `AgentRunner` 组件 | ✅ | `api/definition/components/agent_runner/runner.py` |
+| `AgentRunContext` | ✅ | `api/entities/builtin/agent_runner/context.py` |
+| `AgentRunResult` | ✅ | `api/entities/builtin/agent_runner/result.py` |
+| `AgentRunnerCapabilities` | ✅ | `api/entities/builtin/agent_runner/capabilities.py` |
+| `AgentRunnerPermissions` | ✅ | `api/entities/builtin/agent_runner/permissions.py` |
+| EBA 事件模型 (Event/Actor/Subject) | ✅ | `api/entities/builtin/agent_runner/event.py` |
+| `LIST_AGENT_RUNNERS` action | ✅ | `runtime/io/handlers/control.py` |
+| `RUN_AGENT` action | ✅ | `runtime/io/handlers/control.py` |
+| `AgentRunAPIProxy` | ✅ | `api/proxies/agent_run_api.py` |
+| Pull API handlers (State/History/Event/Artifact) | ✅ | `runtime/io/handlers/plugin.py` |
+| `caller_plugin_identity` injection | ✅ | Pull API handlers inject caller identity |
+
+### LangBot 侧
+
+| 组件 | 状态 | 备注 |
+|------|------|------|
+| `AgentRunnerRegistry` | ✅ | `pkg/agent/runner/registry.py` |
+| `AgentRunOrchestrator` | ✅ | `pkg/agent/runner/orchestrator.py` - event-first `run(event, binding)` |
+| `AgentRunnerDescriptor` | ✅ | `pkg/agent/runner/descriptor.py` |
+| `AgentResourceBuilder` | ✅ | `pkg/agent/runner/resource_builder.py` |
+| `AgentRunContextBuilder` | ✅ | `pkg/agent/runner/context_builder.py` - event-first context |
+| `AgentResultNormalizer` | ✅ | `pkg/agent/runner/result_normalizer.py` |
+| `ConfigMigration` | ✅ | `pkg/agent/runner/config_migration.py` |
+| `PipelineAdapter` | ✅ | `pkg/agent/runner/pipeline_adapter.py` - Query → Event + Binding |
+| `run_from_query()` → `run(event, binding)` | ✅ | Pipeline 路径委托到 event-first path |
+| `ChatMessageHandler` 集成 | ✅ | 使用 orchestrator 替代 wrapper |
+| `PipelineService` 集成 | ✅ | 从 registry 获取 runner metadata |
+| Plugin connector | ✅ | `list_agent_runners()` / `run_agent()` |
+| `EventLogStore` | ✅ | `pkg/agent/runner/event_log_store.py` |
+| `TranscriptStore` | ✅ | `pkg/agent/runner/transcript_store.py` |
+| `ArtifactStore` | ✅ | `pkg/agent/runner/artifact_store.py` |
+| `PersistentStateStore` | ✅ | `pkg/agent/runner/persistent_state_store.py` |
+| History / Event pull APIs | ✅ | Orchestrator + APIProxy |
+| Artifact pull APIs | ✅ | Orchestrator + APIProxy |
+| State pull APIs | ✅ | Orchestrator + APIProxy |
+| `artifact.created` / `state.updated` handling | ✅ | Event-first handlers in orchestrator |
+| Pipeline path host capability coverage | ✅ | EventLog/Transcript/ArtifactStore/PersistentStateStore |
+| External harness state handoff | ✅ | `external.session_id` / `external.working_directory` 写入 PersistentStateStore |
+
+### 官方插件
+
+> 外部服务插件仓库：`/home/glwuy/langbot-app/langbot-agent-runner/`  
+> 本地 Local Agent 插件仓库：`/home/glwuy/langbot-app/langbot-local-agent/`
+
+| 插件 | 状态 | 备注 |
+|------|------|------|
+| `local-agent` | ✅ 已完成 | 核心功能：模型、工具、知识库、流式、会话 |
+| `dify-agent` | ✅ 已完成 | 支持 chat/agent/workflow 三种应用类型 |
+| `n8n-agent` | ✅ 已完成 | Webhook 调用，支持 basic/jwt/header 认证 |
+| `coze-agent` | ✅ 已完成 | 多模态输入，思维链处理 |
+| `claude-code-agent` | ✅ MVP smoke 通过 | 本地 Claude Code CLI；context / skill / MCP 投影；host-owned resume state |
+| `dashscope-agent` | ✅ 已完成 | 阿里云百炼，支持 agent/workflow 两种模式 |
+| `langflow-agent` | ✅ 已完成 | SSE 流式，tweaks 配置支持 |
+| `tbox-agent` | ✅ 已完成 | 蚂蚁百宝箱，多模态输入 |
+
+**注意**: LangBot 内置 runner（`pkg/provider/runners/`）已停用，文件顶部添加了 DEPRECATED 注释。
+
+### 本地验收
+
+| 日期 | 范围 | 状态 | 证据 |
+|------|------|------|------|
+| 2026-05-29 | `local-agent` Pipeline Debug Chat | ✅ PASS | `langbot-skills/reports/2026-05-29-17-59-00-462-08-00-pipeline-debug-chat.md` |
+| 2026-05-29 | `claude-code-agent` Pipeline Debug Chat | ✅ PASS | `langbot-skills/reports/2026-05-29-18-03-31-169-08-00-pipeline-debug-chat.md` |
+| 2026-05-29 | Claude Code context / skill / MCP projection | ✅ PASS | `langbot-skills/reports/claude-code-agent-resource-context-20260529.md` |
+| 2026-05-29 | Claude Code resume state | ✅ PASS | `langbot-skills/reports/claude-code-agent-real-workdir-20260529.md` |
+
+---
+
+## 未完成但仍属本分支收尾
+
+以下项目属于本分支收尾工作：
+
+- [x] Smoke / manual validation — `local-agent`、Claude Code MVP、Codex MVP 已通过本地 WebUI smoke
+- [ ] Docs final QA
+- [ ] Claude Code runner 文档、安装和 marketplace 发布准备
+
+---
+
+## 非本分支范围
+
+以下能力由其他分支负责：
+
+| 能力 | 负责分支 | 备注 |
+|------|----------|------|
+| EventGateway implementation | event branch | 完整事件网关、事件路由、持久化管理 |
+| Event subscription / notification | event branch | 事件订阅、推送通知 |
+| BindingResolver persistence UI | 其他模块 | 绑定配置的持久化 UI |
+| Event router integration | event branch | 与 BindingResolver 集成 |
+| Scheduler / background event source | 其他模块 | 定时任务、后台事件源 |
+| Security release hardening | 后续 release gate | 路径隔离、权限边界、secret、MCP/skill 投影策略、资源配额、审计 |
+| Codex / Kimi runner 全量接入 | 后续 runner 插件工作 | Codex MVP 已打通；Codex 发布级能力、Kimi runner 和全量 hardening 仍不扩大到当前协议闭环 |
+| Issue-centric 产品模型 / 异步队列 / workflow engine | 后续产品架构 | 不属于当前 agent-runner plugin 协议闭环 |
+
+---
+
+## 待办事项
+
+### 高优先级
+
+- [x] 工具详情 API — SDK `GET_TOOL_DETAIL` action、`AgentRunAPIProxy.get_tool_detail()` 与 Host 侧授权校验已接通
+- [x] Pipeline `run_from_query()` → `run(event, binding)` — 已完成
+- [x] EventLog / Transcript / ArtifactStore / PersistentStateStore — 已完成
+- [x] History / Event / Artifact / State pull APIs — 已完成
+- [x] `caller_plugin_identity` 验证路径 — 已完成
+
+### 低优先级 / 未来
+
+- [ ] EBA 完整集成 — EventGateway、event subscription、event notification 由其他分支实现
+- [ ] 平台 API 动作执行 — `action.requested` 结果类型存在但未执行
+- [ ] 安全发布级 hardening — 作为生产默认启用前的 release gate，不阻塞当前协议闭环
+
+---
+
+## 关键决策记录
+
+| 日期 | 决策 |
+|------|------|
+| 2026-05-10 | Phase 0 集成测试通过，SDK v1 协议验证成功 |
+| 2026-05-13 | Phase 3 完成：所有 7 个官方 runner 插件迁移完成 |
+| 2026-05-23 | Phase 3.5 完成：`run_from_query()` 委托到 event-first `run(event, binding)`，Pipeline path 获得 host capabilities |
+| 2026-05-29 | 本地 `local-agent` 与 `claude-code-agent` 通过 WebUI smoke；Claude Code runner 验证 external harness context 投影和 host-owned resume state |
+
+---
+
+## 相关文档
+
+- [README.md](./README.md) — 总体设计
+- [PHASE1_QA_ACCEPTANCE_MATRIX.md](./PHASE1_QA_ACCEPTANCE_MATRIX.md) — Agent Runner QA 指南和下一轮测试入口
+- [OFFICIAL_RUNNER_PLUGINS.md](./OFFICIAL_RUNNER_PLUGINS.md) — 官方插件仓库计划
+- [SECURITY_HARDENING.md](./SECURITY_HARDENING.md) — 安全发布级 hardening 后续门槛
+- [IMPLEMENTATION_PLAN.md](./IMPLEMENTATION_PLAN.md) — 具体实施细节
--- a/docs/agent-runner-pluginization/PROTOCOL_V1.md
+++ b/docs/agent-runner-pluginization/PROTOCOL_V1.md
@@ -0,0 +1,702 @@
+# LangBot AgentRunner Protocol v1
+
+本文档定义 LangBot Host 与插件 SDK / Runtime / AgentRunner 之间的协议合同。它优先描述”稳定接口应是什么”，不描述具体落地任务。
+
+## 当前状态
+
+**Protocol v1 已在当前分支落地**：
+
+- ✅ SDK 定义 `AgentRunnerManifest`、`AgentRunContext`、`AgentRunResult`、`AgentRunAPIProxy`
+- ✅ Runtime 支持 `LIST_AGENT_RUNNERS` 和 `RUN_AGENT`
+- ✅ Host 支持 `run_id` session authorization
+- ✅ Host 能从当前 Pipeline 入口生成 event-first context
+- ✅ `messages` 降级为 optional bootstrap
+- ✅ `max-round` 不出现在协议实体中，也不属于 Host / Pipeline 语义；类似参数若存在，由 runner 自己解释 `ctx.config`
+- ✅ Proxy 覆盖 model、tool、knowledge、state/storage
+- ✅ History / Event / Artifact / State API 已落地
+- ✅ EventLog / Transcript / ArtifactStore / PersistentStateStore 已落地
+- ✅ `local-agent` 与 Claude Code runner 已通过本地 WebUI smoke，验证 host-infra runner 与外部 harness runner 共享同一协议路径
+
+## 1. 协议目标
+
+Protocol v1 要解决四件事：
+
+- LangBot 如何发现插件提供的 AgentRunner。
+- LangBot 如何把一次事件调用封装成 `AgentRunContext`。
+- AgentRunner 如何以事件流形式返回运行结果。
+- AgentRunner 如何通过受限 API 访问 LangBot host 能力。
+
+Protocol v1 不定义：
+
+- LangBot 内部如何持久化 AgentBinding。
+- AgentRunner 内部如何组装 prompt、压缩历史、管理 memory。
+- 官方 local-agent 的具体实现。
+- Pipeline 的长期配置模型。
+- 发布级安全 hardening 的完整实现；当前只定义 Host 侧资源、权限、状态和审计边界，release gate 见 [SECURITY_HARDENING.md](./SECURITY_HARDENING.md)。
+
+## 2. 参与方
+
+| 名称 | 职责 |
+| --- | --- |
+| LangBot Host | 事件入口、绑定解析、权限、资源、存储、生命周期、结果投递。 |
+| Plugin Runtime | 加载插件，响应 Host 的 runner discovery 和 run 调用。 |
+| AgentRunner | 插件提供的 agent 执行组件。 |
+| AgentRunAPIProxy | AgentRunner 访问 Host 能力的受限 API。 |
+| AgentBinding | Host 内部的事件到 runner 绑定配置，不直接暴露给 SDK。 |
+
+`AgentBinding` 只影响 Host 构造出的 `ctx.config`、`ctx.resources`、`ctx.context` 和 `ctx.delivery`。SDK 不需要知道 binding 的持久化形态。
+
+外部 harness runner（Claude Code、Codex、Kimi Code 等）仍然是 `AgentRunner`。Protocol v1 只要求它们消费 event-first `AgentRunContext`、返回 `AgentRunResult`，并通过 Host 授权的 state/storage/artifact APIs 保存跨轮次指针。它们内部可以继续使用自己的 session、tool loop、MCP、上下文压缩和权限模型。
+
+## 3. Discovery 协议
+
+### 3.1 LIST_AGENT_RUNNERS
+
+Host 调用 Plugin Runtime 获取当前插件暴露的 runner 列表。该请求不需要额外 payload。
+
+Runtime 返回：
+
+```python
+class ListAgentRunnersResponse(BaseModel):
+    runners: list[AgentRunnerManifest]
+```
+
+### 3.2 AgentRunnerManifest
+
+```python
+class AgentRunnerManifest(BaseModel):
+    id: str
+    name: str
+    label: I18nObject
+    description: I18nObject | None = None
+    capabilities: AgentRunnerCapabilities
+    permissions: AgentRunnerPermissions
+    context: AgentRunnerContextPolicy
+    config_schema: list[DynamicFormItemSchema] = []
+    metadata: dict[str, Any] = {}
+```
+
+字段要求：
+
+- `id` 必须稳定，推荐 `plugin:author/name/runner`。
+- `name` 是插件内 runner 名称，例如 `default`。
+- `config_schema` 只描述绑定配置表单，不代表插件实例状态。
+- `metadata` 只能放展示、诊断、非稳定扩展信息。
+
+### 3.3 Capabilities
+
+```python
+class AgentRunnerCapabilities(BaseModel):
+    streaming: bool = False
+    tool_calling: bool = False
+    knowledge_retrieval: bool = False
+    multimodal_input: bool = False
+    event_context: bool = True
+    platform_api: bool = False
+    interrupt: bool = False
+    stateful_session: bool = False
+    self_managed_context: bool = True
+```
+
+语义：
+
+- `streaming`: runner 可以返回 `message.delta`。
+- `tool_calling`: runner 可能调用 Host tool APIs。
+- `knowledge_retrieval`: runner 可能调用 Host knowledge APIs。
+- `multimodal_input`: runner 可以处理非纯文本 input / artifact。
+- `event_context`: runner 理解 event-first 输入。
+- `platform_api`: runner 可能请求平台动作。
+- `interrupt`: runner 支持取消或中断。
+- `stateful_session`: runner 可能维护跨 run 会话状态。
+- `self_managed_context`: runner 自己管理 working context，Host 不应默认 inline 历史。
+
+### 3.4 Permissions
+
+```python
+class AgentRunnerPermissions(BaseModel):
+    models: list[Literal["invoke", "stream", "rerank"]] = []
+    tools: list[Literal["detail", "call"]] = []
+    knowledge_bases: list[Literal["list", "retrieve"]] = []
+    history: list[Literal["page", "search"]] = []
+    events: list[Literal["get", "page"]] = []
+    artifacts: list[Literal["metadata", "read"]] = []
+    storage: list[Literal["plugin", "workspace", "binding"]] = []
+    files: list[Literal["config", "knowledge"]] = []
+    platform_api: list[str] = []
+```
+
+Manifest permissions 是 runner 需要的最大能力。实际可用资源还要经过 Host binding policy 和当前 run scope 裁剪。
+
+### 3.5 Context Policy
+
+```python
+class AgentRunnerContextPolicy(BaseModel):
+    ownership: Literal["self_managed", "host_bootstrap", "hybrid"] = "self_managed"
+    bootstrap: Literal["none", "current_event", "recent_tail", "summary_tail"] = "current_event"
+    max_inline_events: int = 0
+    max_inline_bytes: int = 0
+    supports_history_pull: bool = True
+    supports_history_search: bool = False
+    supports_artifact_pull: bool = True
+    owns_compaction: bool = True
+    wants_static_context_refs: bool = True
+```
+
+Host 不使用该声明给 runner inline 历史窗口。默认原则：
+
+- Host 不得默认 inline 全量历史。
+- Host 只 inline 当前 event / input 和 context handles。
+- Runner 拥有 working context assembly。
+- Runner 可在授权后通过 Host history / event / artifact / state APIs 拉取更多上下文。
+- `max-round` 或类似窗口参数不属于 Protocol v1 字段，也不属于 Pipeline / Host 通用语义；如果某个 runner 需要，应由 runner 自己解释 `ctx.config`。
+
+## 4. Run 协议
+
+### 4.1 RUN_AGENT
+
+Host 调用 Runtime：
+
+```python
+class AgentRunRequest(BaseModel):
+    runner_id: str
+    runner_name: str
+    context: AgentRunContext
+```
+
+Runtime 返回 `AgentRunResult` 异步流。
+
+插件运行时可以继续在底层 transport 中使用 `plugin_author`、`plugin_name`、`runner_name` 定位组件，但协议语义以 `runner_id` 和 `context` 为准。
+
+### 4.2 AgentRunContext
+
+```python
+class AgentRunContext(BaseModel):
+    run_id: str
+    trigger: AgentTrigger
+    event: AgentEventContext
+    conversation: ConversationContext | None = None
+    actor: ActorContext | None = None
+    subject: SubjectContext | None = None
+    input: AgentInput
+    delivery: DeliveryContext
+    resources: AgentResources
+    context: ContextAccess
+    state: AgentRunState
+    runtime: AgentRuntimeContext
+    config: dict[str, Any] = {}
+    bootstrap: BootstrapContext | None = None
+    adapter: AdapterContext | None = None
+    metadata: dict[str, Any] = {}
+```
+
+核心约束：
+
+- `event` 是必选字段，Protocol v1 是 event-first。
+- `input` 表示当前事件的主输入，不等于历史消息。
+- `bootstrap` 是可选字段；LangBot Host 默认不填历史窗口。
+- `adapter` 只放 Pipeline adapter 字段，runner 不应依赖它做长期能力。
+- `config` 是 Host binding config，不是插件实例状态。
+
+### 4.3 AgentTrigger
+
+```python
+class AgentTrigger(BaseModel):
+    type: str
+    source: Literal["platform", "webui", "api", "scheduler", "system", "pipeline_adapter"]
+    timestamp: int | None = None
+```
+
+`trigger.type` 应与 `event.event_type` 一致或更粗粒度。例如 Pipeline 兼容入口触发消息时：
+
+```json
+{
+  "type": "message.received",
+  "source": "pipeline_adapter"
+}
+```
+
+### 4.4 AgentEventContext
+
+```python
+class AgentEventContext(BaseModel):
+    event_id: str
+    event_type: str
+    event_time: int | None = None
+    source: str
+    source_event_type: str | None = None
+    raw_ref: RawEventRef | None = None
+    data: dict[str, Any] = {}
+```
+
+要求：
+
+- `event_type` 使用 LangBot 稳定协议名，例如 `message.received`。
+- 平台原始事件名放入 `source_event_type`。
+- 大型原始 payload 必须放入 `raw_ref` 或 artifact，不应直接塞入 `data`。
+
+### 4.5 Actor / Subject / Conversation
+
+```python
+class ConversationContext(BaseModel):
+    conversation_id: str | None = None
+    thread_id: str | None = None
+    launcher_type: str | None = None
+    launcher_id: str | None = None
+    bot_id: str | None = None
+    workspace_id: str | None = None
+
+class ActorContext(BaseModel):
+    actor_type: str
+    actor_id: str | None = None
+    actor_name: str | None = None
+    metadata: dict[str, Any] = {}
+
+class SubjectContext(BaseModel):
+    subject_type: str
+    subject_id: str | None = None
+    data: dict[str, Any] = {}
+```
+
+示例：
+
+- 消息事件：actor 是发消息的人，subject 是当前消息。
+- 入群事件：actor 是新成员或邀请人，subject 是群/成员关系。
+- 定时事件：actor 可以是 system，subject 是 schedule。
+
+### 4.6 AgentInput
+
+```python
+class AgentInput(BaseModel):
+    text: str | None = None
+    contents: list[ContentElement] = []
+    attachments: list[ArtifactRef] = []
+    message_chain: dict[str, Any] | None = None
+```
+
+要求：
+
+- 文本、多模态、附件都属于当前 event input。
+- 大文件、图片、音频、工具大结果应以 artifact ref 传递。
+- `message_chain` 是平台兼容字段，不应成为长期稳定依赖。
+
+### 4.7 DeliveryContext
+
+```python
+class DeliveryContext(BaseModel):
+    surface: str
+    reply_target: dict[str, Any] | None = None
+    supports_streaming: bool = False
+    supports_edit: bool = False
+    supports_reaction: bool = False
+    max_message_size: int | None = None
+    platform_capabilities: dict[str, Any] = {}
+```
+
+Runner 可以参考 delivery 能力决定返回 `message.delta`、`message.completed` 或 `action.requested`。
+
+### 4.8 ContextAccess
+
+```python
+class ContextAccess(BaseModel):
+    conversation_id: str | None = None
+    thread_id: str | None = None
+    latest_cursor: str | None = None
+    event_seq: int | None = None
+    transcript_seq: int | None = None
+    has_history_before: bool = False
+    inline_policy: InlineContextPolicy
+    available_apis: ContextAPICapabilities
+```
+
+`ContextAccess` 告诉 runner：Host inline 了什么、没有 inline 什么、如果需要更多上下文应该通过哪些 API 拉取。
+它不是 Host 的业务上下文编排策略，而是 runner 按需读取上下文的入口说明。
+
+```python
+class InlineContextPolicy(BaseModel):
+    mode: Literal["none", "current_event", "recent_tail", "summary_tail"]
+    delivered_count: int = 0
+    source_total_count: int | None = None
+    messages_complete: bool = False
+    reason: str | None = None
+
+class ContextAPICapabilities(BaseModel):
+    history_page: bool = False
+    history_search: bool = False
+    event_get: bool = False
+    event_page: bool = False
+    artifact_metadata: bool = False
+    artifact_read: bool = False
+    state: bool = False
+    storage: bool = False
+```
+
+### 4.9 BootstrapContext
+
+```python
+class BootstrapContext(BaseModel):
+    messages: list[Message] = []
+    summary: str | None = None
+    artifacts: list[ArtifactRef] = []
+    metadata: dict[str, Any] = {}
+```
+
+约束：
+
+- `bootstrap.messages` 不是 LangBot Host 的默认行为。
+- 自管 context runner 默认应收到空 bootstrap。
+- Host 不应为了”帮 agent 更聪明”而自动拼接完整 transcript。
+- 类似历史窗口策略应由具体 runner 自己解释 binding config，并通过 Host history API 拉取历史；new/official runners 不应依赖 Pipeline adapter 下发历史窗口。
+
+### 4.10 RuntimeContext
+
+```python
+class AgentRuntimeContext(BaseModel):
+    host: str = "langbot"
+    langbot_version: str | None = None
+    trace_id: str
+    deadline_at: float | None = None
+    locale: str | None = None
+    timezone: str | None = None
+    static_refs: dict[str, StaticContextRef] = {}
+    metadata: dict[str, Any] = {}
+```
+
+`static_refs` 用于 KV cache 友好的静态上下文引用，例如 system policy、tool schema、resource manifest 的 hash/version。
+
+### 4.11 State
+
+```python
+class AgentRunState(BaseModel):
+    conversation: dict[str, Any] = {}
+    actor: dict[str, Any] = {}
+    subject: dict[str, Any] = {}
+    runner: dict[str, Any] = {}
+```
+
+State 是可选 host-owned snapshot。Runner 也可以完全自管状态。
+
+## 5. Resources
+
+```python
+class AgentResources(BaseModel):
+    models: list[ModelResource] = []
+    tools: list[ToolResource] = []
+    knowledge_bases: list[KnowledgeBaseResource] = []
+    files: list[FileResource] = []
+    storage: StorageResource = StorageResource()
+    platform_capabilities: dict[str, Any] = {}
+```
+
+资源列表是本次 run 的授权结果。History / Event / Artifact 访问通过 permissions、`ctx.context.available_apis` 和 Host 侧 run session 校验控制，不作为可枚举 resource list 暴露。Runner 只能通过 `AgentRunAPIProxy` 访问这些能力。
+
+## 6. Result Stream
+
+### 6.1 AgentRunResult
+
+```python
+class AgentRunResult(BaseModel):
+    run_id: str
+    type: str
+    data: dict[str, Any] = {}
+    sequence: int | None = None
+    timestamp: int | None = None
+```
+
+### 6.2 稳定 result types
+
+| type | 说明 |
+| --- | --- |
+| `message.delta` | 流式消息片段。 |
+| `message.completed` | 完整消息。 |
+| `tool.call.started` | runner 开始工具调用的可观测事件。 |
+| `tool.call.completed` | runner 完成工具调用的可观测事件。 |
+| `artifact.created` | runner 生成 artifact。 |
+| `state.updated` | runner 请求更新 host-owned state。 |
+| `action.requested` | runner 请求 Host 执行平台动作。 |
+| `run.completed` | run 正常结束。 |
+| `run.failed` | run 失败。 |
+
+Host 必须忽略未知 result type 并记录 warning，除非该 type 明确要求强校验。
+
+### 6.3 message.delta
+
+```json
+{
+  "type": "message.delta",
+  "data": {
+    "chunk": {
+      "role": "assistant",
+      "content": "hel"
+    }
+  }
+}
+```
+
+### 6.4 message.completed
+
+```json
+{
+  "type": "message.completed",
+  "data": {
+    "message": {
+      "role": "assistant",
+      "content": "hello"
+    }
+  }
+}
+```
+
+### 6.5 state.updated
+
+```json
+{
+  "type": "state.updated",
+  "data": {
+    "scope": "conversation",
+    "key": "external.session_id",
+    "value": "abc"
+  }
+}
+```
+
+Host 必须校验 scope、key、value 大小和 JSON 可序列化性。
+
+### 6.6 action.requested
+
+```json
+{
+  "type": "action.requested",
+  "data": {
+    "action": "message.edit",
+    "target": {"message_id": "..."},
+    "payload": {"text": "..."}
+  }
+}
+```
+
+Protocol v1 只定义表达方式。Host 是否执行 action 取决于 platform API 能力、binding policy、审批策略和实现阶段。
+
+## 7. AgentRunAPIProxy
+
+所有 proxy action 必须携带 `run_id`。Host 必须校验：
+
+- active run session 存在。
+- caller plugin identity 匹配。
+- resource 在本次 `ctx.resources` 中授权。
+- scope 不越界。
+- payload size / rate limit / deadline 合法。
+
+### 7.1 Model APIs
+
+```python
+await api.models.invoke(model_id, messages, tools=None, extra_args=None)
+await api.models.stream(model_id, messages, tools=None, extra_args=None)
+await api.models.rerank(model_id, query, documents, top_k=None)
+```
+
+### 7.2 Tool APIs
+
+```python
+await api.tools.get_detail(tool_name)
+await api.tools.call(tool_name, parameters)
+```
+
+### 7.3 Knowledge APIs
+
+```python
+await api.knowledge.retrieve(kb_id, query_text, top_k=5, filters=None)
+```
+
+### 7.4 History APIs
+
+```python
+await api.history.page(
+    conversation_id=None,
+    before_cursor=None,
+    after_cursor=None,
+    limit=50,
+    direction="backward",
+    include_artifacts=False,
+)
+
+await api.history.search(
+    query,
+    filters=None,
+    top_k=10,
+)
+```
+
+History API 返回 Transcript projection，不返回原始平台 payload。
+
+### 7.5 Event APIs
+
+```python
+await api.events.get(event_id)
+await api.events.page(before_cursor=None, limit=50)
+```
+
+Event API 返回稳定 event envelope 或受限 raw ref，不默认返回大 payload。
+
+### 7.6 Artifact APIs
+
+```python
+await api.artifacts.metadata(artifact_id)
+await api.artifacts.read_range(artifact_id, offset=0, length=65536)
+await api.artifacts.open_stream(artifact_id)
+```
+
+Artifact API 必须支持大小限制、MIME 校验、过期时间和授权范围。
+
+### 7.7 State / Storage APIs
+
+```python
+await api.state.get(scope, key)
+await api.state.set(scope, key, value)
+await api.state.delete(scope, key)
+
+await api.storage.get(area, key)
+await api.storage.set(area, key, value)
+await api.storage.delete(area, key)
+await api.storage.list(area, prefix=None)
+```
+
+建议区分：
+
+- `state`: 小型 JSON 状态，适合 conversation / actor / runner / binding。
+- `storage`: blob 或较大数据，适合插件私有数据、workspace 数据、checkpoint。
+
+### 7.8 Platform APIs
+
+```python
+await api.platform.request_action(action, target, payload)
+```
+
+平台 API 是受限能力。默认不开放。需要 runner manifest、binding policy、用户审批策略同时允许。
+
+## 8. 错误模型
+
+Host API 错误统一返回：
+
+```python
+class AgentAPIError(BaseModel):
+    code: str
+    message: str
+    retryable: bool = False
+    details: dict[str, Any] = {}
+```
+
+建议 code：
+
+| code | 说明 |
+| --- | --- |
+| `unauthorized` | 未授权访问资源或 scope。 |
+| `not_found` | 资源不存在或对当前 runner 不可见。 |
+| `deadline_exceeded` | 超过 run deadline。 |
+| `payload_too_large` | 请求或响应过大。 |
+| `rate_limited` | Host 限流。 |
+| `invalid_argument` | 参数错误。 |
+| `runtime_error` | Host 或下游能力错误。 |
+
+Runner 失败使用 `run.failed`：
+
+```json
+{
+  "type": "run.failed",
+  "data": {
+    "code": "runner.error",
+    "message": "failed to call external agent",
+    "retryable": false
+  }
+}
+```
+
+## 9. Timeout 与 Cancellation
+
+Host 在 `ctx.runtime.deadline_at` 中下发总 deadline。SDK proxy 必须用该 deadline 限制单次 action timeout。
+
+取消语义：
+
+- Host 可以取消 active run。
+- Runtime 应尽力中断 runner。
+- Runner 支持中断时应返回或触发 `run.failed`，code 为 `cancelled`。
+- Host 必须 unregister active run session。
+
+## 10. Security 与 Guardrail
+
+Protocol v1 的安全边界在 Host：
+
+- Runner 不能直接访问未授权 model/tool/kb/history/artifact/storage。
+- SDK 本地校验只提升开发体验，不能替代 Host 校验。
+- 所有 resource id 对 runner 来说都是 opaque。
+- 默认只能访问当前 conversation / thread 的 history。
+- 跨会话、workspace 级 history 或 storage 必须额外授权。
+- 大 payload 必须 artifact 化。
+- Host 必须记录 run_id、runner_id、action、resource、scope、result。
+
+对外部 harness runner，边界进一步拆分为：
+
+- Host 在调用前完成 binding/resource policy 裁剪、路径策略、secret 过滤和审计记录。
+- Runner plugin 把授权后的 context/resource projection 适配为目标 harness 的 context 文件、MCP 配置、skill 目录、环境变量或 CLI 参数。
+- Claude Code / Codex / Kimi Code 等外部 harness 的 native permission mode、allowed/disallowed tools 和执行隔离策略只是额外执行约束，不能替代 Host 侧授权。
+- 外部 session id、working directory、checkpoint 等跨轮次指针应作为小型 JSON state 保存，例如 `external.session_id`、`external.working_directory`。
+
+完整路径隔离、MCP allowlist、secret redaction、配额、workspace 清理和发布级安全测试不属于当前 Protocol v1 smoke 闭环，详见 [SECURITY_HARDENING.md](./SECURITY_HARDENING.md)。
+
+Host 不负责业务编排：
+
+- 不拼接全量历史。
+- 不替 runner 做业务 prompt assembly。
+- 不内置 agent memory 策略。
+- 不内置 tool loop 业务流程。
+- 不内置上下文压缩策略。
+
+这些能力可以由官方或第三方 AgentRunner 插件实现，并通过公开 Host APIs 消费 LangBot 的状态、历史、存储、artifact、模型、工具和知识库能力。
+
+## 11. Pipeline Adapter
+
+Pipeline 是当前入口 adapter，不是协议中心。
+
+**当前分支已实现**：
+
+- ✅ `PipelineAdapter.query_to_event(query)` — 从 `Query` 构造 `AgentEventEnvelope`
+- ✅ `PipelineAdapter.pipeline_config_to_binding(query, runner_id)` — 从 Pipeline config 构造临时 AgentBinding
+- ✅ `run_from_query()` 委托到 `run(event, binding)`
+- ✅ runner-specific config 从 Pipeline 当前绑定配置透传到 `AgentBinding.runner_config` / `ctx.config`
+- ✅ Query-only 字段放入 `adapter` context
+
+Pipeline adapter 负责：
+
+- 从 `Query` 构造 `AgentEventContext`。
+- 从 Pipeline config 构造临时 AgentBinding。
+- 从当前 runner binding config 构造 `ctx.config`。
+- 保留必要的 legacy adapter metadata，但不定义历史窗口、prompt 组装或 agentic context 策略。
+- 后续若需要传递 preprocessing / hook 后的有效指令，应通过 Host prompt/instruction
+  package pull API 暴露能力位和引用，而不是继续把 prompt 推入 `ctx.adapter.extra`。
+- 将 Query-only 字段放入 `adapter`。
+
+Runner 不应长期依赖 `adapter`。新 runner 应只依赖 event-first context 和 Host APIs。
+
+## 12. 最小 v1 完成标准
+
+Protocol v1 已在当前分支完成：
+
+- ✅ SDK 定义 `AgentRunnerManifest`、`AgentRunContext`、`AgentRunResult`、`AgentRunAPIProxy`
+- ✅ Runtime 支持 `LIST_AGENT_RUNNERS` 和 `RUN_AGENT`
+- ✅ Host 支持 `run_id` session authorization
+- ✅ Host 能从当前 Pipeline 入口生成 event-first context
+- ✅ `messages` 降级为 optional bootstrap
+- ✅ `max-round` 不出现在协议实体中，也不属于 Host / Pipeline 语义
+- ✅ Proxy 至少覆盖 model、tool、knowledge、state/storage
+- ✅ History / event / artifact API 已落地
+- ✅ EventLog / Transcript / ArtifactStore / PersistentStateStore 已落地
+- ✅ 外部 harness runner 最小 smoke 已落地：Claude Code runner 能消费 event-first context、返回消息、写回 `external.session_id` / `external.working_directory`
+
+## 13. 开放问题
+
+- `AgentBinding` 是否需要进入 SDK 文档作为只读诊断信息，还是完全 Host 内部。
+- `TranscriptItem` 的最小字段集如何定义。
+- ArtifactStore 是否复用现有 BinaryStorage backend，还是引入独立实体。
+- State 与 Storage 的边界是否需要更强类型。
+- `platform_api` action 的审批模型如何表达。
+- 多 runner 并发处理同一 event 时，result delivery 的冲突策略如何定义。
+- Host 侧 scoped MCP / skill / workspace projection 是否需要从 runner config 上移为一等 resource projection API。
--- a/docs/agent-runner-pluginization/README.md
+++ b/docs/agent-runner-pluginization/README.md
@@ -0,0 +1,125 @@
+# Agent Runner 插件化文档入口
+
+本文档是 agent-runner 插件化工作的路由页。具体设计拆到独立文档中维护，避免把 LangBot 宿主架构、SDK 协议、上下文管理、EBA 预留和官方 runner 迁移混在同一份 README 里。
+
+## 本分支目标
+
+**本分支目标：AgentRunner 外化 / 插件化基础设施**
+
+本分支只做 LangBot 作为 Agent Host 的基础能力建设：
+
+- LangBot 与 SDK 的稳定协议合同（Protocol v1）
+- Host-side `AgentEventEnvelope` / `AgentBinding` 模型
+- `run(event, binding)` event-first 入口
+- `PipelineAdapter`：Pipeline Query → AgentEventEnvelope + AgentBinding
+- EventLog / Transcript / ArtifactStore / PersistentStateStore
+- History / Event / Artifact / State pull APIs
+- SDK runtime forwarding pull APIs + `caller_plugin_identity` 验证路径
+
+## 本分支不实现
+
+以下能力由其他分支负责，本分支只预留 integration point：
+
+- **EventGateway**：完整事件网关实现、事件路由、事件持久化管理
+- **Event subscription / Event notification**：事件订阅、推送通知
+- **BindingResolver persistence UI**：绑定配置的持久化 UI 和 event router 集成（如由其他模块负责）
+- **Scheduler / Background event source**：定时任务、后台事件源
+- **Runtime control plane v2**：runtime registry、heartbeat、task queue、daemon claim、progress/cancel 和 runtime audit
+
+EventGateway 在本文档中描述为 **future integration point**，由外部 event branch 提供。本分支只定义 host-side envelope/binding models 和 `run(event, binding)` orchestrator 入口。
+
+## 当前状态
+
+**当前 Pipeline 是入口 adapter，不再是 agent runner 设计核心。**
+
+当前主入口仍可由 Pipeline 触发，但内部已转换成 event-first path：
+
+1. `run_from_query()` 使用 `PipelineAdapter.query_to_event(query)` 转换为 `AgentEventEnvelope`
+2. `run_from_query()` 使用 `PipelineAdapter.pipeline_config_to_binding(query, runner_id)` 转换为 `AgentBinding`
+3. `run_from_query()` 委托到 `run(event, binding, bound_plugins, adapter_context)`
+
+Pipeline path 已获得 event-first host capabilities：
+- EventLog / Transcript 写入
+- ArtifactStore 注册
+- PersistentStateStore 状态持久化
+- History / Event / Artifact / State pull APIs 可用
+
+## 设计文档
+
+| 文档 | 关注点 |
+| --- | --- |
+| [PROTOCOL_V1.md](./PROTOCOL_V1.md) | LangBot Host 与 SDK / Runtime / AgentRunner 的协议合同：run context、result stream、proxy actions、错误和 adapter 边界。 |
+| [HOST_SDK_INFRASTRUCTURE.md](./HOST_SDK_INFRASTRUCTURE.md) | LangBot 宿主能力、SDK 协议、runner 发现、绑定、权限、状态、存储、生命周期和调用链。 |
+| [AGENT_CONTEXT_PROTOCOL.md](./AGENT_CONTEXT_PROTOCOL.md) | Agent-owned context 方向：事件到来时 LangBot 传什么，agent 如何按需拉取更多历史 / artifact / state，以及如何支持 KV cache 友好的上下文管理。 |
+| [EVENT_BASED_AGENT.md](./EVENT_BASED_AGENT.md) | EBA 预留：事件模型、事件来源、触发绑定、非消息事件如何复用 AgentRunner 调度。**标注为 future design note**。 |
+| [RUNTIME_CONTROL_PLANE_V2.md](./RUNTIME_CONTROL_PLANE_V2.md) | Agent Platform v2 / runtime 管控面预留：Host 新增 runtime registry、heartbeat、task queue、daemon 执行和 audit；管理插件构建在这些 Host 能力之上。**标注为 future design note**。 |
+| [OFFICIAL_RUNNER_PLUGINS.md](./OFFICIAL_RUNNER_PLUGINS.md) | 官方 runner 插件迁移，包括 local-agent 和外部 runner。它是下游落地计划，不是 LangBot 基础能力设计的前置约束。 |
+| [PHASE1_QA_ACCEPTANCE_MATRIX.md](./PHASE1_QA_ACCEPTANCE_MATRIX.md) | Agent Runner QA 指南：保留最高价值测试路径，指导 agent 开展下一轮 WebUI / runner smoke 验证。 |
+| [SECURITY_HARDENING.md](./SECURITY_HARDENING.md) | 安全发布级 hardening 的后续发布门槛：路径隔离、权限边界、secret、资源配额、MCP / skill 投影和审计。 |
+| [PROGRESS.md](./PROGRESS.md) | 当前实现进度、已验收能力、未完成收尾和非本分支范围。 |
+
+## 工作拆分
+
+### 1. LangBot + SDK 基础设施
+
+目标是把 LangBot 从内置 runner 执行器变成 agent host：
+
+- LangBot 与 SDK 的稳定协议合同
+- runner manifest / descriptor / registry
+- agent binding 与配置解析
+- run orchestration 和生命周期管理
+- resource authorization 与 `run_id` 级权限校验
+- host-owned state / storage / event log / transcript / artifact 能力
+- SDK `AgentRunner`、`AgentRunContext`、`AgentRunResult`、`AgentRunAPIProxy`
+
+协议合同详见 [PROTOCOL_V1.md](./PROTOCOL_V1.md)。
+
+详见 [HOST_SDK_INFRASTRUCTURE.md](./HOST_SDK_INFRASTRUCTURE.md)。
+
+### 2. Agent-owned context
+
+LangBot 不应成为最终 agentic context manager。它应提供事实源、默认上下文引用和按需读取 API；agent 或其背后的 runtime 负责历史剪裁、摘要、召回和 KV cache 策略。
+
+`max-round` 这类历史窗口参数不应作为目标协议继续扩展；如果某个 runner 仍需要类似策略，应由该 runner 的 manifest/config schema 暴露为 binding config。
+
+详见 [AGENT_CONTEXT_PROTOCOL.md](./AGENT_CONTEXT_PROTOCOL.md)。
+
+### 3. Event Based Agent（Future）
+
+消息只是事件的一种。后续 `message.received`、`message.recalled`、`group.member_joined`、`friend.request_received` 等事件都应能通过统一事件 envelope 触发 AgentRunner。
+
+**本分支不实现 EBA 完整能力，只预留：**
+- event-first envelope (`AgentEventEnvelope`)
+- AgentBinding model
+- `run(event, binding)` 入口
+- PipelineAdapter（当前 AgentEventEnvelope / AgentBinding 的 Pipeline adapter source）
+
+详见 [EVENT_BASED_AGENT.md](./EVENT_BASED_AGENT.md)。
+
+### 4. 官方 runner 插件
+
+官方 `local-agent` 和外部 runner 迁移是下游工作。它们需要依附 LangBot 提供的宿主能力，但不应反过来决定宿主协议。
+
+`local-agent` 可以外移，也可以重写。验收重点是它能完整消费 LangBot 的模型、工具、知识库、存储、事件、history API 和 result stream，而不是保留旧内置 runner 的内部结构。
+
+详见 [OFFICIAL_RUNNER_PLUGINS.md](./OFFICIAL_RUNNER_PLUGINS.md)。
+
+### 5. Runtime Control Plane v2（Future）
+
+当前 AgentRunner v1 主线只负责 `event -> binding -> runner.run(ctx) -> result stream`。
+后续 Agent Platform v2 可以在 Host 侧新增 runtime registry、heartbeat、task queue、daemon claim、progress/cancel 和 runtime audit。
+
+在这些 Host 能力之上，可以构建独立 agent 管控面插件；插件负责 UI、策略和编排体验，runtime/task 的事实源仍由 Host 持有。
+
+详见 [RUNTIME_CONTROL_PLANE_V2.md](./RUNTIME_CONTROL_PLANE_V2.md)。
+
+## 已确认决策
+
+- 一个插件可以声明多个 `AgentRunner` 组件，每个组件独立暴露 manifest、配置 schema、能力和权限。
+- 插件本身按单实例、无状态执行单元理解；不同绑定不创建多个插件实例。
+- 绑定只保存 runner id 和绑定配置，不代表插件实例状态。
+- LangBot 可以提供 host-owned state / storage 能力，让 runner 把状态寄宿在 LangBot；但这应该是授权能力，不是强制要求。
+- 官方 runner 插件是协议消费者，不是协议设计的优先约束。
+- Pipeline 是当前入口 adapter，不是未来架构中心。
+- EventGateway 是 future integration point，由外部 event branch 提供。
+- Runtime control plane 是 v2 Host capability layer，不阻塞当前 AgentRunner v1 主线；agent 管控面插件应构建在该 Host 能力层之上。
--- a/docs/agent-runner-pluginization/RUNTIME_CONTROL_PLANE_V2.md
+++ b/docs/agent-runner-pluginization/RUNTIME_CONTROL_PLANE_V2.md
@@ -0,0 +1,225 @@
+# Agent Runtime Control Plane V2
+
+本文档记录后续 Agent Platform / runtime 管控面的设计方向。它是当前讨论中的 **v2 文档**，但这里的 v2 指 Host capability layer / runtime control plane，不是 `AgentRunner Protocol v2`，也不属于当前 AgentRunner Protocol v1 插件化主线的交付范围。
+
+## 1. 结论
+
+当前主线应继续收口 AgentRunner v1：
+
+```text
+message/event -> binding -> runner.run(ctx) -> result stream
+```
+
+Runtime Control Plane v2 在 Host 侧新增 runtime control plane：
+
+```text
+event -> task -> runtime selection -> daemon claim -> execute -> progress/audit/result
+```
+
+在 Runtime Control Plane v2 之上，可以构建独立的 agent 管控面插件。插件负责 UI、策略和编排体验；runtime、task、heartbeat、audit 的事实源必须属于 LangBot Host，而不是插件私有 storage。
+
+## 2. 不影响 v1 主线
+
+v2 不应改变 AgentRunner v1 的基本契约：
+
+- 现有 `local-agent`、Dify、n8n、Coze 等 runner 仍可按 v1 直接执行。
+- 当前 Claude Code / Codex MVP runner 可以继续作为本机 subprocess 开发路径。
+- Host v1 已有的 event-first context、resource authorization、history / event / artifact / state / storage pull APIs 继续保留。
+- Pipeline 仍只是当前入口 adapter，不参与 v2 runtime 管控面的设计中心。
+
+v2 只是在 Host 上新增一层可选能力。需要管控面的 runner 或管理插件可以声明使用它；不需要的 runner 不受影响。
+
+## 3. 当前 Host 能力与缺口
+
+当前 Host 已经具备 v2 的基础设施底座：
+
+- `AgentEventEnvelope` / `AgentBinding`
+- run-scoped resource authorization
+- EventLog / Transcript / ArtifactStore / PersistentStateStore
+- History / Event / Artifact / State / Storage pull APIs
+- AgentRunner result stream 和受控错误回流
+- binding config 与 host-owned state
+
+这些能力足够支持一次 `runner.run(ctx)` 内的安全执行，但不足以承担完整 runtime 管控面。
+
+v2 还需要 Host 新增：
+
+- runtime registry：runtime id、所属 workspace、所在机器、provider 能力、状态。
+- capability discovery：`claude` / `codex` / 其它 CLI 是否存在、版本、登录状态、执行隔离能力。
+- heartbeat / liveness：runtime 在线、忙闲、最后心跳、可用 slot。
+- task queue：enqueue、claim、start、progress、complete、fail、cancel。
+- workspace mapping：LangBot workspace / project 如何映射到 runtime 上的真实目录、仓库或挂载。
+- secret / env projection：按授权向 runtime 投影 token、代理、MCP 配置、技能和环境变量。
+- runtime audit：stdout、stderr、事件流、产物、失败原因、执行耗时、使用量。
+- control API / UI：选择 runtime、测试 runtime、查看状态、下线、取消任务、重试任务。
+
+## 4. 角色边界
+
+### 4.1 LangBot Host
+
+Host 是事实源和控制面内核：
+
+- 保存 runtime / task / heartbeat / audit 状态。
+- 做权限校验、资源裁剪、workspace 绑定和审计。
+- 决定任务是否可被某 runtime claim。
+- 将执行结果统一回写到 event / transcript / artifact / state。
+
+Host 不应内置具体 agent CLI 的复杂业务逻辑，也不应把某个官方 runner 的特殊行为提升为通用协议。
+
+### 4.2 Agent 管控面插件
+
+管理插件是 v2 control plane 的产品化管理层：
+
+- 展示 runtime、agent、task、进度、失败、审计。
+- 提供策略配置，例如默认 runtime、provider 偏好、并发限制、重试策略。
+- 触发 runtime 测试、任务取消、任务重试、手动分配。
+
+管理插件不应把 runtime/task 的事实源放进自己的 plugin storage。它应该调用 Host v2 API。
+
+### 4.3 Runtime daemon / worker
+
+Runtime daemon 负责真实执行：
+
+- 在所在机器上检测 CLI 和版本。
+- 管理工作目录、仓库、挂载、临时文件和进程。
+- 从 Host claim 任务，执行后上报 progress / complete / fail。
+- 将 stdout / stderr / artifacts / session id 回流 Host。
+
+Claude Code、Codex、OpenCode、Gemini CLI 等 provider 适配逻辑应主要落在 daemon / worker 或 provider adapter 中。
+
+## 5. 部署形态
+
+### 5.1 uv / local embedded
+
+用户用 `uv` 或源码直接启动 LangBot 时，LangBot 进程所在机器就是 runtime host。
+
+这种模式下可以直接检测用户主机上的 `claude`、`codex` 等 CLI，也可以直接 subprocess 执行。它适合个人开发和本地 smoke，但不应作为团队级管控面的唯一形态。
+
+### 5.2 Docker embedded
+
+用户用 Docker 启动 LangBot 时，runtime host 是容器，不是宿主机。
+
+因此：
+
+- 只能检测容器内的 `claude`、`codex`。
+- 只能使用容器内的 HOME、PATH、凭据和挂载目录。
+- 如果镜像未安装 CLI，或未挂载认证文件 / workspace，CLI runner 会不可用。
+
+Docker embedded 可以作为高级部署选项，但需要用户显式安装 CLI、挂载工作区和凭据。Host 不应假设 Docker 容器能自动访问宿主机 CLI。
+
+### 5.3 Sidecar daemon
+
+推荐的 v2 形态是 sidecar daemon：
+
+```text
+LangBot Host (Docker or server)
+  <-> Runtime daemon on user host / worker host
+        -> claude / codex / other CLI
+```
+
+这种模式下，LangBot 可以跑在 Docker 内，runtime daemon 跑在宿主机或独立 worker 机器上。daemon 负责检测本机 CLI、持有本机凭据和工作区访问能力。
+
+### 5.4 Remote runtime
+
+团队场景可以使用远端 runtime：
+
+- 开发机、构建机、云主机或专用 worker。
+- 多个 workspace 可绑定不同 runtime。
+- Host 只通过 registry / task queue / heartbeat / audit 进行管理。
+
+### 5.5 API-only agent
+
+Dify、n8n、Coze、DashScope 等 API 型 runner 不依赖本地 CLI。它们可以继续按 v1 直接执行，也可以在未来按需要接入 v2 task/audit。
+
+## 6. 与 Claude Code / Codex MVP runner 的关系
+
+当前 Claude Code / Codex runner 是 v1 runner：
+
+```text
+runner.run(ctx) -> subprocess("claude" / "codex")
+```
+
+它们适合验证 Host context 投影、state resume、result stream 和基础 CLI 调用，但有明确限制：
+
+- 命令只在 LangBot runtime host 上执行。
+- Docker 环境只能看到容器内 CLI。
+- 没有 runtime registry、heartbeat、task queue、cancel、workspace lifecycle。
+- 不提供发布级执行隔离、secret projection、团队级 audit。
+
+v2 不需要删除这些 runner。它们可以继续作为 dev / MVP 路径存在。未来若接入管控面，可以增加 runtime-managed 执行模式：
+
+```text
+runner binding -> Host task -> runtime daemon -> provider CLI -> Host result
+```
+
+## 7. 最小 v2 API 草案
+
+以下仅记录能力边界，不代表最终 API 命名。
+
+Runtime：
+
+- `runtime.register`
+- `runtime.heartbeat`
+- `runtime.list`
+- `runtime.get`
+- `runtime.disable`
+- `runtime.capabilities.report`
+- `runtime.capabilities.probe`
+
+Task：
+
+- `task.enqueue`
+- `task.claim`
+- `task.start`
+- `task.progress`
+- `task.complete`
+- `task.fail`
+- `task.cancel`
+- `task.retry`
+
+Workspace：
+
+- `runtime.workspace.bind`
+- `runtime.workspace.unbind`
+- `runtime.workspace.resolve`
+
+Audit / artifacts：
+
+- `task.log.append`
+- `task.artifact.create`
+- `task.events.page`
+
+这些 API 应由 Host 提供，并受 workspace、runtime、binding、actor 和 plugin identity 约束。
+
+## 8. 管控面插件可以构建的能力
+
+基于 v2 Host 能力，可以实现一个类似 Multica 的 agent 管控面插件：
+
+- runtime 列表、在线状态、CLI 能力、版本、认证状态。
+- agent profile 与 runtime/provider 绑定。
+- 任务看板、任务详情、进度流、失败原因、重试和取消。
+- workspace 到 runtime 目录 / 仓库的映射管理。
+- provider capability 测试，例如 Claude Code / Codex 是否可执行。
+- 审计视图：输入、输出、工具、artifact、stdout/stderr、session id。
+- 策略配置：并发、队列、默认 runtime、fallback runtime、权限模式。
+
+该插件应该是 Host v2 的消费者，而不是 Host v2 的替代品。
+
+## 9. 设计原则
+
+- v1 先稳定，v2 可选叠加。
+- Host 保存事实源，插件提供管理体验。
+- Runtime daemon 执行具体 CLI 和本机资源访问。
+- Docker 不假设拥有宿主机 CLI；需要 sidecar 或显式挂载。
+- Pipeline 不进入 v2 控制面中心。
+- 直接 subprocess runner 可保留，但只作为 local/dev/MVP 路径。
+- 发布级能力必须经过 Host 权限、审计和资源边界。
+
+## 10. 待定问题
+
+- runtime daemon 与 Host 的认证模型：workspace token、device token、还是 scoped PAT。
+- task 与 AgentRunner binding 的映射关系：由 binding 直接 enqueue，还是由独立 task policy 决定。
+- runtime capability schema 的稳定字段：provider、version、login status、execution isolation、workspace access、slot。
+- secret projection 的边界：Host 存储、用户本机存储、或外部 secret manager。
+- Docker compose 是否提供官方 sidecar daemon 示例。
+- v2 UI 是核心前端的一部分，还是完全由管理插件提供。
--- a/docs/agent-runner-pluginization/SECURITY_HARDENING.md
+++ b/docs/agent-runner-pluginization/SECURITY_HARDENING.md
@@ -0,0 +1,73 @@
+# Agent Runner Security Hardening
+
+本文档记录 agent-runner 插件化进入生产发布前需要补齐的安全与稳定加固项。
+
+## 状态
+
+**当前结论：暂不塞进本阶段 agent-runner plugin 协议闭环。**
+
+本阶段目标是验证 LangBot 可以通过统一的 `run(event, binding)` 协议接入 `local-agent` 与外部 harness runner（如 Claude Code runner），并能传递事件、上下文、资源句柄、状态和结果流。
+
+安全发布级 hardening 是后续 release gate，不应阻塞当前协议闭环，但必须作为进入生产默认启用前的验收条件。
+
+## 责任边界
+
+### LangBot Host 负责
+
+- 资源授权：决定某个 `run_id` / binding 可以访问哪些模型、RAG、MCP、skill、artifact、history、state。
+- 资源投影：只把授权后的资源句柄、配置片段或上下文文件传给 runner。
+- 路径策略：限制 workspace / context file / artifact 的允许路径和清理策略。
+- Secret 策略：过滤环境变量、配置、日志和 transcript 中的 secret。
+- 运行约束：配置超时、轮次、并发、配额、输出大小和取消路径。
+- 审计记录：记录事件、绑定、资源授权、runner 调用、外部 harness session id、关键错误和结果摘要。
+
+### Runner Plugin 负责
+
+- 遵守 LangBot 下发的 binding config、授权资源和运行约束。
+- 将 LangBot 资源投影成目标 runner 可消费的形式，例如 context 文件、MCP 配置、环境变量或 CLI 参数。
+- 不把长期状态保存在插件实例内；需要跨轮次保存的外部 session id / working directory 等状态应写入 host-owned state。
+- 对外部进程做最小必要封装，包括命令参数构造、超时、取消、输出解析和错误映射。
+
+### 外部 Harness 负责
+
+Claude Code、Codex、Kimi Code 等外部 harness 可以继续使用自身的权限模型、工具 allow / deny 规则、MCP 加载策略、session/resume 机制和沙箱能力。
+
+但外部 harness 不是 LangBot 的唯一安全边界。LangBot 仍必须在调用前完成资源授权、路径限制、secret 过滤和审计记录。
+
+## 当前 MVP 可接受边界
+
+当前阶段可以接受以下前提：
+
+- 由可信管理员配置 runner binding。
+- 工作目录和 context 输出目录为显式配置或 host 生成路径。
+- 外部 runner 默认使用保守权限，例如 plan / no-write 模式或禁用高风险工具。
+- 通过 timeout、max turns、输出长度和进程取消降低失控风险。
+- 通过 host-owned state 保存 `external.session_id`、`external.working_directory` 等 resume 所需指针。
+
+这些前提足够做本地 E2E 与协议验收，不等同于生产发布完成。
+
+## Release Gate Checklist
+
+进入生产默认启用前，需要补齐：
+
+- Path isolation：workspace allowlist、路径规范化、防止 `..` 逃逸、context / artifact 清理。
+- Permission boundary：runner 能力声明、binding 级资源授权、run 级权限校验。
+- Secret handling：环境变量白名单、配置脱敏、日志和 transcript redaction。
+- MCP policy：MCP server allowlist、scoped token、tool allow / deny、危险工具审计。
+- Skill projection policy：skill 来源验证、只读投影、版本和摘要记录。
+- Process isolation：进程组管理、取消、超时、CPU / 内存 / 输出配额。
+- State lifecycle：session id、workspace、artifact 的过期、清理、迁移和审计。
+- Audit first-class：事件、资源授权、外部命令、session id、结果摘要可追踪。
+- UI / Admin control：管理员能看到 runner 权限、风险提示、资源绑定和禁用入口。
+- Test matrix：路径逃逸、secret 泄漏、权限拒绝、timeout、取消、MCP deny、resume、cleanup、audit 完整性。
+
+## 非当前范围
+
+以下内容不属于本阶段协议闭环：
+
+- 完整异步队列与 issue-centric 产品模型。
+- 复杂 workflow engine。
+- Codex / Kimi runner 全量接入。
+- EBA 分支完整迁移和联调。
+- 发布级安全 hardening 的完整实现。
+
--- a/docs/review/box-architecture.md
+++ b/docs/review/box-architecture.md
@@ -1,595 +0,0 @@
-# Box 系统架构深度分析
-
-> 更新日期: 2026-06-02
-> 状态更新: 自部署社区版已具备发布条件（box 可选、降级完善、无迁移欠债）；工具调用循环上限、配额遍历异步化、`host_path` 挂载白名单等已落地。剩余多租户 / 安全硬化项见 [SaaS 阻塞项清单](./box-issues.md)。
-> 分支: `feat/sandbox` (LangBot + langbot-plugin-sdk)
-> 相关文档: [SaaS 阻塞项](./box-issues.md) | [Session 作用域](./box-session-scope.md) | [Runtime 对比](./box-vs-plugin-runtime.md) | [测试覆盖](./box-test-coverage.md) | [toB 分析](./box-tob-analysis.md)
-
---
-
-## 1. 全局架构
-
-```
-┌──────────────────────────────────────────────────────────────────┐
-│                       LangBot 主进程                              │
-│                                                                   │
-│  LocalAgentRunner ──> ToolManager ──> NativeToolLoader            │
-│       │                    │              │                       │
-│       │                    │      exec / read / write / edit      │
-│       │                    │              glob / grep             │
-│       │                    │                                      │
-│       │                    ├──> MCPLoader ──> BoxStdioSession     │
-│       │                    │       (shared 容器, 多 process)       │
-│       │                    │                                      │
-│       │                    ├──> SkillToolLoader (activate 工具)    │
-│       │                    │                                      │
-│       │                    ├──> SkillAuthoringToolLoader          │
-│       │                    │                                      │
-│       │                    └──> PluginToolLoader                  │
-│       │                                                           │
-│  BoxService (门面)                                                 │
-│    ├─ Profile 管理 (locked 字段)                                   │
-│    ├─ Host mount 校验 (allowed_mount_roots)                        │
-│    ├─ Workspace quota 检查                                         │
-│    ├─ 输出截断 (head+tail)                                         │
-│    ├─ Session ID 模板解析 (resolve_box_session_id)                 │
-│    ├─ 技能挂载组装 (build_skill_extra_mounts)                      │
-│    ├─ 重连循环 (_reconnect_loop, 指数退避)                          │
-│    └─ BoxRuntimeConnector                                          │
-│         ├─ 心跳 loop (20s ping)                                    │
-│         └─ ActionRPCBoxClient                                      │
-│              │  Action RPC (stdio 或 WebSocket)                    │
-│                                                                    │
-│  SkillManager (skill_mgr)                                          │
-│    └─ 从 Box runtime 拉取 skills, 不可用时回落 data/skills          │
-└──────────────────────────────────────────────────────────────────┘
-               │
-               ▼
-┌──────────────────────────────────────────────────────────────────┐
-│              Box Runtime 进程 (SDK 侧)                            │
-│                                                                   │
-│  BoxServerHandler (Action RPC 处理, INIT 配置注入)                  │
-│       │                                                           │
-│  BoxRuntime (session 管理 / 进程生命周期 / TTL reaper)              │
-│       │       └─ session.managed_processes: dict[pid, _ManagedProcess]
-│       │                                                           │
-│  Backend (启动时根据 box.backend 配置选择):                          │
-│    DockerBackend ──┐                                              │
-│    PodmanBackend ──┤── CLISandboxBackend                          │
-│    NsjailBackend ──┘  (本地 CLI 或 fallback 到容器内 CLI)            │
-│    E2BBackend         (云沙箱, 需要 E2B_API_KEY)                    │
-│                                                                   │
-│  BoxSkillStore                                                    │
-│    ├─ list / get / create / update / delete                       │
-│    ├─ scan_skill_directory / read_skill_file / write_skill_file   │
-│    └─ preview_skill_zip / install_skill_zip (zip 或 GitHub)        │
-│                                                                   │
-│  aiohttp 单端口服务 (默认 :5410):                                    │
-│    /rpc/ws                                       — Action RPC      │
-│    /v1/sessions/{id}/managed-process/ws          — 默认 process     │
-│    /v1/sessions/{id}/managed-process/{pid}/ws    — 指定 process     │
-└──────────────────────────────────────────────────────────────────┘
-               │
-               ▼
-┌──────────────────────────────────────────────────────────────────┐
-│  容器 / 沙箱 (Docker/Podman 容器, nsjail sandbox, 或 E2B 远程沙箱)  │
-│  - 隔离文件系统 / 网络 / PID 命名空间                                │
-│  - 资源限制 (CPU, 内存, PID 数, 可选 workspace 配额)                 │
-│  - 主挂载 (host_path → mount_path) + 任意条 extra_mounts             │
-│      └─ Skills 通过 extra_mounts 挂在 /workspace/.skills/<name>     │
-│  - exec: 用户命令在此执行                                            │
-│  - managed process: 多个长驻进程并存 (MCP Server / 自定义服务)        │
-└──────────────────────────────────────────────────────────────────┘
-```
-
-**核心设计原则**:
- Box Runtime 作为独立进程运行，通过 Action RPC 与 LangBot 主进程通信，两者复用 SDK 的 IO 层（Handler → Connection → Controller）
- 一个 session_id 对应一个容器/沙箱实例。同一 session 内可并存多条 mount 与多个 managed process
- Skill / 默认 exec / MCP Server 共享同一个 session 容器（详见 [box-session-scope.md](./box-session-scope.md)）
-
---
-
-## 2. LangBot 侧模块
-
-### 2.1 BoxService (`pkg/box/service.py`, 722 行)
-
-应用层门面，协调 Profile、安全校验、配额、连接、Skill 挂载与 Session 模板：
-
-主要公开方法（按定义顺序）：
-
-```
-BoxService
-  ├─ initialize()                              连接 Box Runtime + 默认 workspace 准备
-  ├─ _on_runtime_disconnect(connector)         触发重连
-  ├─ _reconnect_loop(connector)                指数退避重连
-  ├─ available (property)                      连接状态
-  │
-  ├─ resolve_box_session_id(query)             从 pipeline 模板解析 session_id
-  ├─ build_skill_extra_mounts(query)           组装 pipeline-bound skill 的挂载列表
-  │
-  ├─ execute_tool(parameters, query)           Agent 调用 exec 时的入口
-  │    ├─ _apply_profile / build_spec
-  │    ├─ _validate_host_mount
-  │    ├─ _enforce_workspace_quota (phase=pre)
-  │    ├─ client.execute(spec)
-  │    ├─ _enforce_workspace_quota (phase=post)
-  │    └─ _truncate (stdout/stderr)
-  │
-  ├─ execute_spec_payload(spec_payload, ...)   内部入口（其他 loader 调用）
-  ├─ create_session(spec_payload, ...)         显式创建 session
-  ├─ start_managed_process(session_id, ...)    启动 managed process
-  ├─ get_managed_process(session_id, pid)      查询进程状态（pid 默认 'default'）
-  ├─ stop_managed_process(session_id, pid)     单独停止某个 managed process
-  ├─ get_managed_process_websocket_url(...)    返回 WS attach URL
-  │
-  ├─ list_skills() / get_skill(name)           Skill 元数据
-  ├─ create_skill / update_skill / delete_skill  Skill CRUD
-  ├─ scan_skill_directory(path)                扫描目录
-  ├─ list_skill_files / read_skill_file / write_skill_file
-  ├─ preview_skill_zip / install_skill_zip     zip / GitHub 安装
-  │
-  ├─ shutdown() / dispose()                    清理：RPC SHUTDOWN + 进程终止
-  ├─ get_status() / get_sessions() / get_recent_errors()
-  └─ get_system_guidance()                     LLM 系统提示
-```
-
-**Profile 系统**: 4 个内置 Profile（`default` / `offline_readonly` / `network_basic` / `network_extended`），`locked` frozenset 字段不可被 LLM 覆盖。参数合并顺序：Profile defaults → LLM 请求参数 → locked 强制值。
-
-**输出截断**: 默认 4000 字符上限，保留前 60% + 后 40%，中间插入 `[...truncated...]`。
-
-**Skill 挂载合并**: `execute_tool()` 调用时，`build_skill_extra_mounts(query)` 会把当前 pipeline-bound 的所有 skill 的 `package_root` 作为 `extra_mounts` 加入 BoxSpec，挂在 `/workspace/.skills/<name>`。LLM 通过 `activate` 工具显式激活某个 skill 后，工具调用才允许引用这个 skill 的虚拟路径。
-
-### 2.2 BoxRuntimeConnector (`pkg/box/connector.py`, 357 行)
-
-管理与 Box Runtime 的通信连接：
-
- **本地 stdio**: Unix/macOS 默认路径，fork `python -m langbot_plugin.cli.__init__ box -s --ws-control-port {port}` 子进程（与 plugin runtime 统一走 `lbp` CLI 入口）
- **本地 subprocess + WS**: Windows 本地（asyncio ProactorEventLoop 不支持 stdio pipe）
- **远程 WebSocket**: Docker 部署 / `box.runtime.endpoint` 显式配置时，连接 `ws://{host}:{port}/rpc/ws`
- **同步等待**: `asyncio.Event` + `wait_for(timeout=30s)` 模式确认连接
- **心跳**: `_heartbeat_loop()` 每 20s 调用 `ping()`，失败仅 DEBUG 日志（断开检测靠 connection close）
- **重连**: `runtime_disconnect_callback` 由 BoxService 提供，触发 `_reconnect_loop`
- **INIT 注入**: 连接建立后立即下发当前 `box.*` 配置子树（剔除 `runtime` 私有字段），Runtime 据此初始化 backend
-
-> **历史改进**: 2026-04-16 版本本文档曾列 P0 「Box 无心跳 / 无重连」，已修复（commit `2dfd9d5d`、`c6882cf`、`5029d9c` 等）。
-
-### 2.3 BoxWorkspaceSession 工具 (`pkg/box/workspace.py`, 413 行)
-
-此文件目前提供两类能力：
-
-1. **路径与命令重写工具函数** — `normalize_host_path` / `rewrite_mounted_path` / `unwrap_venv_path` / `rewrite_venv_command` / `infer_workspace_host_path`，被 MCP loader 与 Skill 路径解析共用。
-2. **`BoxWorkspaceSession`** — 围绕 BoxService 的轻量包装，专供 MCP-in-Box 场景使用（管理一个共享 session 的 session_id、构建挂载 payload、stage host 文件到共享 workspace）。
-
-**变化点**: 早期 Skill exec 会为每个 skill 创建独立 BoxWorkspaceSession（独占 session）；当前实现已转为 `extra_mounts` 模式，Skill 不再独占容器，只追加挂载。这部分 wrapping 逻辑已从 native loader 移除。
-
-### 2.4 policy.py (`pkg/box/policy.py`, 98 行) — 仍是死代码
-
-三层安全策略设计（`SandboxPolicy` / `ToolPolicy` / `ElevatedPolicy`），全项目无任何导入或调用。详见 [SaaS 阻塞项 S2](./box-issues.md)。
-
-### 2.5 SkillManager (`pkg/skill/manager.py`, 186 行)
-
-```
-SkillManager
-  ├─ initialize()                  调用 reload_skills()
-  ├─ reload_skills()               先从 Box runtime list_skills()，
-  │                                 不可用则回落 data/skills/ 扫描
-  ├─ refresh_skill_from_disk()     单 skill 重新加载
-  ├─ get_skill_by_name(name)
-  └─ get_managed_skills_root()     返回 Box 视角的 skills_root 路径
-```
-
-skill 元数据通过 `parse_frontmatter` 解析 `SKILL.md` 头部（`name` / `description` / `instructions`），不再做整体扫描的代价（典型 < 50 个）。
-
-### 2.6 Skill activation (`pkg/skill/activation.py`, 33 行) + Skill loader 辅助
-
-历史上 skill 通过 LLM 在文本中输出 `[ACTIVATE_SKILL:name]` 标记激活；当前已改为 **Tool Call 机制**：
-
- `SkillToolLoader` (`pkg/provider/tools/loaders/skill.py`, 157 行) 暴露 `activate` 工具，参数为 skill 名
- 工具实现调用 `register_activated_skill(query, skill_data)`，将激活态写入 `query.variables['_activated_skills']`
- 这种 KV-cache-friendly 模式对齐 Claude Code 设计；详见 [box-session-scope.md §4.3](./box-session-scope.md) 的 Tool Call 描述
-
-`activation.py` 现仅保留对外辅助函数（pipeline 层调用 loader 的 `register_activated_skill`）。
-
---
-
-## 3. SDK 侧模块
-
-### 3.1 BoxRuntime (`box/runtime.py`, 599 行)
-
-核心编排器，管理 session 生命周期与 backend 调度：
-
-```
-Session 生命周期:
-
-  Client EXEC / CREATE_SESSION
-       │
-       ▼
-  _get_or_create_session(spec)
-    ├─ _reap_expired_sessions_locked()   清理 TTL 过期 session
-    ├─ 已存在? → _assert_session_compatible() → 复用
-    ├─ Backend session 失踪? → 重建 (commit c6882cf)
-    └─ 新建? → backend.start_session(spec) → 创建容器
-       │       └─ 应用 spec.extra_mounts （多挂载）
-       ▼
-  execute(spec)
-    ├─ 获取 session lock (每 session 独立)
-    ├─ backend.exec(session, spec)       在容器中执行命令
-    ├─ 更新 last_used_at
-    └─ 超时? → 销毁 session
-       │
-       ▼
-  Session 保持存活直到:
-    ├─ TTL 过期 (默认 300s，下次操作时清理)
-    ├─ 执行超时 (自动销毁)
-    ├─ 客户端 DELETE_SESSION
-    └─ SHUTDOWN
-```
-
-**关键设计**:
- 每 session 有独立 `asyncio.Lock`，同一 session 内的命令串行执行
- 每 session 维护 `managed_processes: dict[process_id, _ManagedProcess]`，支持多个长驻进程并存（MCP / 自定义）
- 全局 `_lock` 保护 `_sessions` dict 的读写
- 兼容性检查：比较核心 spec 字段，`image` 字段对不支持自定义镜像的 backend（nsjail/E2B）会跳过
-
-**Backend 选择 (`_select_backend`)**: 优先级
-1. 显式 `box.backend` 配置（`docker` / `nsjail` / `e2b`）
-2. `local` (默认) → Docker / Podman / nsjail CLI 顺序探测
-3. `get_status` 调用时若当前 backend 不可用，会尝试重新选择 (commit `e5617c7`)
-
-### 3.2 Backend 系统
-
-#### CLISandboxBackend (`box/backend.py`, 411 行)
-
-Docker / Podman 公共基类：
-
-```
-start_session(spec):
-  1. validate_sandbox_security(spec)
-  2. docker/podman run -d --rm --name <name>
-     --network none (可选)
-     --cpus/--memory/--pids-limit
-     --read-only + --tmpfs /tmp
-     -v <host>:<mount>:<mode>          主挂载
-     -v <extra.host>:<extra.mount>:..  额外挂载 (extra_mounts)
-     <image> sh -lc 'while true; do sleep 3600; done'
-  3. 返回 BoxSessionInfo
-
-exec(session, spec):
-  docker/podman exec -e KEY=VAL <container>
-    sh -lc 'mkdir -p <workdir> && cd <workdir> && <cmd>'
-
-start_managed_process(session, spec):
-  docker/podman exec -i <container>
-    sh -lc 'mkdir -p <cwd> && cd <cwd> && exec <command> <args>'
-  返回 asyncio.subprocess.Process (stdin/stdout PIPE)
-```
-
-容器以 idle 进程启动，实际命令通过 `docker exec` 执行。`--rm` 确保容器退出时自动清理。
-
-**Windows 支持**: backend 内对 Windows 路径处理与 subprocess 调用做了适配（commit `120817a`）。
-
-**孤儿清理**: 启动时枚举 `langbot.box=true` 标签的容器，instance_id 不匹配的强制删除。
-
-#### NsjailBackend (`box/nsjail_backend.py`, 552 行)
-
-轻量级 Linux 沙箱（无容器引擎依赖）：
-
- 使用 namespace 隔离（user/mount/pid/ipc/uts/cgroup/net）
- 挂载宿主 `/usr`/`/lib`/`/bin`/`/sbin` 只读 + 选定 `/etc` 条目
- 每 session 创建独立目录（workspace/tmp/home）
- 资源限制: cgroup v2 优先，fallback 到 rlimit
- **CLI 兼容**: 通过 `shutil.which(self._nsjail_bin)` 检测系统安装版 nsjail；不存在时再尝试容器内 nsjail（commit `686fcc0`、`feed530`）
- **无自定义镜像**: 使用宿主 OS，`image` 字段固定为 `'host'`，兼容性检查跳过 image
-
-#### E2BBackend (`box/e2b_backend.py`, 429 行)
-
-云沙箱后端（commit `75b547f` 引入）：
-
- 通过 `e2b` SDK 与 E2B 平台通信
- 配置：`box.e2b.api_key` / `api_url` / `template`
- 支持 `extra_mounts`（commit `0fea9b1` 同步上传文件）
- 无本地容器引擎依赖，适合无 Docker 的部署或 SaaS 多租户场景
- 不支持自定义 image 字段，由 template 控制
-
-### 3.3 Server (`box/server.py`, 508 行)
-
-单端口 aiohttp 服务（默认 5410），通过路径区分（commit `8c71ec5` 合并端口）：
-
-1. **Action RPC** (`/rpc/ws`): `BoxServerHandler` 处理所有 action，包括 `INIT` 配置注入、skill store 操作等
-2. **WS Relay** (`/v1/sessions/{id}/managed-process/ws` 与 `/v1/sessions/{id}/managed-process/{pid}/ws`): 双向桥接 WebSocket ↔ 指定 managed process stdin/stdout
-
-stdio 模式同样会在 5410 启动 aiohttp，专门承担 managed process attach；Action RPC 走 stdin/stdout。
-
-### 3.4 Client (`box/client.py`, 377 行)
-
-`ActionRPCBoxClient` 封装 `Handler.call_action()` 调用：
-
- 25+ 方法对应 25+ 个 RPC action（exec / session / managed-process / skill / status / shutdown）
- 错误还原: `_translate_action_error()` 通过字符串前缀匹配还原 SDK 侧异常类型
- `execute()` timeout = 300s，其他默认 15s
- `BoxRuntimeClient` 是 ABC，供后续可能的非 RPC 实现复用
-
-包级别 `__init__.py` 显式导出：`BoxRuntimeClient`、`ActionRPCBoxClient`（commit `df9c722`）。
-
-### 3.5 Actions (`box/actions.py`, 34 行)
-
-`LangBotToBoxAction` 枚举共定义 **25 个** action：
-
-| 类别 | Actions |
-|------|---------|
-| 控制 | `INIT`、`HEALTH`、`STATUS`、`GET_BACKEND_INFO`、`SHUTDOWN` |
-| 执行 | `EXEC` |
-| Session | `CREATE_SESSION` / `GET_SESSION` / `GET_SESSIONS` / `DELETE_SESSION` |
-| Managed Process | `START_MANAGED_PROCESS` / `GET_MANAGED_PROCESS` / `STOP_MANAGED_PROCESS` |
-| Skill | `LIST_SKILLS` / `GET_SKILL` / `CREATE_SKILL` / `UPDATE_SKILL` / `DELETE_SKILL` / `SCAN_SKILL_DIRECTORY` / `LIST_SKILL_FILES` / `READ_SKILL_FILE` / `WRITE_SKILL_FILE` / `PREVIEW_SKILL_ZIP` / `INSTALL_SKILL_ZIP` |
-
-### 3.6 Models (`box/models.py`, 331 行)
-
-核心数据模型：
-
-| 模型 | 用途 |
-|------|------|
-| `BoxNetworkMode` | `OFF` / `ON` |
-| `BoxExecutionStatus` | `COMPLETED` / `TIMED_OUT` |
-| `BoxHostMountMode` | `NONE` / `READ_ONLY` / `READ_WRITE` |
-| `BoxManagedProcessStatus` | `RUNNING` / `EXITED` |
-| `BoxMountSpec` | 单条挂载（host_path/mount_path/mode）— **新增** |
-| `BoxSpec` | 执行请求；新增 `extra_mounts: list[BoxMountSpec]`、`persistent`、`workspace_quota_mb` |
-| `BoxProfile` | 4 个内置 Profile + `locked` frozenset |
-| `BoxSessionInfo` | Session 状态（含 backend_name/created_at/last_used_at） |
-| `BoxManagedProcessSpec` | 长驻进程参数（process_id/command/args/env/cwd） |
-| `BoxManagedProcessInfo` | 进程状态（status/exit_code/stderr_preview/attached） |
-| `BoxExecutionResult` | 执行结果（status/exit_code/stdout/stderr/duration_ms） |
-
-`BoxSpec` 校验器: `workdir` 默认继承 `mount_path`；`host_path` 支持 POSIX 和 Windows 路径；设置 `host_path` 时 `workdir` 必须在 `mount_path` 下。
-
-### 3.7 BoxSkillStore (`box/skill_store.py`, 647 行)
-
-新增模块（commit `4ab3502`），把 skill 持久化收归 Box runtime：
-
-```
-BoxSkillStore
-  ├─ list_skills() / get_skill(name)
-  ├─ create_skill(data) / update_skill(name, data) / delete_skill(name)
-  ├─ scan_skill_directory(path)            扫描目录返回候选 skill 包列表
-  ├─ list_skill_files(name, path)          浏览 skill 内文件树
-  ├─ read_skill_file(name, path) / write_skill_file(name, path, content)
-  ├─ preview_skill_zip(zip_bytes, ...)     不落盘预览 zip 内容
-  └─ install_skill_zip(zip_bytes, ...)     解压、校验、复制到 skills_root
-     └─ 支持 source_subdir / target_suffix（commit 1aa043f）
-```
-
-GitHub 安装路径：HTTP 层（`api/http/service/skill.py`）先 `git clone` 拉取，再走 `install_skill_zip` 或 directory 路径。Skill 文件存放于 `box.local.skills_root`（默认 `skills`，相对 `host_root`），容器内对应 `/workspace/.skills/`。
-
-### 3.8 Security (`box/security.py`, 52 行)
-
-`validate_sandbox_security()`: 黑名单校验 host_path，阻止挂载 `/etc`/`/proc`/`/sys`/`/dev`/`/root`/`/boot` 及 Docker/Podman socket。
-
-**已知缺陷**: 根路径 `/` 未拦截，用户 home 目录未拦截，是 denylist 而非 allowlist 策略。详见 [SaaS 阻塞项 S5](./box-issues.md)。
-
-### 3.9 Errors (`box/errors.py`, 33 行)
-
-| 异常类型 | 含义 |
-|----------|------|
-| `BoxError` | 基类 |
-| `BoxValidationError` | spec/参数校验失败 |
-| `BoxBackendUnavailableError` | 无可用 backend |
-| `BoxRuntimeUnavailableError` | Runtime 服务不可用 |
-| `BoxSessionConflictError` | session 已存在但 spec 不兼容 |
-| `BoxSessionNotFoundError` | session 不存在 |
-| `BoxManagedProcessConflictError` | session 已有同名 process |
-| `BoxManagedProcessNotFoundError` | process 不存在 |
-
---
-
-## 4. 工具系统集成
-
-### 4.1 ToolManager 编排 (`toolmgr.py`)
-
-```
-ToolManager.initialize()
-  ├─ NativeToolLoader      (exec / read / write / edit / glob / grep)
-  ├─ PluginToolLoader      (插件工具)
-  ├─ MCPLoader             (MCP Server 工具)
-  ├─ SkillToolLoader       (activate 工具 — Tool Call 激活)
-  └─ SkillAuthoringToolLoader  (Skill CRUD)
-
-工具调用优先级: native → plugin → mcp → skill → skill_authoring
-```
-
-### 4.2 Native Tools (`native.py`, 846 行)
-
-| 工具 | 是否在 Box 中执行 | 是否访问宿主文件系统 |
-|------|:---:|:---:|
-| `exec`  | 是 | 否 |
-| `read`  | **否** | **是** — 直接 `open()` 宿主文件 |
-| `write` | **否** | **是** — 直接 `open()` 宿主文件 |
-| `edit`  | **否** | **是** — 直接 `open()` 宿主文件 |
-| `glob`  | **否** | **是** — 直接遍历宿主目录 |
-| `grep`  | **否** | **是** — 直接读宿主文件 |
-
-**沙箱边界不对称**: 这是刻意的设计权衡 — `read`/`write`/`edit`/`glob`/`grep` 绕过沙箱以获得性能（避免容器 I/O 开销与跨进程拷贝），但意味着 LLM 可以直接读写 `allowed_mount_roots` 下任何文件。Skill 路径经 `_resolve_host_path()` 重写，禁止穿越 `package_root`。
-
-**exec 的 Skill 分支**: 命令中引用 `/workspace/.skills/<name>` 的 skill 时：
-1. 验证 skill 已激活
-2. 单次 exec 只能引用一个 skill 包
-3. 若 skill 是 Python 项目（有 `requirements.txt` 或 `pyproject.toml`），命令会被 venv bootstrap 包裹（在 skill 挂载点内创建 `.venv`）
-4. 调用 `box_service.execute_tool()` → 走默认 session_id 与已组装好的 `extra_mounts`，**不再为每 skill 起独立 session**
-
-### 4.3 MCP-in-Box (`mcp_stdio.py`, 354 行)
-
-`BoxStdioSessionRuntime` 让 MCP stdio 服务器在 Box 容器中运行，**共享 session、多 process**模式（commit `529088e`）：
-
-```
-initialize()
-  1. 复用/创建共享 session (session_id = _build_box_session_id())
-     - persistent=True，长期保持
-  2. workspace.execute_raw(install_cmd) 安装依赖 (可选)
-  3. 将每个 MCP server 文件 stage 到 /workspace/.mcp/<process_id>/
-  4. workspace.start_managed_process(process_id=<server>)
-  5. websocket_client(ws_url) 通过 WS relay 连接
-  6. ClientSession.initialize() MCP 协议握手
-```
-
-配置 (`MCPServerBoxConfig`): `network='on'` (MCP 服务器通常需要网络)，`host_path_mode='ro'` (默认只读)，`startup_timeout_sec=120` (留时间给 pip install)。
-
-每条 MCP server 是同一 session 中的一个 managed process，独立的 `process_id`、独立 attach URL，互不阻塞。
-
---
-
-## 5. 启动与生命周期
-
-### 5.1 启动顺序 (`build_app.py`)
-
-```
-BuildAppStage.run(ap)
-  ├─ ... (persistence, models, sessions) ...
-  │
-  ├─ BoxService(ap)
-  ├─ box_service.initialize()
-  │    └─ connector.initialize()
-  │         ├─ [stdio] fork box subprocess
-  │         ├─ [subprocess+WS] Windows 本地
-  │         └─ [remote WS] connect URL
-  │    └─ 启动心跳 _heartbeat_task
-  ├─ ap.box_service = box_service
-  │
-  ├─ ToolManager(ap)
-  ├─ tool_mgr.initialize()
-  │    ├─ NativeToolLoader   (检查 box_service.available)
-  │    ├─ PluginToolLoader
-  │    ├─ MCPLoader          (Box 可用时，stdio MCP 走沙箱)
-  │    └─ SkillAuthoringToolLoader
-  ├─ ap.tool_mgr = tool_mgr
-  │
-  ├─ ... (platform, pipeline) ...
-  ├─ SkillManager.initialize()    (从 Box runtime 加载 skill 列表)
-  └─ ... (RAG, HTTP, plugins) ...
-```
-
-BoxService 在 ToolManager **之前**初始化。ToolManager 创建 loader 时检查 `box_service.available`。
-
-### 5.2 初始化失败处理
-
-```python
-try:
-    await self._runtime_connector.initialize()
-    self._available = True
-except Exception as e:
-    self._available = False
-    logger.warning(f"Box runtime unavailable: {e}")
-```
-
-**静默降级**: Box 初始化失败不会阻止应用启动，仅导致 6 个 native tool、所有 Skill 工具和 MCP-in-Box 工具不暴露给 LLM。与 Plugin 的行为不同（Plugin 失败会抛异常）。
-
-### 5.3 销毁流程
-
-```
-app.dispose()
-  └─ box_service.dispose()
-       ├─ connector.dispose()
-       │    ├─ cancel _heartbeat_task
-       │    ├─ cancel _handler_task / _ctrl_task
-       │    └─ terminate subprocess (SIGTERM)
-       └─ loop.create_task(client.shutdown())
-            └─ RPC SHUTDOWN → Box Runtime 清理所有容器
-```
-
-Box 额外做了 RPC SHUTDOWN 通知 Runtime 主动清理容器，比 Plugin 的直接杀进程更安全。
-
---
-
-## 6. 配置
-
-### config.yaml (重构后)
-
-```yaml
-box:
-    enabled: true         # 整个 Box 子系统的总开关。设为 false 时：
-                          #  - 不连接远程 Box runtime，不 fork 本地 stdio 子进程
-                          #  - sandbox 工具 (exec/read/write/edit/glob/grep) 不暴露给 LLM
-                          #  - skill 添加/编辑 / GitHub 安装 / 文件写入全部拒绝
-                          #  - stdio 模式的 MCP server 启动时报错（http/sse 模式不受影响）
-                          #  - skill 列表/读取保持只读可用
-                          # BOX__ENABLED 环境变量可覆盖（统一约定）
-    backend: 'local'      # 'local' (探测) / 'docker' / 'nsjail' / 'e2b'
-                          # 由 box.backend / BOX__BACKEND 选择后端
-    runtime:
-        endpoint: ''      # 外部 Runtime 的 WS 基地址 'ws://host:5410'
-                          # 留空 = 本地自管 Runtime
-    local:
-        profile: 'default'
-        image: ''                       # 覆盖 profile 默认 image
-        host_root: './data/box'         # 工作区挂载根，Docker 部署需绝对路径
-        default_workspace: ''           # 默认 '<host_root>/default'
-        skills_root: 'skills'           # Box 管理的 skill 包目录（相对 host_root）
-        allowed_mount_roots:            # 默认 ['<host_root>']
-            - './data/box'
-            - '/tmp'
-        workspace_quota_mb: null        # 配额覆盖，null = 走 profile
-    e2b:
-        api_key: ''                     # 也可走 E2B_API_KEY 环境变量
-        api_url: ''                     # 自托管 E2B 时填写
-        template: ''                    # 默认 template ID
-```
-
-> **重大变更**: 较 2026-04-16 文档，配置结构完全重组（commit `eefdea4`）。原字段 `box.profile` / `box.runtime_url` / `box.shared_host_root` / `box.allowed_host_mount_roots` 全部迁入 `box.local.*` 子表，新增 `box.backend` 与 `box.e2b.*` 配置组。
-
-### docker-compose.yaml
-
-`langbot_box` 服务受 compose profile 控制,默认 `docker compose up` **不会**启动它。需要 sandbox 时:
-
-```bash
-docker compose --profile box up        # 启动 langbot + langbot_box + plugin runtime
-docker compose --profile all up        # 同上
-docker compose up                       # 只起 langbot + plugin runtime (box 关闭)
-```
-
-若不起 `langbot_box`,需要同步在 `data/config.yaml` 中设 `box.enabled: false`(或 langbot 容器 env 加 `BOX__ENABLED=false`),否则 LangBot 会一直尝试连接不存在的 Box runtime 并报错。
-
-```yaml
-# langbot_box 的关键 volume
-volumes:
-  - ${LANGBOT_BOX_ROOT}:${LANGBOT_BOX_ROOT}         # 工作区挂载(源/目标同路径)
-  - /var/run/docker.sock:/var/run/docker.sock       # Docker backend 复用宿主 docker
-```
-
-### 关闭/连接失败时的行为矩阵
-
-`box.enabled = false` 与"启用但连接失败"在用户可观察行为上**完全一致**——都通过 `BoxService.available = False` 表达,只是 `get_status` 多返回 `enabled` 字段供前端区分文案。
-
-| 消费方 | Box 可用 | Box 不可用(disabled 或 failed) |
-|---|---|---|
-| native exec/read/write/edit/glob/grep 工具 | 暴露给 LLM | **不暴露** |
-| `activate` / `register_skill` 工具 | 暴露给 LLM | **不暴露** |
-| stdio MCP server | 在 Box 内启动 | **`_init_stdio_python_server` 抛 RuntimeError** 拒绝;不退化到宿主 stdio |
-| http/sse MCP server | 正常 | 正常(不依赖 Box) |
-| Skill 列表/读取 (`list_skills`/`get_skill`/`read_skill_file`) | 走 Box runtime | 走 LangBot 本地 `data/skills/` 只读 fallback |
-| Skill 创建/编辑/安装/写文件 | 走 Box runtime | **HTTP 400** + 明确错误信息(`_require_box_for_write`) |
-| Pipeline AI 配置中 `box-session-id-template` | 正常生效 | **前端 banner** 提示字段无效 |
-| Pipeline 扩展页 `enable_all_skills` / 绑定 skill | 可编辑 | **前端禁用** + banner |
-| 仪表盘 Box 状态卡片 | 绿点 / "已连接" | 灰点 / "已禁用"(disabled) 或 红点 / "已断开"(failed) |
-
-> 后端拒写的边界条件:如果 `ap.box_service` **完全没装**(老式 dev mode,没经过 BuildAppStage),`_require_box_for_write` 视作 no-op,保留 `data/skills/` 本地路径——以兼容历史测试与最小化设置。生产环境总会装 `ap.box_service`,因此该 fallback 不会被触发。
-
-### Pipeline 配置 (templates/metadata/pipeline/ai.yaml)
-
-`local-agent.config.box-session-id-template` 控制 session 作用域，预设：
-
- `{launcher_type}_{launcher_id}` — 每个会话 (推荐，默认)
- `{launcher_type}_{launcher_id}_{sender_id}` — 群聊每个用户
- `{launcher_type}_{launcher_id}_{conversation_id}` — 每个对话上下文
- `{query_id}` — 每条消息（完全隔离）
-
-详见 [box-session-scope.md](./box-session-scope.md)。
-
-### REST API
-
-| 端点 | 方法 | 说明 | 前端 |
-|------|------|------|:---:|
-| `/api/v1/box/status` | GET | 可用性、Profile、后端信息 | ✅ 监控页 |
-| `/api/v1/box/sessions` | GET | 活跃 session 列表 | ❌ |
-| `/api/v1/box/errors` | GET | 最近 50 条错误 | ❌ |
-| `/api/v1/skills` 等 | GET/POST/PUT/DELETE | Skill CRUD、文件浏览、zip/GitHub 安装、preview | ✅ Skill 管理页 |
-
-前端 `web/src/app/home/monitoring/components/overview-cards/SystemStatusCards.tsx` 已接入 `/api/v1/box/status`，展示 backend 名称、profile 与活跃 session 数。Sessions 与 errors API 仍未接入。
--- a/docs/review/box-issues.md
+++ b/docs/review/box-issues.md
@@ -1,76 +0,0 @@
-# Box 系统 — SaaS 发布前阻塞项
-
-> 更新日期: 2026-06-02
-> 分支: `feat/sandbox` (LangBot + langbot-plugin-sdk)
-> 相关文档: [架构分析](./box-architecture.md) | [Session 作用域](./box-session-scope.md) | [Runtime 对比](./box-vs-plugin-runtime.md) | [测试覆盖](./box-test-coverage.md) | [toB 分析](./box-tob-analysis.md)
-
-## 范围说明
-
-**自部署社区版已具备发布条件**：默认 stdio 模式、box 为可选项；box 关闭 / 不可用时后端、前端、工具、skill、stdio-MCP 均能干净降级（清晰报错、不崩溃）；配置向后兼容（旧 `data/config.yaml` 可直接启动）；无新增 ORM 模型、无迁移欠债；市场安装失败不会破坏实例。CI 全绿。
-
-本清单**只保留发布 SaaS / 多租户 / 公网暴露前必须处理的阻塞项**。社区版（可信、单运营者、内网）不受这些项阻塞——它们的风险面在"不可信调用方能直接触达 Box 控制面"或"多租户共享资源"的场景才成立。
-
-## 已解决（社区版发布前）
-
-| 项 | 处理 |
-|----|------|
-| 工具调用循环无上限 (原 #13) | `localagent.py` 增加 `MAX_TOOL_CALL_ROUNDS=128`，超限优雅终止（`cafef1a3`） |
-| 配额校验同步遍历阻塞事件循环 (原 #10) | `_enforce_workspace_quota` 改 async，工作区遍历走 `asyncio.to_thread`（`cafef1a3`） |
-| `host_path` 挂载白名单 (原 #3 的 LangBot 侧) | `pkg/box/service.py` `allowed_mount_roots` 白名单，空列表时拒绝一切宿主挂载 |
-| 重复的 `_is_path_under` (原 #12) | 已去重，仅保留一处定义 |
-| 重连 / 心跳 / Windows 兼容 / nsjail image 字段 / 前端 Box 状态接入 | 见上一轮 review 记录，均已合入 |
-
---
-
-## SaaS 阻塞项
-
-### S1. Box 控制面无认证 — Critical
-
- **位置**: SDK `box/server.py` — Action RPC WS (`/rpc/ws`) 与 managed-process relay (`/v1/sessions/{id}/managed-process/{pid}/ws`)
- **现状**: 两个 WS handler 在 `ws.prepare` 后直接服务，无任何 token / 鉴权；box 默认绑定 `0.0.0.0:5410`。任何能触达该端口者可发起 `EXEC`、创建 session、attach 任意 session 的 managed-process stdin/stdout、甚至 `SHUTDOWN`。LangBot→box 的 INIT 也未下发任何凭证。
- **缓解现状**: 默认 `docker-compose.yaml` 的 `langbot_box` 未把 5410 发布到宿主（爆炸半径限于内网 bridge）；但 box 挂载了 `/var/run/docker.sock`，同网络的任意服务（含被攻破的插件）→ 宿主 root。若运营者把 5410 发布到宿主或独立以 `0.0.0.0` 起 box，则完全裸奔。
- **要求**: INIT 时下发 token，两个 WS 路由按连接校验（query/header）。这是 SaaS 的**头号**阻塞项。
-
-### S2. 无 exec 授权模型（policy.py 死代码） — High
-
- **位置**: LangBot `pkg/box/policy.py`（`SandboxPolicy` / `ToolPolicy` / `ElevatedPolicy` 全项目无引用）；`pkg/provider/tools/loaders/native.py`；`pkg/provider/tools/toolmgr.py`
- **现状**: 原生工具（`exec/read/write/edit/glob/grep`）按"box 是否可用"全有或全无地暴露，**无 per-pipeline 的 exec 网关 / 工具白名单 / 沙箱模式 / 权限提升控制**。只要 box 可用，任何使用 local-agent + 函数调用模型的 pipeline 都能跑任意 shell。
- **要求**: 接入 policy.py（或等价机制），按 pipeline 控制是否暴露 `exec`、可用工具白名单、沙箱网络/只读模式。
-
-### S3. 会话资源无界（DoS） — High
-
- **#5 session 数量无上限**: SDK `box/runtime.py` `_get_or_create_session` 的 `_sessions` dict 无容量限制——可变 `session_id` 的恶意调用可无限创建容器，耗尽宿主 CPU/内存/PID/磁盘。
- **#8 无定时回收**: 过期 session 仅在 `_get_or_create_session` 时机会性清理，无独立周期任务；一波创建后转静默会永久泄漏容器。
- **要求**: `max_sessions` 上限（拒绝或 LRU），加独立周期 reaper（如 60s）。
-
-### S4. 工作区配额无内核级限制（TOCTOU） — Med-High
-
- **位置**: LangBot `pkg/box/service.py` `_enforce_workspace_quota`（应用层 read-then-check）；SDK 侧 `workspace_quota_mb` 仅记录/透传，无 `--storage-opt size=` 等内核/FS 限额
- **现状**: 执行前后两次检查之间存在竞态窗口；单条命令（`dd`/`fallocate`）可在检查间隙撑爆磁盘，事后检查只能补救。
- **要求**: Docker `--storage-opt size=` 做内核级限制，或 Redis 原子计数预留式配额。
-
-### S5. 挂载校验缺口 — Med-High
-
- **位置**: SDK `box/security.py` `_BLOCKED_HOST_PATHS_POSIX`；`box/backend.py` 的 `extra_mounts` 处理
- **现状**: ① SDK 黑名单仍不含 `/`（前缀匹配，`host_path="/"` 可通过，挂载整个宿主 fs）；用户 home、`/usr`、`/opt`、`/tmp` 也未拦截。② `validate_sandbox_security` 只校验 `spec.host_path`，**从不遍历 `spec.extra_mounts`**——LangBot 侧 `allowed_mount_roots` 也只校验 `host_path`。当前 `extra_mounts` 仅由 `build_skill_extra_mounts` 内部填充（agent 不可达），但缺乏纵深防御：一旦 S1 的无认证 RPC 被触达，extra_mounts 可挂任意宿主路径，两层都不拦。
- **要求**: SDK 黑名单加入 `/`（或改白名单）；`extra_mounts` 在 SDK 与 LangBot 两侧都纳入挂载校验。
-
-### S6. 容器加固缺失 — Med
-
- **位置**: SDK `box/backend.py` 的 `docker run` 组装
- **现状**: 未设置 `--cap-drop=ALL`、`--security-opt=no-new-privileges`、非 root `--user`；叠加挂载 docker.sock，逃逸面偏大。
- **要求**: 默认加上上述加固 flag（需回归常用 skill 不被破坏）。
-
-### S7. 全局锁内执行慢操作（扩展性） — Med
-
- **位置**: SDK `box/runtime.py` `_get_or_create_session`：`self._lock` 持有期间调用 `backend.start_session()`（`docker run` / nsjail 启动 / E2B `Sandbox.create`）
- **影响**: 冷启动（镜像拉取数秒、E2B >1s）期间串行阻塞所有并发请求——多租户负载下整个 Box runtime 停顿。降级表现是延迟而非失败。
- **要求**: 锁内只做状态检查与注册，容器创建移到锁外。
-
-### S8. 其他硬化 / 跟进 — Low
-
- **#9** SDK `box/server.py` 直接读 `runtime._sessions` 私有字段、绕过锁，并发下可能读到不一致状态——应加公共访问方法。
- **#16** `pkg/provider/tools/toolmgr.py` `execute_func_call` 按优先级分发，plugin/MCP 若有同名 `exec/read/write/...` 工具会被静默遮蔽——应加命名空间或冲突告警。
- **#4** SDK `box/runtime.py` INIT/handshake 与 backend 实例化的残留竞态（仅"纯远程 WS box 先启动、LangBot 后连"场景成立；stdio/compose 路径下 config 经 env 在 spawn 时已就位，无竞态）——应在 INIT 完成前拒绝业务 action。
- **#11** `extra_mounts` 在容器创建时固定（SDK `runtime.py` 兼容性检查不含 extra_mounts）；长生命周期共享 session 后续新激活的 skill 不会挂上（当前缓解：创建时挂上 pipeline 绑定的全部 skill）——动态绑定场景需销毁重建或文档说明。
- **#21** 集成测试未进 CI：容器实际执行、E2B 真机、managed-process WS attach 仅本地可跑。安全关键路径缺自动化覆盖——SaaS 前建议加 Docker-in-Docker CI stage 或合并前手动 checklist。
--- a/docs/review/box-session-scope.md
+++ b/docs/review/box-session-scope.md
@@ -1,402 +0,0 @@
-# Box Session Scope Design
-
-> Date: 2026-04-18 (last reviewed 2026-06-02)
-> Status (2026-06-02): the self-hosted community edition is release-ready (box optional, clean degradation, no migration debt). Tool-call loop cap, async quota scan, and the host_path mount allowlist have landed. Remaining multi-tenant / security hardening is tracked in [box-issues.md](./box-issues.md).
-> Branch: `feat/sandbox` (LangBot + langbot-plugin-sdk)
-> Related: [Box Architecture](./box-architecture.md) | [Box vs Plugin Runtime](./box-vs-plugin-runtime.md)
-
---
-
-## 0. Implementation Status (2026-05-19)
-
-This document was authored as a design proposal. The current `feat/sandbox` branch
-has shipped the design largely as written:
-
-| Item | Status | Notes |
-|------|--------|-------|
-| `BoxMountSpec` + `BoxSpec.extra_mounts` | ✅ Shipped | SDK `box/models.py` |
-| Docker / nsjail / E2B backends apply extra mounts | ✅ Shipped | Last gap closed by SDK commit `0fea9b1` (E2B) |
-| `box-session-id-template` in `local-agent` pipeline config | ✅ Shipped | `templates/metadata/pipeline/ai.yaml`, default `{launcher_type}_{launcher_id}` |
-| `BoxService.resolve_box_session_id(query)` | ✅ Shipped | `pkg/box/service.py:166` |
-| `BoxService.build_skill_extra_mounts(query)` | ✅ Shipped | `pkg/box/service.py:189` |
-| Skill exec uses unified container + extra mounts | ✅ Shipped | `pkg/provider/tools/loaders/native.py` skill branch |
-| MCP-in-Box uses shared persistent session, multi-process | ✅ Shipped (earlier than originally scoped) | SDK commit `529088e`, LangBot `mcp_stdio.py:_build_box_session_id` |
-| `BoxManagedProcessSpec.process_id` + multi-process per session | ✅ Shipped | `BoxRuntime` keeps `managed_processes: dict[pid, _ManagedProcess]` |
-| Per-tenant / quota integration with templates | ❌ Not started | See [box-tob-analysis.md](./box-tob-analysis.md) |
-
-The "Phase 2 deferred" note in §10 is **out of date** — MCP unification went in on
-the same line. Pipeline-scoped (not user-scoped) MCP container is the realized
-behavior: each pipeline's MCP servers share one `mcp-<pipeline>` session, and
-user exec sessions use the template-derived id.
-
-The remaining open work is multi-tenant overlays (tenant_id in session_id,
-quota counters keyed by tenant), tracked in the toB analysis doc rather than here.
-
---
-
-## 1. Problems
-
-### 1.1 Default exec: per-message containers
-
-Currently, `BoxService.execute_tool()` sets `session_id = str(query.query_id)` — an
-auto-incrementing integer per incoming message. Every user message creates a new sandbox
-container. Dependencies installed and in-container state are lost between messages.
-
-### 1.2 Three isolated container pools
-
-Default exec, skills, and MCP servers each manage their own containers with
-independent session IDs:
-
-| Path         | Session ID                                    | Container   |
-|--------------|-----------------------------------------------|-------------|
-| Default exec | `str(query_id)` (per message)                 | Ephemeral   |
-| Skill exec   | `skill-{launcher}_{id}-{skill_name}`          | Per skill   |
-| MCP stdio    | `mcp-{server_uuid}`                           | Per server  |
-
-This means a single logical user interaction can spawn 3+ containers that cannot
-share state, see each other's files, or reuse installed dependencies.
-
-### 1.3 Single bind mount limitation
-
-`BoxSpec` currently supports only **one** `host_path` → `mount_path` bind mount.
-This prevents mounting both a default workspace and skill directories into the
-same container.
-
---
-
-## 2. Concept Model
-
-```
-Platform Message
-  → Query (query_id: int, auto-increment, per message)
-    → Session (launcher_type + launcher_id, per chat window)
-      → Conversation (uuid, per dialogue context within a Session)
-```
-
-| Concept       | Key                                 | Example                    | Scope                        |
-|---------------|-------------------------------------|----------------------------|------------------------------|
-| Query         | `query_id`                          | `42`                       | Single message               |
-| Session       | `launcher_type` + `launcher_id`     | `group_123456`             | Chat window (group or PM)    |
-| Conversation  | `conversation_id` (UUID)            | `a1b2c3d4-...`             | Dialogue context within a Session |
-| Sender        | `sender_id`                         | `789`                      | Individual user              |
-
-Note: in a **group chat**, all users share the same Session (keyed by `group_id`). The
-individual sender is tracked as `sender_id` but does not affect Session/Conversation routing.
-
---
-
-## 3. Target Scenarios
-
-| #  | Scenario                       | Box Granularity                          | Desired `session_id`                                   |
-|----|--------------------------------|------------------------------------------|---------------------------------------------------------|
-| 1  | Personal assistant             | 1 Box per user, long-lived               | `{launcher_type}_{launcher_id}`                          |
-| 2  | Customer service               | 1 Box per customer, cross-pipeline       | `{launcher_type}_{launcher_id}`                          |
-| 3  | Internal employee tool         | 1 Box per employee                       | `{launcher_type}_{launcher_id}`                          |
-| 4  | Group chat shared assistant    | 1 Box per group                          | `{launcher_type}_{launcher_id}`                          |
-| 5  | Group chat isolated per user   | 1 Box per user within a group            | `{launcher_type}_{launcher_id}_{sender_id}`              |
-| 6  | Teaching (cross-channel)       | 1 Box per student across groups/PMs      | `{sender_id}`                                           |
-| 7  | One-off execution              | 1 Box per message (current behavior)     | `{query_id}`                                            |
-| 8  | Multi-project development      | 1 Box per conversation context           | `{launcher_type}_{launcher_id}_{conversation_id}`        |
-
-No single fixed granularity covers all scenarios. A template-based approach is needed.
-
---
-
-## 4. Design Overview
-
-Two key changes:
-
-1. **Unified container**: exec, skills, and MCP all share the same container per
-   session scope. No more separate container pools.
-2. **Configurable session scope**: `session_id` is generated from a template with
-   pipeline variables, configurable per pipeline.
-
-### 4.1 Unified Container with Multiple Mounts
-
-A single container per session scope is created on first use. It has:
-
- **Primary mount**: default workspace at `/workspace` (from `default_host_workspace`)
- **Skill mounts**: each pipeline-bound skill's `package_root` mounted at
-  `/workspace/.skills/{skill_name}/`
- **MCP servers**: run as managed processes inside the same container
-
-```
-Container (session_id = "group_123456")
-  /workspace/                          ← default workspace (bind mount, rw)
-  /workspace/.skills/web-search/       ← skill package (bind mount, rw)
-  /workspace/.skills/data-analysis/    ← skill package (bind mount, rw)
-  [managed process: mcp-server-a]      ← MCP server running inside
-  [managed process: mcp-server-b]      ← MCP server running inside
-```
-
-This requires extending `BoxSpec` to support multiple mounts (see §5).
-
-### 4.2 Session ID Template
-
-A new field `box-session-id-template` in the `local-agent` pipeline runner config
-controls the session scope:
-
-```yaml
-# templates/metadata/pipeline/ai.yaml (under local-agent.config)
- name: box-session-id-template
-  label:
-    en_US: Sandbox Scope
-    zh_Hans: 沙箱作用域
-  description:
-    en_US: >-
-      Determines how sandbox environments are shared. Use variables to
-      control isolation granularity.
-    zh_Hans: >-
-      决定沙箱环境的共享方式。使用变量控制隔离粒度。
-  type: select
-  required: false
-  default: "{launcher_type}_{launcher_id}"
-  options:
-    - value: "{launcher_type}_{launcher_id}"
-      label:
-        en_US: Per chat (Recommended)
-        zh_Hans: 每个会话（推荐）
-    - value: "{launcher_type}_{launcher_id}_{sender_id}"
-      label:
-        en_US: Per user in chat
-        zh_Hans: 会话中每个用户
-    - value: "{launcher_type}_{launcher_id}_{conversation_id}"
-      label:
-        en_US: Per conversation context
-        zh_Hans: 每个对话上下文
-    - value: "{query_id}"
-      label:
-        en_US: Per message (isolated)
-        zh_Hans: 每条消息（完全隔离）
-```
-
-Available template variables (populated by PreProcessor in `query.variables`):
-
-| Variable            | Source                          | Example              |
-|---------------------|---------------------------------|----------------------|
-| `{launcher_type}`   | `query.session.launcher_type`   | `person` / `group`   |
-| `{launcher_id}`     | `query.session.launcher_id`     | `123456`             |
-| `{sender_id}`       | `query.sender_id`               | `789`                |
-| `{conversation_id}` | `conversation.uuid`             | `a1b2c3d4-...`       |
-| `{query_id}`        | `query.query_id`                | `42`                 |
-
-Default `{launcher_type}_{launcher_id}` covers scenarios 1–4 out of the box.
-
---
-
-## 5. SDK Changes: Multi-Mount BoxSpec
-
-### 5.1 Model Extension
-
-```python
-# box/models.py
-
-class BoxMountSpec(pydantic.BaseModel):
-    """A single bind mount specification."""
-    host_path: str
-    mount_path: str
-    mode: BoxHostMountMode = BoxHostMountMode.READ_WRITE
-
-class BoxSpec(pydantic.BaseModel):
-    # ... existing fields ...
-    host_path: str | None = None              # Primary mount (backward compat)
-    host_path_mode: BoxHostMountMode = BoxHostMountMode.READ_WRITE
-    mount_path: str = DEFAULT_BOX_MOUNT_PATH
-    extra_mounts: list[BoxMountSpec] = []     # NEW: additional mounts
-```
-
-`extra_mounts` is additive — the existing `host_path` / `mount_path` pair remains
-the primary mount for backward compatibility.
-
-### 5.2 Backend: Apply Extra Mounts
-
-```python
-# box/backend.py — CLISandboxBackend.start_session()
-
-# Primary mount (unchanged)
-if spec.host_path is not None and spec.host_path_mode != BoxHostMountMode.NONE:
-    args.extend(['-v', f'{spec.host_path}:{spec.mount_path}:{spec.host_path_mode.value}'])
-
-# Extra mounts (NEW)
-for mount in spec.extra_mounts:
-    if mount.mode != BoxHostMountMode.NONE:
-        args.extend(['-v', f'{mount.host_path}:{mount.mount_path}:{mount.mode.value}'])
-```
-
-Same pattern for nsjail backend.
-
---
-
-## 6. LangBot Changes
-
-### 6.1 Session ID Resolution
-
-In `BoxService.execute_tool()`:
-
-```python
-# Before:
-spec_payload.setdefault('session_id', str(query.query_id))
-
-# After:
-template = (query.pipeline_config or {}).get('ai', {}) \
-    .get('local-agent', {}).get('box-session-id-template',
-         '{launcher_type}_{launcher_id}')
-variables = query.variables or {}
-session_id = template.format_map(collections.defaultdict(
-    lambda: 'unknown', variables
-))
-spec_payload.setdefault('session_id', session_id)
-```
-
-### 6.2 Skill Exec: Use Same Container
-
-Currently `native.py:_invoke_exec` creates a separate `BoxWorkspaceSession` per
-skill with `host_path=package_root`. Instead:
-
-1. Use the **same session_id** as default exec (from the template).
-2. Pass the skill's `package_root` as an **extra mount** at
-   `/workspace/.skills/{skill_name}/` instead of replacing `/workspace`.
-3. The container already has the default workspace at `/workspace`.
-
-```python
-# native.py — _invoke_exec, skill branch (REVISED)
-
-# Same session_id as default exec
-session_id = resolve_box_session_id(query)
-
-spec_payload = {
-    'cmd': rewritten_command,
-    'workdir': rewritten_workdir,
-    'session_id': session_id,
-    'extra_mounts': [{
-        'host_path': package_root,
-        'mount_path': f'/workspace/.skills/{selected_skill_name}',
-        'mode': 'rw',
-    }],
-}
-result = await self.ap.box_service.execute_spec_payload(spec_payload, query)
-```
-
-The virtual path `/workspace/.skills/{name}` no longer needs rewriting at the
-command level — it maps directly to the bind mount path inside the container.
-
-### 6.3 MCP: Use Same Container
-
-MCP servers should run inside the same container as exec and skills. Changes:
-
-1. `BoxStdioSessionRuntime` uses the pipeline's session_id template instead of
-   `mcp-{server_uuid}`.
-2. MCP server's working directory is a subdirectory (e.g. `/workspace/.mcp/{name}/`).
-3. MCP server's dependencies are mounted or installed into that subdirectory.
-4. The MCP server runs as a managed process inside the shared container.
-
-Since MCP servers start at LangBot boot (not per-query), the session must be
-created eagerly. The container will be kept alive by the managed process
-exemption in TTL reaping (`runtime.py:259`).
-
-**Note**: MCP sessions are pipeline-scoped (not per-launcher), so their session_id
-should be a **fixed identifier per pipeline** rather than the user-facing template.
-This means one shared MCP container per pipeline, with user exec sessions separate.
-
-Alternatively, in a future iteration, MCP managed processes could be launched
-lazily into the user's container on first MCP tool call. This is more complex
-but maximizes sharing. For V1, keeping MCP containers at pipeline scope is
-simpler and more predictable.
-
---
-
-## 7. Mount Layout Summary
-
-### Default exec (no skills activated)
-
-```
-Container (session_id from template)
-  /workspace/          ← default_host_workspace (rw)
-```
-
-### Exec with activated skills
-
-```
-Container (same session_id)
-  /workspace/                          ← default_host_workspace (rw)
-  /workspace/.skills/web-search/       ← skill package_root (rw)
-  /workspace/.skills/data-analysis/    ← skill package_root (rw)
-```
-
-Extra mounts are **additive** — they are added when the container is first
-created (or on the first exec that references a skill). Since Docker bind
-mounts are specified at container creation time, skills must be known at
-creation time.
-
-**Resolution**: When creating a container, inject `extra_mounts` for **all
-pipeline-bound skills** (from `extensions_preferences`), not just the
-currently activated one. This way any skill can be activated later without
-recreating the container.
-
-### MCP servers (V1: pipeline-scoped)
-
-```
-Container (session_id = "mcp-pipeline-{pipeline_uuid}")
-  /workspace/                    ← MCP shared workspace
-  /workspace/.mcp/server-a/      ← MCP server A files
-  /workspace/.mcp/server-b/      ← MCP server B files
-  [managed process: server-a]
-  [managed process: server-b]
-```
-
---
-
-## 8. Data Migration
-
-Existing pipelines do not have `box-session-id-template`. The backend uses
-`.get(..., default)` so missing keys fall back to `{launcher_type}_{launcher_id}`.
-This changes behavior from per-message to per-launcher for existing pipelines.
-
-Recommendation: **accept the behavior change** — per-launcher is the more
-intuitive default, and the old per-message behavior was rarely desired.
-
---
-
-## 9. Cloud Quota Implications
-
-| Scope                                         | Typical concurrent containers |
-|-----------------------------------------------|-------------------------------|
-| `{query_id}` (per message)                    | Many, short-lived             |
-| `{launcher_type}_{launcher_id}` (per chat)    | = active chat count           |
-| `{sender_id}` (per user)                      | = active user count           |
-| `{conversation_id}` (per conversation)        | Between per-chat and per-msg  |
-
-With the unified container model, each scope value maps to exactly **one**
-container (instead of potentially 3+ per-message). This significantly reduces
-resource usage.
-
-Quota enforcement point: `BoxRuntime._get_or_create_session()` in the SDK.
-
---
-
-## 10. Implementation Phases
-
-### Phase 1: Session scope + skill unification (this PR)
-
-1. **SDK**: Extend `BoxSpec` with `extra_mounts: list[BoxMountSpec]`.
-2. **SDK**: Update Docker/nsjail backends to apply extra mounts.
-3. **LangBot**: Add `box-session-id-template` to `local-agent` YAML metadata
-   and default pipeline config JSON.
-4. **LangBot**: Update `BoxService.execute_tool()` to use template interpolation.
-5. **LangBot**: Update `native.py:_invoke_exec` skill branch to use same
-   session_id + extra mounts instead of separate `BoxWorkspaceSession`.
-6. **LangBot**: On container creation, inject extra mounts for all
-   pipeline-bound skills.
-7. **Frontend**: No code change — `DynamicFormComponent` renders `select` fields.
-8. **Tests**: Unit tests for template interpolation and multi-mount specs.
-
-### Phase 2: MCP unification (future)
-
-1. Refactor `BoxStdioSessionRuntime` to use pipeline-scoped shared container.
-2. MCP servers become managed processes in the shared container.
-3. Support multiple concurrent managed processes per container.
-
-MCP unification is deferred because it requires changes to the managed process
-model (currently 1 managed process per session) and has startup ordering
-concerns (MCP servers start at boot, before any user query determines
-a session_id).
--- a/docs/review/box-test-coverage.md
+++ b/docs/review/box-test-coverage.md
@@ -1,122 +0,0 @@
-# Box 系统测试覆盖分析
-
-> 更新日期: 2026-06-02
-> 状态更新: 自部署社区版已具备发布条件（box 可选、降级完善、无迁移欠债）；工具调用循环上限、配额遍历异步化、`host_path` 挂载白名单等已落地。剩余多租户 / 安全硬化项见 [SaaS 阻塞项清单](./box-issues.md)。
-> 分支: `feat/sandbox` (LangBot + langbot-plugin-sdk)
-
---
-
-## 1. 测试文件清单
-
-### LangBot 仓库
-
-| 文件 | 行数 | CI 运行 | 覆盖范围 |
-|------|------|---------|---------|
-| `tests/unit_tests/box/test_box_connector.py` | 106 | 是 | Connector 传输决策、WS relay URL、dispose、心跳/重连 |
-| `tests/unit_tests/box/test_box_service.py` | 1224 | 是 | Service 核心逻辑（最全面） |
-| `tests/unit_tests/box/test_workspace.py` | 147 | 是 | WorkspaceSession 路径重写、payload 构建 |
-| `tests/unit_tests/provider/test_mcp_box_integration.py` | 707 | 是 | MCP Box 配置、路径重写、payload、shared-session/multi-process、runtime info |
-| `tests/unit_tests/provider/test_localagent_sandbox_exec.py` | 444 | 是 | LocalAgent exec 流程、流式、Skill 激活 (Tool Call) |
-| `tests/unit_tests/provider/test_tool_manager_native.py` | 249 | 是 | ToolManager 路由、native tool CRUD、路径穿越、6 工具暴露 |
-| `tests/unit_tests/provider/test_skill_tools.py` | 582 | 是 | Skill 管理、Tool Call 激活、路径、authoring CRUD |
-| `tests/unit_tests/test_skill_service.py` | 396 | 是 | HTTP service：skill CRUD、zip/GitHub install、文件浏览 |
-| `tests/unit_tests/test_paths.py` | 23 | 是 | paths 工具 |
-| `tests/unit_tests/test_preproc.py` | 134 | 是 | PreProcessor 注入 session 变量、bound skill 解析 |
-| `tests/unit_tests/pipeline/test_chat_handler_logging.py` | 78 | 是 | Chat handler 日志相关回归 |
-| `tests/integration_tests/box/test_box_integration.py` | 329 | **否** | 真实容器执行、超时、网络隔离 |
-| `tests/integration_tests/box/test_box_mcp_integration.py` | 368 | **否** | Managed process、WS attach、shared-session 清理 |
-
-### SDK 仓库
-
-| 文件 | 行数 | CI 运行 | 覆盖范围 |
-|------|------|---------|---------|
-| `tests/box/test_backend_selection.py` | 255 | 是 | 显式 backend / local 模式探测顺序 / 配置变更触发 reselect |
-| `tests/box/test_nsjail_backend.py` | 452 | 是 | nsjail 可用性、安装版 CLI vs 容器内 CLI、session、arg 构建、资源限制 |
-| `tests/box/test_e2b_backend.py` | 482 | 是 | E2B SDK mock、session 生命周期、extra_mounts 同步 |
-| `tests/box/test_skill_store.py` | 88 | 是 | zip preview/install、基础 file CRUD |
-
-**总计**: 17 个测试文件, ~6,500 行测试代码; 其中 2 个集成测试（约 700 行）在 CI 中不运行。
-
-> 较 2026-04-16 版增加：`test_skill_service.py`、`test_paths.py`、`test_preproc.py`、`test_chat_handler_logging.py` (LangBot)，`test_backend_selection.py`、`test_e2b_backend.py`、`test_skill_store.py` (SDK)。`test_nsjail_backend.py` 增加 CLI 兼容性 case (commit `feed530`)。
-
---
-
-## 2. 覆盖良好的区域
-
-| 区域 | 质量 | 说明 |
-|------|------|------|
-| BoxRuntime session 管理 | 优秀 | session 复用、冲突检测、TTL 配置、消失 session 重建 |
-| BoxService Profile 系统 | 优秀 | 4 个内置 Profile、locked/unlocked 字段、timeout clamp |
-| BoxService host mount 安全 | 优秀 | allowed_mount_roots、disallowed_roots、shared host root |
-| BoxService workspace quota | 优秀 | 前置/后置配额检查、超额清理 |
-| BoxService 输出截断 | 优秀 | 短/精确边界/长输出、独立 stderr |
-| BoxService 可观测性 | 优秀 | 状态报告、error ring buffer、buffer 上限 |
-| BoxService session 模板 | 良好 | `resolve_box_session_id` + `build_skill_extra_mounts` 在 service / native / mcp 三处都有覆盖 |
-| RPC client/server 协议 | 优秀 | execute/get_sessions/delete/create/conflict error |
-| BoxRuntimeConnector | 良好 | local/remote 模式、Docker 平台、relay URL、心跳与重连回调 |
-| BoxWorkspaceSession | 良好 | payload 构建、managed process 路径重写、stage host file |
-| BoxHostMountMode.NONE | 良好 | 枚举校验、workdir 约束 |
-| NsjailBackend | 良好 | 可用性、安装版 vs 容器内、session 生命周期、arg 构建、资源限制 |
-| E2BBackend | 良好 | mock SDK、session/extra_mounts 同步 |
-| Backend selection | 良好 | 显式 backend 优先级、local 探测顺序、配置变更触发 reselect |
-| MCP Box 集成 | 良好 | config model、路径重写、payload、shared-session 多 process |
-| Native tool loader | 良好 | 6 工具（exec/read/write/edit/glob/grep）、路径穿越拦截 |
-| LocalAgent exec 流程 | 良好 | 完整 tool call 循环、流式、system prompt 注入、Tool Call 激活 |
-| Skill 系统 | 良好 | 加载、Tool Call 激活、marker、路径解析、authoring CRUD、HTTP service |
-
---
-
-## 3. 覆盖缺失的区域
-
-### 3.1 零测试 / 严重不足
-
-| 区域 | 源文件 | 影响 |
-|------|--------|------|
-| **`security.py`** | SDK `box/security.py` (52 行) | `validate_sandbox_security()` 无任何测试。阻止 `/etc`/`/proc`/Docker socket 等危险挂载的安全函数从未被验证 |
-| **`policy.py`** | `pkg/box/policy.py` (98 行) | 三层安全策略无测试（也是死代码） |
-| **`skill_store.py` 边缘场景** | SDK `box/skill_store.py` (647 行) vs 测试 88 行 | GitHub 安装路径、`source_subdir` / `target_suffix` 组合、损坏 zip、文件冲突等场景未覆盖 |
-
-### 3.2 未测试的关键路径
-
-| 区域 | 说明 |
-|------|------|
-| **Session TTL 过期** | 测试配置了 `session_ttl_sec` 但从未推进时间验证过期清理 |
-| **并发 session 访问** | 无并发 exec / 并发创建 / race condition 测试 |
-| **Container backend (Docker)** | 仅通过集成测试覆盖（CI 不运行），单元测试全用 FakeBackend |
-| **E2B 真实 sandbox** | 单测全是 mock，未对接真实 E2B API |
-| **BoxRuntime shutdown()** | 在 test cleanup 中调用但未验证行为 |
-| **BoxServerHandler 错误路径** | 畸形请求、未知 action 类型 |
-| **WS relay** | 仅在集成测试中覆盖（CI 不运行） |
-| **NsjailBackend managed process** | 完全未测试 |
-| **MCP stdio 完整生命周期** | 依赖安装 → 进程启动 → 健康检查 → 多 process 并发 → 重试 |
-| **BoxService start/stop_managed_process** | 单 process 流转有单测，多 process 互不阻塞主要靠集成测试 |
-| **重连指数退避** | connector 单测覆盖回调接线，未实际跑完整重连周期 |
-
-### 3.3 边缘情况缺失
-
-| 区域 | 说明 |
-|------|------|
-| BoxSpec 校验 | 无效 session_id 格式、超长命令、env 特殊字符 |
-| BoxSpec.extra_mounts | 重复 mount_path、与 host_path 冲突、绝对 vs 相对路径 |
-| BoxExecutionResult | 仅 COMPLETED 和 TIMED_OUT，无 ERROR 状态测试 |
-| 多后端 fallback | local 模式探测顺序仅靠 mock，无真实 Docker 不可用 → nsjail 真机 fallback 测试 |
-| Profile YAML 加载 | 测试用硬编码字符串，未从真实 config.yaml 加载 |
-| INIT 配置变更触发 backend 重建 | 单测仅在初始化场景验证 |
-
---
-
-## 4. 集成测试 vs CI 的差距
-
-CI 仅运行 `tests/unit_tests/`，以下场景**从未在自动化中验证**:
-
- 真实容器的创建/执行/销毁
- 容器网络隔离（`--network none`）
- 容器资源限制生效（cpus/memory/pids_limit）
- Managed process 的 WS 双向 I/O
- 多 process 同 session 并发 I/O
- 孤儿容器清理
- Session 删除清理容器
- 进程退出检测
- E2B 真实 sandbox 行为
-
-**建议**: 在 CI 中加一个可选的 Docker-in-Docker 集成测试 stage，至少覆盖核心执行路径（exec / MCP attach / session 销毁）。
--- a/docs/review/box-tob-analysis.md
+++ b/docs/review/box-tob-analysis.md
@@ -1,167 +0,0 @@
-# Box 系统 toB 商业化分析
-
-> 更新日期: 2026-06-02
-> 状态更新: 自部署社区版已具备发布条件（box 可选、降级完善、无迁移欠债）；工具调用循环上限、配额遍历异步化、`host_path` 挂载白名单等已落地。剩余多租户 / 安全硬化项见 [SaaS 阻塞项清单](./box-issues.md)。
-> 分支: `feat/sandbox` (LangBot + langbot-plugin-sdk)
-
---
-
-## 1. 现有优势
-
-| 能力 | toB 价值 | 代码位置 |
-|------|---------|---------|
-| **沙箱隔离执行** | 企业安全运行不受信代码的基础能力 | SDK `box/backend.py` |
-| **多后端支持** | 适配不同企业容器基础设施 (Podman/Docker/nsjail/E2B) | SDK `box/runtime.py` `_select_backend()` |
-| **E2B 云沙箱** | SaaS / 无 Docker 部署的兜底执行环境 | SDK `box/e2b_backend.py` |
-| **连接自愈** | 心跳 + 自动重连，单点 Box runtime 故障可恢复 | `pkg/box/connector.py` `_heartbeat_loop`, `pkg/box/service.py` `_reconnect_loop` |
-| **Profile + locked 字段** | 运维锁定安全边界，LLM/用户无法绕过 | `pkg/box/service.py`, SDK `box/models.py` |
-| **资源限制** | CPU/内存/PID 数限制防止资源滥用 | SDK `backend.py` `--cpus/--memory/--pids-limit` |
-| **Workspace quota** | 磁盘用量控制 | `pkg/box/service.py` `_enforce_workspace_quota` |
-| **静默降级** | Box 不可用不影响其他功能，降低部署门槛 | `pkg/box/service.py:78` `_available=False` |
-| **孤儿容器清理** | 防止泄漏的容器持续占用资源 | SDK `backend.py` `cleanup_orphaned_containers` |
-| **网络隔离** | `--network none` 防止数据外泄 | SDK `backend.py` start_session |
-| **只读根文件系统** | `--read-only` 防止容器被持久篡改 | SDK `backend.py` start_session |
-| **Host path 白名单** | `allowed_host_mount_roots` 限制可挂载目录 | `pkg/box/service.py` `_validate_host_mount` |
-
---
-
-## 2. toB 差距分析
-
-### 2.1 安全与合规
-
-| 维度 | 现状 | toB 要求 | 优先级 |
-|------|------|---------|--------|
-| **WS relay 认证** | 无认证，任何人可 attach | 至少 token 认证 | **P0** |
-| **安全策略** | policy.py 是死代码，实际无细粒度控制 | 工具级 allow/deny、沙箱模式控制 | **P0** |
-| **审计日志** | 仅内存中 50 条 `_recent_errors` | 持久化审计：谁何时执行了什么、结果如何 | **P0** |
-| **Host path 校验** | 黑名单策略，`/` 未拦截 | 白名单策略，默认拒绝 | **P1** |
-| **数据驻留** | 无控制 | GDPR / 等保要求的数据隔离 | **P2** |
-
-### 2.2 多租户
-
-| 维度 | 现状 | toB 要求 | 优先级 |
-|------|------|---------|--------|
-| **租户隔离** | 无租户概念 | BoxSpec/Profile 绑定 tenant_id | **P0** |
-| **RBAC** | 仅 token 认证 | admin/operator/viewer 角色权限 | **P0** |
-| **资源配额** | 单一 workspace quota | 每租户 CPU 时间/内存/并发/执行次数配额 | **P1** |
-| **Session 隔离** | 所有 session 共享 dict | 按租户分区，互不可见 | **P1** |
-
-### 2.3 可靠性
-
-| 维度 | 现状 | toB 要求 | 优先级 |
-|------|------|---------|--------|
-| **连接恢复** | 已实现：20s 心跳 + `_reconnect_loop` 指数退避 | 已满足基本要求 | 已有 |
-| **Session 清理** | 机会性（仅新建时触发） | 定时清理 + 独立 reaper | **P1** |
-| **水平扩展** | 单 Box Runtime 实例 | 多实例负载均衡（按 tenant 路由） | **P1** |
-| **优雅降级** | 已有（_available=False） | 已满足基本要求 | 已有 |
-| **Backend 自愈** | 已实现：`get_status` 时若 backend 不可用会重新选择 | 已满足基本要求 | 已有 |
-
-### 2.4 可观测性
-
-| 维度 | 现状 | toB 要求 | 优先级 |
-|------|------|---------|--------|
-| **监控指标** | 无 Prometheus metrics | session 数/执行延迟/资源用量/错误率 | **P1** |
-| **结构化日志** | Python logging, 无结构化 | JSON 格式日志，含 trace_id/tenant_id | **P1** |
-| **前端面板** | 监控页接入 `/api/v1/box/status`（backend 名 + 活跃 session 数）；`sessions` / `errors` 仍未接入 | 完整状态面板 + 历史错误/审计列表 | **P2** |
-
---
-
-## 3. SaaS 部署架构建议
-
-### 3.1 方案 A: 共享 Box Runtime Pool (快速上线)
-
-```
-LangBot Instance ──> Box Runtime (共享)
-                       ├─ tenant_id 标签隔离
-                       ├─ Redis 配额计数器
-                       └─ Container labels: langbot.tenant_id=xxx
-```
-
- **优点**: 改动最小，加 tenant_id 到 BoxSpec/labels 即可
- **缺点**: 容器引擎共享，安全隔离弱
-
-### 3.2 方案 B: 每租户 K8s Namespace + gVisor (推荐中期)
-
-```
-LangBot ──> K8s API
-              ├─ namespace: tenant-xxx
-              │    ├─ RuntimeClass: gVisor (runsc)
-              │    ├─ ResourceQuota
-              │    └─ NetworkPolicy
-              └─ namespace: tenant-yyy
-                   └─ ...
-```
-
- **优点**: 强隔离（namespace + gVisor），原生 K8s 配额
- **缺点**: 需要重写 backend 为 K8s Job，部署复杂度高
-
-### 3.3 方案 C: K8s Job 直接编排 (长期)
-
-```
-LangBot ──> K8s Job per execution
-              ├─ 每次执行创建 Job
-              ├─ Pod Security Standards
-              ├─ 自动调度和资源分配
-              └─ Job TTL Controller 自动清理
-```
-
- **优点**: 最强隔离，天然水平扩展
- **缺点**: 冷启动延迟，架构重写
-
-**推荐演进路径**: A → B → C
-
---
-
-## 4. 配额体系建议
-
-### 三层配额
-
-| 层 | 实现 | 作用 |
-|----|------|------|
-| **内核层** | Docker `--cpus`/`--memory`/`--storage-opt` | 硬性资源上限，不可绕过 |
-| **应用层** | Redis 原子计数器 | 并发 session 数/执行次数/CPU 时间预算 |
-| **计费层** | 月度聚合 | 按租户计费（session-hours/execution-count） |
-
-### Profile 与套餐映射
-
-| 套餐 | Profile | locked 字段 | 配额 |
-|------|---------|------------|------|
-| Free | `offline_readonly` | network, host_path_mode, rootfs | 10 exec/天, 0.5 CPU, 256MB |
-| Pro | `default` | (无) | 100 exec/天, 1 CPU, 512MB |
-| Enterprise | `network_extended` | (按需) | 无限, 2 CPU, 1GB, 自定义镜像 |
-
-### TOCTOU 配额修复
-
-当前 `_enforce_workspace_quota` 的 TOCTOU 问题可通过两种方式解决:
-
-1. **预留式配额** (应用层): Redis `INCRBY` 预扣额度 → 执行 → 成功则扣减，失败则回滚
-2. **内核级限制** (Docker): `--storage-opt size=500m` 直接限制容器可写层大小
-
---
-
-## 5. 优先实施路线
-
-### Phase 1 (2-4 周): 安全基线
-
- [ ] WS relay 加 token 认证
- [ ] 接入或删除 policy.py
- [x] ~~Box 加重连和心跳~~（已完成，见 [box-issues.md 已解决](./box-issues.md)）
- [ ] 审计日志持久化（至少写文件/数据库）
- [ ] `security.py` 加 `/` 拦截，考虑白名单
- [ ] INIT 与 backend 初始化顺序整理（避免 backend 在配置到达前实例化）
-
-### Phase 2 (4-8 周): 多租户基础
-
- [ ] BoxSpec 加 `tenant_id` 字段
- [ ] 容器 labels 加 tenant 标识
- [ ] Redis 配额计数器（并发/执行次数/时间）
- [ ] RBAC 基础框架
- [ ] 定时 session reaper
-
-### Phase 3 (8-16 周): 生产就绪
-
- [ ] Prometheus metrics exporter
- [ ] 前端 Box 状态面板
- [ ] K8s backend 支持 (方案 B)
- [ ] 结构化日志 (JSON, trace_id)
- [ ] 水平扩展支持
--- a/docs/review/box-vs-plugin-runtime.md
+++ b/docs/review/box-vs-plugin-runtime.md
@@ -1,222 +0,0 @@
-# Box Runtime vs Plugin Runtime: 连接架构对比
-
-> 更新日期: 2026-06-02
-> 状态更新: 自部署社区版已具备发布条件（box 可选、降级完善、无迁移欠债）；工具调用循环上限、配额遍历异步化、`host_path` 挂载白名单等已落地。剩余多租户 / 安全硬化项见 [SaaS 阻塞项清单](./box-issues.md)。
-> 分支: `feat/sandbox` (LangBot + langbot-plugin-sdk)
-
---
-
-## 1. 总体差异
-
-| 维度 | Plugin Runtime | Box Runtime |
-|------|---------------|-------------|
-| **继承关系** | `PluginRuntimeConnector(ManagedRuntimeConnector)` | `BoxRuntimeConnector`（独立类） |
-| **传输分支** | 3 条 (Docker/WS, Win32/subprocess+WS, Unix/stdio) | 3 条 (本地 stdio, Win32/subprocess+WS, 远程 WS) |
-| **心跳** | 20s ping loop | 20s ping loop（`_heartbeat_loop`） |
-| **重连** | WS 模式: sleep 3s → re-initialize | 由 BoxService `_reconnect_loop` 处理，指数退避 |
-| **Handler 类型** | `RuntimeConnectionHandler` (1132 行, 25+ action) | 基础 `Handler` + `BoxServerHandler`（SDK 端 25 action） |
-| **Client 抽象** | Handler 即 API | 独立 `ActionRPCBoxClient` 封装 Handler |
-| **启用/禁用** | `is_enable_plugin` 开关 | 无开关（可用/不可用由初始化结果决定） |
-| **初始化失败** | 异常上抛 | 静默降级 `_available=False` |
-| **Shutdown** | 直接杀进程 | RPC SHUTDOWN → 清理容器 → 再杀进程 |
-
---
-
-## 2. 传输决策
-
-### Plugin: 3-路决策
-
-```python
-# pkg/plugin/connector.py:106-165
-if get_platform() == 'docker' or use_websocket_to_connect_plugin_runtime():
-    # Docker/WS → ws://langbot_plugin_runtime:5400/control/ws
-elif get_platform() == 'win32':
-    # Windows → 起子进程(无 pipe) + ws://localhost:5400/control/ws
-else:
-    # Unix/Mac → StdioClientController(python -m langbot_plugin.cli rt -s)
-```
-
-### Box: 3-路决策
-
-```python
-# pkg/box/connector.py
-if self._uses_websocket():
-    if platform.get_platform() == 'win32' and not self.configured_runtime_url:
-        await self._start_subprocess_then_ws()  # subprocess + ws://localhost:5410/rpc/ws
-    else:
-        await self._connect_remote_ws()         # ws://{host}:5410/rpc/ws
-else:
-    await self._start_local_stdio()             # StdioClientController
-```
-
-> 历史：2026-04-16 版本本文档曾把 Box 描述为 2 路决策（缺 Windows 分支）。现已对齐 Plugin 的 3 路设计。
-
-### 决策矩阵
-
-| 环境 | Plugin | Box |
-|------|--------|-----|
-| Docker | WS → `:5400` | WS → `:5410/rpc/ws` |
-| `--standalone-box` | N/A | WS → `localhost:5410/rpc/ws` |
-| Windows 非 Docker | subprocess + WS (`:5400`) | subprocess + WS (`localhost:5410/rpc/ws`) |
-| Unix/Mac 非 Docker | stdio | stdio |
-| 手动配置 URL | 通过配置项 | WS → 用户配置的 URL |
-
---
-
-## 3. 连接建立
-
-### 同步模式差异
-
-**Plugin**: `new_connection_callback` 内直接 ping + await handler_task，`initialize()` 通过 `create_task()` 异步启动，不阻塞等待连接。
-
-**Box**: 使用 `asyncio.Event` + `wait_for(timeout=30s)` 模式，`initialize()` 同步等待连接成功或超时。
-
-### Box stdio 路径
-
-```
-connector._start_local_stdio()
-  ├─ connected = asyncio.Event()
-  ├─ ctrl = StdioClientController(python, ['-m', 'langbot_plugin.cli.__init__', 'box', '-s', '--ws-control-port', N])
-  ├─ _ctrl_task = create_task(ctrl.run(callback))
-  │    callback:
-  │      handler = Handler(connection)          ← 基础 Handler, 无 disconnect_callback
-  │      client.set_handler(handler)
-  │      _handler_task = create_task(handler.run())
-  │      call_action(PING, {})                  ← 握手, timeout=15s
-  │      connected.set()                        ← 通知外层
-  │      await _handler_task                    ← 阻塞直到断开
-  └─ await wait_for(connected.wait(), 30s)      ← 同步等待
-```
-
-### Plugin stdio 路径
-
-```
-connector.initialize()
-  ├─ ctrl = StdioClientController(python, ['-m', 'langbot_plugin.cli', 'rt', '-s'])
-  ├─ task = ctrl.run(callback)
-  │    callback:
-  │      disconnect_callback:
-  │        [WS] → runtime_disconnect_callback → 重连
-  │        [stdio] → 仅日志, 不重连
-  │      handler = RuntimeConnectionHandler(conn, disconnect_cb, ap)
-  │      create_task(handler.run())
-  │      handler.ping()                         ← 握手, timeout=10s
-  │      await handler_task                     ← 阻塞直到断开
-  ├─ create_task(heartbeat_loop())              ← 20s ping loop
-  └─ create_task(task)                          ← 不等待连接
-```
-
---
-
-## 4. 心跳与重连
-
-### 心跳
-
-| 维度 | Plugin | Box |
-|------|--------|-----|
-| 有心跳? | 是 | 是（`connector.py` `_heartbeat_loop`） |
-| 间隔 | 20s | 20s |
-| 失败处理 | 仅 DEBUG 日志，不触发重连 | 仅 DEBUG 日志，依赖 connection close 触发重连 |
-| 生命周期 | 整个应用生命周期 | 连接建立后启动；`dispose()` 时 cancel |
-
-### 重连
-
-| 维度 | Plugin | Box |
-|------|--------|-----|
-| Docker/WS 断开 | `runtime_disconnect_callback` → sleep 3s → re-initialize | `runtime_disconnect_callback` → `BoxService._reconnect_loop()`（指数退避） |
-| WS 连接失败 | 同上 | 同上；初次失败时 `_available=False`，重连成功后恢复 |
-| stdio 断开 | 仅日志，不重连 | 接同样回调；stdio 重连需重新 fork 子进程 |
-| 重连退避 | 固定 3s，无 backoff | 指数退避 |
-
-> 历史：2026-04-16 版本本文档曾把心跳与重连标记为 Box 缺失。这两项已在 commit `2dfd9d5d` / `c6882cf` / `5029d9c` 等修复（详见 [box-issues.md 已解决](./box-issues.md)）。
-
---
-
-## 5. 共享 IO 层
-
-两者复用同一套 SDK IO 基础设施：
-
-```
-Handler ← ABC                              (runtime/io/handler.py)
-  ├── RuntimeConnectionHandler              (Plugin 用, LangBot 侧)
-  ├── ControlConnectionHandler              (Plugin 用, SDK 侧)
-  ├── BoxServerHandler                      (Box 用, SDK 侧)
-  └── 匿名 Handler 实例                     (Box 用, LangBot 侧)
-
-Connection ← ABC
-  ├── StdioConnection    (stdio: 16KB chunks, 应用层分帧协议)
-  └── WebSocketConnection (WS: 64KB chunks, 原生 WS 分帧)
-
-Controller ← ABC
-  ├── StdioClientController    (fork 子进程, pipe stdin/stdout)
-  ├── StdioServerController    (接管当前进程 stdin/stdout)
-  ├── WebSocketClientController (连接 WS 服务端)
-  └── WebSocketServerController (监听 WS 端口)
-```
-
-共享的核心机制：
- `call_action()` / `call_action_generator()` — RPC 调用/流式调用
- `ActionRequest` / `ActionResponse` — 请求/响应协议
- `seq_id` 关联 — 并发请求复用单连接
- `CommonAction.PING` — 两者都用于初始握手
- 文件传输 (`send_file`) — Plugin 用，Box 不用
-
---
-
-## 6. 端口方案
-
-| 服务 | Plugin | Box |
-|------|--------|-----|
-| Action RPC (stdio) | stdin/stdout | stdin/stdout |
-| Action RPC (WS) | `:5400` | `:5410/rpc/ws` |
-| 辅助服务 | debug WS `:5401` | managed process WS relay `:5410/v1/sessions/{id}/managed-process/ws` |
-
-**Box 特点**: 单端口 aiohttp 服务（默认 5410），通过路径区分 Action RPC 和 managed process relay。即使在 stdio 模式，也在 `:5410` 启动 aiohttp 用于 managed process attach。Plugin 在 stdio 模式不开额外端口。
-
---
-
-## 7. 销毁对比
-
-### Plugin
-
-```python
-dispose():
-  if stdio: ctrl.process.terminate()
-  _dispose_subprocess()         # Windows 子进程
-  heartbeat_task.cancel()
-```
-
-### Box
-
-```python
-connector.dispose():
-  _handler_task.cancel()
-  _ctrl_task.cancel()
-  _subprocess.terminate()
-
-service.dispose():
-  connector.dispose()
-  loop.create_task(client.shutdown())   # RPC SHUTDOWN → 清理所有容器
-```
-
-Box 的 RPC SHUTDOWN 确保容器被正确停止，不会成为孤儿。Plugin 直接杀进程。
-
---
-
-## 8. 改进建议
-
-### P0
-
-1. **两者都加 WS 认证**: 至少 token 认证（INIT 时下发，连接时校验）
-
-### P1
-
-2. **考虑 Box 继承 ManagedRuntimeConnector**: 复用 `_start_runtime_subprocess` / `_wait_until_ready` / `_dispose_subprocess`，减少重复代码
-3. **Plugin 重连加退避**: 固定 3s 无 backoff 可能造成日志洪水，建议向 Box 的指数退避看齐
-4. **统一连接管理模式**: Event-based (Box) vs direct-await (Plugin)，考虑收敛为一种
-
-### 已完成（自上一轮）
-
- ~~Box 加重连~~（commit `2dfd9d5d`）
- ~~Box 加心跳~~（20s loop 与 Plugin 一致）
- ~~Box 加 Windows 支持~~（commit `120817a` / `fafb7a4`）
--- a/pyproject.toml
+++ b/pyproject.toml
@@ -1,6 +1,6 @@
 [project]
 name = "langbot"
-version = "4.10.0-beta.1"
+version = "4.9.7"
 description = "Production-grade platform for building agentic IM bots"
 readme = "README.md"
 license-files = ["LICENSE"]
@@ -70,7 +70,7 @@ dependencies = [
    "chromadb>=1.0.0,<2.0.0",
    "qdrant-client (>=1.15.1,<2.0.0)",
    "pyseekdb==1.1.0.post3",
-    "langbot-plugin==0.4.0b1",
+    "langbot-plugin==0.3.11",
    "asyncpg>=0.30.0",
    "line-bot-sdk>=3.19.0",
    "matrix-nio>=0.25.2",
@@ -105,6 +105,9 @@ classifiers = [
    "Topic :: Communications :: Chat",
 ]

+[tool.uv.sources]
+langbot-plugin = { path = "../langbot-plugin-sdk", editable = true }
+
 [project.urls]
 Homepage = "https://langbot.app"
 Documentation = "https://docs.langbot.app"
--- a/src/langbot/init.py
+++ b/src/langbot/init.py
@@ -1,3 +1,3 @@
 """LangBot - Production-grade platform for building agentic IM bots"""

-__version__ = '4.10.0-beta.1'
+__version__ = '4.9.7'
--- a/src/langbot/main.py
+++ b/src/langbot/main.py
@@ -5,8 +5,6 @@ import argparse
 import sys
 import os

-from langbot.pkg.utils import paths
-
 # ASCII art banner
 asciiart = r"""
 _                   ___      _   
@@ -29,12 +27,6 @@ async def main_entry(loop: asyncio.AbstractEventLoop):
        help='Use standalone plugin runtime / 使用独立插件运行时',
        default=False,
    )
-    parser.add_argument(
-        '--standalone-box',
-        action='store_true',
-        help='Use standalone box runtime / 使用独立 Box 运行时',
-        default=False,
-    )
    parser.add_argument('--debug', action='store_true', help='Debug mode / 调试模式', default=False)
    args = parser.parse_args()

@@ -43,11 +35,6 @@ async def main_entry(loop: asyncio.AbstractEventLoop):

        platform.standalone_runtime = True

-    if args.standalone_box:
-        from langbot.pkg.utils import platform
-
-        platform.standalone_box = True
-
    if args.debug:
        from langbot.pkg.utils import constants

@@ -100,7 +87,7 @@ def main():
    # Set up the working directory
    # When installed as a package, we need to handle the working directory differently
    # We'll create data directory in current working directory if not exists
-    os.makedirs(paths.get_data_root(), exist_ok=True)
+    os.makedirs('data', exist_ok=True)

    loop = asyncio.new_event_loop()

--- a/src/langbot/pkg/agent/init.py
+++ b/src/langbot/pkg/agent/init.py
@@ -0,0 +1,37 @@
+"""Agent runner subsystem for LangBot."""
+from __future__ import annotations
+
+from .runner.descriptor import AgentRunnerDescriptor
+from .runner.id import parse_runner_id, format_runner_id, RunnerIdParts, is_plugin_runner_id
+from .runner.errors import (
+    AgentRunnerError,
+    RunnerNotFoundError,
+    RunnerNotAuthorizedError,
+    RunnerProtocolError,
+    RunnerExecutionError,
+)
+from .runner.registry import AgentRunnerRegistry
+from .runner.context_builder import AgentRunContextBuilder
+from .runner.resource_builder import AgentResourceBuilder
+from .runner.result_normalizer import AgentResultNormalizer
+from .runner.orchestrator import AgentRunOrchestrator
+from .runner.config_migration import ConfigMigration
+
+__all__ = [
+    'AgentRunnerDescriptor',
+    'parse_runner_id',
+    'format_runner_id',
+    'is_plugin_runner_id',
+    'RunnerIdParts',
+    'AgentRunnerError',
+    'RunnerNotFoundError',
+    'RunnerNotAuthorizedError',
+    'RunnerProtocolError',
+    'RunnerExecutionError',
+    'AgentRunnerRegistry',
+    'AgentRunContextBuilder',
+    'AgentResourceBuilder',
+    'AgentResultNormalizer',
+    'AgentRunOrchestrator',
+    'ConfigMigration',
+]
--- a/src/langbot/pkg/agent/runner/init.py
+++ b/src/langbot/pkg/agent/runner/init.py
@@ -0,0 +1,52 @@
+"""Agent runner modules."""
+from __future__ import annotations
+
+from .descriptor import AgentRunnerDescriptor
+from .id import parse_runner_id, format_runner_id, RunnerIdParts
+from .errors import (
+    AgentRunnerError,
+    RunnerNotFoundError,
+    RunnerNotAuthorizedError,
+    RunnerProtocolError,
+    RunnerExecutionError,
+)
+from .registry import AgentRunnerRegistry
+from .context_builder import AgentRunContextBuilder
+from .resource_builder import AgentResourceBuilder
+from .result_normalizer import AgentResultNormalizer
+from .orchestrator import AgentRunOrchestrator
+from .config_migration import ConfigMigration
+from .session_registry import AgentRunSessionRegistry, AgentRunSession, get_session_registry
+from .events import (
+    MESSAGE_RECEIVED,
+    MESSAGE_RECALLED,
+    GROUP_MEMBER_JOINED,
+    FRIEND_REQUEST_RECEIVED,
+    RESERVED_EVENT_TYPES,
+)
+
+__all__ = [
+    'AgentRunnerDescriptor',
+    'parse_runner_id',
+    'format_runner_id',
+    'RunnerIdParts',
+    'AgentRunnerError',
+    'RunnerNotFoundError',
+    'RunnerNotAuthorizedError',
+    'RunnerProtocolError',
+    'RunnerExecutionError',
+    'AgentRunnerRegistry',
+    'AgentRunContextBuilder',
+    'AgentResourceBuilder',
+    'AgentResultNormalizer',
+    'AgentRunOrchestrator',
+    'ConfigMigration',
+    'AgentRunSessionRegistry',
+    'AgentRunSession',
+    'get_session_registry',
+    'MESSAGE_RECEIVED',
+    'MESSAGE_RECALLED',
+    'GROUP_MEMBER_JOINED',
+    'FRIEND_REQUEST_RECEIVED',
+    'RESERVED_EVENT_TYPES',
+]
--- a/src/langbot/pkg/agent/runner/artifact_store.py
+++ b/src/langbot/pkg/agent/runner/artifact_store.py
@@ -0,0 +1,300 @@
+"""Artifact store for managing Host-owned artifacts."""
+from __future__ import annotations
+
+import json
+import datetime
+import typing
+import uuid
+import base64
+
+import sqlalchemy
+from sqlalchemy.ext.asyncio import AsyncEngine, AsyncSession
+from sqlalchemy.orm import sessionmaker
+
+from ...entity.persistence.artifact import AgentArtifact
+from ...entity.persistence.bstorage import BinaryStorage
+
+
+class ArtifactStore:
+    """Store for AgentArtifact records.
+
+    Handles artifact metadata registration and content retrieval.
+    Actual blob storage is delegated to BinaryStorage or external storage.
+
+    All methods are async and use the provided database engine.
+    """
+
+    engine: AsyncEngine
+
+    # Hard limits
+    MAX_INLINE_READ_BYTES = 1024 * 1024  # 1MB max for inline base64
+    MAX_RANGE_READ_BYTES = 10 * 1024 * 1024  # 10MB max for range reads
+
+    def __init__(self, engine: AsyncEngine):
+        self.engine = engine
+        self._session_factory = sessionmaker(
+            engine, class_=AsyncSession, expire_on_commit=False
+        )
+
+    async def register_artifact(
+        self,
+        artifact_id: str | None,
+        artifact_type: str,
+        source: str,
+        storage_key: str | None = None,
+        storage_type: str = 'binary_storage',
+        mime_type: str | None = None,
+        name: str | None = None,
+        size_bytes: int | None = None,
+        sha256: str | None = None,
+        conversation_id: str | None = None,
+        run_id: str | None = None,
+        runner_id: str | None = None,
+        bot_id: str | None = None,
+        workspace_id: str | None = None,
+        expires_at: datetime.datetime | None = None,
+        metadata: dict[str, typing.Any] | None = None,
+        content: bytes | None = None,
+    ) -> str:
+        """Register a new artifact.
+
+        If content is provided and storage_key is None, stores content
+        in BinaryStorage automatically.
+
+        Args:
+            artifact_id: Unique artifact ID (generated if None)
+            artifact_type: Type of artifact (image, file, voice, tool_result, etc.)
+            source: Source of artifact (platform, runner, tool, system)
+            storage_key: Key in BinaryStorage or external reference
+            storage_type: Storage type (binary_storage, file, url)
+            mime_type: MIME type
+            name: Original file name
+            size_bytes: Size in bytes
+            sha256: SHA256 hash
+            conversation_id: Conversation ID
+            run_id: Run ID that created this
+            runner_id: Runner ID that created this
+            bot_id: Bot UUID
+            workspace_id: Workspace ID
+            expires_at: Expiration time
+            metadata: Additional metadata
+            content: Optional content to store in BinaryStorage
+
+        Returns:
+            The artifact_id
+        """
+        if artifact_id is None:
+            artifact_id = str(uuid.uuid4())
+
+        # If content provided, store in BinaryStorage
+        if content is not None and storage_key is None:
+            storage_key = f"artifact:{artifact_id}"
+            storage_type = 'binary_storage'
+            if size_bytes is None:
+                size_bytes = len(content)
+
+        async with self._session_factory() as session:
+            # Store content in BinaryStorage if provided
+            if content is not None:
+                binary_storage = BinaryStorage(
+                    unique_key=f'artifact:{artifact_id}',
+                    key=storage_key,
+                    owner_type='artifact',
+                    owner='host',
+                    value=content,
+                )
+                session.add(binary_storage)
+
+            # Store artifact metadata
+            artifact = AgentArtifact(
+                artifact_id=artifact_id,
+                artifact_type=artifact_type,
+                mime_type=mime_type,
+                name=name,
+                size_bytes=size_bytes,
+                sha256=sha256,
+                source=source,
+                storage_key=storage_key,
+                storage_type=storage_type,
+                conversation_id=conversation_id,
+                run_id=run_id,
+                runner_id=runner_id,
+                bot_id=bot_id,
+                workspace_id=workspace_id,
+                created_at=datetime.datetime.utcnow(),
+                expires_at=expires_at,
+                metadata_json=json.dumps(metadata) if metadata else None,
+            )
+            session.add(artifact)
+            await session.commit()
+
+        return artifact_id
+
+    async def get_metadata(
+        self,
+        artifact_id: str,
+    ) -> dict[str, typing.Any] | None:
+        """Get artifact metadata (public fields only, no internal storage info).
+
+        Args:
+            artifact_id: Artifact ID
+
+        Returns:
+            Artifact metadata dict compatible with SDK ArtifactMetadata, or None if not found
+        """
+        async with self._session_factory() as session:
+            result = await session.execute(
+                sqlalchemy.select(AgentArtifact).where(
+                    AgentArtifact.artifact_id == artifact_id
+                )
+            )
+            row = result.scalars().first()
+            if row is None:
+                return None
+            return self._row_to_public_dict(row)
+
+    async def _get_internal_record(
+        self,
+        artifact_id: str,
+    ) -> AgentArtifact | None:
+        """Get full artifact record including internal fields.
+
+        Used internally by read_artifact to access storage_key/storage_type.
+
+        Args:
+            artifact_id: Artifact ID
+
+        Returns:
+            AgentArtifact ORM instance, or None if not found
+        """
+        async with self._session_factory() as session:
+            result = await session.execute(
+                sqlalchemy.select(AgentArtifact).where(
+                    AgentArtifact.artifact_id == artifact_id
+                )
+            )
+            return result.scalars().first()
+
+    async def read_artifact(
+        self,
+        artifact_id: str,
+        offset: int = 0,
+        limit: int | None = None,
+    ) -> dict[str, typing.Any] | None:
+        """Read artifact content.
+
+        For small artifacts, returns content_base64 directly.
+        For large artifacts, returns file_key for chunked transfer.
+
+        Args:
+            artifact_id: Artifact ID
+            offset: Byte offset to start reading from (must be >= 0)
+            limit: Maximum bytes to read (must be > 0 if provided)
+
+        Returns:
+            ArtifactReadResult dict, or None if not found
+
+        Raises:
+            ValueError: If offset < 0 or limit <= 0
+        """
+        # Validate offset and limit
+        if offset < 0:
+            raise ValueError("offset must be >= 0")
+
+        if limit is not None and limit <= 0:
+            raise ValueError("limit must be > 0")
+
+        # Get internal record (includes storage_key/storage_type)
+        record = await self._get_internal_record(artifact_id)
+        if record is None:
+            return None
+
+        storage_type = record.storage_type or 'binary_storage'
+        storage_key = record.storage_key
+        size_bytes = record.size_bytes or 0
+
+        # Cap limit at hard limit
+        if limit is None:
+            limit = self.MAX_INLINE_READ_BYTES
+        limit = min(limit, self.MAX_RANGE_READ_BYTES)
+
+        # For binary_storage, read content
+        if storage_type == 'binary_storage' and storage_key:
+            content = await self._read_binary_storage(storage_key)
+            if content is None:
+                return None
+
+            # Apply offset and limit
+            if offset > 0:
+                content = content[offset:]
+            if limit and len(content) > limit:
+                content = content[:limit]
+                has_more = True
+            else:
+                has_more = False
+
+            return {
+                'artifact_id': artifact_id,
+                'mime_type': record.mime_type,
+                'size_bytes': size_bytes,
+                'offset': offset,
+                'length': len(content),
+                'content_base64': base64.b64encode(content).decode('utf-8'),
+                'file_key': None,
+                'has_more': has_more,
+            }
+
+        # For other storage types, return storage reference
+        # (caller can use file_key for chunked transfer)
+        return {
+            'artifact_id': artifact_id,
+            'mime_type': record.mime_type,
+            'size_bytes': size_bytes,
+            'offset': offset,
+            'length': None,
+            'content_base64': None,
+            'file_key': storage_key,
+            'has_more': False,
+        }
+
+    async def _read_binary_storage(self, key: str) -> bytes | None:
+        """Read content from BinaryStorage.
+
+        Uses unique_key for isolation to prevent cross-artifact access.
+
+        Args:
+            key: The unique_key used when storing the artifact
+
+        Returns:
+            Content bytes, or None if not found
+        """
+        async with self._session_factory() as session:
+            result = await session.execute(
+                sqlalchemy.select(BinaryStorage).where(BinaryStorage.unique_key == key)
+            )
+            row = result.scalars().first()
+            if row is None:
+                return None
+            return row.value
+
+    def _row_to_public_dict(self, row: AgentArtifact) -> dict[str, typing.Any]:
+        """Convert an AgentArtifact row to public dict.
+
+        Returns only fields that match SDK ArtifactMetadata entity.
+        Host-only fields (bot_id, workspace_id, storage_key, storage_type) are excluded.
+        """
+        return {
+            'artifact_id': row.artifact_id,
+            'artifact_type': row.artifact_type,
+            'mime_type': row.mime_type,
+            'name': row.name,
+            'size_bytes': row.size_bytes,
+            'sha256': row.sha256,
+            'source': row.source,
+            'conversation_id': row.conversation_id,
+            'run_id': row.run_id,
+            'runner_id': row.runner_id,
+            'created_at': int(row.created_at.timestamp()) if row.created_at else None,
+            'expires_at': int(row.expires_at.timestamp()) if row.expires_at else None,
+            'metadata': json.loads(row.metadata_json) if row.metadata_json else {},
+        }
--- a/src/langbot/pkg/agent/runner/config_migration.py
+++ b/src/langbot/pkg/agent/runner/config_migration.py
@@ -0,0 +1,230 @@
+"""Configuration migration for agent runner IDs."""
+
+from __future__ import annotations
+
+import typing
+
+from .id import is_plugin_runner_id
+
+
+# Mapping from old built-in runner names to official plugin runner IDs
+OLD_RUNNER_TO_PLUGIN_RUNNER_ID = {
+    'local-agent': 'plugin:langbot/local-agent/default',
+    'dify-service-api': 'plugin:langbot/dify-agent/default',
+    'n8n-service-api': 'plugin:langbot/n8n-agent/default',
+    'coze-api': 'plugin:langbot/coze-agent/default',
+    'dashscope-app-api': 'plugin:langbot/dashscope-agent/default',
+    'langflow-api': 'plugin:langbot/langflow-agent/default',
+    'tbox-app-api': 'plugin:langbot/tbox-agent/default',
+}
+
+
+class ConfigMigration:
+    """Configuration migration helper for agent runner IDs.
+
+    Responsibilities:
+    - Resolve runner ID from new ai.runner.id or old ai.runner.runner
+    - Map old built-in runner names to official plugin runner IDs
+    - Extract runtime runner config from ai.runner_config
+    - Migrate old ai.<runner-name> blocks into ai.runner_config
+    """
+
+    @staticmethod
+    def resolve_runner_id(pipeline_config: dict[str, typing.Any]) -> str | None:
+        """Resolve runner ID from pipeline configuration.
+
+        Priority:
+        1. New format: ai.runner.id (must be plugin:* format)
+        2. Old format: ai.runner.runner (mapped to plugin:* if built-in)
+
+        Args:
+            pipeline_config: Pipeline configuration dict
+
+        Returns:
+            Runner ID string, or None if not configured
+        """
+        ai_config = pipeline_config.get('ai', {})
+        runner_config = ai_config.get('runner', {})
+
+        # Check new format first
+        runner_id = runner_config.get('id')
+        if runner_id:
+            if is_plugin_runner_id(runner_id):
+                return runner_id
+            # If it's not a plugin ID, try to map it as old runner name
+            return OLD_RUNNER_TO_PLUGIN_RUNNER_ID.get(runner_id, runner_id)
+
+        # Check old format
+        old_runner_name = runner_config.get('runner')
+        if old_runner_name:
+            # If already plugin:* format, return directly
+            if is_plugin_runner_id(old_runner_name):
+                return old_runner_name
+            # Map old built-in runner to official plugin ID
+            mapped_id = OLD_RUNNER_TO_PLUGIN_RUNNER_ID.get(old_runner_name)
+            if mapped_id:
+                return mapped_id
+            # Return old name if no mapping exists (will error in registry)
+            return old_runner_name
+
+        return None
+
+    @staticmethod
+    def resolve_runner_config(
+        pipeline_config: dict[str, typing.Any],
+        runner_id: str,
+    ) -> dict[str, typing.Any]:
+        """Resolve runner binding configuration from pipeline configuration.
+
+        Runtime code should only read the migrated format. Legacy
+        ai.<runner-name> blocks are handled by migration helpers, not by the
+        hot path.
+
+        Args:
+            pipeline_config: Pipeline configuration dict
+            runner_id: Resolved runner ID
+
+        Returns:
+            Runner configuration dict (empty if not found)
+        """
+        ai_config = pipeline_config.get('ai', {})
+
+        # Check new format
+        runner_configs = ai_config.get('runner_config', {})
+        if runner_id in runner_configs:
+            return runner_configs[runner_id]
+
+        return {}
+
+    @staticmethod
+    def resolve_legacy_runner_config(
+        pipeline_config: dict[str, typing.Any],
+        runner_id: str,
+    ) -> dict[str, typing.Any]:
+        """Resolve old ai.<runner-name> config for migration only."""
+        ai_config = pipeline_config.get('ai', {})
+
+        # Try to find old runner name from runner_id
+        old_runner_name = None
+        for old_name, mapped_id in OLD_RUNNER_TO_PLUGIN_RUNNER_ID.items():
+            if mapped_id == runner_id:
+                old_runner_name = old_name
+                break
+
+        if old_runner_name:
+            old_config = ai_config.get(old_runner_name, {})
+            if old_config:
+                old_config = dict(old_config)
+                if runner_id == OLD_RUNNER_TO_PLUGIN_RUNNER_ID['local-agent']:
+                    old_config.pop('max-round', None)
+                return ConfigMigration.normalize_runner_config_for_migration(runner_id, old_config)
+
+        return {}
+
+    @staticmethod
+    def normalize_runner_config_for_migration(
+        runner_id: str,
+        runner_config: dict[str, typing.Any],
+    ) -> dict[str, typing.Any]:
+        """Normalize released legacy runner config before storing binding config.
+
+        Runtime code should not carry aliases. This helper is intentionally used
+        only by config migration so AgentRunner implementations can consume the
+        current manifest-defined field names.
+        """
+        normalized = dict(runner_config)
+
+        if runner_id == OLD_RUNNER_TO_PLUGIN_RUNNER_ID['local-agent']:
+            legacy_kb = normalized.pop('knowledge-base', None)
+            if 'knowledge-bases' not in normalized:
+                if isinstance(legacy_kb, str) and legacy_kb and legacy_kb not in {'__none__', '__none'}:
+                    normalized['knowledge-bases'] = [legacy_kb]
+                elif legacy_kb is not None:
+                    normalized['knowledge-bases'] = []
+
+        return normalized
+
+    @staticmethod
+    def get_old_runner_name(runner_id: str) -> str | None:
+        """Get old runner name from mapped runner ID.
+
+        Args:
+            runner_id: Plugin runner ID
+
+        Returns:
+            Old runner name if mapped, None otherwise
+        """
+        for old_name, mapped_id in OLD_RUNNER_TO_PLUGIN_RUNNER_ID.items():
+            if mapped_id == runner_id:
+                return old_name
+        return None
+
+    @staticmethod
+    def get_expire_time(pipeline_config: dict[str, typing.Any]) -> int:
+        """Get conversation expire time from configuration.
+
+        Args:
+            pipeline_config: Pipeline configuration dict
+
+        Returns:
+            Expire time in seconds (0 means no expiry)
+        """
+        ai_config = pipeline_config.get('ai', {})
+        runner_config = ai_config.get('runner', {})
+        return runner_config.get('expire-time', 0)
+
+    @staticmethod
+    def migrate_pipeline_config(pipeline_config: dict[str, typing.Any]) -> dict[str, typing.Any]:
+        """Migrate pipeline config to new format.
+
+        This converts old ai.runner.runner and ai.<runner-name> to
+        new ai.runner.id and ai.runner_config format.
+
+        Args:
+            pipeline_config: Original pipeline configuration
+
+        Returns:
+            Migrated pipeline configuration
+        """
+        # Create copy
+        new_config = dict(pipeline_config)
+        ai_config = new_config.get('ai', {})
+        if not ai_config:
+            return new_config
+
+        runner_config = ai_config.get('runner', {})
+        runner_configs = ai_config.get('runner_config', {})
+
+        # Resolve runner ID
+        runner_id = ConfigMigration.resolve_runner_id(pipeline_config)
+        if runner_id:
+            # Set new format
+            runner_config['id'] = runner_id
+            # Remove old runner field if present
+            if 'runner' in runner_config and is_plugin_runner_id(runner_config['runner']):
+                # Already migrated plugin:* format, keep as id
+                pass
+            elif 'runner' in runner_config:
+                # Old built-in runner name, remove after migration
+                old_name = runner_config['runner']
+                if old_name in OLD_RUNNER_TO_PLUGIN_RUNNER_ID:
+                    del runner_config['runner']
+
+            # Migrate runner config
+            resolved_config = ConfigMigration.resolve_runner_config(pipeline_config, runner_id)
+            if not resolved_config:
+                resolved_config = ConfigMigration.resolve_legacy_runner_config(pipeline_config, runner_id)
+            if resolved_config:
+                resolved_config = ConfigMigration.normalize_runner_config_for_migration(runner_id, resolved_config)
+                runner_configs[runner_id] = resolved_config
+                # Remove old runner config block
+                for old_name, mapped_id in OLD_RUNNER_TO_PLUGIN_RUNNER_ID.items():
+                    if mapped_id == runner_id and old_name in ai_config:
+                        del ai_config[old_name]
+
+        # Update configs
+        ai_config['runner'] = runner_config
+        ai_config['runner_config'] = runner_configs
+        new_config['ai'] = ai_config
+
+        return new_config
--- a/src/langbot/pkg/agent/runner/config_schema.py
+++ b/src/langbot/pkg/agent/runner/config_schema.py
@@ -0,0 +1,208 @@
+"""Helpers for interpreting AgentRunner DynamicForm configuration."""
+from __future__ import annotations
+
+import typing
+
+from .descriptor import AgentRunnerDescriptor
+
+
+LLM_MODEL_SELECTOR_TYPES = {'model-fallback-selector', 'llm-model-selector'}
+KB_SELECTOR_TYPES = {'knowledge-base-multi-selector'}
+PROMPT_EDITOR_TYPES = {'prompt-editor'}
+NONE_SENTINELS = {'', '__none__', '__none'}
+
+
+def iter_schema_items(
+    descriptor: AgentRunnerDescriptor | None,
+    field_types: set[str],
+) -> typing.Iterator[dict[str, typing.Any]]:
+    """Yield descriptor config schema items whose type is in field_types."""
+    if descriptor is None:
+        return
+    for item in descriptor.config_schema or []:
+        if not isinstance(item, dict):
+            continue
+        if item.get('type') in field_types:
+            yield item
+
+
+def has_permission(
+    descriptor: AgentRunnerDescriptor | None,
+    name: str,
+    actions: set[str],
+) -> bool:
+    """Return whether a runner descriptor requests one of the given actions."""
+    if descriptor is None:
+        return False
+    configured_actions = descriptor.permissions.get(name, [])
+    return any(action in configured_actions for action in actions)
+
+
+def uses_host_models(descriptor: AgentRunnerDescriptor | None) -> bool:
+    """Return whether LangBot should resolve model resources for this runner."""
+    return (
+        has_permission(descriptor, 'models', {'invoke', 'stream', 'list'})
+        and any(True for _ in iter_schema_items(descriptor, LLM_MODEL_SELECTOR_TYPES))
+    )
+
+
+def uses_host_tools(descriptor: AgentRunnerDescriptor | None) -> bool:
+    """Return whether LangBot should expose tool resources to this runner."""
+    return (
+        descriptor is not None
+        and descriptor.supports_tool_calling()
+        and has_permission(descriptor, 'tools', {'list', 'detail', 'call'})
+    )
+
+
+def uses_host_knowledge_bases(descriptor: AgentRunnerDescriptor | None) -> bool:
+    """Return whether LangBot should expose knowledge-base resources to this runner."""
+    return (
+        descriptor is not None
+        and descriptor.supports_knowledge_retrieval()
+        and has_permission(descriptor, 'knowledge_bases', {'list', 'retrieve'})
+    )
+
+
+def extract_prompt_config(
+    descriptor: AgentRunnerDescriptor | None,
+    runner_config: dict[str, typing.Any],
+    default_prompt: list[dict[str, typing.Any]],
+) -> list[dict[str, typing.Any]]:
+    """Extract the prompt-editor value selected by the runner schema."""
+    for item in iter_schema_items(descriptor, PROMPT_EDITOR_TYPES):
+        field_name = item.get('name')
+        if field_name and field_name in runner_config:
+            configured_prompt = runner_config[field_name]
+            if isinstance(configured_prompt, list):
+                return configured_prompt
+        default_value = item.get('default')
+        if isinstance(default_value, list):
+            return default_value
+    return default_prompt
+
+
+def extract_model_selection(
+    descriptor: AgentRunnerDescriptor | None,
+    runner_config: dict[str, typing.Any],
+) -> tuple[str, list[str]]:
+    """Extract primary/fallback LLM selections from schema-defined fields."""
+    primary_uuid = ''
+    fallback_uuids: list[str] = []
+
+    for item in iter_schema_items(descriptor, LLM_MODEL_SELECTOR_TYPES):
+        field_name = item.get('name')
+        if not field_name:
+            continue
+
+        value = runner_config.get(field_name, item.get('default'))
+        if item.get('type') == 'model-fallback-selector':
+            if isinstance(value, str):
+                primary_uuid = value
+            elif isinstance(value, dict):
+                primary_uuid = value.get('primary') or ''
+                fallbacks = value.get('fallbacks', [])
+                if isinstance(fallbacks, list):
+                    fallback_uuids = [fallback for fallback in fallbacks if isinstance(fallback, str)]
+            break
+
+        if item.get('type') == 'llm-model-selector' and isinstance(value, str):
+            primary_uuid = value
+            break
+
+    return primary_uuid, fallback_uuids
+
+
+def extract_knowledge_base_uuids(
+    descriptor: AgentRunnerDescriptor | None,
+    runner_config: dict[str, typing.Any],
+) -> list[str]:
+    """Extract configured knowledge-base UUIDs from schema-defined fields."""
+    if not uses_host_knowledge_bases(descriptor):
+        return []
+
+    kb_uuids: list[str] = []
+    for item in iter_schema_items(descriptor, KB_SELECTOR_TYPES):
+        field_name = item.get('name')
+        if not field_name:
+            continue
+        value = runner_config.get(field_name, item.get('default', []))
+        if isinstance(value, list):
+            kb_uuids.extend(
+                kb_uuid for kb_uuid in value if isinstance(kb_uuid, str) and kb_uuid not in NONE_SENTINELS
+            )
+
+    return list(dict.fromkeys(kb_uuids))
+
+
+def iter_config_model_refs(
+    descriptor: AgentRunnerDescriptor,
+    runner_config: dict[str, typing.Any],
+) -> typing.Iterator[tuple[str, str]]:
+    """Yield model references declared by schema-defined model selector fields."""
+    for item in descriptor.config_schema or []:
+        if not isinstance(item, dict):
+            continue
+
+        field_name = item.get('name')
+        field_type = item.get('type')
+        if not field_name or field_name not in runner_config:
+            continue
+
+        value = runner_config.get(field_name)
+        if field_type == 'model-fallback-selector':
+            if isinstance(value, str) and value not in NONE_SENTINELS:
+                yield 'llm', value
+            elif isinstance(value, dict):
+                primary = value.get('primary')
+                if isinstance(primary, str) and primary not in NONE_SENTINELS:
+                    yield 'llm', primary
+                fallbacks = value.get('fallbacks', [])
+                if isinstance(fallbacks, list):
+                    for fallback_uuid in fallbacks:
+                        if isinstance(fallback_uuid, str) and fallback_uuid not in NONE_SENTINELS:
+                            yield 'llm', fallback_uuid
+        elif field_type == 'llm-model-selector':
+            if isinstance(value, str) and value not in NONE_SENTINELS:
+                yield 'llm', value
+        elif field_type == 'rerank-model-selector':
+            if isinstance(value, str) and value not in NONE_SENTINELS:
+                yield 'rerank', value
+
+
+def set_empty_llm_model_selection(
+    descriptor: AgentRunnerDescriptor,
+    runner_config: dict[str, typing.Any],
+    model_uuid: str,
+) -> bool:
+    """Set the first empty schema-defined LLM selector to model_uuid."""
+    for item in iter_schema_items(descriptor, LLM_MODEL_SELECTOR_TYPES):
+        field_name = item.get('name')
+        field_type = item.get('type')
+        if not field_name:
+            continue
+
+        value = runner_config.get(field_name, item.get('default'))
+        if field_type == 'model-fallback-selector':
+            if isinstance(value, dict):
+                primary = value.get('primary') or ''
+                if primary not in NONE_SENTINELS:
+                    return False
+                fallbacks = value.get('fallbacks', [])
+                runner_config[field_name] = {
+                    'primary': model_uuid,
+                    'fallbacks': fallbacks if isinstance(fallbacks, list) else [],
+                }
+                return True
+            if isinstance(value, str) and value not in NONE_SENTINELS:
+                return False
+            runner_config[field_name] = {'primary': model_uuid, 'fallbacks': []}
+            return True
+
+        if field_type == 'llm-model-selector':
+            if isinstance(value, str) and value not in NONE_SENTINELS:
+                return False
+            runner_config[field_name] = model_uuid
+            return True
+
+    return False
--- a/src/langbot/pkg/agent/runner/context_builder.py
+++ b/src/langbot/pkg/agent/runner/context_builder.py
@@ -0,0 +1,427 @@
+"""Agent run context builder for provisioning AgentRunContext envelopes."""
+
+from __future__ import annotations
+
+import uuid
+import time
+import typing
+
+from ...core import app
+from .descriptor import AgentRunnerDescriptor
+from .persistent_state_store import get_persistent_state_store
+from .host_models import AgentEventEnvelope, AgentBinding
+
+
+DEFAULT_RUNNER_TIMEOUT_SECONDS = 300
+
+
+# Internal models for the agent runner context protocol.
+
+
+class AgentTrigger(typing.TypedDict):
+    """Agent trigger information."""
+
+    type: str
+    source: str  # 'pipeline' or 'event_router'
+    timestamp: int | None
+
+
+class ConversationContext(typing.TypedDict):
+    """Conversation context."""
+
+    conversation_id: str | None
+    thread_id: str | None
+    launcher_type: str | None
+    launcher_id: str | None
+    sender_id: str | None
+    bot_id: str | None
+    workspace_id: str | None
+    session_id: str | None
+    pipeline_uuid: str | None
+
+
+class AgentInput(typing.TypedDict):
+    """Agent input."""
+
+    text: str | None
+    contents: list[dict[str, typing.Any]]
+    message_chain: dict[str, typing.Any] | None
+    attachments: list[dict[str, typing.Any]]
+
+
+class AgentRunState(typing.TypedDict):
+    """Agent run state with 4 scopes."""
+
+    conversation: dict[str, typing.Any]
+    actor: dict[str, typing.Any]
+    subject: dict[str, typing.Any]
+    runner: dict[str, typing.Any]
+
+
+# Resource payload models matching langbot-plugin-sdk/resources.py.
+
+
+class ModelResource(typing.TypedDict):
+    """Model resource payload."""
+
+    model_id: str
+    model_type: str | None
+    provider: str | None
+
+
+class ToolResource(typing.TypedDict):
+    """Tool resource payload."""
+
+    tool_name: str
+    tool_type: str | None
+    description: str | None
+
+
+class KnowledgeBaseResource(typing.TypedDict):
+    """Knowledge base resource payload."""
+
+    kb_id: str
+    kb_name: str | None
+    kb_type: str | None
+
+
+class FileResource(typing.TypedDict):
+    """File resource payload."""
+
+    file_id: str
+    file_name: str | None
+    mime_type: str | None
+    source: str | None
+
+
+class StorageResource(typing.TypedDict):
+    """Storage resource payload."""
+
+    plugin_storage: bool
+    workspace_storage: bool
+
+
+class AgentResources(typing.TypedDict):
+    """Agent resources payload."""
+
+    models: list[ModelResource]
+    tools: list[ToolResource]
+    knowledge_bases: list[KnowledgeBaseResource]
+    files: list[FileResource]
+    storage: StorageResource
+    platform_capabilities: dict[str, typing.Any]
+
+
+class AgentRuntimeContext(typing.TypedDict):
+    """Agent runtime context."""
+
+    langbot_version: str | None
+    sdk_protocol_version: str
+    query_id: int | None
+    trace_id: str | None
+    deadline_at: float | None
+    metadata: dict[str, typing.Any]
+
+
+class AgentRunContextPayload(typing.TypedDict):
+    """AgentRunContext payload passed to an agent runner.
+
+    Protocol v1 structure - matches SDK AgentRunContext.
+
+    Note: The 'config' field contains the binding config from ai.runner_config[runner_id],
+    which is Pipeline's configuration for this specific runner binding (not plugin instance config).
+    """
+
+    run_id: str
+    trigger: AgentTrigger
+    conversation: ConversationContext | None
+    event: dict[str, typing.Any]  # REQUIRED for Protocol v1
+    actor: dict[str, typing.Any] | None
+    subject: dict[str, typing.Any] | None
+    input: AgentInput
+    delivery: dict[str, typing.Any]  # REQUIRED for Protocol v1
+    resources: AgentResources
+    context: dict[str, typing.Any]  # ContextAccess - REQUIRED for Protocol v1
+    state: AgentRunState
+    runtime: AgentRuntimeContext
+    config: dict[str, typing.Any]  # Binding config from ai.runner_config[runner_id]
+    bootstrap: dict[str, typing.Any] | None  # Optional bootstrap context
+    adapter: dict[str, typing.Any] | None  # Pipeline adapter context
+    metadata: dict[str, typing.Any]  # Additional metadata
+
+
+class AgentRunContextBuilder:
+    """Builder for provisioning AgentRunContext.
+
+    Responsibilities:
+    - Generate new run_id (UUID, not query id)
+    - Set trigger type based on event source
+    - Build conversation context from event
+    - Build input from event
+    - Build state snapshot from PersistentStateStore
+    - Build runtime context with host info, trace_id, deadline
+    - Set config from runner binding configuration.
+
+    Pipeline Query adaptation belongs to PipelineAdapter, not this builder.
+    """
+
+    ap: app.Application
+
+    def __init__(self, ap: app.Application):
+        self.ap = ap
+
+    async def build_context_from_event(
+        self,
+        event: AgentEventEnvelope,
+        binding: AgentBinding,
+        descriptor: AgentRunnerDescriptor,
+        resources: AgentResources,
+    ) -> AgentRunContextPayload:
+        """Build AgentRunContext from event-first envelope.
+
+        This is the main entry point for Protocol v1.
+        Does NOT inline full history by default.
+
+        Args:
+            event: Event envelope
+            binding: Agent binding configuration
+            descriptor: Runner descriptor
+            resources: Built resources
+
+        Returns:
+            AgentRunContextPayload for the runner
+        """
+        # Generate new run_id
+        run_id = str(uuid.uuid4())
+
+        # Build trigger from event
+        trigger: AgentTrigger = {
+            'type': event.event_type,
+            'source': event.source,
+            'timestamp': event.event_time or int(time.time()),
+        }
+
+        # Build conversation context from event
+        conversation: ConversationContext | None = None
+        if event.conversation_id:
+            conversation = {
+                'session_id': None,  # Pipeline adapter field
+                'conversation_id': event.conversation_id,
+                'thread_id': event.thread_id,
+                'launcher_type': None,  # Will be filled from actor/subject if needed
+                'launcher_id': None,
+                'sender_id': event.actor.actor_id if event.actor else None,
+                'bot_id': event.bot_id,
+                'workspace_id': event.workspace_id,
+                'pipeline_uuid': binding.pipeline_uuid,  # Pipeline adapter field
+            }
+
+        # Build event context (Protocol v1 event-first)
+        event_context = {
+            'event_id': event.event_id,
+            'event_type': event.event_type,
+            'event_time': event.event_time,
+            'source': event.source,
+            'source_event_type': event.source_event_type,
+            'raw_ref': event.raw_ref.model_dump(mode='json') if event.raw_ref else None,
+            'data': event.data,
+        }
+
+        # Build actor context
+        actor_context = None
+        if event.actor:
+            actor_context = {
+                'actor_type': event.actor.actor_type,
+                'actor_id': event.actor.actor_id,
+                'actor_name': event.actor.actor_name,
+            }
+
+        # Build subject context
+        subject_context = None
+        if event.subject:
+            subject_context = {
+                'subject_type': event.subject.subject_type,
+                'subject_id': event.subject.subject_id,
+                'data': event.subject.data,
+            }
+
+        # Build input from event
+        input: AgentInput = {
+            'text': event.input.text,
+            'contents': [c.model_dump(mode='json') if hasattr(c, 'model_dump') else c for c in event.input.contents],
+            'message_chain': event.input.message_chain,
+            'attachments': [
+                a.model_dump(mode='json') if hasattr(a, 'model_dump') else a for a in event.input.attachments
+            ],
+        }
+
+        # Build context access (no history inlined by default for Protocol v1)
+        # Populate with actual values from stores
+        context_access = await self._build_context_access(event, descriptor, binding)
+
+        # Build state snapshot from persistent state store (event-first Protocol v1)
+        persistent_state_store = get_persistent_state_store(self.ap.persistence_mgr.get_db_engine())
+        state: AgentRunState = await persistent_state_store.build_snapshot_from_event(event, binding, descriptor)
+
+        # Build runtime context
+        runtime: AgentRuntimeContext = {
+            'langbot_version': self.ap.ver_mgr.get_current_version(),
+            'sdk_protocol_version': descriptor.protocol_version,
+            'query_id': None,  # No query_id in event-first mode
+            'trace_id': run_id,
+            'deadline_at': self._build_deadline_from_binding(binding),
+            'metadata': {
+                'bot_id': event.bot_id,
+                'workspace_id': event.workspace_id,
+                'streaming_supported': event.delivery.supports_streaming,
+                'model_context_window_tokens': None,
+                # TODO(model-info): populate model_context_window_tokens after
+                # LiteLLM/model metadata lands. Runners fall back to their
+                # binding config until Host can provide the real window.
+            },
+        }
+
+        # Build delivery context
+        delivery_context = {
+            'surface': event.delivery.surface,
+            'reply_target': event.delivery.reply_target,
+            'supports_streaming': event.delivery.supports_streaming,
+            'supports_edit': event.delivery.supports_edit,
+            'supports_reaction': event.delivery.supports_reaction,
+            'max_message_size': event.delivery.max_message_size,
+            'platform_capabilities': event.delivery.platform_capabilities,
+        }
+
+        # Build adapter context (empty for event-first)
+        adapter_context = {
+            'query_id': None,
+            'pipeline_uuid': binding.pipeline_uuid,
+            'extra': {},
+        }
+
+        # Build full context - Protocol v1 structure
+        context: AgentRunContextPayload = {
+            'run_id': run_id,
+            'trigger': trigger,
+            'conversation': conversation,
+            'event': event_context,  # REQUIRED
+            'actor': actor_context,
+            'subject': subject_context,
+            'input': input,
+            'delivery': delivery_context,  # REQUIRED
+            'resources': resources,
+            'context': context_access,  # ContextAccess - REQUIRED
+            'state': state,
+            'runtime': runtime,
+            'config': binding.runner_config,
+            'bootstrap': None,
+            'adapter': adapter_context,
+            'metadata': {},  # Additional metadata
+        }
+
+        return context
+
+    def _build_deadline_from_binding(self, binding: AgentBinding) -> float | None:
+        """Build deadline timestamp from binding timeout config.
+
+        Args:
+            binding: Agent binding with runner_config
+
+        Returns:
+            Deadline timestamp or None
+        """
+        timeout = binding.runner_config.get('timeout', DEFAULT_RUNNER_TIMEOUT_SECONDS)
+        if timeout is None:
+            return None
+
+        try:
+            timeout_seconds = float(timeout)
+        except (TypeError, ValueError):
+            return None
+
+        if timeout_seconds <= 0:
+            return None
+
+        return time.time() + timeout_seconds
+
+    async def _build_context_access(
+        self,
+        event: AgentEventEnvelope,
+        descriptor: AgentRunnerDescriptor,
+        binding: AgentBinding | None = None,
+    ) -> dict[str, typing.Any]:
+        """Build ContextAccess with actual values from stores.
+
+        Args:
+            event: Event envelope
+            descriptor: Runner descriptor
+            binding: Agent binding (required for state_policy in event-first mode)
+
+        Returns:
+            ContextAccess dict
+        """
+        conversation_id = event.conversation_id
+
+        # Check if history APIs are available for this runner
+        # Based on runner permissions
+        permissions = descriptor.permissions or {}
+        history_permissions = permissions.get('history', [])
+        event_permissions = permissions.get('events', [])
+        artifact_permissions = permissions.get('artifacts', [])
+
+        history_page_enabled = 'page' in history_permissions and conversation_id is not None
+        history_search_enabled = 'search' in history_permissions and conversation_id is not None
+        event_get_enabled = 'get' in event_permissions
+        event_page_enabled = 'page' in event_permissions and conversation_id is not None
+        artifact_metadata_enabled = 'metadata' in artifact_permissions
+        artifact_read_enabled = 'read' in artifact_permissions
+
+        # Determine state API availability based on binding state_policy.
+        state_enabled = False
+        if binding is not None:
+            state_policy = binding.state_policy
+            if state_policy.enable_state and state_policy.state_scopes:
+                state_enabled = True
+
+        # Get latest cursor and has_history_before if conversation exists
+        latest_cursor = None
+        has_history_before = False
+
+        if conversation_id:
+            try:
+                from .transcript_store import TranscriptStore
+
+                store = TranscriptStore(self.ap.persistence_mgr.get_db_engine())
+
+                latest_cursor = await store.get_latest_cursor(conversation_id)
+                if latest_cursor:
+                    has_history_before = True
+            except Exception as e:
+                self.ap.logger.warning(f'Failed to get transcript cursor: {e}')
+
+        return {
+            'conversation_id': conversation_id,
+            'thread_id': event.thread_id,
+            'latest_cursor': latest_cursor,
+            'event_seq': None,  # Will be populated when EventLog is written
+            'transcript_seq': int(latest_cursor) if latest_cursor else None,
+            'has_history_before': has_history_before,
+            'inline_policy': {
+                'mode': 'current_event',
+                'delivered_count': 0,
+                'source_total_count': None,
+                'messages_complete': False,
+                'reason': 'self_managed_context',
+            },
+            'available_apis': {
+                'history_page': history_page_enabled,
+                'history_search': history_search_enabled,
+                'event_get': event_get_enabled,
+                'event_page': event_page_enabled,
+                'artifact_metadata': artifact_metadata_enabled,
+                'artifact_read': artifact_read_enabled,
+                'state': state_enabled,
+                'storage': True,
+                'prompt_get': False,
+            },
+        }
--- a/src/langbot/pkg/agent/runner/descriptor.py
+++ b/src/langbot/pkg/agent/runner/descriptor.py
@@ -0,0 +1,72 @@
+"""Agent runner descriptor."""
+from __future__ import annotations
+
+import typing
+import pydantic
+
+
+class AgentRunnerDescriptor(pydantic.BaseModel):
+    """Descriptor for an agent runner.
+
+    Represents the discovered metadata for a runner, including
+    its identity, capabilities, permissions, and configuration schema.
+    """
+
+    id: str
+    """Unique runner ID: plugin:author/plugin_name/runner_name"""
+
+    source: typing.Literal['plugin']
+    """Runner source type"""
+
+    label: dict[str, str]
+    """Display labels keyed by locale (e.g., en_US, zh_Hans)"""
+
+    description: dict[str, str] | None = None
+    """Optional description keyed by locale"""
+
+    plugin_author: str
+    """Plugin author from manifest"""
+
+    plugin_name: str
+    """Plugin name from manifest"""
+
+    runner_name: str
+    """AgentRunner component name from manifest"""
+
+    plugin_version: str | None = None
+    """Optional plugin version"""
+
+    protocol_version: str = '1'
+    """SDK protocol version, default '1'"""
+
+    config_schema: list[dict[str, typing.Any]] = []
+    """Configuration schema using DynamicForm format"""
+
+    capabilities: dict[str, bool] = {}
+    """Runner capabilities: streaming, tool_calling, knowledge_retrieval, etc."""
+
+    permissions: dict[str, list[str]] = {}
+    """Requested permissions: models, tools, knowledge_bases, storage, files, platform_api"""
+
+    raw_manifest: dict[str, typing.Any] = {}
+    """Original manifest for reference"""
+
+    model_config = pydantic.ConfigDict(
+        extra='allow',
+    )
+
+    def get_plugin_id(self) -> str:
+        """Return plugin identifier as author/name."""
+        return f'{self.plugin_author}/{self.plugin_name}'
+
+    def supports_streaming(self) -> bool:
+        """Check if runner supports streaming output."""
+        return self.capabilities.get('streaming', False)
+
+    def supports_tool_calling(self) -> bool:
+        """Check if runner supports tool calling."""
+        return self.capabilities.get('tool_calling', False)
+
+    def supports_knowledge_retrieval(self) -> bool:
+        """Check if runner supports knowledge retrieval."""
+        return self.capabilities.get('knowledge_retrieval', False)
--- a/src/langbot/pkg/agent/runner/errors.py
+++ b/src/langbot/pkg/agent/runner/errors.py
@@ -0,0 +1,37 @@
+"""Agent runner errors."""
+from __future__ import annotations
+
+
+class AgentRunnerError(Exception):
+    """Base error for agent runner operations."""
+    pass
+
+
+class RunnerNotFoundError(AgentRunnerError):
+    """Runner not found in registry."""
+    def __init__(self, runner_id: str):
+        self.runner_id = runner_id
+        super().__init__(f'Agent runner not found: {runner_id}')
+
+
+class RunnerNotAuthorizedError(AgentRunnerError):
+    """Runner not authorized for this pipeline."""
+    def __init__(self, runner_id: str, bound_plugins: list[str] | None):
+        self.runner_id = runner_id
+        self.bound_plugins = bound_plugins
+        super().__init__(f'Agent runner {runner_id} not authorized for bound_plugins={bound_plugins}')
+
+
+class RunnerProtocolError(AgentRunnerError):
+    """Runner protocol version mismatch or invalid manifest."""
+    def __init__(self, runner_id: str, message: str):
+        self.runner_id = runner_id
+        super().__init__(f'Agent runner protocol error for {runner_id}: {message}')
+
+
+class RunnerExecutionError(AgentRunnerError):
+    """Runner execution failed."""
+    def __init__(self, runner_id: str, message: str, retryable: bool = False):
+        self.runner_id = runner_id
+        self.retryable = retryable
+        super().__init__(f'Agent runner {runner_id} execution failed: {message}')
--- a/src/langbot/pkg/agent/runner/event_log_store.py
+++ b/src/langbot/pkg/agent/runner/event_log_store.py
@@ -0,0 +1,255 @@
+"""EventLog store for writing and querying event records."""
+from __future__ import annotations
+
+import json
+import datetime
+import typing
+import uuid
+
+import sqlalchemy
+from sqlalchemy.ext.asyncio import AsyncEngine, AsyncSession
+from sqlalchemy.orm import sessionmaker
+
+from ...entity.persistence.event_log import EventLog
+
+
+class EventLogStore:
+    """Store for EventLog records.
+
+    Handles writing events to the event log and querying them.
+    All methods are async and use the provided database engine.
+    """
+
+    engine: AsyncEngine
+
+    # Hard limits
+    MAX_INPUT_SUMMARY_LENGTH = 1000
+
+    def __init__(self, engine: AsyncEngine):
+        self.engine = engine
+        self._session_factory = sessionmaker(
+            engine, class_=AsyncSession, expire_on_commit=False
+        )
+
+    async def append_event(
+        self,
+        event_id: str | None,
+        event_type: str,
+        source: str,
+        bot_id: str | None = None,
+        workspace_id: str | None = None,
+        conversation_id: str | None = None,
+        thread_id: str | None = None,
+        actor_type: str | None = None,
+        actor_id: str | None = None,
+        actor_name: str | None = None,
+        subject_type: str | None = None,
+        subject_id: str | None = None,
+        input_summary: str | None = None,
+        input_json: dict[str, typing.Any] | None = None,
+        raw_ref: str | None = None,
+        run_id: str | None = None,
+        runner_id: str | None = None,
+        event_time: datetime.datetime | None = None,
+        metadata: dict[str, typing.Any] | None = None,
+    ) -> str:
+        """Append an event to the event log.
+
+        Args:
+            event_id: Unique event ID (generated if None)
+            event_type: Event type
+            source: Event source
+            bot_id: Bot UUID
+            workspace_id: Workspace ID
+            conversation_id: Conversation ID
+            thread_id: Thread ID
+            actor_type: Actor type
+            actor_id: Actor ID
+            actor_name: Actor display name
+            subject_type: Subject type
+            subject_id: Subject ID
+            input_summary: Brief input summary
+            input_json: Full input JSON
+            raw_ref: Reference to raw event payload
+            run_id: Run ID processing this event
+            runner_id: Runner ID processing this event
+            event_time: When the event occurred
+            metadata: Additional metadata
+
+        Returns:
+            The event_id
+        """
+        if event_id is None:
+            event_id = str(uuid.uuid4())
+
+        # Truncate input summary if too long
+        if input_summary and len(input_summary) > self.MAX_INPUT_SUMMARY_LENGTH:
+            input_summary = input_summary[:self.MAX_INPUT_SUMMARY_LENGTH - 3] + "..."
+
+        async with self._session_factory() as session:
+            event = EventLog(
+                event_id=event_id,
+                event_type=event_type,
+                event_time=event_time,
+                source=source,
+                bot_id=bot_id,
+                workspace_id=workspace_id,
+                conversation_id=conversation_id,
+                thread_id=thread_id,
+                actor_type=actor_type,
+                actor_id=actor_id,
+                actor_name=actor_name,
+                subject_type=subject_type,
+                subject_id=subject_id,
+                input_summary=input_summary,
+                input_json=json.dumps(input_json) if input_json else None,
+                raw_ref=raw_ref,
+                run_id=run_id,
+                runner_id=runner_id,
+                metadata_json=json.dumps(metadata) if metadata else None,
+                created_at=datetime.datetime.utcnow(),
+            )
+            session.add(event)
+            await session.commit()
+
+        return event_id
+
+    async def get_event(
+        self,
+        event_id: str,
+    ) -> dict[str, typing.Any] | None:
+        """Get a single event by ID.
+
+        Args:
+            event_id: Event ID
+
+        Returns:
+            Event record as dict, or None if not found
+        """
+        async with self._session_factory() as session:
+            result = await session.execute(
+                sqlalchemy.select(EventLog).where(EventLog.event_id == event_id)
+            )
+            row = result.scalars().first()
+            if row is None:
+                return None
+            return self._row_to_dict(row)
+
+    async def page_events(
+        self,
+        conversation_id: str | None = None,
+        event_types: list[str] | None = None,
+        before_seq: int | None = None,
+        limit: int = 50,
+    ) -> tuple[list[dict[str, typing.Any]], int | None, bool]:
+        """Page through event records.
+
+        Args:
+            conversation_id: Filter by conversation ID
+            event_types: Filter by event types
+            before_seq: Get events before this sequence number
+            limit: Maximum items to return (capped at 100)
+
+        Returns:
+            Tuple of (items, next_seq, has_more)
+        """
+        limit = min(limit, 100)  # Hard cap
+
+        async with self._session_factory() as session:
+            query = sqlalchemy.select(EventLog)
+
+            if conversation_id is not None:
+                query = query.where(EventLog.conversation_id == conversation_id)
+
+            if event_types:
+                query = query.where(EventLog.event_type.in_(event_types))
+
+            if before_seq is not None:
+                query = query.where(EventLog.id < before_seq)
+
+            query = query.order_by(EventLog.id.desc()).limit(limit + 1)
+
+            result = await session.execute(query)
+            rows = result.scalars().all()
+
+            items = [self._row_to_dict(row) for row in rows[:limit]]
+            has_more = len(rows) > limit
+            next_seq = items[-1]['id'] if items and has_more else None
+
+            return items, next_seq, has_more
+
+    async def get_latest_cursor(
+        self,
+        conversation_id: str,
+    ) -> str | None:
+        """Get the latest cursor for a conversation.
+
+        Args:
+            conversation_id: Conversation ID
+
+        Returns:
+            Cursor string (seq number), or None if no events
+        """
+        async with self._session_factory() as session:
+            result = await session.execute(
+                sqlalchemy.select(EventLog.id)
+                .where(EventLog.conversation_id == conversation_id)
+                .order_by(EventLog.id.desc())
+                .limit(1)
+            )
+            row = result.scalars().first()
+            if row is None:
+                return None
+            return str(row)
+
+    async def has_events_before(
+        self,
+        conversation_id: str,
+        seq: int,
+    ) -> bool:
+        """Check if there are events before a sequence number.
+
+        Args:
+            conversation_id: Conversation ID
+            seq: Sequence number
+
+        Returns:
+            True if there are events before
+        """
+        async with self._session_factory() as session:
+            result = await session.execute(
+                sqlalchemy.select(sqlalchemy.func.count())
+                .select_from(EventLog)
+                .where(
+                    EventLog.conversation_id == conversation_id,
+                    EventLog.id < seq,
+                )
+            )
+            count = result.scalar()
+            return count > 0
+
+    def _row_to_dict(self, row: EventLog) -> dict[str, typing.Any]:
+        """Convert an EventLog row to dict."""
+        return {
+            'id': row.id,
+            'event_id': row.event_id,
+            'event_type': row.event_type,
+            'event_time': int(row.event_time.timestamp()) if row.event_time else None,
+            'source': row.source,
+            'bot_id': row.bot_id,
+            'workspace_id': row.workspace_id,
+            'conversation_id': row.conversation_id,
+            'thread_id': row.thread_id,
+            'actor_type': row.actor_type,
+            'actor_id': row.actor_id,
+            'actor_name': row.actor_name,
+            'subject_type': row.subject_type,
+            'subject_id': row.subject_id,
+            'input_summary': row.input_summary,
+            'input_json': json.loads(row.input_json) if row.input_json else None,
+            'raw_ref': row.raw_ref,
+            'run_id': row.run_id,
+            'runner_id': row.runner_id,
+            'created_at': int(row.created_at.timestamp()) if row.created_at else None,
+            'metadata': json.loads(row.metadata_json) if row.metadata_json else {},
+        }
--- a/src/langbot/pkg/agent/runner/events.py
+++ b/src/langbot/pkg/agent/runner/events.py
@@ -0,0 +1,25 @@
+"""Canonical AgentRunner event names reserved for future EBA integration."""
+from __future__ import annotations
+
+
+MESSAGE_RECEIVED = 'message.received'
+"""A normal message entered the current Pipeline."""
+
+MESSAGE_RECALLED = 'message.recalled'
+"""A platform message was recalled or deleted."""
+
+GROUP_MEMBER_JOINED = 'group.member_joined'
+"""A new member joined a group/channel conversation."""
+
+FRIEND_REQUEST_RECEIVED = 'friend.request_received'
+"""A new friend/contact request was received."""
+
+
+RESERVED_EVENT_TYPES = frozenset(
+    {
+        MESSAGE_RECEIVED,
+        MESSAGE_RECALLED,
+        GROUP_MEMBER_JOINED,
+        FRIEND_REQUEST_RECEIVED,
+    }
+)
--- a/src/langbot/pkg/agent/runner/host_models.py
+++ b/src/langbot/pkg/agent/runner/host_models.py
@@ -0,0 +1,172 @@
+"""Agent event envelope and binding models for LangBot Host.
+
+These are Host-internal models, not exposed to SDK.
+"""
+from __future__ import annotations
+
+import typing
+import pydantic
+
+from langbot_plugin.api.entities.builtin.agent_runner.event import (
+    ActorContext,
+    SubjectContext,
+    RawEventRef,
+)
+from langbot_plugin.api.entities.builtin.agent_runner.input import AgentInput
+from langbot_plugin.api.entities.builtin.agent_runner.delivery import DeliveryContext
+
+
+class AgentEventEnvelope(pydantic.BaseModel):
+    """Event envelope for LangBot Host event gateway.
+
+    This is the unified input model that replaces Query-first approach.
+    IM / WebUI / API / EventRouter all produce this envelope.
+    """
+
+    event_id: str
+    """Unique event identifier."""
+
+    event_type: str
+    """Event type (message.received, message.recalled, etc.)."""
+
+    event_time: int | None = None
+    """Event timestamp (epoch seconds)."""
+
+    source: str
+    """Event source (platform, webui, api, scheduler, system)."""
+
+    source_event_type: str | None = None
+    """Original source event type, when available."""
+
+    bot_id: str | None = None
+    """Bot UUID handling this event."""
+
+    workspace_id: str | None = None
+    """Workspace ID (for multi-tenant)."""
+
+    conversation_id: str | None = None
+    """Conversation ID."""
+
+    thread_id: str | None = None
+    """Thread ID (for platforms supporting threads)."""
+
+    actor: ActorContext | None = None
+    """Actor (who triggered the event)."""
+
+    subject: SubjectContext | None = None
+    """Subject (what the event is about)."""
+
+    input: AgentInput
+    """Event input."""
+
+    delivery: DeliveryContext
+    """Delivery context."""
+
+    raw_ref: RawEventRef | None = None
+    """Reference to raw event payload."""
+
+    data: dict[str, typing.Any] = pydantic.Field(default_factory=dict)
+    """Small structured event payload. Large payloads should be referenced via raw_ref/artifacts."""
+
+
+# Binding scope types
+class BindingScope(pydantic.BaseModel):
+    """Scope for agent binding."""
+
+    scope_type: typing.Literal["bot", "pipeline", "workspace", "global"] = "pipeline"
+    """Scope type."""
+
+    scope_id: str | None = None
+    """Scope identifier (bot_uuid, pipeline_uuid, etc.)."""
+
+
+class ResourcePolicy(pydantic.BaseModel):
+    """Resource policy for agent binding.
+
+    Controls what resources the runner can access.
+    """
+
+    allowed_model_uuids: list[str] | None = None
+    """Additional model UUID grants. None means no additional model grants."""
+
+    allowed_tool_names: list[str] | None = None
+    """Additional tool name grants. None means no additional tool grants."""
+
+    allowed_kb_uuids: list[str] | None = None
+    """Additional knowledge base UUID grants. None means no additional KB grants."""
+
+    allow_plugin_storage: bool = True
+    """Whether plugin storage is allowed."""
+
+    allow_workspace_storage: bool = False
+    """Whether workspace storage is allowed."""
+
+
+class StatePolicy(pydantic.BaseModel):
+    """State policy for agent binding.
+
+    Controls state management behavior.
+    """
+
+    enable_state: bool = True
+    """Whether host-owned state is enabled."""
+
+    state_scopes: list[typing.Literal["conversation", "actor", "subject", "runner"]] = (
+        pydantic.Field(default_factory=lambda: ["conversation", "actor"])
+    )
+    """Enabled state scopes."""
+
+
+class DeliveryPolicy(pydantic.BaseModel):
+    """Delivery policy for agent binding.
+
+    Controls how results are delivered.
+    """
+
+    enable_streaming: bool = True
+    """Whether streaming output is enabled."""
+
+    enable_reply: bool = True
+    """Whether reply is enabled."""
+
+    max_message_size: int | None = None
+    """Maximum message size."""
+
+
+class AgentBinding(pydantic.BaseModel):
+    """Binding configuration for mapping events to runners.
+
+    This is Host-internal model for event-to-runner binding.
+    It replaces the old Pipeline runner config role.
+    """
+
+    binding_id: str
+    """Unique binding identifier."""
+
+    scope: BindingScope = pydantic.Field(default_factory=BindingScope)
+    """Binding scope."""
+
+    event_types: list[str] = pydantic.Field(default_factory=lambda: ["message.received"])
+    """Event types this binding handles."""
+
+    runner_id: str
+    """Runner ID to invoke."""
+
+    runner_config: dict[str, typing.Any] = pydantic.Field(default_factory=dict)
+    """Runner binding configuration."""
+
+    resource_policy: ResourcePolicy = pydantic.Field(default_factory=ResourcePolicy)
+    """Resource policy."""
+
+    state_policy: StatePolicy = pydantic.Field(default_factory=StatePolicy)
+    """State policy."""
+
+    delivery_policy: DeliveryPolicy = pydantic.Field(default_factory=DeliveryPolicy)
+    """Delivery policy."""
+
+    enabled: bool = True
+    """Whether binding is enabled."""
+
+    # Fields for Pipeline adapter
+    pipeline_uuid: str | None = None
+    """Pipeline UUID (for Pipeline adapter)."""
--- a/src/langbot/pkg/agent/runner/id.py
+++ b/src/langbot/pkg/agent/runner/id.py
@@ -0,0 +1,91 @@
+"""Agent runner ID parsing and formatting."""
+from __future__ import annotations
+
+import dataclasses
+
+
+@dataclasses.dataclass(frozen=True)
+class RunnerIdParts:
+    """Parsed runner ID components."""
+    source: str  # 'plugin' (future: 'builtin')
+    plugin_author: str
+    plugin_name: str
+    runner_name: str
+
+    def to_plugin_id(self) -> str:
+        """Return plugin identifier as author/name."""
+        return f'{self.plugin_author}/{self.plugin_name}'
+
+
+def parse_runner_id(runner_id: str) -> RunnerIdParts:
+    """Parse runner ID string into components.
+
+    Args:
+        runner_id: Runner ID in format 'plugin:author/plugin_name/runner_name'
+
+    Returns:
+        RunnerIdParts with parsed components
+
+    Raises:
+        ValueError: If runner_id format is invalid
+    """
+    if runner_id.startswith('plugin:'):
+        parts = runner_id[7:].split('/')
+        if len(parts) != 3:
+            raise ValueError(
+                f'Invalid plugin runner ID format: {runner_id}. '
+                f'Expected: plugin:author/plugin_name/runner_name'
+            )
+        plugin_author, plugin_name, runner_name = parts
+        if not plugin_author or not plugin_name or not runner_name:
+            raise ValueError(
+                f'Invalid plugin runner ID: {runner_id}. '
+                f'author, plugin_name, and runner_name must be non-empty'
+            )
+        return RunnerIdParts(
+            source='plugin',
+            plugin_author=plugin_author,
+            plugin_name=plugin_name,
+            runner_name=runner_name,
+        )
+    else:
+        # Only plugin runner IDs are valid at the protocol boundary.
+        raise ValueError(
+            f'Invalid runner ID format: {runner_id}. '
+            f'Expected: plugin:author/plugin_name/runner_name'
+        )
+
+
+def format_runner_id(
+    source: str,
+    plugin_author: str,
+    plugin_name: str,
+    runner_name: str,
+) -> str:
+    """Format runner ID from components.
+
+    Args:
+        source: Runner source ('plugin')
+        plugin_author: Plugin author
+        plugin_name: Plugin name
+        runner_name: Runner component name
+
+    Returns:
+        Runner ID string
+    """
+    if source == 'plugin':
+        return f'plugin:{plugin_author}/{plugin_name}/{runner_name}'
+    else:
+        raise ValueError(f'Invalid runner source: {source}')
+
+
+def is_plugin_runner_id(runner_id: str) -> bool:
+    """Check if runner ID is a plugin runner.
+
+    Args:
+        runner_id: Runner ID string
+
+    Returns:
+        True if runner ID starts with 'plugin:'
+    """
+    return runner_id.startswith('plugin:')
--- a/src/langbot/pkg/agent/runner/orchestrator.py
+++ b/src/langbot/pkg/agent/runner/orchestrator.py
@@ -0,0 +1,884 @@
+"""Agent run orchestrator for coordinating runner execution."""
+from __future__ import annotations
+
+import typing
+import traceback
+import asyncio
+import time
+
+from langbot_plugin.api.entities.builtin.provider import message as provider_message
+from langbot_plugin.api.entities.builtin.pipeline import query as pipeline_query
+from langbot_plugin.entities.io.errors import ActionCallTimeoutError
+
+from ...core import app
+from .descriptor import AgentRunnerDescriptor
+from .registry import AgentRunnerRegistry
+from .context_builder import AgentRunContextBuilder, AgentRunContextPayload
+from .resource_builder import AgentResourceBuilder
+from .result_normalizer import AgentResultNormalizer
+from .persistent_state_store import get_persistent_state_store, PersistentStateStore
+from .session_registry import get_session_registry, AgentRunSessionRegistry
+from .config_migration import ConfigMigration
+from .host_models import AgentEventEnvelope, AgentBinding
+from .pipeline_adapter import PipelineAdapter
+from .state_scope import build_state_context
+from .errors import (
+    RunnerNotFoundError,
+    RunnerExecutionError,
+    RunnerProtocolError,
+)
+
+
+# Maximum inline artifact content size (1MB)
+MAX_ARTIFACT_INLINE_BYTES = 1 * 1024 * 1024
+
+
+class AgentRunOrchestrator:
+    """Orchestrator for agent runner execution.
+
+    Responsibilities:
+    - Resolve runner ID from pipeline config (new or old format)
+    - Get runner descriptor from registry
+    - Provision AgentRunContext envelope from Query
+    - Build AgentResources with permission filtering
+    - Invoke plugin runtime RUN_AGENT action
+    - Normalize AgentRunResult to Pipeline messages
+    - Handle errors, timeouts, protocol errors
+    - Maintain streaming card behavior
+
+    Entry points:
+    - run(event, binding): Main entry for event-first Protocol v1
+    - run_from_query(query): Pipeline adapter wrapper
+    """
+
+    ap: app.Application
+
+    registry: AgentRunnerRegistry
+
+    context_builder: AgentRunContextBuilder
+
+    resource_builder: AgentResourceBuilder
+
+    result_normalizer: AgentResultNormalizer
+
+    # Cached singleton references (set in __init__)
+    _session_registry: AgentRunSessionRegistry
+    _persistent_state_store: PersistentStateStore | None
+
+    def __init__(
+        self,
+        ap: app.Application,
+        registry: AgentRunnerRegistry,
+    ):
+        self.ap = ap
+        self.registry = registry
+        self.context_builder = AgentRunContextBuilder(ap)
+        self.resource_builder = AgentResourceBuilder(ap)
+        self.result_normalizer = AgentResultNormalizer(ap)
+        # Cache singleton references to avoid per-request getter calls
+        self._session_registry = get_session_registry()
+        self._persistent_state_store = None  # Lazy init on first use
+
+    async def run(
+        self,
+        event: AgentEventEnvelope,
+        binding: AgentBinding,
+        bound_plugins: list[str] | None = None,
+        adapter_context: dict[str, typing.Any] | None = None,
+    ) -> typing.AsyncGenerator[provider_message.Message | provider_message.MessageChunk, None]:
+        """Run agent runner from event-first envelope.
+
+        This is the main entry point for Protocol v1.
+        Event Gateway -> AgentBindingResolver -> run(event, binding).
+
+        Args:
+            event: Event envelope from event gateway
+            binding: Agent binding configuration
+            bound_plugins: Optional list of bound plugin identities for authorization
+            adapter_context: Optional adapter context from Pipeline adapter
+
+        Yields:
+            Message or MessageChunk for pipeline response
+
+        Raises:
+            RunnerNotFoundError: If runner not found
+            RunnerNotAuthorizedError: If runner not authorized
+            RunnerExecutionError: If runner execution failed
+        """
+        runner_id = binding.runner_id
+
+        # Get runner descriptor
+        descriptor = await self.registry.get(runner_id, bound_plugins)
+
+        # Build resources from binding
+        resources = await self.resource_builder.build_resources_from_binding(
+            event=event,
+            binding=binding,
+            descriptor=descriptor,
+        )
+
+        # Build context from event + binding
+        context = await self.context_builder.build_context_from_event(
+            event=event,
+            binding=binding,
+            descriptor=descriptor,
+            resources=resources,
+        )
+
+        # Merge adapter context if provided (for Pipeline adapter)
+        if adapter_context:
+            # Merge params into adapter.extra
+            if 'params' in adapter_context:
+                context['adapter']['extra']['params'] = adapter_context['params']
+            # Merge prompt into adapter.extra for Pipeline adapter consumers.
+            if 'prompt' in adapter_context:
+                context['adapter']['extra']['prompt'] = adapter_context['prompt']
+            # Set query_id if provided
+            if adapter_context.get('query_id'):
+                context['runtime']['query_id'] = adapter_context['query_id']
+
+        # Build state context for State API handlers
+        state_context = build_state_context(event, binding, descriptor)
+
+        # Register session for proxy action permission validation
+        run_id = context['run_id']
+        query_id = context['runtime'].get('query_id')  # May be None for pure event-first mode
+        await self._session_registry.register(
+            run_id=run_id,
+            runner_id=descriptor.id,
+            query_id=query_id,
+            plugin_identity=descriptor.get_plugin_id(),
+            resources=resources,
+            permissions=descriptor.permissions or {},
+            conversation_id=event.conversation_id,
+            state_policy={
+                'enable_state': binding.state_policy.enable_state,
+                'state_scopes': list(binding.state_policy.state_scopes),
+            },
+            state_context=state_context,
+        )
+
+        # Write incoming event to EventLog
+        event_log_id = await self._write_event_log(
+            event=event,
+            binding=binding,
+            run_id=run_id,
+            runner_id=descriptor.id,
+        )
+
+        # Register incoming attachments so input/transcript artifact_refs are resolvable.
+        await self._register_input_artifacts(
+            event=event,
+            run_id=run_id,
+            runner_id=descriptor.id,
+        )
+
+        # Write user message to Transcript if message.received
+        if event.event_type == 'message.received' and event.conversation_id:
+            await self._write_user_transcript(
+                event=event,
+                event_log_id=event_log_id,
+            )
+
+        # Track artifact refs for assistant transcript (cleared after each message.completed)
+        pending_artifact_refs: list[dict[str, typing.Any]] = []
+
+        try:
+            # Run via plugin connector
+            async for result_dict in self._invoke_runner(descriptor, context):
+                # Handle artifact.created first - consume before normalizer
+                if result_dict.get('type') == 'artifact.created':
+                    artifact_ref = await self._handle_artifact_created(
+                        result_dict=result_dict,
+                        event=event,
+                        run_id=run_id,
+                        runner_id=descriptor.id,
+                    )
+                    pending_artifact_refs.append(artifact_ref)
+                    # Pass to normalizer for logging, but don't yield to pipeline
+                    await self.result_normalizer.normalize(result_dict, descriptor)
+                    continue
+
+                # Handle state.updated first - consume before normalizer
+                if result_dict.get('type') == 'state.updated':
+                    await self._handle_state_updated_event(result_dict, event, binding, descriptor)
+                    # Pass to normalizer for logging, but don't yield to pipeline
+                    await self.result_normalizer.normalize(result_dict, descriptor)
+                    continue
+
+                # Handle message.completed - write to Transcript
+                if result_dict.get('type') == 'message.completed' and event.conversation_id:
+                    # Merge pending artifact refs with message's own refs
+                    merged_refs = self._merge_artifact_refs(
+                        pending_artifact_refs,
+                        result_dict,
+                    )
+                    # Clear pending refs after attaching to this message
+                    pending_artifact_refs.clear()
+
+                    await self._write_assistant_transcript(
+                        result_dict=result_dict,
+                        event=event,
+                        run_id=run_id,
+                        runner_id=descriptor.id,
+                        artifact_refs=merged_refs if merged_refs else None,
+                    )
+
+                # Normalize result for other types
+                result = await self.result_normalizer.normalize(result_dict, descriptor)
+                if result is not None:
+                    yield result
+        finally:
+            # Unregister session after run completes (success or error)
+            await self._session_registry.unregister(run_id)
+
+    async def run_from_query(
+        self,
+        query: pipeline_query.Query,
+    ) -> typing.AsyncGenerator[provider_message.Message | provider_message.MessageChunk, None]:
+        """Run agent runner from pipeline query.
+
+        This is the Pipeline adapter wrapper for the Query-based flow.
+        It delegates to the event-first run(event, binding) method.
+
+        For the new event-first Protocol v1, use run(event, binding) instead.
+
+        Args:
+            query: Pipeline query with pipeline_config, session, messages, etc.
+
+        Yields:
+            Message or MessageChunk for pipeline response
+
+        Raises:
+            RunnerNotFoundError: If runner not found
+            RunnerNotAuthorizedError: If runner not authorized
+            RunnerExecutionError: If runner execution failed
+        """
+        # Resolve runner ID using ConfigMigration
+        runner_id = ConfigMigration.resolve_runner_id(query.pipeline_config)
+        if not runner_id:
+            raise RunnerNotFoundError('no runner configured')
+
+        # Convert Query to event-first envelope
+        event = PipelineAdapter.query_to_event(query)
+
+        # Convert Pipeline config to binding
+        binding = PipelineAdapter.pipeline_config_to_binding(query, runner_id)
+
+        # Extract bound plugins for authorization
+        bound_plugins = query.variables.get('_pipeline_bound_plugins')
+
+        # Build adapter context for Pipeline-specific fields
+        adapter_context = PipelineAdapter.build_adapter_context(query, binding)
+
+        # Delegate to event-first run()
+        async for result in self.run(
+            event,
+            binding,
+            bound_plugins=bound_plugins,
+            adapter_context=adapter_context,
+        ):
+            yield result
+
+    async def _invoke_runner(
+        self,
+        descriptor: AgentRunnerDescriptor,
+        context: AgentRunContextPayload,
+    ) -> typing.AsyncGenerator[dict[str, typing.Any], None]:
+        """Invoke runner via plugin connector.
+
+        Args:
+            descriptor: Runner descriptor
+            context: AgentRunContext dict
+
+        Yields:
+            Raw result dicts from plugin runtime
+
+        Raises:
+            RunnerExecutionError: If plugin system disabled or runtime error
+        """
+        if not self.ap.plugin_connector.is_enable_plugin:
+            raise RunnerExecutionError(
+                descriptor.id,
+                'Plugin system is disabled',
+                retryable=False,
+            )
+
+        try:
+            gen = self.ap.plugin_connector.run_agent(
+                plugin_author=descriptor.plugin_author,
+                plugin_name=descriptor.plugin_name,
+                runner_name=descriptor.runner_name,
+                context=context,
+            )
+
+            while True:
+                try:
+                    result_dict = await self._next_with_deadline(gen, descriptor, context)
+                except StopAsyncIteration:
+                    break
+                yield result_dict
+
+        except asyncio.TimeoutError as e:
+            raise RunnerExecutionError(
+                descriptor.id,
+                'Runner timed out (code: runner.timeout)',
+                retryable=True,
+            ) from e
+        except ActionCallTimeoutError as e:
+            raise RunnerExecutionError(
+                descriptor.id,
+                f'{e} (code: runner.timeout)',
+                retryable=True,
+            ) from e
+        except RunnerExecutionError:
+            raise
+        except Exception as e:
+            # Wrap unexpected errors
+            self.ap.logger.error(
+                f'Runner {descriptor.id} unexpected error: {traceback.format_exc()}'
+            )
+            raise RunnerExecutionError(
+                descriptor.id,
+                str(e),
+                retryable=False,
+            )
+
+    async def _next_with_deadline(
+        self,
+        gen: typing.AsyncGenerator[dict[str, typing.Any], None],
+        descriptor: AgentRunnerDescriptor,
+        context: AgentRunContextPayload,
+    ) -> dict[str, typing.Any]:
+        """Read the next runner result while enforcing the run deadline."""
+        remaining = self._remaining_deadline_seconds(context)
+        if remaining is not None and remaining <= 0:
+            await self._close_generator(gen, descriptor)
+            raise asyncio.TimeoutError
+
+        try:
+            if remaining is None:
+                return await anext(gen)
+            return await asyncio.wait_for(anext(gen), timeout=remaining)
+        except StopAsyncIteration:
+            if self._is_deadline_exhausted(context):
+                raise asyncio.TimeoutError
+            raise
+        except asyncio.TimeoutError:
+            await self._close_generator(gen, descriptor)
+            raise
+
+    def _remaining_deadline_seconds(
+        self,
+        context: AgentRunContextPayload,
+    ) -> float | None:
+        runtime = context.get('runtime') or {}
+        deadline_at = runtime.get('deadline_at')
+        if deadline_at is None:
+            return None
+        try:
+            return float(deadline_at) - time.time()
+        except (TypeError, ValueError):
+            return None
+
+    def _is_deadline_exhausted(self, context: AgentRunContextPayload) -> bool:
+        remaining = self._remaining_deadline_seconds(context)
+        return remaining is not None and remaining <= 0
+
+    async def _close_generator(
+        self,
+        gen: typing.AsyncGenerator[dict[str, typing.Any], None],
+        descriptor: AgentRunnerDescriptor,
+    ) -> None:
+        try:
+            await gen.aclose()
+        except Exception as e:
+            self.ap.logger.warning(f'Failed to close timed-out runner {descriptor.id}: {e}')
+
+    def resolve_runner_id_for_telemetry(self, query: pipeline_query.Query) -> str | None:
+        """Resolve runner ID for telemetry/logging without full execution.
+
+        Args:
+            query: Pipeline query
+
+        Returns:
+            Runner ID string, or None
+        """
+        return ConfigMigration.resolve_runner_id(query.pipeline_config)
+
+    async def _handle_state_updated_event(
+        self,
+        result_dict: dict[str, typing.Any],
+        event: AgentEventEnvelope,
+        binding: AgentBinding,
+        descriptor: AgentRunnerDescriptor,
+    ) -> None:
+        """Handle state.updated result in event-first mode.
+
+        Persists state to database via PersistentStateStore.
+
+        Args:
+            result_dict: Raw result dict with type='state.updated'
+            event: Event envelope
+            binding: Agent binding configuration
+            descriptor: Runner descriptor
+        """
+        data = result_dict.get('data', {})
+
+        scope = data.get('scope')
+        if not scope:
+            raise RunnerProtocolError(
+                descriptor.id,
+                'state.updated missing required field: scope',
+            )
+
+        # Extract key and value
+        key = data.get('key')
+        value = data.get('value')
+
+        if not key:
+            raise RunnerProtocolError(
+                descriptor.id,
+                'state.updated missing required field: key',
+            )
+
+        # Lazy init persistent state store
+        if self._persistent_state_store is None:
+            self._persistent_state_store = get_persistent_state_store(
+                self.ap.persistence_mgr.get_db_engine()
+            )
+
+        # Apply update to persistent state store
+        success, error = await self._persistent_state_store.apply_update_from_event(
+            event=event,
+            binding=binding,
+            descriptor=descriptor,
+            scope=scope,
+            key=key,
+            value=value,
+            logger=self.ap.logger,
+        )
+
+        if success:
+            self.ap.logger.debug(
+                f'Runner {descriptor.id} state.updated (event mode): scope={scope}, key={key}'
+            )
+        elif error:
+            self.ap.logger.warning(
+                f'Runner {descriptor.id} state.updated rejected: {error}'
+            )
+
+    async def _write_event_log(
+        self,
+        event: AgentEventEnvelope,
+        binding: AgentBinding,
+        run_id: str,
+        runner_id: str,
+    ) -> str:
+        """Write incoming event to EventLog.
+
+        Args:
+            event: Event envelope
+            binding: Agent binding
+            run_id: Run ID
+            runner_id: Runner ID
+
+        Returns:
+            Event log ID
+        """
+        import datetime
+
+        from .event_log_store import EventLogStore
+        store = EventLogStore(self.ap.persistence_mgr.get_db_engine())
+
+        # Build input summary
+        input_summary = None
+        input_json = None
+        if event.input:
+            if event.input.text:
+                input_summary = event.input.text[:1000]
+            input_json = {
+                'text': event.input.text,
+                'contents': [c.model_dump(mode='json') if hasattr(c, 'model_dump') else c for c in event.input.contents],
+                'attachments': [a.model_dump(mode='json') if hasattr(a, 'model_dump') else a for a in event.input.attachments],
+            }
+
+        return await store.append_event(
+            event_id=event.event_id,
+            event_type=event.event_type,
+            source=event.source,
+            bot_id=event.bot_id,
+            workspace_id=event.workspace_id,
+            conversation_id=event.conversation_id,
+            thread_id=event.thread_id,
+            actor_type=event.actor.actor_type if event.actor else None,
+            actor_id=event.actor.actor_id if event.actor else None,
+            actor_name=event.actor.actor_name if event.actor else None,
+            subject_type=event.subject.subject_type if event.subject else None,
+            subject_id=event.subject.subject_id if event.subject else None,
+            input_summary=input_summary,
+            input_json=input_json,
+            run_id=run_id,
+            runner_id=runner_id,
+            event_time=datetime.datetime.fromtimestamp(event.event_time) if event.event_time else None,
+        )
+
+    async def _register_input_artifacts(
+        self,
+        event: AgentEventEnvelope,
+        run_id: str,
+        runner_id: str,
+    ) -> None:
+        """Register current-event attachments referenced by AgentInput."""
+        if not event.input or not event.input.attachments:
+            return
+
+        from .artifact_store import ArtifactStore
+        store = ArtifactStore(self.ap.persistence_mgr.get_db_engine())
+
+        for attachment in event.input.attachments:
+            data = attachment.model_dump(mode='json') if hasattr(attachment, 'model_dump') else attachment
+            if not isinstance(data, dict):
+                continue
+
+            artifact_id = data.get('artifact_id')
+            artifact_type = data.get('artifact_type') or 'file'
+            if not artifact_id:
+                continue
+
+            content, parsed_mime_type = self._decode_attachment_content(data.get('content'))
+            url = data.get('url')
+            platform_ref_id = data.get('id')
+            storage_key = None
+            storage_type = 'metadata_only'
+            if content is None:
+                if url:
+                    storage_key = url
+                    storage_type = 'url'
+                elif platform_ref_id:
+                    storage_key = platform_ref_id
+                    storage_type = 'platform_ref'
+
+            metadata = {
+                'input_attachment': True,
+                'input_source': data.get('source') or 'platform',
+            }
+            if url:
+                metadata['url'] = url
+            if platform_ref_id:
+                metadata['platform_ref_id'] = platform_ref_id
+
+            try:
+                await store.register_artifact(
+                    artifact_id=artifact_id,
+                    artifact_type=artifact_type,
+                    source='platform',
+                    storage_key=storage_key,
+                    storage_type=storage_type,
+                    mime_type=data.get('mime_type') or parsed_mime_type,
+                    name=data.get('name'),
+                    size_bytes=data.get('size') or (len(content) if content is not None else None),
+                    conversation_id=event.conversation_id,
+                    run_id=run_id,
+                    runner_id=runner_id,
+                    bot_id=event.bot_id,
+                    workspace_id=event.workspace_id,
+                    metadata=metadata,
+                    content=content,
+                )
+            except Exception as e:
+                self.ap.logger.warning(
+                    f'Failed to register input artifact {artifact_id}: {e}'
+                )
+
+    def _decode_attachment_content(
+        self,
+        content: typing.Any,
+    ) -> tuple[bytes | None, str | None]:
+        """Decode base64 attachment content, including data URLs."""
+        if not isinstance(content, str) or not content:
+            return None, None
+
+        import base64
+        import binascii
+
+        mime_type = None
+        payload = content
+        if content.startswith('data:') and ',' in content:
+            header, payload = content.split(',', 1)
+            if ';base64' in header:
+                mime_type = header[5:].split(';', 1)[0] or None
+
+        try:
+            return base64.b64decode(payload, validate=False), mime_type
+        except (binascii.Error, ValueError):
+            return None, mime_type
+
+    async def _write_user_transcript(
+        self,
+        event: AgentEventEnvelope,
+        event_log_id: str,
+    ) -> None:
+        """Write user message to Transcript.
+
+        Args:
+            event: Event envelope
+            event_log_id: Event log ID
+        """
+        from .transcript_store import TranscriptStore
+        store = TranscriptStore(self.ap.persistence_mgr.get_db_engine())
+
+        # Build content
+        content = event.input.text if event.input else None
+        content_json = None
+        if event.input:
+            content_json = {
+                'role': 'user',
+                'content': [c.model_dump(mode='json') if hasattr(c, 'model_dump') else c for c in event.input.contents] if event.input.contents else [],
+            }
+
+        # Build artifact refs
+        artifact_refs = []
+        if event.input and event.input.attachments:
+            for a in event.input.attachments:
+                artifact_refs.append(a.model_dump(mode='json') if hasattr(a, 'model_dump') else a)
+
+        await store.append_transcript(
+            transcript_id=None,  # Auto-generate
+            event_id=event_log_id,
+            conversation_id=event.conversation_id,
+            role='user',
+            content=content,
+            content_json=content_json,
+            artifact_refs=artifact_refs if artifact_refs else None,
+            thread_id=event.thread_id,
+            item_type='message',
+            metadata={
+                'actor_type': event.actor.actor_type if event.actor else None,
+                'actor_id': event.actor.actor_id if event.actor else None,
+            },
+        )
+
+    async def _handle_artifact_created(
+        self,
+        result_dict: dict[str, typing.Any],
+        event: AgentEventEnvelope,
+        run_id: str,
+        runner_id: str,
+    ) -> dict[str, typing.Any]:
+        """Handle artifact.created result - register artifact and write EventLog.
+
+        Args:
+            result_dict: Raw result dict with type='artifact.created'
+            event: Event envelope
+            run_id: Current run ID
+            runner_id: Runner ID
+
+        Returns:
+            Artifact reference dict for Transcript
+
+        Raises:
+            RunnerProtocolError: On validation failures or registration errors
+        """
+        import base64
+        import uuid
+
+        from .artifact_store import ArtifactStore
+        from .event_log_store import EventLogStore
+
+        data = result_dict.get('data', {})
+
+        # Validate run_id matches current context
+        result_run_id = result_dict.get('run_id')
+        if result_run_id and result_run_id != run_id:
+            raise RunnerProtocolError(
+                runner_id,
+                f'artifact.created run_id mismatch: expected {run_id}, got {result_run_id}',
+            )
+
+        # Extract artifact fields
+        artifact_id = data.get('artifact_id') or str(uuid.uuid4())
+        artifact_type = data.get('artifact_type')
+        if not artifact_type:
+            raise RunnerProtocolError(
+                runner_id,
+                'artifact.created missing required field: artifact_type',
+            )
+
+        mime_type = data.get('mime_type')
+        name = data.get('name')
+        size_bytes = data.get('size_bytes')
+        sha256 = data.get('sha256')
+        metadata = data.get('metadata')
+        content_base64 = data.get('content_base64')
+
+        # Decode and validate content if provided
+        content: bytes | None = None
+        if content_base64:
+            try:
+                content = base64.b64decode(content_base64, validate=True)
+            except Exception as e:
+                raise RunnerProtocolError(
+                    runner_id,
+                    f'artifact.created invalid base64 content: {e}',
+                )
+
+            # Validate content size
+            if len(content) > MAX_ARTIFACT_INLINE_BYTES:
+                raise RunnerProtocolError(
+                    runner_id,
+                    f'artifact.created content size {len(content)} bytes exceeds limit {MAX_ARTIFACT_INLINE_BYTES} bytes',
+                )
+
+        # Register artifact via ArtifactStore
+        artifact_store = ArtifactStore(self.ap.persistence_mgr.get_db_engine())
+        try:
+            registered_id = await artifact_store.register_artifact(
+                artifact_id=artifact_id,
+                artifact_type=artifact_type,
+                source='runner',
+                mime_type=mime_type,
+                name=name,
+                size_bytes=size_bytes,
+                sha256=sha256,
+                conversation_id=event.conversation_id,
+                run_id=run_id,
+                runner_id=runner_id,
+                bot_id=event.bot_id,
+                workspace_id=event.workspace_id,
+                metadata=metadata,
+                content=content,
+            )
+        except Exception as e:
+            raise RunnerProtocolError(
+                runner_id,
+                f'artifact.created failed to register artifact: {e}',
+            )
+
+        # Write to EventLog
+        event_log_store = EventLogStore(self.ap.persistence_mgr.get_db_engine())
+        await event_log_store.append_event(
+            event_id=str(uuid.uuid4()),
+            event_type='artifact.created',
+            source='runner',
+            bot_id=event.bot_id,
+            workspace_id=event.workspace_id,
+            conversation_id=event.conversation_id,
+            thread_id=event.thread_id,
+            actor_type=event.actor.actor_type if event.actor else None,
+            actor_id=event.actor.actor_id if event.actor else None,
+            actor_name=event.actor.actor_name if event.actor else None,
+            input_summary=f'Artifact created: {artifact_type}',
+            input_json={
+                'artifact_id': registered_id,
+                'artifact_type': artifact_type,
+                'mime_type': mime_type,
+                'name': name,
+                'size_bytes': size_bytes,
+            },
+            run_id=run_id,
+            runner_id=runner_id,
+        )
+
+        # Return artifact ref for Transcript
+        return {
+            'artifact_id': registered_id,
+            'artifact_type': artifact_type,
+            'mime_type': mime_type,
+            'name': name,
+        }
+
+    def _merge_artifact_refs(
+        self,
+        pending_refs: list[dict[str, typing.Any]],
+        result_dict: dict[str, typing.Any],
+    ) -> list[dict[str, typing.Any]]:
+        """Merge pending artifact refs with message's own refs, deduplicating by artifact_id.
+
+        Args:
+            pending_refs: Artifact refs accumulated from artifact.created events
+            result_dict: Result dict that may contain message with artifact_refs
+
+        Returns:
+            Merged and deduplicated list of artifact refs
+        """
+        # Start with pending refs
+        merged = list(pending_refs)
+        seen_ids = {ref.get('artifact_id') for ref in pending_refs if ref.get('artifact_id')}
+
+        # Extract refs from message data if present
+        data = result_dict.get('data', {})
+        message = data.get('message', {})
+        message_refs = message.get('artifact_refs', [])
+
+        if isinstance(message_refs, list):
+            for ref in message_refs:
+                if isinstance(ref, dict):
+                    artifact_id = ref.get('artifact_id')
+                    if artifact_id and artifact_id not in seen_ids:
+                        merged.append(ref)
+                        seen_ids.add(artifact_id)
+
+        return merged
+
+    async def _write_assistant_transcript(
+        self,
+        result_dict: dict[str, typing.Any],
+        event: AgentEventEnvelope,
+        run_id: str,
+        runner_id: str,
+        artifact_refs: list[dict[str, typing.Any]] | None = None,
+    ) -> None:
+        """Write assistant message to Transcript.
+
+        Args:
+            result_dict: Result dict from runner
+            event: Original event envelope
+            run_id: Run ID
+            runner_id: Runner ID
+            artifact_refs: Optional artifact references to include
+        """
+        import uuid
+
+        from .transcript_store import TranscriptStore
+        store = TranscriptStore(self.ap.persistence_mgr.get_db_engine())
+
+        data = result_dict.get('data', {})
+        message = data.get('message', {})
+
+        # Build content
+        content = None
+        content_json = None
+
+        if isinstance(message.get('content'), str):
+            content = message['content']
+            content_json = message
+        elif isinstance(message.get('content'), list):
+            # Extract text from content list
+            text_parts = []
+            for c in message['content']:
+                if isinstance(c, dict) and c.get('type') == 'text':
+                    text_parts.append(c.get('text', ''))
+            content = ' '.join(text_parts) if text_parts else None
+            content_json = message
+
+        # Generate a unique event ID for assistant message
+        assistant_event_id = str(uuid.uuid4())
+
+        await store.append_transcript(
+            transcript_id=str(uuid.uuid4()),
+            event_id=assistant_event_id,
+            conversation_id=event.conversation_id,
+            role='assistant',
+            content=content,
+            content_json=content_json,
+            artifact_refs=artifact_refs,
+            thread_id=event.thread_id,
+            item_type='message',
+            run_id=run_id,
+            runner_id=runner_id,
+            metadata={
+                'run_id': run_id,
+                'runner_id': runner_id,
+            },
+        )
--- a/src/langbot/pkg/agent/runner/persistent_state_store.py
+++ b/src/langbot/pkg/agent/runner/persistent_state_store.py
@@ -0,0 +1,431 @@
+"""Persistent state store for AgentRunner protocol state.
+
+This module provides a database-backed state store for event-first Protocol v1.
+"""
+from __future__ import annotations
+
+import typing
+import json
+import threading
+from datetime import datetime
+
+import sqlalchemy
+from sqlalchemy.ext.asyncio import AsyncEngine
+from sqlalchemy import select, delete, update
+
+from .descriptor import AgentRunnerDescriptor
+from .host_models import AgentEventEnvelope, AgentBinding
+from .state_scope import (
+    VALID_STATE_SCOPES,
+    build_state_scope_key,
+    get_binding_identity,
+    normalize_state_key,
+)
+from ...entity.persistence.agent_runner_state import AgentRunnerState
+
+
+# Maximum value_json size (256KB)
+MAX_VALUE_JSON_BYTES = 256 * 1024
+
+
+class PersistentStateStore:
+    """Database-backed state store for AgentRunner protocol state.
+
+    IMPORTANT: This is HOST-OWNED protocol state, NOT plugin instance state.
+
+    This store provides:
+    1. Persistent storage across runs via database
+    2. Scope isolation by runner_id + binding_identity + scope
+    3. Policy enforcement (enable_state, state_scopes)
+    4. JSON value validation and size limits
+
+    Used by:
+    - Event-first Protocol v1 (async methods)
+    - State API handlers (get/set/delete/list)
+    """
+
+    def __init__(self, db_engine: AsyncEngine):
+        self._db_engine = db_engine
+
+    def _get_scope_key(
+        self,
+        scope: str,
+        event: AgentEventEnvelope,
+        binding: AgentBinding,
+        descriptor: AgentRunnerDescriptor,
+    ) -> str | None:
+        """Get scope key for given scope."""
+        return build_state_scope_key(scope, event, binding, descriptor)
+
+    def _check_scope_enabled(self, scope: str, binding: AgentBinding) -> bool:
+        """Check if scope is enabled by binding's state_policy."""
+        state_policy = binding.state_policy
+        if not state_policy.enable_state:
+            return False
+        return scope in state_policy.state_scopes
+
+    def _validate_json_value(
+        self,
+        value: typing.Any,
+        logger: typing.Any = None,
+    ) -> tuple[str | None, str | None]:
+        """Validate and serialize value to JSON.
+
+        Returns:
+            Tuple of (json_string, error_message). If error_message is not None,
+            json_string will be None.
+        """
+        try:
+            json_str = json.dumps(value, ensure_ascii=False)
+        except (TypeError, ValueError) as e:
+            return None, f'Value is not JSON-serializable: {e}'
+
+        # Check size limit
+        json_bytes = len(json_str.encode('utf-8'))
+        if json_bytes > MAX_VALUE_JSON_BYTES:
+            return None, f'Value size {json_bytes} bytes exceeds limit {MAX_VALUE_JSON_BYTES} bytes'
+
+        return json_str, None
+
+    # ========== Async DB Operations ==========
+
+    async def build_snapshot_from_event(
+        self,
+        event: AgentEventEnvelope,
+        binding: AgentBinding,
+        descriptor: AgentRunnerDescriptor,
+    ) -> dict[str, dict[str, typing.Any]]:
+        """Build state snapshot for all scopes from event and binding.
+
+        Reads from database, respects state_policy.
+        """
+        state_policy = binding.state_policy
+
+        # If state is disabled, return all empty scopes
+        if not state_policy.enable_state:
+            return {
+                'conversation': {},
+                'actor': {},
+                'subject': {},
+                'runner': {},
+            }
+
+        snapshot: dict[str, dict[str, typing.Any]] = {
+            'conversation': {},
+            'actor': {},
+            'subject': {},
+            'runner': {},
+        }
+
+        async with self._db_engine.connect() as conn:
+            for scope in VALID_STATE_SCOPES:
+                if not self._check_scope_enabled(scope, binding):
+                    continue
+
+                scope_key = self._get_scope_key(scope, event, binding, descriptor)
+                if not scope_key:
+                    continue
+
+                # Query all state entries for this scope_key
+                result = await conn.execute(
+                    select(AgentRunnerState.state_key, AgentRunnerState.value_json)
+                    .where(AgentRunnerState.scope_key == scope_key)
+                )
+                rows = result.fetchall()
+
+                for row in rows:
+                    key = row.state_key
+                    value_json = row.value_json
+                    if value_json:
+                        try:
+                            snapshot[scope][key] = json.loads(value_json)
+                        except json.JSONDecodeError:
+                            pass  # Skip invalid JSON
+
+        # Seed external.conversation_id from event.conversation_id if not set
+        if self._check_scope_enabled('conversation', binding) and event.conversation_id:
+            if 'external.conversation_id' not in snapshot['conversation']:
+                snapshot['conversation']['external.conversation_id'] = event.conversation_id
+
+        return snapshot
+
+    async def apply_update_from_event(
+        self,
+        event: AgentEventEnvelope,
+        binding: AgentBinding,
+        descriptor: AgentRunnerDescriptor,
+        scope: str,
+        key: str,
+        value: typing.Any,
+        logger: typing.Any = None,
+    ) -> tuple[bool, str | None]:
+        """Apply a state update from event context.
+
+        Returns:
+            Tuple of (success, error_message). If success is False, error_message
+            contains the reason.
+        """
+        state_policy = binding.state_policy
+
+        # Check if state is disabled
+        if not state_policy.enable_state:
+            return False, 'State is disabled by binding policy'
+
+        # Validate scope
+        if scope not in VALID_STATE_SCOPES:
+            return False, f'Invalid scope: {scope}'
+
+        # Check if scope is enabled
+        if not self._check_scope_enabled(scope, binding):
+            return False, f'Scope "{scope}" not enabled by binding policy'
+
+        # Map accepted key aliases
+        key = normalize_state_key(key)
+
+        # Get scope key
+        scope_key = self._get_scope_key(scope, event, binding, descriptor)
+        if not scope_key:
+            return False, f'Missing identity for scope "{scope}"'
+
+        # Validate and serialize value
+        value_json, error = self._validate_json_value(value, logger)
+        if error:
+            return False, error
+
+        # Build context fields
+        binding_identity = get_binding_identity(binding)
+
+        async with self._db_engine.begin() as conn:
+            # Check if entry exists
+            result = await conn.execute(
+                select(AgentRunnerState.id)
+                .where(AgentRunnerState.scope_key == scope_key)
+                .where(AgentRunnerState.state_key == key)
+            )
+            existing = result.first()
+
+            now = datetime.utcnow()
+
+            if existing:
+                # Update existing entry
+                await conn.execute(
+                    update(AgentRunnerState)
+                    .where(AgentRunnerState.id == existing.id)
+                    .values(
+                        value_json=value_json,
+                        updated_at=now,
+                    )
+                )
+            else:
+                # Insert new entry
+                await conn.execute(
+                    sqlalchemy.insert(AgentRunnerState).values(
+                        runner_id=descriptor.id,
+                        binding_identity=binding_identity,
+                        scope=scope,
+                        scope_key=scope_key,
+                        state_key=key,
+                        value_json=value_json,
+                        bot_id=event.bot_id,
+                        workspace_id=event.workspace_id,
+                        conversation_id=event.conversation_id,
+                        thread_id=event.thread_id,
+                        actor_type=event.actor.actor_type if event.actor else None,
+                        actor_id=event.actor.actor_id if event.actor else None,
+                        subject_type=event.subject.subject_type if event.subject else None,
+                        subject_id=event.subject.subject_id if event.subject else None,
+                        created_at=now,
+                        updated_at=now,
+                    )
+                )
+
+        return True, None
+
+    async def state_get(
+        self,
+        scope_key: str,
+        state_key: str,
+    ) -> typing.Any:
+        """Get a single state value by scope_key and state_key.
+
+        Used by State API handlers.
+        """
+        state_key = normalize_state_key(state_key)
+
+        async with self._db_engine.connect() as conn:
+            result = await conn.execute(
+                select(AgentRunnerState.value_json)
+                .where(AgentRunnerState.scope_key == scope_key)
+                .where(AgentRunnerState.state_key == state_key)
+            )
+            row = result.first()
+
+            if not row or not row.value_json:
+                return None
+
+            try:
+                return json.loads(row.value_json)
+            except json.JSONDecodeError:
+                return None
+
+    async def state_set(
+        self,
+        scope_key: str,
+        state_key: str,
+        value: typing.Any,
+        runner_id: str,
+        binding_identity: str,
+        scope: str,
+        context: dict[str, typing.Any] | None = None,
+        logger: typing.Any = None,
+    ) -> tuple[bool, str | None]:
+        """Set a state value.
+
+        Used by State API handlers.
+        Context contains optional fields like bot_id, conversation_id, etc.
+        """
+        state_key = normalize_state_key(state_key)
+
+        # Validate and serialize value
+        value_json, error = self._validate_json_value(value, logger)
+        if error:
+            return False, error
+
+        context = context or {}
+
+        async with self._db_engine.begin() as conn:
+            # Check if entry exists
+            result = await conn.execute(
+                select(AgentRunnerState.id)
+                .where(AgentRunnerState.scope_key == scope_key)
+                .where(AgentRunnerState.state_key == state_key)
+            )
+            existing = result.first()
+
+            now = datetime.utcnow()
+
+            if existing:
+                # Update existing entry
+                await conn.execute(
+                    update(AgentRunnerState)
+                    .where(AgentRunnerState.id == existing.id)
+                    .values(
+                        value_json=value_json,
+                        updated_at=now,
+                    )
+                )
+            else:
+                # Insert new entry
+                await conn.execute(
+                    sqlalchemy.insert(AgentRunnerState).values(
+                        runner_id=runner_id,
+                        binding_identity=binding_identity,
+                        scope=scope,
+                        scope_key=scope_key,
+                        state_key=state_key,
+                        value_json=value_json,
+                        bot_id=context.get('bot_id'),
+                        workspace_id=context.get('workspace_id'),
+                        conversation_id=context.get('conversation_id'),
+                        thread_id=context.get('thread_id'),
+                        actor_type=context.get('actor_type'),
+                        actor_id=context.get('actor_id'),
+                        subject_type=context.get('subject_type'),
+                        subject_id=context.get('subject_id'),
+                        created_at=now,
+                        updated_at=now,
+                    )
+                )
+
+        return True, None
+
+    async def state_delete(
+        self,
+        scope_key: str,
+        state_key: str,
+    ) -> bool:
+        """Delete a state value.
+
+        Returns True if deleted, False if not found.
+        """
+        state_key = normalize_state_key(state_key)
+
+        async with self._db_engine.begin() as conn:
+            result = await conn.execute(
+                delete(AgentRunnerState)
+                .where(AgentRunnerState.scope_key == scope_key)
+                .where(AgentRunnerState.state_key == state_key)
+                .returning(AgentRunnerState.id)
+            )
+            deleted = result.first()
+            return deleted is not None
+
+    async def state_list(
+        self,
+        scope_key: str,
+        prefix: str | None = None,
+        limit: int = 100,
+    ) -> tuple[list[str], bool]:
+        """List state keys in a scope.
+
+        Returns tuple of (keys, has_more).
+        """
+        # Enforce limit cap
+        limit = min(limit, 100)
+
+        async with self._db_engine.connect() as conn:
+            query = (
+                select(AgentRunnerState.state_key)
+                .where(AgentRunnerState.scope_key == scope_key)
+                .order_by(AgentRunnerState.state_key)
+                .limit(limit + 1)  # Fetch one extra to check has_more
+            )
+
+            if prefix:
+                prefix = normalize_state_key(prefix)
+                query = query.where(
+                    AgentRunnerState.state_key.like(f'{prefix}%')
+                )
+
+            result = await conn.execute(query)
+            rows = result.fetchall()
+
+            keys = [row.state_key for row in rows[:limit]]
+            has_more = len(rows) > limit
+
+            return keys, has_more
+
+    async def clear_all(self) -> None:
+        """Clear all state entries (for testing)."""
+        async with self._db_engine.begin() as conn:
+            await conn.execute(delete(AgentRunnerState))
+
+
+# Global singleton persistent state store
+_persistent_state_store: PersistentStateStore | None = None
+_persistent_state_store_lock = threading.Lock()
+
+
+def get_persistent_state_store(db_engine: AsyncEngine | None = None) -> PersistentStateStore:
+    """Get the global persistent state store singleton.
+
+    Args:
+        db_engine: Database engine (required on first call)
+
+    Returns:
+        PersistentStateStore singleton
+    """
+    global _persistent_state_store
+    with _persistent_state_store_lock:
+        if _persistent_state_store is None:
+            if db_engine is None:
+                raise RuntimeError("db_engine required for first call to get_persistent_state_store")
+            _persistent_state_store = PersistentStateStore(db_engine)
+        return _persistent_state_store
+
+
+def reset_persistent_state_store() -> None:
+    """Reset the global persistent state store (for testing)."""
+    global _persistent_state_store
+    with _persistent_state_store_lock:
+        _persistent_state_store = None
--- a/src/langbot/pkg/agent/runner/pipeline_adapter.py
+++ b/src/langbot/pkg/agent/runner/pipeline_adapter.py
@@ -0,0 +1,626 @@
+"""Pipeline adapter for converting Query to event-first envelope.
+
+This adapter bridges the Query/Pipeline entry point with the event-first
+Protocol v1 architecture.
+"""
+from __future__ import annotations
+
+import hashlib
+import typing
+
+from langbot_plugin.api.entities.builtin.pipeline import query as pipeline_query
+from langbot_plugin.api.entities.builtin.platform import message as platform_message
+from langbot_plugin.api.entities.builtin.agent_runner.event import (
+    AgentEventContext,
+    ConversationContext,
+    ActorContext,
+    SubjectContext,
+    RawEventRef,
+)
+from langbot_plugin.api.entities.builtin.agent_runner.input import AgentInput
+from langbot_plugin.api.entities.builtin.agent_runner.delivery import DeliveryContext
+
+from .host_models import (
+    AgentEventEnvelope,
+    AgentBinding,
+    BindingScope,
+    ResourcePolicy,
+    StatePolicy,
+    DeliveryPolicy,
+)
+from . import events as runner_events
+
+
+class PipelineAdapter:
+    """Adapter for converting Pipeline Query to event-first envelope.
+
+    This adapter is responsible for:
+    - Converting Query to AgentEventEnvelope
+    - Converting Pipeline config to temporary AgentBinding
+    - Putting Query-only fields into adapter context
+    """
+
+    INTERNAL_PREFIX = '_'
+    SENSITIVE_PATTERNS = ('secret', 'token', 'key', 'password', 'credential', 'api_key', 'apikey')
+    PERMISSION_VARS = ('_pipeline_bound_plugins', '_authorized', '_permission')
+
+    @classmethod
+    def query_to_event(
+        cls,
+        query: pipeline_query.Query,
+    ) -> AgentEventEnvelope:
+        """Convert Pipeline Query to AgentEventEnvelope.
+
+        Args:
+            query: Pipeline query
+
+        Returns:
+            AgentEventEnvelope for event-first processing
+        """
+        # Build event context
+        event = cls._build_event_context(query)
+
+        # Build conversation context
+        conversation = cls._build_conversation_context(query)
+
+        # Build actor context
+        actor = cls._build_actor_context(query)
+
+        # Build subject context
+        subject = cls._build_subject_context(query)
+
+        # Build input
+        input = cls._build_input(query)
+
+        # Build delivery context
+        delivery = cls._build_delivery_context(query)
+
+        # Build raw ref
+        raw_ref = cls._build_raw_ref(query)
+
+        return AgentEventEnvelope(
+            event_id=event.event_id or str(query.query_id),
+            event_type=event.event_type or runner_events.MESSAGE_RECEIVED,
+            event_time=event.event_time,
+            source="pipeline_adapter",
+            source_event_type=event.source_event_type,
+            bot_id=query.bot_uuid,
+            workspace_id=None,  # Not available in Query
+            conversation_id=conversation.conversation_id,
+            thread_id=conversation.thread_id,
+            actor=actor,
+            subject=subject,
+            input=input,
+            delivery=delivery,
+            raw_ref=raw_ref,
+            data=event.data,
+        )
+
+    @classmethod
+    def pipeline_config_to_binding(
+        cls,
+        query: pipeline_query.Query,
+        runner_id: str,
+    ) -> AgentBinding:
+        """Convert Pipeline config to temporary AgentBinding.
+
+        Args:
+            query: Pipeline query
+            runner_id: Resolved runner ID
+
+        Returns:
+            AgentBinding for this run
+        """
+        pipeline_config = query.pipeline_config or {}
+        ai_config = pipeline_config.get('ai', {})
+        runner_config = ai_config.get('runner_config', {}).get(runner_id, {})
+        pipeline_uuid = getattr(query, 'pipeline_uuid', None)
+
+        # Build scope
+        scope = BindingScope(
+            scope_type="pipeline",
+            scope_id=pipeline_uuid,
+        )
+
+        # Build resource policy from pipeline config
+        resource_policy = ResourcePolicy(
+            allowed_model_uuids=cls._extract_allowed_models(query),
+            allowed_tool_names=cls._extract_allowed_tools(query),
+            allowed_kb_uuids=cls._extract_allowed_kbs(query),
+        )
+
+        # Build state policy
+        state_policy = StatePolicy(
+            enable_state=True,
+            state_scopes=["conversation", "actor", "subject", "runner"],
+        )
+
+        # Build delivery policy
+        delivery_policy = DeliveryPolicy(
+            enable_streaming=True,
+            enable_reply=True,
+        )
+
+        return AgentBinding(
+            binding_id=f"pipeline_{pipeline_uuid or 'default'}_{runner_id}",
+            scope=scope,
+            event_types=[runner_events.MESSAGE_RECEIVED],
+            runner_id=runner_id,
+            runner_config=runner_config,
+            resource_policy=resource_policy,
+            state_policy=state_policy,
+            delivery_policy=delivery_policy,
+            enabled=True,
+            pipeline_uuid=pipeline_uuid,
+        )
+
+    @classmethod
+    def build_adapter_context(
+        cls,
+        query: pipeline_query.Query,
+        binding: AgentBinding,
+    ) -> dict[str, typing.Any]:
+        """Build Query-derived fields for the Pipeline adapter entry."""
+        return {
+            'params': cls.build_params(query),
+            'prompt': cls.build_prompt(query),
+            'query_id': getattr(query, 'query_id', None),
+        }
+
+    @classmethod
+    def build_params(cls, query: pipeline_query.Query) -> dict[str, typing.Any]:
+        """Build adapter params from Pipeline variables with host filtering."""
+        params: dict[str, typing.Any] = {}
+        variables = getattr(query, 'variables', None)
+        if not variables:
+            return params
+
+        for key, value in variables.items():
+            if key.startswith(cls.INTERNAL_PREFIX):
+                continue
+            key_lower = key.lower()
+            if any(pattern in key_lower for pattern in cls.SENSITIVE_PATTERNS):
+                continue
+            if any(key == perm_var or key.startswith(perm_var) for perm_var in cls.PERMISSION_VARS):
+                continue
+            if cls.is_json_serializable(value):
+                params[key] = value
+
+        return params
+
+    @classmethod
+    def build_prompt(cls, query: pipeline_query.Query) -> list[dict[str, typing.Any]]:
+        """Build effective prompt messages from Pipeline preprocessing output."""
+        prompt = getattr(query, 'prompt', None)
+        messages = getattr(prompt, 'messages', None)
+        if not messages:
+            return []
+        return [cls._dump_message(msg) for msg in messages]
+
+    @classmethod
+    def is_json_serializable(cls, value: typing.Any) -> bool:
+        """Return whether a value can safely cross the adapter boundary as JSON."""
+        if value is None or isinstance(value, (str, int, float, bool)):
+            return True
+        if isinstance(value, (list, tuple)):
+            return all(cls.is_json_serializable(item) for item in value)
+        if isinstance(value, dict):
+            return all(
+                isinstance(k, str) and cls.is_json_serializable(v)
+                for k, v in value.items()
+            )
+        return False
+
+    @staticmethod
+    def _dump_message(message: typing.Any) -> dict[str, typing.Any]:
+        """Serialize a provider message-like object."""
+        if hasattr(message, 'model_dump'):
+            return message.model_dump(mode='json')
+        if isinstance(message, dict):
+            return message
+        return {
+            'role': getattr(message, 'role', None),
+            'content': getattr(message, 'content', None),
+        }
+
+    # Private helper methods
+
+    @classmethod
+    def _build_event_context(
+        cls,
+        query: pipeline_query.Query,
+    ) -> AgentEventContext:
+        """Build AgentEventContext from Query."""
+        message_event = getattr(query, 'message_event', None)
+
+        event_data: dict[str, typing.Any] = {}
+        if message_event and hasattr(message_event, 'model_dump'):
+            try:
+                event_data = message_event.model_dump(mode='json')
+            except TypeError:
+                event_data = message_event.model_dump()
+            except Exception:
+                event_data = {}
+            event_data.pop('source_platform_object', None)
+
+        source_event_type = None
+        if message_event:
+            source_event_type = getattr(message_event, 'type', None)
+
+        message_chain = getattr(query, 'message_chain', None)
+        message_id = getattr(message_chain, 'message_id', None)
+        if message_id == -1:
+            message_id = None
+
+        event_time = None
+        if message_event:
+            event_time = getattr(message_event, 'time', None)
+        if isinstance(event_time, (int, float)):
+            event_time = int(event_time)
+
+        source_event_id = str(message_id or query.query_id)
+        return AgentEventContext(
+            event_id=cls._build_scoped_event_id(query, source_event_id, event_time),
+            event_type=runner_events.MESSAGE_RECEIVED,
+            event_time=event_time,
+            source="pipeline_adapter",
+            source_event_type=source_event_type,
+            data=event_data,
+        )
+
+    @classmethod
+    def _build_scoped_event_id(
+        cls,
+        query: pipeline_query.Query,
+        source_event_id: str,
+        event_time: int | None,
+    ) -> str:
+        """Build a globally unique host event id from pipeline-local ids."""
+        launcher_type = getattr(query, 'launcher_type', None)
+        launcher_type_value = getattr(launcher_type, 'value', launcher_type) if launcher_type is not None else None
+        scope_parts = [
+            'pipeline_adapter',
+            getattr(query, 'pipeline_uuid', None),
+            getattr(query, 'bot_uuid', None),
+            launcher_type_value,
+            getattr(query, 'launcher_id', None),
+            getattr(query, 'sender_id', None),
+            source_event_id,
+            event_time,
+        ]
+        scoped = '|'.join('' if part is None else str(part) for part in scope_parts)
+        digest = hashlib.sha256(scoped.encode('utf-8')).hexdigest()[:32]
+        return f'pipeline:{digest}'
+
+    @classmethod
+    def _build_conversation_context(
+        cls,
+        query: pipeline_query.Query,
+    ) -> ConversationContext:
+        """Build ConversationContext from Query."""
+        # Handle launcher_type safely
+        launcher_type = getattr(query, 'launcher_type', None)
+        launcher_type_value = None
+        if launcher_type is not None:
+            launcher_type_value = getattr(launcher_type, 'value', launcher_type)
+
+        # Handle launcher_id
+        launcher_id = getattr(query, 'launcher_id', None)
+
+        # Build session_id from launcher info if available
+        session_id = None
+        if launcher_type_value and launcher_id:
+            session_id = f'{launcher_type_value}_{launcher_id}'
+
+        # Handle session and conversation_id
+        conversation_id = None
+        session = getattr(query, 'session', None)
+        if session:
+            conversation = getattr(session, 'using_conversation', None)
+            if conversation:
+                conversation_id = getattr(conversation, 'uuid', None)
+
+        if not conversation_id:
+            variables = getattr(query, 'variables', None) or {}
+            conversation_id = variables.get('conversation_id') or None
+
+        if not conversation_id:
+            conversation_id = session_id
+
+        # Handle sender_id
+        sender_id = getattr(query, 'sender_id', None)
+        if sender_id is not None:
+            sender_id = str(sender_id)
+
+        # Handle bot_uuid
+        bot_uuid = getattr(query, 'bot_uuid', None)
+
+        # Handle pipeline_uuid
+        pipeline_uuid = getattr(query, 'pipeline_uuid', None)
+
+        return ConversationContext(
+            conversation_id=str(conversation_id) if conversation_id is not None else None,
+            thread_id=None,
+            launcher_type=launcher_type_value,
+            launcher_id=launcher_id,
+            sender_id=sender_id,
+            bot_id=bot_uuid,
+            workspace_id=None,
+            session_id=session_id,
+            pipeline_uuid=pipeline_uuid,
+        )
+
+    @classmethod
+    def _build_actor_context(
+        cls,
+        query: pipeline_query.Query,
+    ) -> ActorContext:
+        """Build ActorContext from Query."""
+        message_event = getattr(query, 'message_event', None)
+        sender = getattr(message_event, 'sender', None) if message_event else None
+        sender_id = getattr(query, 'sender_id', None)
+        actor_id = getattr(sender, 'id', None) if sender else None
+        if actor_id is None:
+            actor_id = sender_id
+        actor_name = sender.get_name() if sender and hasattr(sender, 'get_name') else None
+
+        return ActorContext(
+            actor_type="user",
+            actor_id=str(actor_id) if actor_id is not None else None,
+            actor_name=actor_name,
+            metadata={},
+        )
+
+    @classmethod
+    def _build_subject_context(
+        cls,
+        query: pipeline_query.Query,
+    ) -> SubjectContext:
+        """Build SubjectContext from Query."""
+        message_chain = getattr(query, 'message_chain', None)
+        message_id = getattr(message_chain, 'message_id', None) if message_chain else None
+        if message_id == -1:
+            message_id = None
+
+        query_id = getattr(query, 'query_id', None)
+
+        # Safely get launcher_type
+        launcher_type = getattr(query, 'launcher_type', None)
+        launcher_type_value = None
+        if launcher_type is not None:
+            launcher_type_value = getattr(launcher_type, 'value', launcher_type)
+
+        return SubjectContext(
+            subject_type="message",
+            subject_id=str(message_id or query_id or ''),
+            data={
+                "launcher_type": launcher_type_value,
+                "launcher_id": getattr(query, 'launcher_id', None),
+                "sender_id": str(getattr(query, 'sender_id', '')) if getattr(query, 'sender_id', None) else None,
+                "bot_uuid": getattr(query, 'bot_uuid', None),
+                "pipeline_uuid": getattr(query, 'pipeline_uuid', None),
+            },
+        )
+
+    @classmethod
+    def _build_input(
+        cls,
+        query: pipeline_query.Query,
+    ) -> AgentInput:
+        """Build AgentInput from Query."""
+        text = None
+        text_parts: list[str] = []
+        contents: list[dict[str, typing.Any]] = []
+
+        user_message = getattr(query, 'user_message', None)
+        if user_message:
+            content = getattr(user_message, 'content', None)
+            if isinstance(content, list):
+                for elem in content:
+                    # Handle both real objects and mocks
+                    if hasattr(elem, 'model_dump'):
+                        contents.append(elem.model_dump(mode='json'))
+                    elif isinstance(elem, dict):
+                        contents.append(elem)
+                    else:
+                        # For mocks, extract type and text attributes
+                        elem_type = getattr(elem, 'type', None)
+                        if elem_type == 'text':
+                            elem_text = getattr(elem, 'text', None)
+                            contents.append({'type': 'text', 'text': elem_text})
+                            if elem_text:
+                                text_parts.append(elem_text)
+                        continue
+
+                    # Extract text for the text field
+                    if hasattr(elem, 'type') and getattr(elem, 'type', None) == 'text':
+                        elem_text = getattr(elem, 'text', None)
+                        if elem_text:
+                            text_parts.append(elem_text)
+            elif content is not None:
+                text = str(content)
+                contents.append({'type': 'text', 'text': text})
+
+        if text_parts:
+            text = ''.join(text_parts)
+
+        message_chain_dict = None
+        message_chain = getattr(query, 'message_chain', None)
+        if message_chain:
+            if hasattr(message_chain, 'model_dump'):
+                message_chain_dict = message_chain.model_dump(mode='json')
+
+        attachments = cls._build_attachments(query, contents)
+
+        return AgentInput(
+            text=text,
+            contents=contents,
+            message_chain=message_chain_dict,
+            attachments=attachments,
+        )
+
+    @classmethod
+    def _build_attachments(
+        cls,
+        query: pipeline_query.Query,
+        contents: list[dict[str, typing.Any]],
+    ) -> list[dict[str, typing.Any]]:
+        """Extract attachments from query."""
+        import uuid
+
+        attachments: list[dict[str, typing.Any]] = []
+
+        for elem in contents:
+            elem_type = elem.get('type')
+            artifact_id = str(uuid.uuid4())  # Generate unique ID
+
+            if elem_type == 'image_url':
+                image_url = elem.get('image_url') or {}
+                attachments.append({
+                    'artifact_id': artifact_id,
+                    'artifact_type': 'image',
+                    'source': 'url',
+                    'url': image_url.get('url') if isinstance(image_url, dict) else str(image_url),
+                })
+            elif elem_type == 'image_base64':
+                attachments.append({
+                    'artifact_id': artifact_id,
+                    'artifact_type': 'image',
+                    'source': 'base64',
+                    'content': elem.get('image_base64'),
+                })
+            elif elem_type == 'file_url':
+                attachments.append({
+                    'artifact_id': artifact_id,
+                    'artifact_type': 'file',
+                    'source': 'url',
+                    'url': elem.get('file_url'),
+                    'name': elem.get('file_name'),
+                })
+            elif elem_type == 'file_base64':
+                attachments.append({
+                    'artifact_id': artifact_id,
+                    'artifact_type': 'file',
+                    'source': 'base64',
+                    'content': elem.get('file_base64'),
+                    'name': elem.get('file_name'),
+                })
+
+        message_chain = getattr(query, 'message_chain', None)
+        if message_chain:
+            try:
+                for component in message_chain:
+                    artifact_id = str(uuid.uuid4())  # Generate unique ID
+
+                    if isinstance(component, platform_message.Image):
+                        attachments.append({
+                            'artifact_id': artifact_id,
+                            'artifact_type': 'image',
+                            'source': 'message_chain',
+                            'id': component.image_id or None,
+                            'url': component.url or None,
+                        })
+                    elif isinstance(component, platform_message.File):
+                        attachments.append({
+                            'artifact_id': artifact_id,
+                            'artifact_type': 'file',
+                            'source': 'message_chain',
+                            'id': component.id or None,
+                            'name': component.name or None,
+                        })
+                    elif isinstance(component, platform_message.Voice):
+                        attachments.append({
+                            'artifact_id': artifact_id,
+                            'artifact_type': 'voice',
+                            'source': 'message_chain',
+                            'id': component.voice_id or None,
+                            'url': component.url or None,
+                        })
+            except TypeError:
+                # message_chain is not iterable (e.g., a Mock object)
+                pass
+
+        return attachments
+
+    @classmethod
+    def _build_delivery_context(
+        cls,
+        query: pipeline_query.Query,
+    ) -> DeliveryContext:
+        """Build DeliveryContext from Query."""
+        message_chain = getattr(query, 'message_chain', None)
+        return DeliveryContext(
+            surface="platform",
+            reply_target={
+                "message_id": getattr(message_chain, 'message_id', None),
+            },
+            supports_streaming=True,
+            supports_edit=False,
+            supports_reaction=False,
+            platform_capabilities={},
+        )
+
+    @classmethod
+    def _build_raw_ref(
+        cls,
+        query: pipeline_query.Query,
+    ) -> RawEventRef | None:
+        """Build RawEventRef from Query."""
+        # For now, we don't store raw event payload
+        return None
+
+    @classmethod
+    def _extract_allowed_models(
+        cls,
+        query: pipeline_query.Query,
+    ) -> list[str] | None:
+        """Extract allowed model UUIDs from query."""
+        model_uuids: list[str] = []
+        model_uuid = getattr(query, 'use_llm_model_uuid', None)
+        if model_uuid:
+            model_uuids.append(model_uuid)
+
+        variables = getattr(query, 'variables', None) or {}
+        for fallback_uuid in variables.get('_fallback_model_uuids', []) or []:
+            if fallback_uuid and fallback_uuid not in model_uuids:
+                model_uuids.append(fallback_uuid)
+
+        return model_uuids or None
+
+    @classmethod
+    def _extract_allowed_tools(
+        cls,
+        query: pipeline_query.Query,
+    ) -> list[str] | None:
+        """Extract allowed tool names from query."""
+        use_funcs = getattr(query, 'use_funcs', None)
+        if not use_funcs:
+            return None
+        try:
+            tool_names = []
+            for func in use_funcs:
+                if isinstance(func, dict):
+                    name = func.get('name')
+                elif hasattr(func, 'name'):
+                    name = func.name
+                else:
+                    continue
+                if name:
+                    tool_names.append(name)
+            return tool_names if tool_names else None
+        except (TypeError, AttributeError):
+            return None
+
+    @classmethod
+    def _extract_allowed_kbs(
+        cls,
+        query: pipeline_query.Query,
+    ) -> list[str] | None:
+        """Extract allowed knowledge base UUIDs from query."""
+        variables = getattr(query, 'variables', None)
+        if not variables:
+            return None
+        kb_uuids = variables.get('_knowledge_base_uuids')
+        if kb_uuids:
+            return kb_uuids
+        return None
--- a/src/langbot/pkg/agent/runner/registry.py
+++ b/src/langbot/pkg/agent/runner/registry.py
@@ -0,0 +1,293 @@
+"""Agent runner registry for discovering and caching runner descriptors."""
+
+from __future__ import annotations
+
+import typing
+import asyncio
+
+from ...core import app
+from .descriptor import AgentRunnerDescriptor
+from .id import parse_runner_id, format_runner_id
+from .errors import RunnerNotFoundError, RunnerNotAuthorizedError
+
+
+class AgentRunnerRegistry:
+    """Registry for discovering and managing agent runners.
+
+    Responsibilities:
+    - Discover runners from plugin runtime via LIST_AGENT_RUNNERS
+    - Validate runner manifests (kind, metadata, spec)
+    - Cache discovered runners for performance
+    - Filter runners by bound plugins
+    - Handle manifest errors gracefully (log warning, skip runner)
+    """
+
+    ap: app.Application
+
+    _cache: dict[str, AgentRunnerDescriptor] | None
+    """Cached runner descriptors keyed by runner ID"""
+
+    _cache_lock: asyncio.Lock
+    """Lock for cache refresh operations"""
+
+    def __init__(self, ap: app.Application):
+        self.ap = ap
+        self._cache = None
+        self._cache_lock = asyncio.Lock()
+
+    async def _discover_runners(self) -> dict[str, AgentRunnerDescriptor]:
+        """Discover runners from plugin runtime.
+
+        Always discovers ALL runners (no bound_plugins filter).
+        The cache should contain unfiltered discovery results.
+
+        Returns:
+            Dict of runner descriptors keyed by runner ID
+        """
+        if not self.ap.plugin_connector.is_enable_plugin:
+            return {}
+
+        runners: dict[str, AgentRunnerDescriptor] = {}
+
+        try:
+            # Always list all runners (bound_plugins=None)
+            plugin_runners = await self.ap.plugin_connector.list_agent_runners(None)
+
+            for runner_data in plugin_runners:
+                try:
+                    descriptor = self._validate_and_build_descriptor(runner_data)
+                    if descriptor is not None:
+                        runners[descriptor.id] = descriptor
+                except Exception as e:
+                    plugin_author = runner_data.get('plugin_author', 'unknown')
+                    plugin_name = runner_data.get('plugin_name', 'unknown')
+                    runner_name = runner_data.get('runner_name', 'unknown')
+                    self.ap.logger.warning(
+                        f'Invalid runner manifest for plugin:{plugin_author}/{plugin_name}/{runner_name}: {e}'
+                    )
+                    continue
+
+        except Exception as e:
+            self.ap.logger.warning(f'Failed to list agent runners from plugin runtime: {e}')
+            return {}
+
+        return runners
+
+    def _validate_and_build_descriptor(self, runner_data: dict[str, typing.Any]) -> AgentRunnerDescriptor | None:
+        """Validate runner manifest and build descriptor.
+
+        Args:
+            runner_data: Raw runner data from plugin runtime with fields:
+                - plugin_author, plugin_name, runner_name
+                - manifest (full component manifest dict)
+                - protocol_version, capabilities, permissions, config (extracted from spec)
+
+        Returns:
+            AgentRunnerDescriptor if valid, None if invalid
+        """
+        plugin_author = runner_data.get('plugin_author', '')
+        plugin_name = runner_data.get('plugin_name', '')
+        runner_name = runner_data.get('runner_name', '')
+
+        if not plugin_author or not plugin_name or not runner_name:
+            return None
+
+        manifest = runner_data.get('manifest', {})
+
+        # Validate kind
+        kind = manifest.get('kind', '')
+        if kind != 'AgentRunner':
+            return None
+
+        # Validate metadata
+        metadata = manifest.get('metadata', {})
+        name = metadata.get('name', '')
+        if not name:
+            return None
+
+        # metadata.label must exist
+        label = metadata.get('label', {})
+        if not label:
+            label = {name: name}  # fallback
+
+        spec = manifest.get('spec', {})
+
+        # SDK now provides these directly extracted from spec. Fall back to
+        # manifest.spec for older runtimes/tests that return the raw manifest.
+        protocol_version = runner_data.get('protocol_version') or spec.get('protocol_version', '1')
+        config_schema = runner_data.get('config') or spec.get('config', [])
+        capabilities = runner_data.get('capabilities') or spec.get('capabilities', {})
+        permissions = runner_data.get('permissions') or spec.get('permissions', {})
+
+        # Build descriptor
+        runner_id = format_runner_id(
+            source='plugin',
+            plugin_author=plugin_author,
+            plugin_name=plugin_name,
+            runner_name=runner_name,
+        )
+
+        return AgentRunnerDescriptor(
+            id=runner_id,
+            source='plugin',
+            label=label,
+            description=metadata.get('description') or runner_data.get('runner_description'),
+            plugin_author=plugin_author,
+            plugin_name=plugin_name,
+            runner_name=runner_name,
+            plugin_version=runner_data.get('plugin_version'),
+            protocol_version=protocol_version,
+            config_schema=config_schema,
+            capabilities=capabilities,
+            permissions=permissions,
+            raw_manifest=manifest,
+        )
+
+    async def refresh(self) -> None:
+        """Refresh runner cache.
+
+        Always discovers ALL runners (no bound_plugins filter).
+        The cache contains unfiltered discovery results.
+        """
+        async with self._cache_lock:
+            self._cache = await self._discover_runners()
+
+    async def list_runners(
+        self,
+        bound_plugins: list[str] | None = None,
+        use_cache: bool = True,
+    ) -> list[AgentRunnerDescriptor]:
+        """List available runners.
+
+        Args:
+            bound_plugins: Optional filter for bound plugins (applied locally)
+            use_cache: Use cached data if available
+
+        Returns:
+            List of runner descriptors
+        """
+        if use_cache and self._cache is not None:
+            # Filter from cache
+            return self._filter_runners_by_bound_plugins(self._cache, bound_plugins)
+
+        # Discover fresh (always full list)
+        runners = await self._discover_runners()
+
+        # Update cache (full list, unfiltered)
+        async with self._cache_lock:
+            self._cache = runners
+
+        # Filter locally
+        return self._filter_runners_by_bound_plugins(runners, bound_plugins)
+
+    def _filter_runners_by_bound_plugins(
+        self,
+        runners: dict[str, AgentRunnerDescriptor],
+        bound_plugins: list[str] | None,
+    ) -> list[AgentRunnerDescriptor]:
+        """Filter runners by bound plugins.
+
+        Args:
+            runners: Dict of runner descriptors
+            bound_plugins: Optional filter (None means all plugins allowed)
+
+        Returns:
+            Filtered list of runner descriptors
+        """
+        if bound_plugins is None:
+            # All plugins allowed
+            return list(runners.values())
+
+        allowed_plugin_ids = set(bound_plugins)
+        filtered = []
+        for descriptor in runners.values():
+            plugin_id = descriptor.get_plugin_id()
+            if plugin_id in allowed_plugin_ids:
+                filtered.append(descriptor)
+
+        return filtered
+
+    async def get(
+        self,
+        runner_id: str,
+        bound_plugins: list[str] | None = None,
+    ) -> AgentRunnerDescriptor:
+        """Get a specific runner descriptor.
+
+        Args:
+            runner_id: Runner ID to lookup
+            bound_plugins: Optional bound plugins filter
+
+        Returns:
+            AgentRunnerDescriptor
+
+        Raises:
+            RunnerNotFoundError: If runner not found
+            RunnerNotAuthorizedError: If runner not in bound plugins
+        """
+        # Parse and validate runner ID format
+        try:
+            parse_runner_id(runner_id)
+        except ValueError as e:
+            raise RunnerNotFoundError(runner_id) from e
+
+        # Get from cache or discover (always full list)
+        if self._cache is None:
+            await self.refresh()
+
+        if self._cache is None:
+            raise RunnerNotFoundError(runner_id)
+
+        descriptor = self._cache.get(runner_id)
+        if descriptor is None:
+            raise RunnerNotFoundError(runner_id)
+
+        # Check authorization
+        if bound_plugins is not None:
+            plugin_id = descriptor.get_plugin_id()
+            if plugin_id not in bound_plugins:
+                raise RunnerNotAuthorizedError(runner_id, bound_plugins)
+
+        return descriptor
+
+    async def get_runner_metadata_for_pipeline(self) -> list[dict[str, typing.Any]]:
+        """Get runner metadata for pipeline configuration UI.
+
+        Returns runner options and their config schemas for the DynamicForm.
+        """
+        # Get all runners (no bound plugin filter for metadata listing)
+        runners = await self.list_runners(bound_plugins=None)
+
+        options = []
+        stages = []
+
+        for descriptor in runners:
+            config_schema = []
+            for index, config_item in enumerate(descriptor.config_schema):
+                item = dict(config_item)
+                if not item.get('id'):
+                    item_name = item.get('name') or str(index)
+                    item['id'] = f'{descriptor.id}.{item_name}'
+                config_schema.append(item)
+
+            # Add runner option
+            options.append(
+                {
+                    'name': descriptor.id,
+                    'label': descriptor.label,
+                    'description': descriptor.description,
+                }
+            )
+
+            # Add config schema as stage if not empty
+            if descriptor.config_schema:
+                stages.append(
+                    {
+                        'name': descriptor.id,
+                        'label': descriptor.label,
+                        'description': descriptor.description,
+                        'config': config_schema,
+                    }
+                )
+
+        return options, stages
--- a/src/langbot/pkg/agent/runner/resource_builder.py
+++ b/src/langbot/pkg/agent/runner/resource_builder.py
@@ -0,0 +1,268 @@
+"""Agent resource builder for constructing authorized resources."""
+from __future__ import annotations
+
+import typing
+
+from ...core import app
+from .descriptor import AgentRunnerDescriptor
+from .context_builder import (
+    AgentResources,
+    ModelResource,
+    ToolResource,
+    KnowledgeBaseResource,
+    StorageResource,
+)
+from . import config_schema
+from .host_models import AgentEventEnvelope, AgentBinding
+
+
+class AgentResourceBuilder:
+    """Builder for constructing AgentResources with permission filtering.
+
+    Responsibilities:
+    - Apply 3-layer permission filtering:
+        1. Runner manifest declared permissions
+        2. Pipeline extensions_preference (bound plugins/MCP servers)
+        3. Runner binding config selected resources
+    - Build models list from authorized models
+    - Build tools list from bound plugins/MCP servers
+    - Build knowledge_bases list from config
+    - Build storage and files permissions summary
+
+    Note: This only builds the resource declaration. The actual proxy actions
+    in handler.py must still validate against ctx.resources at runtime.
+
+    Resource field names match the plugin SDK payload:
+    - ModelResource: model_id, model_type, provider
+    - ToolResource: tool_name, tool_type, description
+    - KnowledgeBaseResource: kb_id, kb_name, kb_type
+    - StorageResource: plugin_storage, workspace_storage
+    """
+
+    ap: app.Application
+
+    def __init__(self, ap: app.Application):
+        self.ap = ap
+
+    async def build_resources_from_binding(
+        self,
+        event: AgentEventEnvelope,
+        binding: AgentBinding,
+        descriptor: AgentRunnerDescriptor,
+    ) -> AgentResources:
+        """Build AgentResources from event and binding.
+
+        This is the main entry point for Protocol v1.
+
+        Args:
+            event: Event envelope
+            binding: Agent binding with resource policy
+            descriptor: Runner descriptor with permissions and capabilities
+
+        Returns:
+            AgentResources dict with filtered resource lists
+        """
+        # Layer 1: Runner manifest permissions
+        manifest_perms = descriptor.permissions
+
+        # Layer 2: Binding resource policy
+        resource_policy = binding.resource_policy
+
+        # Layer 3: Runner binding config
+        runner_config = binding.runner_config
+
+        # Build each resource category
+        models = await self._build_models_from_binding(
+            manifest_perms, resource_policy, descriptor, runner_config
+        )
+        tools = await self._build_tools_from_binding(
+            manifest_perms, resource_policy, binding
+        )
+        knowledge_bases = await self._build_knowledge_bases_from_binding(
+            manifest_perms, resource_policy, descriptor, runner_config
+        )
+        storage = self._build_storage_from_binding(manifest_perms, binding)
+
+        return {
+            'models': models,
+            'tools': tools,
+            'knowledge_bases': knowledge_bases,
+            'files': [],  # Files are populated at runtime
+            'storage': storage,
+            'platform_capabilities': {},  # Reserved for EBA
+        }
+
+    async def _build_models_from_binding(
+        self,
+        manifest_perms: dict[str, list[str]],
+        resource_policy: typing.Any,
+        descriptor: AgentRunnerDescriptor,
+        runner_config: dict[str, typing.Any],
+    ) -> list[ModelResource]:
+        """Build models list from binding."""
+        models: list[ModelResource] = []
+        seen_model_ids: set[str] = set()
+
+        model_perms = manifest_perms.get('models', [])
+        allow_llm = 'invoke' in model_perms or 'stream' in model_perms
+        allow_rerank = 'rerank' in model_perms
+        if not allow_llm and not allow_rerank:
+            return models
+
+        # Get additional model UUID grants from resource policy.
+        allowed_uuids = resource_policy.allowed_model_uuids
+
+        # Add model resources from binding config schema
+        await self._append_config_declared_model_resources(
+            models=models,
+            seen_model_ids=seen_model_ids,
+            descriptor=descriptor,
+            runner_config=runner_config,
+            include_llm=allow_llm,
+            include_rerank=allow_rerank,
+        )
+
+        # Add explicitly allowed models
+        if allowed_uuids and allow_llm:
+            for model_uuid in allowed_uuids:
+                await self._append_llm_model_resource(models, seen_model_ids, model_uuid)
+
+        return models
+
+    async def _build_tools_from_binding(
+        self,
+        manifest_perms: dict[str, list[str]],
+        resource_policy: typing.Any,
+        binding: AgentBinding,
+    ) -> list[ToolResource]:
+        """Build tools list from binding."""
+        tools: list[ToolResource] = []
+
+        # Check manifest permission
+        tool_perms = manifest_perms.get('tools', [])
+        if 'detail' not in tool_perms and 'call' not in tool_perms:
+            return tools
+
+        # Get tool names from resource policy
+        allowed_names = resource_policy.allowed_tool_names
+
+        if allowed_names:
+            for tool_name in allowed_names:
+                tools.append({
+                    'tool_name': tool_name,
+                    'tool_type': None,
+                    'description': None,
+                })
+
+        return tools
+
+    async def _build_knowledge_bases_from_binding(
+        self,
+        manifest_perms: dict[str, list[str]],
+        resource_policy: typing.Any,
+        descriptor: AgentRunnerDescriptor,
+        runner_config: dict[str, typing.Any],
+    ) -> list[KnowledgeBaseResource]:
+        """Build knowledge bases list from binding."""
+        kb_resources: list[KnowledgeBaseResource] = []
+
+        # Check manifest permission
+        kb_perms = manifest_perms.get('knowledge_bases', [])
+        if 'list' not in kb_perms and 'retrieve' not in kb_perms:
+            return kb_resources
+
+        # Get KB UUID grants from schema-defined config fields.
+        kb_uuids = config_schema.extract_knowledge_base_uuids(descriptor, runner_config)
+
+        # Also include resource policy grants.
+        allowed_uuids = resource_policy.allowed_kb_uuids
+        if allowed_uuids:
+            kb_uuids = list(dict.fromkeys([*kb_uuids, *allowed_uuids]))
+
+        for kb_uuid in kb_uuids:
+            try:
+                kb = await self.ap.rag_mgr.get_knowledge_base_by_uuid(kb_uuid)
+                if kb:
+                    kb_resources.append({
+                        'kb_id': kb_uuid,
+                        'kb_name': kb.get_name(),
+                        'kb_type': kb.knowledge_base_entity.kb_type if hasattr(kb.knowledge_base_entity, 'kb_type') else None,
+                    })
+            except Exception as e:
+                self.ap.logger.warning(f'Failed to build knowledge base resource {kb_uuid}: {e}')
+
+        return kb_resources
+
+    def _build_storage_from_binding(
+        self,
+        manifest_perms: dict[str, list[str]],
+        binding: AgentBinding,
+    ) -> StorageResource:
+        """Build storage permissions from binding."""
+        storage_perms = manifest_perms.get('storage', [])
+        resource_policy = binding.resource_policy
+
+        return {
+            'plugin_storage': 'plugin' in storage_perms and resource_policy.allow_plugin_storage,
+            'workspace_storage': 'workspace' in storage_perms and resource_policy.allow_workspace_storage,
+        }
+
+    async def _append_config_declared_model_resources(
+        self,
+        models: list[ModelResource],
+        seen_model_ids: set[str],
+        descriptor: AgentRunnerDescriptor,
+        runner_config: dict[str, typing.Any],
+        include_llm: bool,
+        include_rerank: bool,
+    ) -> None:
+        """Authorize model-like values selected through DynamicForm fields."""
+        for model_type, model_uuid in config_schema.iter_config_model_refs(descriptor, runner_config):
+            if model_type == 'llm' and include_llm:
+                await self._append_llm_model_resource(models, seen_model_ids, model_uuid)
+            elif model_type == 'rerank' and include_rerank:
+                await self._append_rerank_model_resource(models, seen_model_ids, model_uuid)
+
+    async def _append_llm_model_resource(
+        self,
+        models: list[ModelResource],
+        seen_model_ids: set[str],
+        model_uuid: str | None,
+    ) -> None:
+        """Append an LLM model resource if it exists and has not been added."""
+        if not model_uuid or model_uuid == '__none__' or model_uuid in seen_model_ids:
+            return
+
+        try:
+            model = await self.ap.model_mgr.get_model_by_uuid(model_uuid)
+            if model and model.model_entity:
+                models.append({
+                    'model_id': model_uuid,
+                    'model_type': getattr(model.model_entity, 'model_type', None),
+                    'provider': getattr(model.provider_entity, 'name', None) if hasattr(model, 'provider_entity') else None,
+                })
+                seen_model_ids.add(model_uuid)
+        except Exception as e:
+            self.ap.logger.warning(f'Failed to build LLM model resource {model_uuid}: {e}')
+
+    async def _append_rerank_model_resource(
+        self,
+        models: list[ModelResource],
+        seen_model_ids: set[str],
+        model_uuid: str | None,
+    ) -> None:
+        """Append a rerank model resource if it exists and has not been added."""
+        if not model_uuid or model_uuid == '__none__' or model_uuid in seen_model_ids:
+            return
+
+        try:
+            model = await self.ap.model_mgr.get_rerank_model_by_uuid(model_uuid)
+            if model and model.model_entity:
+                models.append({
+                    'model_id': model_uuid,
+                    'model_type': getattr(model.model_entity, 'model_type', 'rerank') or 'rerank',
+                    'provider': getattr(model.provider_entity, 'name', None) if hasattr(model, 'provider_entity') else None,
+                })
+                seen_model_ids.add(model_uuid)
+        except Exception as e:
+            self.ap.logger.warning(f'Failed to build rerank model resource {model_uuid}: {e}')
--- a/src/langbot/pkg/agent/runner/result_normalizer.py
+++ b/src/langbot/pkg/agent/runner/result_normalizer.py
@@ -0,0 +1,193 @@
+"""Agent result normalizer for converting AgentRunResult to Pipeline messages."""
+from __future__ import annotations
+
+import typing
+
+from langbot_plugin.api.entities.builtin.provider import message as provider_message
+
+from ...core import app
+from .descriptor import AgentRunnerDescriptor
+from .errors import RunnerExecutionError, RunnerProtocolError
+
+
+# Maximum size for a single result payload (prevent memory exhaustion)
+MAX_RESULT_SIZE_BYTES = 1024 * 1024  # 1 MB
+
+
+class AgentResultNormalizer:
+    """Normalizer for converting AgentRunResult to Pipeline messages.
+
+    Responsibilities:
+    - Accept only supported result types (message.delta, message.completed, etc.)
+    - Map message.delta -> MessageChunk
+    - Map message.completed -> Message
+    - Map run.completed (with message) -> Message
+    - Handle run.failed as controlled error
+    - Ignore unknown types with warning
+    - Validate result size
+    - Validate message schema
+
+    Accepted result types:
+    - message.delta
+    - message.completed
+    - tool.call.started
+    - tool.call.completed
+    - state.updated
+    - run.completed
+    - run.failed
+    - action.requested (log only, don't execute)
+    """
+
+    ap: app.Application
+
+    def __init__(self, ap: app.Application):
+        self.ap = ap
+
+    async def normalize(
+        self,
+        result_dict: dict[str, typing.Any],
+        descriptor: AgentRunnerDescriptor,
+    ) -> provider_message.Message | provider_message.MessageChunk | None:
+        """Normalize AgentRunResult to Message or MessageChunk.
+
+        Args:
+            result_dict: Raw result dict from plugin runtime
+            descriptor: Runner descriptor for error context
+
+        Returns:
+            Message, MessageChunk, or None (for non-message events)
+
+        Raises:
+            RunnerExecutionError: On run.failed
+            RunnerProtocolError: On invalid result format
+        """
+        # Validate result type
+        result_type = result_dict.get('type')
+        if not result_type:
+            raise RunnerProtocolError(descriptor.id, 'Missing result type')
+
+        # Validate result size
+        try:
+            import json
+            result_json = json.dumps(result_dict)
+            if len(result_json) > MAX_RESULT_SIZE_BYTES:
+                self.ap.logger.warning(
+                    f'Runner {descriptor.id} result too large ({len(result_json)} bytes), truncating'
+                )
+                # Truncate content if possible
+                data = result_dict.get('data', {})
+                if 'chunk' in data or 'message' in data:
+                    content = data.get('chunk', {}).get('content', '') or data.get('message', {}).get('content', '')
+                    if isinstance(content, str) and len(content) > 10000:
+                        # Keep reasonable length
+                        data['chunk'] = {'role': 'assistant', 'content': content[:10000] + '...[truncated]'}
+        except Exception as e:
+            self.ap.logger.warning(f'Failed to validate runner {descriptor.id} result size: {e}')
+
+        # Handle each result type
+        data = result_dict.get('data', {})
+
+        if result_type == 'message.delta':
+            return self._normalize_message_delta(data, descriptor)
+
+        elif result_type == 'message.completed':
+            return self._normalize_message_completed(data, descriptor)
+
+        elif result_type == 'tool.call.started':
+            # Log only, don't yield to pipeline
+            self.ap.logger.debug(
+                f'Runner {descriptor.id} tool call started: {data.get("tool_name", "unknown")}'
+            )
+            return None
+
+        elif result_type == 'tool.call.completed':
+            # Log only, don't yield to pipeline
+            self.ap.logger.debug(
+                f'Runner {descriptor.id} tool call completed: {data.get("tool_name", "unknown")}'
+            )
+            return None
+
+        elif result_type == 'state.updated':
+            # Log for telemetry, don't yield to pipeline
+            # Orchestrator already handles the actual PersistentStateStore update.
+            scope = data.get('scope', 'unknown')
+            key = data.get('key', 'unknown')
+            value_repr = repr(data.get('value', '...'))[:100]  # Truncate for log
+            self.ap.logger.debug(
+                f'Runner {descriptor.id} state.updated logged: scope={scope}, key={key}, value={value_repr}'
+            )
+            return None
+
+        elif result_type == 'run.completed':
+            # May include final message
+            if 'message' in data:
+                return self._normalize_message_completed(data, descriptor)
+            # If no message, it's just completion signal
+            return None
+
+        elif result_type == 'run.failed':
+            error_msg = data.get('error', 'Unknown error')
+            error_code = data.get('code', 'unknown')
+            retryable = data.get('retryable', False)
+            raise RunnerExecutionError(
+                descriptor.id,
+                f'{error_msg} (code: {error_code})',
+                retryable=retryable,
+            )
+
+        elif result_type == 'action.requested':
+            # Reserved for EBA - log only, don't execute
+            self.ap.logger.info(
+                f'Runner {descriptor.id} requested action (not executed in current phase): '
+                f'{data.get("action", "unknown")}'
+            )
+            return None
+
+        elif result_type == 'artifact.created':
+            # Log for telemetry, consumed by orchestrator
+            artifact_id = data.get('artifact_id', 'unknown')
+            artifact_type = data.get('artifact_type', 'unknown')
+            self.ap.logger.debug(
+                f'Runner {descriptor.id} artifact.created logged: artifact_id={artifact_id}, type={artifact_type}'
+            )
+            return None
+
+        else:
+            # Unknown type - warn and ignore.
+            self.ap.logger.warning(
+                f'Runner {descriptor.id} returned unknown result type: {result_type}. '
+                f'Expected supported types (message.delta, message.completed, run.completed, run.failed, etc.)'
+            )
+            return None
+
+    def _normalize_message_delta(
+        self,
+        data: dict[str, typing.Any],
+        descriptor: AgentRunnerDescriptor,
+    ) -> provider_message.MessageChunk:
+        """Normalize message.delta to MessageChunk."""
+        chunk_data = data.get('chunk', {})
+        if not chunk_data:
+            raise RunnerProtocolError(descriptor.id, 'message.delta missing chunk data')
+
+        try:
+            chunk = provider_message.MessageChunk.model_validate(chunk_data)
+            return chunk
+        except Exception as e:
+            raise RunnerProtocolError(descriptor.id, f'Invalid chunk schema: {e}')
+
+    def _normalize_message_completed(
+        self,
+        data: dict[str, typing.Any],
+        descriptor: AgentRunnerDescriptor,
+    ) -> provider_message.Message:
+        """Normalize message.completed to Message."""
+        message_data = data.get('message', {})
+        if not message_data:
+            raise RunnerProtocolError(descriptor.id, 'message.completed missing message data')
+
+        try:
+            msg = provider_message.Message.model_validate(message_data)
+            return msg
+        except Exception as e:
+            raise RunnerProtocolError(descriptor.id, f'Invalid message schema: {e}')
--- a/src/langbot/pkg/agent/runner/session_registry.py
+++ b/src/langbot/pkg/agent/runner/session_registry.py
@@ -0,0 +1,250 @@
+"""Agent run session registry for proxy action permission validation."""
+from __future__ import annotations
+
+import asyncio
+import typing
+import time
+import threading
+
+from .context_builder import AgentResources
+
+
+class AgentRunSessionStatus(typing.TypedDict):
+    """Status tracking for agent run session."""
+    started_at: int
+    last_activity_at: int
+
+
+class AgentRunSession(typing.TypedDict):
+    """Session for an active agent runner execution.
+
+    Stored in AgentRunSessionRegistry for proxy action permission validation.
+
+    Fields:
+        run_id: Unique run identifier (UUID from AgentRunContext)
+        runner_id: Runner descriptor ID (plugin:author/name/runner)
+        query_id: Pipeline query ID
+        plugin_identity: Plugin identifier (author/name) of the runner
+        conversation_id: Conversation ID for history/event access
+        resources: Authorized resources for this run (from AgentResources)
+        permissions: Runner permissions from descriptor (artifacts, history, events, etc.)
+        state_policy: State policy from binding (enable_state, state_scopes)
+        state_context: Context for state API (scope_keys, binding_identity, etc.)
+        status: Session status tracking
+        _authorized_ids: Pre-computed authorized resource IDs for O(1) lookup
+    """
+    run_id: str
+    runner_id: str
+    query_id: int | None
+    plugin_identity: str  # author/name
+    conversation_id: str | None
+    resources: AgentResources
+    permissions: dict[str, list[str]]
+    state_policy: dict[str, typing.Any]  # {enable_state: bool, state_scopes: list}
+    state_context: dict[str, typing.Any]  # {scope_keys: dict, binding_identity: str, ...}
+    status: AgentRunSessionStatus
+    _authorized_ids: dict[str, set[str]]  # Pre-computed sets for O(1) lookup
+
+
+class AgentRunSessionRegistry:
+    """Registry for active agent run sessions.
+
+    Host-owned registry for tracking active AgentRunner executions.
+    Used by proxy actions in handler.py to validate resource access.
+
+    Key: run_id (UUID from AgentRunContext)
+    Value: AgentRunSession with authorized resources
+
+    Thread-safe via asyncio.Lock.
+    """
+
+    _sessions: dict[str, AgentRunSession]
+    _lock: asyncio.Lock
+
+    def __init__(self):
+        self._sessions = {}
+        self._lock = asyncio.Lock()
+
+    async def register(
+        self,
+        run_id: str,
+        runner_id: str,
+        query_id: int | None,
+        plugin_identity: str,
+        resources: AgentResources,
+        conversation_id: str | None = None,
+        permissions: dict[str, list[str]] | None = None,
+        state_policy: dict[str, typing.Any] | None = None,
+        state_context: dict[str, typing.Any] | None = None,
+    ) -> None:
+        """Register a new agent run session.
+
+        Args:
+            run_id: Unique run identifier
+            runner_id: Runner descriptor ID
+            query_id: Pipeline query ID
+            plugin_identity: Plugin identifier (author/name)
+            resources: Authorized resources for this run
+            conversation_id: Conversation ID for history/event access
+            permissions: Runner permissions from descriptor (artifacts, history, events, etc.)
+            state_policy: State policy from binding (enable_state, state_scopes)
+            state_context: Context for state API (scope_keys, binding_identity, etc.)
+        """
+        now = int(time.time())
+
+        # Normalize permissions to empty dict if None
+        permissions = permissions or {}
+
+        # Normalize state_policy to defaults if None
+        if state_policy is None:
+            state_policy = {'enable_state': True, 'state_scopes': ['conversation', 'actor']}
+
+        # Normalize state_context to empty dict if None
+        state_context = state_context or {}
+
+        # Pre-compute authorized resource IDs for O(1) lookup
+        authorized_ids: dict[str, set[str]] = {
+            'model': {m.get('model_id') for m in resources.get('models', [])},
+            'tool': {t.get('tool_name') for t in resources.get('tools', [])},
+            'knowledge_base': {kb.get('kb_id') for kb in resources.get('knowledge_bases', [])},
+            'file': {f.get('file_id') for f in resources.get('files', [])},
+        }
+
+        # NOTE: state_policy and state_context are stored at session top-level,
+        # NOT in resources. Resources should only contain resource authorization info.
+        session: AgentRunSession = {
+            'run_id': run_id,
+            'runner_id': runner_id,
+            'query_id': query_id,
+            'plugin_identity': plugin_identity,
+            'conversation_id': conversation_id,
+            'resources': resources,  # Original AgentResources, no state metadata mixed in
+            'permissions': permissions,
+            'state_policy': state_policy,
+            'state_context': state_context,
+            'status': {
+                'started_at': now,
+                'last_activity_at': now,
+            },
+            '_authorized_ids': authorized_ids,
+        }
+
+        async with self._lock:
+            self._sessions[run_id] = session
+
+    async def unregister(self, run_id: str) -> None:
+        """Unregister an agent run session.
+
+        Args:
+            run_id: Unique run identifier
+        """
+        async with self._lock:
+            if run_id in self._sessions:
+                del self._sessions[run_id]
+
+    async def get(self, run_id: str) -> AgentRunSession | None:
+        """Get session by run_id.
+
+        Args:
+            run_id: Unique run identifier
+
+        Returns:
+            AgentRunSession if found, None otherwise
+        """
+        async with self._lock:
+            return self._sessions.get(run_id)
+
+    async def update_activity(self, run_id: str) -> None:
+        """Update last activity timestamp for session.
+
+        Args:
+            run_id: Unique run identifier
+        """
+        async with self._lock:
+            if run_id in self._sessions:
+                self._sessions[run_id]['status']['last_activity_at'] = int(time.time())
+
+    def is_resource_allowed(
+        self,
+        session: AgentRunSession,
+        resource_type: str,
+        resource_id: str,
+    ) -> bool:
+        """Check if resource access is allowed for this session.
+
+        Uses pre-computed authorized IDs for O(1) lookup.
+
+        Args:
+            session: AgentRunSession to check
+            resource_type: Resource type ('model', 'tool', 'knowledge_base', 'storage', 'file')
+            resource_id: Resource identifier (model_id, tool_name, kb_id, 'plugin'/'workspace', file_key)
+
+        Returns:
+            True if resource is authorized, False otherwise
+        """
+        authorized_ids = session.get('_authorized_ids', {})
+
+        if resource_type in ('model', 'tool', 'knowledge_base', 'file'):
+            return resource_id in authorized_ids.get(resource_type, set())
+
+        if resource_type == 'storage':
+            storage = session['resources'].get('storage', {})
+            if resource_id == 'plugin':
+                return storage.get('plugin_storage', False)
+            elif resource_id == 'workspace':
+                return storage.get('workspace_storage', False)
+            return False
+
+        return False
+
+    async def list_active_runs(self) -> list[AgentRunSession]:
+        """List all active run sessions.
+
+        Returns:
+            List of active AgentRunSession dicts
+        """
+        async with self._lock:
+            return list(self._sessions.values())
+
+    async def cleanup_stale_sessions(self, max_age_seconds: int = 3600) -> int:
+        """Cleanup sessions that have been inactive for too long.
+
+        Args:
+            max_age_seconds: Maximum inactivity time in seconds (default 1 hour)
+
+        Returns:
+            Number of sessions cleaned up
+        """
+        now = int(time.time())
+        cleaned = 0
+
+        async with self._lock:
+            stale_run_ids = []
+            for run_id, session in self._sessions.items():
+                last_activity = session['status'].get('last_activity_at', 0)
+                if now - last_activity > max_age_seconds:
+                    stale_run_ids.append(run_id)
+
+            for run_id in stale_run_ids:
+                del self._sessions[run_id]
+                cleaned += 1
+
+        return cleaned
+
+
+# Global registry instance (singleton)
+_global_registry: AgentRunSessionRegistry | None = None
+_global_registry_lock = threading.Lock()
+
+
+def get_session_registry() -> AgentRunSessionRegistry:
+    """Get global session registry instance (thread-safe singleton).
+
+    Returns:
+        AgentRunSessionRegistry singleton
+    """
+    global _global_registry
+    with _global_registry_lock:
+        if _global_registry is None:
+            _global_registry = AgentRunSessionRegistry()
+        return _global_registry
--- a/src/langbot/pkg/agent/runner/state_scope.py
+++ b/src/langbot/pkg/agent/runner/state_scope.py
@@ -0,0 +1,113 @@
+"""State scope key helpers for AgentRunner host-owned state."""
+from __future__ import annotations
+
+import typing
+
+from .descriptor import AgentRunnerDescriptor
+from .host_models import AgentBinding, AgentEventEnvelope
+
+
+VALID_STATE_SCOPES = ('conversation', 'actor', 'subject', 'runner')
+
+STATE_KEY_ALIASES = {
+    'conversation_id': 'external.conversation_id',
+}
+
+
+def normalize_state_key(key: str) -> str:
+    """Map accepted public aliases to protocol state keys."""
+    return STATE_KEY_ALIASES.get(key, key)
+
+
+def get_binding_identity(binding: AgentBinding) -> str:
+    """Return the stable binding identity used for state isolation."""
+    if binding.binding_id:
+        return binding.binding_id
+
+    scope = binding.scope
+    if scope.scope_type and scope.scope_id:
+        return f'{scope.scope_type}:{scope.scope_id}'
+
+    return 'unknown_binding'
+
+
+def build_state_scope_key(
+    scope: str,
+    event: AgentEventEnvelope,
+    binding: AgentBinding,
+    descriptor: AgentRunnerDescriptor,
+) -> str | None:
+    """Build the storage key for one state scope.
+
+    Returns None when the event lacks the identity required by that scope.
+    """
+    binding_identity = get_binding_identity(binding)
+
+    if scope == 'conversation':
+        if not event.conversation_id:
+            return None
+        parts = [descriptor.id, binding_identity, event.conversation_id]
+        if event.thread_id:
+            parts.append(event.thread_id)
+        return f'conversation:{":".join(parts)}'
+
+    if scope == 'actor':
+        if not event.actor or not event.actor.actor_id:
+            return None
+        parts = [
+            descriptor.id,
+            binding_identity,
+            event.actor.actor_type or 'user',
+            event.actor.actor_id,
+        ]
+        return f'actor:{":".join(parts)}'
+
+    if scope == 'subject':
+        if not event.subject or not event.subject.subject_id:
+            return None
+        parts = [
+            descriptor.id,
+            binding_identity,
+            event.subject.subject_type or 'unknown',
+            event.subject.subject_id,
+        ]
+        return f'subject:{":".join(parts)}'
+
+    if scope == 'runner':
+        return f'runner:{descriptor.id}:{binding_identity}'
+
+    return None
+
+
+def build_state_scope_keys(
+    event: AgentEventEnvelope,
+    binding: AgentBinding,
+    descriptor: AgentRunnerDescriptor,
+) -> dict[str, str]:
+    """Build all available scope keys for an event/binding pair."""
+    scope_keys: dict[str, str] = {}
+    for scope in VALID_STATE_SCOPES:
+        scope_key = build_state_scope_key(scope, event, binding, descriptor)
+        if scope_key:
+            scope_keys[scope] = scope_key
+    return scope_keys
+
+
+def build_state_context(
+    event: AgentEventEnvelope,
+    binding: AgentBinding,
+    descriptor: AgentRunnerDescriptor,
+) -> dict[str, typing.Any]:
+    """Build the State API context stored in the run session."""
+    return {
+        'scope_keys': build_state_scope_keys(event, binding, descriptor),
+        'binding_identity': get_binding_identity(binding),
+        'bot_id': event.bot_id,
+        'workspace_id': event.workspace_id,
+        'conversation_id': event.conversation_id,
+        'thread_id': event.thread_id,
+        'actor_type': event.actor.actor_type if event.actor else None,
+        'actor_id': event.actor.actor_id if event.actor else None,
+        'subject_type': event.subject.subject_type if event.subject else None,
+        'subject_id': event.subject.subject_id if event.subject else None,
+    }
--- a/src/langbot/pkg/agent/runner/transcript_store.py
+++ b/src/langbot/pkg/agent/runner/transcript_store.py
@@ -0,0 +1,290 @@
+"""Transcript store for writing and querying conversation history."""
+from __future__ import annotations
+
+import json
+import datetime
+import typing
+import uuid
+
+import sqlalchemy
+from sqlalchemy.ext.asyncio import AsyncEngine, AsyncSession
+from sqlalchemy.orm import sessionmaker
+
+from ...entity.persistence.transcript import Transcript
+
+
+class TranscriptStore:
+    """Store for Transcript records.
+
+    Handles writing transcript items and querying them for history API.
+    All methods are async and use the provided database engine.
+    """
+
+    engine: AsyncEngine
+
+    # Hard limits
+    MAX_CONTENT_LENGTH = 4000
+    HARD_LIMIT = 100
+
+    def __init__(self, engine: AsyncEngine):
+        self.engine = engine
+        self._session_factory = sessionmaker(
+            engine, class_=AsyncSession, expire_on_commit=False
+        )
+
+    async def append_transcript(
+        self,
+        transcript_id: str | None,
+        event_id: str,
+        conversation_id: str,
+        role: str,
+        content: str | None = None,
+        content_json: dict[str, typing.Any] | None = None,
+        artifact_refs: list[dict[str, typing.Any]] | None = None,
+        thread_id: str | None = None,
+        item_type: str = "message",
+        run_id: str | None = None,
+        runner_id: str | None = None,
+        metadata: dict[str, typing.Any] | None = None,
+    ) -> str:
+        """Append a transcript item.
+
+        Args:
+            transcript_id: Unique transcript ID (generated if None)
+            event_id: Source event ID
+            conversation_id: Conversation ID
+            role: Message role (user, assistant, system, tool)
+            content: Text content
+            content_json: Full structured content
+            artifact_refs: Artifact references
+            thread_id: Thread ID
+            item_type: Item type
+            run_id: Run ID that generated this
+            runner_id: Runner ID that generated this
+            metadata: Additional metadata
+
+        Returns:
+            The transcript_id
+        """
+        if transcript_id is None:
+            transcript_id = str(uuid.uuid4())
+
+        # Truncate content if too long
+        if content and len(content) > self.MAX_CONTENT_LENGTH:
+            content = content[:self.MAX_CONTENT_LENGTH - 3] + "..."
+
+        async with self._session_factory() as session:
+            item = Transcript(
+                transcript_id=transcript_id,
+                event_id=event_id,
+                conversation_id=conversation_id,
+                thread_id=thread_id,
+                role=role,
+                item_type=item_type,
+                content=content,
+                content_json=json.dumps(content_json) if content_json else None,
+                artifact_refs_json=json.dumps(artifact_refs) if artifact_refs else None,
+                seq=0,
+                run_id=run_id,
+                runner_id=runner_id,
+                created_at=datetime.datetime.utcnow(),
+                metadata_json=json.dumps(metadata) if metadata else None,
+            )
+            session.add(item)
+            await session.flush()
+            item.seq = item.id or await self._get_next_seq(conversation_id)
+            await session.commit()
+
+        return transcript_id
+
+    async def page_transcript(
+        self,
+        conversation_id: str,
+        before_seq: int | None = None,
+        after_seq: int | None = None,
+        limit: int = 50,
+        direction: str = "backward",
+        include_artifacts: bool = False,
+    ) -> tuple[list[dict[str, typing.Any]], int | None, int | None, bool]:
+        """Page through transcript items.
+
+        Args:
+            conversation_id: Conversation ID
+            before_seq: Get items before this sequence (backward)
+            after_seq: Get items after this sequence (forward)
+            limit: Maximum items to return (capped at 100)
+            direction: 'backward' (older) or 'forward' (newer)
+            include_artifacts: Include artifact refs
+
+        Returns:
+            Tuple of (items, next_seq, prev_seq, has_more)
+        """
+        limit = min(limit, self.HARD_LIMIT)
+
+        async with self._session_factory() as session:
+            query = sqlalchemy.select(Transcript).where(
+                Transcript.conversation_id == conversation_id
+            )
+
+            if direction == "backward" and before_seq is not None:
+                query = query.where(Transcript.seq < before_seq)
+                query = query.order_by(Transcript.seq.desc())
+            elif direction == "forward" and after_seq is not None:
+                query = query.where(Transcript.seq > after_seq)
+                query = query.order_by(Transcript.seq.asc())
+            else:
+                # Default: most recent items first (backward from latest)
+                query = query.order_by(Transcript.seq.desc())
+
+            query = query.limit(limit + 1)
+
+            result = await session.execute(query)
+            rows = result.scalars().all()
+
+            items = [self._row_to_dict(row, include_artifacts) for row in rows[:limit]]
+            has_more = len(rows) > limit
+
+            # Calculate cursors
+            next_seq = None
+            prev_seq = None
+
+            if direction == "backward":
+                # Items are in descending order
+                if items:
+                    next_seq = items[-1].get('seq') if has_more else None
+                    prev_seq = items[0].get('seq')
+            else:
+                # Items are in ascending order
+                if items:
+                    next_seq = items[-1].get('seq') if has_more else None
+                    prev_seq = items[0].get('seq')
+
+            return items, next_seq, prev_seq, has_more
+
+    async def search_transcript(
+        self,
+        conversation_id: str,
+        query_text: str,
+        filters: dict[str, typing.Any] | None = None,
+        top_k: int = 10,
+    ) -> list[dict[str, typing.Any]]:
+        """Search transcript items.
+
+        Basic implementation using LIKE filtering.
+
+        Args:
+            conversation_id: Conversation ID
+            query_text: Search query
+            filters: Optional filters
+            top_k: Maximum results
+
+        Returns:
+            List of matching items
+        """
+        async with self._session_factory() as session:
+            query = sqlalchemy.select(Transcript).where(
+                Transcript.conversation_id == conversation_id,
+                Transcript.content.ilike(f"%{query_text}%"),
+            )
+
+            # Apply additional filters
+            if filters:
+                if 'roles' in filters:
+                    query = query.where(Transcript.role.in_(filters['roles']))
+                if 'item_types' in filters:
+                    query = query.where(Transcript.item_type.in_(filters['item_types']))
+
+            query = query.order_by(Transcript.seq.desc()).limit(top_k)
+
+            result = await session.execute(query)
+            rows = result.scalars().all()
+
+            return [self._row_to_dict(row, include_artifacts=True) for row in rows]
+
+    async def get_latest_cursor(
+        self,
+        conversation_id: str,
+    ) -> str | None:
+        """Get the latest cursor for a conversation.
+
+        Args:
+            conversation_id: Conversation ID
+
+        Returns:
+            Cursor string (seq number), or None if no items
+        """
+        async with self._session_factory() as session:
+            result = await session.execute(
+                sqlalchemy.select(Transcript.seq)
+                .where(Transcript.conversation_id == conversation_id)
+                .order_by(Transcript.seq.desc())
+                .limit(1)
+            )
+            row = result.scalars().first()
+            if row is None:
+                return None
+            return str(row)
+
+    async def has_history_before(
+        self,
+        conversation_id: str,
+        seq: int,
+    ) -> bool:
+        """Check if there is history before a sequence number.
+
+        Args:
+            conversation_id: Conversation ID
+            seq: Sequence number
+
+        Returns:
+            True if there are items before
+        """
+        async with self._session_factory() as session:
+            result = await session.execute(
+                sqlalchemy.select(sqlalchemy.func.count())
+                .select_from(Transcript)
+                .where(
+                    Transcript.conversation_id == conversation_id,
+                    Transcript.seq < seq,
+                )
+            )
+            count = result.scalar()
+            return count > 0
+
+    async def _get_next_seq(self, conversation_id: str) -> int:
+        """Fallback next sequence number for stores that cannot expose autoincrement IDs."""
+        async with self._session_factory() as session:
+            result = await session.execute(
+                sqlalchemy.select(sqlalchemy.func.max(Transcript.seq))
+                .where(Transcript.conversation_id == conversation_id)
+            )
+            max_seq = result.scalar()
+            return (max_seq or 0) + 1
+
+    def _row_to_dict(
+        self,
+        row: Transcript,
+        include_artifacts: bool = False,
+    ) -> dict[str, typing.Any]:
+        """Convert a Transcript row to dict."""
+        result = {
+            'transcript_id': row.transcript_id,
+            'event_id': row.event_id,
+            'conversation_id': row.conversation_id,
+            'thread_id': row.thread_id,
+            'role': row.role,
+            'item_type': row.item_type,
+            'content': row.content,
+            'content_json': json.loads(row.content_json) if row.content_json else None,
+            'seq': row.seq,
+            'cursor': str(row.seq),
+            'created_at': int(row.created_at.timestamp()) if row.created_at else None,
+            'metadata': json.loads(row.metadata_json) if row.metadata_json else {},
+        }
+
+        if include_artifacts and row.artifact_refs_json:
+            result['artifact_refs'] = json.loads(row.artifact_refs_json)
+        else:
+            result['artifact_refs'] = []
+
+        return result
--- a/src/langbot/pkg/api/http/controller/groups/box.py
+++ b/src/langbot/pkg/api/http/controller/groups/box.py
@@ -1,22 +0,0 @@
-from __future__ import annotations
-
-from .. import group
-
-
-@group.group_class('box', '/api/v1/box')
-class BoxRouterGroup(group.RouterGroup):
-    async def initialize(self) -> None:
-        @self.route('/status', methods=['GET'], auth_type=group.AuthType.USER_TOKEN)
-        async def _() -> str:
-            status = await self.ap.box_service.get_status()
-            return self.success(data=status)
-
-        @self.route('/sessions', methods=['GET'], auth_type=group.AuthType.USER_TOKEN)
-        async def _() -> str:
-            sessions = await self.ap.box_service.get_sessions()
-            return self.success(data=sessions)
-
-        @self.route('/errors', methods=['GET'], auth_type=group.AuthType.USER_TOKEN)
-        async def _() -> str:
-            errors = self.ap.box_service.get_recent_errors()
-            return self.success(data=errors)
--- a/src/langbot/pkg/api/http/controller/groups/extensions.py
+++ b/src/langbot/pkg/api/http/controller/groups/extensions.py
@@ -1,52 +0,0 @@
-from __future__ import annotations
-
-import asyncio
-import quart
-
-from .. import group
-
-
-@group.group_class('extensions', '/api/v1/extensions')
-class ExtensionsRouterGroup(group.RouterGroup):
-    """Unified API for installed extensions (plugins, MCP servers, skills)."""
-
-    async def initialize(self) -> None:
-        @self.route('', methods=['GET'], auth_type=group.AuthType.USER_TOKEN_OR_API_KEY)
-        async def _() -> quart.Response:
-            plugins, mcp_servers, skills = await asyncio.gather(
-                self.ap.plugin_connector.list_plugins(),
-                self.ap.mcp_service.get_mcp_servers(contain_runtime_info=True),
-                self.ap.skill_service.list_skills(),
-                return_exceptions=True,
-            )
-
-            def _sort_key(item: dict) -> str:
-                if item['type'] == 'plugin':
-                    return (
-                        item['plugin']
-                        .get('manifest', {})
-                        .get('manifest', {})
-                        .get('metadata', {})
-                        .get('name', '')
-                        .lower()
-                    )
-                if item['type'] == 'mcp':
-                    return (item['server'].get('name') or '').lower()
-                if item['type'] == 'skill':
-                    return (item['skill'].get('display_name') or item['skill'].get('name') or '').lower()
-                return ''
-
-            extensions: list[dict] = []
-            if isinstance(plugins, list):
-                for plugin in plugins:
-                    extensions.append({'type': 'plugin', 'plugin': plugin})
-            if isinstance(mcp_servers, list):
-                for server in mcp_servers:
-                    extensions.append({'type': 'mcp', 'server': server})
-            if isinstance(skills, list):
-                for skill in skills:
-                    extensions.append({'type': 'skill', 'skill': skill})
-
-            extensions.sort(key=_sort_key)
-
-            return self.success(data={'extensions': extensions})
--- a/src/langbot/pkg/api/http/controller/groups/pipelines/pipelines.py
+++ b/src/langbot/pkg/api/http/controller/groups/pipelines/pipelines.py
@@ -73,21 +73,15 @@ class PipelinesRouterGroup(group.RouterGroup):
                plugins = await self.ap.plugin_connector.list_plugins(component_kinds=pipeline_component_kinds)
                mcp_servers = await self.ap.mcp_service.get_mcp_servers(contain_runtime_info=True)

-                # Get available skills
-                available_skills = await self.ap.skill_service.list_skills()
-
                extensions_prefs = pipeline.get('extensions_preferences', {})
                return self.success(
                    data={
                        'enable_all_plugins': extensions_prefs.get('enable_all_plugins', True),
                        'enable_all_mcp_servers': extensions_prefs.get('enable_all_mcp_servers', True),
-                        'enable_all_skills': extensions_prefs.get('enable_all_skills', True),
                        'bound_plugins': extensions_prefs.get('plugins', []),
                        'available_plugins': plugins,
                        'bound_mcp_servers': extensions_prefs.get('mcp_servers', []),
                        'available_mcp_servers': mcp_servers,
-                        'bound_skills': extensions_prefs.get('skills', []),
-                        'available_skills': available_skills,
                    }
                )
            elif quart.request.method == 'PUT':
@@ -95,19 +89,11 @@ class PipelinesRouterGroup(group.RouterGroup):
                json_data = await quart.request.json
                enable_all_plugins = json_data.get('enable_all_plugins', True)
                enable_all_mcp_servers = json_data.get('enable_all_mcp_servers', True)
-                enable_all_skills = json_data.get('enable_all_skills', True)
                bound_plugins = json_data.get('bound_plugins', [])
                bound_mcp_servers = json_data.get('bound_mcp_servers', [])
-                bound_skills = json_data.get('bound_skills', [])

                await self.ap.pipeline_service.update_pipeline_extensions(
-                    pipeline_uuid,
-                    bound_plugins,
-                    bound_mcp_servers,
-                    enable_all_plugins,
-                    enable_all_mcp_servers,
-                    bound_skills=bound_skills,
-                    enable_all_skills=enable_all_skills,
+                    pipeline_uuid, bound_plugins, bound_mcp_servers, enable_all_plugins, enable_all_mcp_servers
                )

                return self.success()
--- a/src/langbot/pkg/api/http/controller/groups/pipelines/websocket_chat.py
+++ b/src/langbot/pkg/api/http/controller/groups/pipelines/websocket_chat.py
@@ -43,12 +43,8 @@ class WebSocketChatRouterGroup(group.RouterGroup):
                    await quart.websocket.send(json.dumps({'type': 'error', 'message': 'WebSocket adapter not found'}))
                    return

-                # Dashboard pipeline-debug sessions must always run under the
-                # built-in websocket_proxy_bot identity. We deliberately do NOT
-                # resolve a web_page_bot owner here — even if one is bound to
-                # the same pipeline, debug requests must not be attributed to
-                # it. The embed widget path (`/api/v1/embed/<bot>/ws/connect`)
-                # is the one that carries the page-bot identity.
+                # Find the owning bot for this pipeline (e.g. a web_page_bot)
+                owner_bot = self._find_owner_bot(pipeline_uuid)

                # 注册连接
                connection = await ws_connection_manager.add_connection(
@@ -77,7 +73,7 @@ class WebSocketChatRouterGroup(group.RouterGroup):
                )

                # 创建接收和发送任务
-                receive_task = asyncio.create_task(self._handle_receive(connection, websocket_adapter))
+                receive_task = asyncio.create_task(self._handle_receive(connection, websocket_adapter, owner_bot))
                send_task = asyncio.create_task(self._handle_send(connection))

                # 等待任务完成
@@ -185,7 +181,14 @@ class WebSocketChatRouterGroup(group.RouterGroup):
            except Exception as e:
                return self.http_status(500, -1, f'Internal server error: {str(e)}')

-    async def _handle_receive(self, connection, websocket_adapter):
+    def _find_owner_bot(self, pipeline_uuid: str):
+        """Find a user-created bot (e.g. web_page_bot) that owns this pipeline."""
+        for bot in self.ap.platform_mgr.bots:
+            if bot.bot_entity.adapter == 'web_page_bot' and bot.bot_entity.use_pipeline_uuid == pipeline_uuid:
+                return bot
+        return None
+
+    async def _handle_receive(self, connection, websocket_adapter, owner_bot=None):
        """处理接收消息的任务"""
        try:
            while connection.is_active:
@@ -210,10 +213,7 @@ class WebSocketChatRouterGroup(group.RouterGroup):
                        logger.debug(f'收到消息: {data} from {connection.connection_id}')

                        # 处理消息（不等待响应，响应会通过broadcast异步发送）
-                        # owner_bot is intentionally NOT passed: the dashboard
-                        # debug WebSocket must always run under the proxy bot,
-                        # never under a coincidentally-bound web_page_bot.
-                        await websocket_adapter.handle_websocket_message(connection, data)
+                        await websocket_adapter.handle_websocket_message(connection, data, owner_bot=owner_bot)

                    elif message_type == 'disconnect':
                        # 客户端主动断开
--- a/src/langbot/pkg/api/http/controller/groups/plugins.py
+++ b/src/langbot/pkg/api/http/controller/groups/plugins.py
@@ -1,15 +1,11 @@
 from __future__ import annotations

 import base64
-import io
 import quart
 import re
 import httpx
 import uuid
 import os
-import zipfile
-import yaml
-from urllib.parse import urlparse
 import posixpath
 import sqlalchemy

@@ -57,97 +53,6 @@ def _get_request_origin() -> str:

@group.group_class('plugins', '/api/v1/plugins')
 class PluginsRouterGroup(group.RouterGroup):
-    @staticmethod
-    def _normalize_archive_path(path: str) -> str:
-        normalized = str(path or '').replace('\\', '/').strip('/')
-        return posixpath.normpath(normalized) if normalized else ''
-
-    @classmethod
-    def _component_source_path(cls, entry) -> str:
-        if isinstance(entry, dict):
-            return cls._normalize_archive_path(entry.get('path') or '')
-        return cls._normalize_archive_path(str(entry or ''))
-
-    @classmethod
-    def _count_component_configs(cls, component_config, archive_names: list[str]) -> int:
-        normalized_names = [cls._normalize_archive_path(name) for name in archive_names]
-        component_files: set[str] = set()
-
-        if isinstance(component_config, list):
-            return len(component_config)
-        if not isinstance(component_config, dict):
-            return 1 if component_config else 0
-
-        for entry in component_config.get('fromFiles') or []:
-            source_path = cls._component_source_path(entry)
-            if source_path and source_path in normalized_names:
-                component_files.add(source_path)
-
-        for entry in component_config.get('fromDirs') or []:
-            source_dir = cls._component_source_path(entry).rstrip('/')
-            if not source_dir:
-                continue
-            prefix = f'{source_dir}/'
-            for archive_name in normalized_names:
-                if not archive_name.startswith(prefix):
-                    continue
-                if archive_name.lower().endswith(('.yaml', '.yml')):
-                    component_files.add(archive_name)
-
-        if component_files:
-            return len(component_files)
-
-        return 1 if any(key in component_config for key in ('path', 'name', 'kind')) else 0
-
-    @classmethod
-    def _count_plugin_components(cls, components, archive_names: list[str]) -> dict[str, int]:
-        if not isinstance(components, dict):
-            return {}
-
-        component_counts: dict[str, int] = {}
-        for kind, component_config in components.items():
-            count = cls._count_component_configs(component_config, archive_names)
-            if count > 0:
-                component_counts[str(kind)] = count
-        return component_counts
-
-    @staticmethod
-    def _parse_github_repo_url(repo_url: str) -> dict | None:
-        raw_url = str(repo_url or '').strip()
-        if not raw_url:
-            return None
-
-        if not re.match(r'^[a-zA-Z][a-zA-Z0-9+.-]*://', raw_url):
-            raw_url = f'https://{raw_url}'
-
-        parsed = urlparse(raw_url)
-        if parsed.netloc.lower() not in ('github.com', 'www.github.com'):
-            return None
-
-        parts = [part for part in parsed.path.strip('/').split('/') if part]
-        if len(parts) < 2:
-            return None
-
-        owner = parts[0]
-        repo = parts[1]
-        if repo.endswith('.git'):
-            repo = repo[:-4]
-        if not owner or not repo:
-            return None
-
-        ref = ''
-        subdir = ''
-        if len(parts) >= 4 and parts[2] in ('tree', 'blob'):
-            ref = parts[3]
-            subdir = '/'.join(parts[4:]).strip('/')
-
-        return {
-            'owner': owner,
-            'repo': repo,
-            'ref': ref,
-            'subdir': subdir,
-        }
-
    async def _check_extensions_limit(self) -> str | None:
        """Check if extensions limit is reached. Returns error response if limit exceeded, None otherwise."""
        limitation = self.ap.instance_config.data.get('system', {}).get('limitation', {})
@@ -349,37 +254,17 @@ class PluginsRouterGroup(group.RouterGroup):
            data = await quart.request.json
            repo_url = data.get('repo_url', '')

-            parsed_repo = self._parse_github_repo_url(repo_url)
-            if not parsed_repo:
+            # Parse GitHub repository URL to extract owner and repo
+            # Supports: https://github.com/owner/repo or github.com/owner/repo
+            pattern = r'github\.com/([^/]+)/([^/]+?)(?:\.git)?(?:/.*)?$'
+            match = re.search(pattern, repo_url)
+
+            if not match:
                return self.http_status(400, -1, 'Invalid GitHub repository URL')

-            owner = parsed_repo['owner']
-            repo = parsed_repo['repo']
-            requested_ref = parsed_repo['ref']
-            requested_subdir = parsed_repo['subdir']
+            owner, repo = match.groups()

            try:
-                if requested_ref:
-                    return self.success(
-                        data={
-                            'releases': [
-                                {
-                                    'id': 0,
-                                    'tag_name': requested_ref,
-                                    'name': requested_ref,
-                                    'published_at': '',
-                                    'prerelease': False,
-                                    'draft': False,
-                                    'source_type': 'branch',
-                                    'archive_url': f'https://api.github.com/repos/{owner}/{repo}/zipball/{requested_ref}',
-                                }
-                            ],
-                            'owner': owner,
-                            'repo': repo,
-                            'source_subdir': requested_subdir,
-                        }
-                    )
-
                # Fetch releases from GitHub API
                url = f'https://api.github.com/repos/{owner}/{repo}/releases'
                async with httpx.AsyncClient(
@@ -405,14 +290,7 @@ class PluginsRouterGroup(group.RouterGroup):
                        }
                    )

-                return self.success(
-                    data={
-                        'releases': formatted_releases,
-                        'owner': owner,
-                        'repo': repo,
-                        'source_subdir': requested_subdir,
-                    }
-                )
+                return self.success(data={'releases': formatted_releases, 'owner': owner, 'repo': repo})
            except httpx.RequestError as e:
                return self.http_status(500, -1, f'Failed to fetch releases: {str(e)}')

@@ -567,62 +445,6 @@ class PluginsRouterGroup(group.RouterGroup):

            return self.success(data={'task_id': wrapper.id})

-        @self.route('/install/local/preview', methods=['POST'], auth_type=group.AuthType.USER_TOKEN_OR_API_KEY)
-        async def _() -> str:
-            file = (await quart.request.files).get('file')
-            if file is None:
-                return self.http_status(400, -1, 'file is required')
-
-            file_bytes = file.read()
-            try:
-                with zipfile.ZipFile(io.BytesIO(file_bytes)) as zf:
-                    names = [name for name in zf.namelist() if not name.endswith('/')]
-                    manifest_name = next(
-                        (
-                            name
-                            for name in names
-                            if name.replace('\\', '/').strip('/').lower() in ('manifest.yaml', 'manifest.yml')
-                        ),
-                        None,
-                    )
-                    if manifest_name is None:
-                        return self.http_status(400, -1, 'manifest.yaml is required')
-
-                    manifest = yaml.safe_load(zf.read(manifest_name).decode('utf-8')) or {}
-                    requirements: list[str] = []
-                    requirements_name = next(
-                        (name for name in names if name.replace('\\', '/').strip('/').lower() == 'requirements.txt'),
-                        None,
-                    )
-                    if requirements_name is not None:
-                        requirements = [
-                            line.strip()
-                            for line in zf.read(requirements_name).decode('utf-8', errors='ignore').splitlines()
-                            if line.strip() and not line.strip().startswith('#')
-                        ]
-
-                    spec = manifest.get('spec') or {}
-                    components = spec.get('components') or {}
-                    component_counts = self._count_plugin_components(components, names)
-                    component_types = list(component_counts.keys())
-
-                    return self.success(
-                        data={
-                            'filename': file.filename or 'local plugin',
-                            'size': len(file_bytes),
-                            'manifest': manifest,
-                            'metadata': manifest.get('metadata') or {},
-                            'component_types': component_types,
-                            'component_counts': component_counts,
-                            'requirements': requirements,
-                            'file_count': len(names),
-                        }
-                    )
-            except zipfile.BadZipFile:
-                return self.http_status(400, -1, 'invalid .lbpkg file')
-            except Exception as exc:
-                return self.http_status(500, -1, f'Failed to preview plugin package: {exc}')
-
        @self.route('/config-files', methods=['POST'], auth_type=group.AuthType.USER_TOKEN)
        async def _() -> str:
            """Upload a file for plugin configuration"""
--- a/src/langbot/pkg/api/http/controller/groups/resources/mcp.py
+++ b/src/langbot/pkg/api/http/controller/groups/resources/mcp.py
@@ -31,9 +31,6 @@ class MCPRouterGroup(group.RouterGroup):
        @self.route('/servers/<server_name>', methods=['GET', 'PUT', 'DELETE'], auth_type=group.AuthType.USER_TOKEN)
        async def _(server_name: str) -> str:
            """获取、更新或删除MCP服务器配置"""
-            from urllib.parse import unquote
-
-            server_name = unquote(server_name)

            server_data = await self.ap.mcp_service.get_mcp_server_by_name(server_name)
            if server_data is None:
@@ -60,9 +57,6 @@ class MCPRouterGroup(group.RouterGroup):
        @self.route('/servers/<server_name>/test', methods=['POST'], auth_type=group.AuthType.USER_TOKEN)
        async def _(server_name: str) -> str:
            """测试MCP服务器连接"""
-            from urllib.parse import unquote
-
-            server_name = unquote(server_name)
            server_data = await quart.request.json
            task_id = await self.ap.mcp_service.test_mcp_server(server_name=server_name, server_data=server_data)
            return self.success(data={'task_id': task_id})
--- a/src/langbot/pkg/api/http/controller/groups/skills.py
+++ b/src/langbot/pkg/api/http/controller/groups/skills.py
@@ -1,190 +0,0 @@
-from __future__ import annotations
-
-import quart
-
-from langbot_plugin.box.errors import BoxError
-
-from .. import group
-
-
-@group.group_class('skills', '/api/v1/skills')
-class SkillsRouterGroup(group.RouterGroup):
-    """Skills management API endpoints."""
-
-    async def initialize(self) -> None:
-        @self.route('', methods=['GET', 'POST'], auth_type=group.AuthType.USER_TOKEN_OR_API_KEY)
-        async def list_or_create_skills() -> quart.Response:
-            if quart.request.method == 'GET':
-                try:
-                    skills = await self.ap.skill_service.list_skills()
-                except (ValueError, BoxError) as exc:
-                    return self.http_status(400, -1, str(exc))
-                return self.success(data={'skills': skills})
-
-            data = await quart.request.json
-            if 'name' not in data or not data['name']:
-                return self.http_status(400, -1, 'Missing required field: name')
-
-            try:
-                skill = await self.ap.skill_service.create_skill(data)
-                return self.success(data={'skill': skill})
-            except (ValueError, BoxError) as exc:
-                return self.http_status(400, -1, str(exc))
-
-        @self.route('/<skill_name>', methods=['GET', 'PUT', 'DELETE'], auth_type=group.AuthType.USER_TOKEN_OR_API_KEY)
-        async def get_update_delete_skill(skill_name: str) -> quart.Response:
-            if quart.request.method == 'GET':
-                try:
-                    skill = await self.ap.skill_service.get_skill(skill_name)
-                except (ValueError, BoxError) as exc:
-                    return self.http_status(400, -1, str(exc))
-                if not skill:
-                    return self.http_status(404, -1, 'Skill not found')
-                return self.success(data={'skill': skill})
-
-            if quart.request.method == 'PUT':
-                data = await quart.request.json
-                try:
-                    skill = await self.ap.skill_service.update_skill(skill_name, data)
-                    return self.success(data={'skill': skill})
-                except (ValueError, BoxError) as exc:
-                    return self.http_status(400, -1, str(exc))
-
-            try:
-                await self.ap.skill_service.delete_skill(skill_name)
-                return self.success()
-            except (ValueError, BoxError) as exc:
-                return self.http_status(400, -1, str(exc))
-
-        @self.route('/<skill_name>/files', methods=['GET'], auth_type=group.AuthType.USER_TOKEN_OR_API_KEY)
-        async def list_skill_files(skill_name: str) -> quart.Response:
-            """List files in skill package directory."""
-            path = quart.request.args.get('path', '.').strip()
-            include_hidden = quart.request.args.get('include_hidden', 'false').lower() == 'true'
-
-            try:
-                result = await self.ap.skill_service.list_skill_files(
-                    skill_name,
-                    path=path,
-                    include_hidden=include_hidden,
-                )
-                return self.success(data=result)
-            except (ValueError, BoxError) as exc:
-                return self.http_status(400, -1, str(exc))
-
-        @self.route(
-            '/<skill_name>/files/<path:path>', methods=['GET', 'PUT'], auth_type=group.AuthType.USER_TOKEN_OR_API_KEY
-        )
-        async def read_or_write_skill_file(skill_name: str, path: str) -> quart.Response:
-            """Read or write a file in skill package."""
-            if quart.request.method == 'GET':
-                try:
-                    result = await self.ap.skill_service.read_skill_file(skill_name, path)
-                    return self.success(data=result)
-                except (ValueError, BoxError) as exc:
-                    return self.http_status(400, -1, str(exc))
-
-            # PUT - write file
-            data = await quart.request.json
-            content = data.get('content', '')
-            if content is None:
-                return self.http_status(400, -1, 'Missing required field: content')
-
-            try:
-                result = await self.ap.skill_service.write_skill_file(skill_name, path, content)
-                return self.success(data=result)
-            except (ValueError, BoxError) as exc:
-                return self.http_status(400, -1, str(exc))
-
-        @self.route('/<skill_name>/preview', methods=['GET'], auth_type=group.AuthType.USER_TOKEN_OR_API_KEY)
-        async def preview_skill(skill_name: str) -> quart.Response:
-            skill = self.ap.skill_mgr.get_skill_by_name(skill_name)
-            if not skill:
-                return self.http_status(404, -1, 'Skill not found')
-            return self.success(data={'instructions': skill.get('instructions', '')})
-
-        @self.route('/install/github', methods=['POST'], auth_type=group.AuthType.USER_TOKEN_OR_API_KEY)
-        async def install_skill_from_github() -> quart.Response:
-            data = await quart.request.json
-            required_fields = ['asset_url', 'owner', 'repo']
-            for field in required_fields:
-                if field not in data or not data[field]:
-                    return self.http_status(400, -1, f'Missing required field: {field}')
-            asset_url = str(data['asset_url']).strip().lower().split('?', 1)[0].split('#', 1)[0]
-            if not asset_url.endswith('skill.md') and not data.get('release_tag'):
-                return self.http_status(400, -1, 'Missing required field: release_tag')
-
-            try:
-                skill = await self.ap.skill_service.install_from_github(data)
-                return self.success(data={'skills': skill})
-            except (ValueError, BoxError) as exc:
-                return self.http_status(400, -1, str(exc))
-            except Exception as exc:
-                return self.http_status(500, -1, f'Failed to install skill: {exc}')
-
-        @self.route('/install/github/preview', methods=['POST'], auth_type=group.AuthType.USER_TOKEN_OR_API_KEY)
-        async def preview_skill_from_github() -> quart.Response:
-            data = await quart.request.json
-            required_fields = ['asset_url', 'owner', 'repo']
-            for field in required_fields:
-                if field not in data or not data[field]:
-                    return self.http_status(400, -1, f'Missing required field: {field}')
-            asset_url = str(data['asset_url']).strip().lower().split('?', 1)[0].split('#', 1)[0]
-            if not asset_url.endswith('skill.md') and not data.get('release_tag'):
-                return self.http_status(400, -1, 'Missing required field: release_tag')
-
-            try:
-                preview = await self.ap.skill_service.preview_install_from_github(data)
-                return self.success(data={'skills': preview})
-            except (ValueError, BoxError) as exc:
-                return self.http_status(400, -1, str(exc))
-            except Exception as exc:
-                return self.http_status(500, -1, f'Failed to preview skill: {exc}')
-
-        @self.route('/install/upload', methods=['POST'], auth_type=group.AuthType.USER_TOKEN_OR_API_KEY)
-        async def install_skill_from_upload() -> quart.Response:
-            file = (await quart.request.files).get('file')
-            if file is None:
-                return self.http_status(400, -1, 'file is required')
-            form = await quart.request.form
-
-            try:
-                skill = await self.ap.skill_service.install_from_zip_upload(
-                    file_bytes=file.read(),
-                    filename=file.filename or '',
-                    source_paths=form.getlist('source_paths'),
-                )
-                return self.success(data={'skills': skill})
-            except (ValueError, BoxError) as exc:
-                return self.http_status(400, -1, str(exc))
-            except Exception as exc:
-                return self.http_status(500, -1, f'Failed to install skill: {exc}')
-
-        @self.route('/install/upload/preview', methods=['POST'], auth_type=group.AuthType.USER_TOKEN_OR_API_KEY)
-        async def preview_skill_from_upload() -> quart.Response:
-            file = (await quart.request.files).get('file')
-            if file is None:
-                return self.http_status(400, -1, 'file is required')
-
-            try:
-                preview = await self.ap.skill_service.preview_install_from_zip_upload(
-                    file_bytes=file.read(),
-                    filename=file.filename or '',
-                )
-                return self.success(data={'skills': preview})
-            except (ValueError, BoxError) as exc:
-                return self.http_status(400, -1, str(exc))
-            except Exception as exc:
-                return self.http_status(500, -1, f'Failed to preview skill: {exc}')
-
-        @self.route('/scan', methods=['GET'], auth_type=group.AuthType.USER_TOKEN_OR_API_KEY)
-        async def scan_skill_directory() -> quart.Response:
-            path = quart.request.args.get('path', '').strip()
-            if not path:
-                return self.http_status(400, -1, 'Missing required parameter: path')
-
-            try:
-                result = await self.ap.skill_service.scan_directory_async(path)
-                return self.success(data=result)
-            except (ValueError, BoxError) as exc:
-                return self.http_status(400, -1, str(exc))
--- a/src/langbot/pkg/api/http/controller/groups/system.py
+++ b/src/langbot/pkg/api/http/controller/groups/system.py
@@ -140,6 +140,17 @@ class SystemRouterGroup(group.RouterGroup):
        async def _() -> str:
            return self.success(data=await self.ap.maintenance_service.get_storage_analysis())

+        @self.route('/debug/exec', methods=['POST'], auth_type=group.AuthType.USER_TOKEN)
+        async def _() -> str:
+            if not constants.debug_mode:
+                return self.http_status(403, 403, 'Forbidden')
+
+            py_code = await quart.request.data
+
+            ap = self.ap
+
+            return self.success(data=exec(py_code, {'ap': ap}))
+
        @self.route(
            '/debug/plugin/action',
            methods=['POST'],
--- a/src/langbot/pkg/api/http/service/model.py
+++ b/src/langbot/pkg/api/http/service/model.py
@@ -9,6 +9,8 @@ from ....core import app
 from ....entity.persistence import model as persistence_model
 from ....entity.persistence import pipeline as persistence_pipeline
 from ....provider.modelmgr import requester as model_requester
+from ....agent.runner.config_migration import ConfigMigration
+from ....agent.runner import config_schema


 def _parse_provider_api_keys(provider_dict: dict) -> dict:
@@ -40,6 +42,40 @@ class LLMModelsService:
    def __init__(self, ap: app.Application) -> None:
        self.ap = ap

+    async def _get_runner_descriptor(self, runner_id: str):
+        registry = getattr(self.ap, 'agent_runner_registry', None)
+        if registry is None:
+            return None
+        try:
+            return await registry.get(runner_id, bound_plugins=None)
+        except Exception as e:
+            logger = getattr(self.ap, 'logger', None)
+            if logger:
+                logger.warning(f'Failed to load AgentRunner descriptor while setting default model: {e}')
+            return None
+
+    async def _auto_set_default_pipeline_llm_model(self, pipeline: persistence_pipeline.LegacyPipeline, model_uuid: str):
+        pipeline_config = pipeline.config
+        if not isinstance(pipeline_config, dict):
+            return
+
+        runner_id = ConfigMigration.resolve_runner_id(pipeline_config)
+        if not runner_id:
+            return
+
+        descriptor = await self._get_runner_descriptor(runner_id)
+        if descriptor is None:
+            return
+
+        ai_config = pipeline_config.setdefault('ai', {})
+        runner_configs = ai_config.setdefault('runner_config', {})
+        runner_config = runner_configs.setdefault(runner_id, {})
+
+        if not config_schema.set_empty_llm_model_selection(descriptor, runner_config, model_uuid):
+            return
+
+        await self.ap.pipeline_service.update_pipeline(pipeline.uuid, {'config': pipeline_config})
+
    async def get_llm_models(self, include_secret: bool = True) -> list[dict]:
        """Get all LLM models with provider info"""
        result = await self.ap.persistence_mgr.execute_async(sqlalchemy.select(persistence_model.LLMModel))
@@ -109,7 +145,6 @@ class LLMModelsService:
        self.ap.model_mgr.llm_models.append(runtime_llm_model)

        if auto_set_to_default_pipeline:
-            # set the default pipeline model to this model
            result = await self.ap.persistence_mgr.execute_async(
                sqlalchemy.select(persistence_pipeline.LegacyPipeline).where(
                    persistence_pipeline.LegacyPipeline.is_default == True
@@ -117,15 +152,7 @@ class LLMModelsService:
            )
            pipeline = result.first()
            if pipeline is not None:
-                model_config = pipeline.config.get('ai', {}).get('local-agent', {}).get('model', {})
-                if not model_config.get('primary', ''):
-                    pipeline_config = pipeline.config
-                    pipeline_config['ai']['local-agent']['model'] = {
-                        'primary': model_data['uuid'],
-                        'fallbacks': [],
-                    }
-                    pipeline_data = {'config': pipeline_config}
-                    await self.ap.pipeline_service.update_pipeline(pipeline.uuid, pipeline_data)
+                await self._auto_set_default_pipeline_llm_model(pipeline, model_data['uuid'])

        return model_data['uuid']

--- a/src/langbot/pkg/api/http/service/pipeline.py
+++ b/src/langbot/pkg/api/http/service/pipeline.py
@@ -3,17 +3,22 @@ from __future__ import annotations
 import uuid
 import json
 import sqlalchemy
+import typing

 from ....core import app
 from ....entity.persistence import pipeline as persistence_pipeline

+# Prefer the official local-agent plugin when it is installed. This is not a
+# built-in fallback: when no AgentRunner plugin is available, the default
+# pipeline stays unbound so the UI can guide users to install a runner.
+PREFERRED_DEFAULT_RUNNER_ID = 'plugin:langbot/local-agent/default'
+

 default_stage_order = [
    'GroupRespondRuleCheckStage',  # 群响应规则检查
    'BanSessionCheckStage',  # 封禁会话检查
    'PreContentFilterStage',  # 内容过滤前置阶段
    'PreProcessor',  # 预处理器
-    'ConversationMessageTruncator',  # 会话消息截断器
    'RequireRateLimitOccupancy',  # 请求速率限制占用
    'MessageProcessor',  # 处理器
    'ReleaseRateLimitOccupancy',  # 释放速率限制占用
@@ -30,11 +35,108 @@ class PipelineService:
    def __init__(self, ap: app.Application) -> None:
        self.ap = ap

+    def _get_default_values_from_schema(self, config_schema: list[dict[str, typing.Any]]) -> dict[str, typing.Any]:
+        """Build runner config defaults from a DynamicForm schema."""
+        defaults: dict[str, typing.Any] = {}
+        for item in config_schema:
+            name = item.get('name')
+            if not name:
+                continue
+            if 'default' in item:
+                defaults[name] = item['default']
+        return defaults
+
+    async def get_default_pipeline_config(self) -> dict[str, typing.Any]:
+        """Get the default pipeline config, rendering runner defaults from installed plugins."""
+        from ....utils import paths as path_utils
+
+        template_path = path_utils.get_resource_path('templates/default-pipeline-config.json')
+        with open(template_path, 'r', encoding='utf-8') as f:
+            config = json.load(f)
+
+        agent_runner_registry = getattr(self.ap, 'agent_runner_registry', None)
+        if agent_runner_registry is None:
+            return config
+
+        try:
+            runners = await agent_runner_registry.list_runners(bound_plugins=None)
+        except Exception as e:
+            logger = getattr(self.ap, 'logger', None)
+            if logger:
+                logger.warning(f'Failed to load plugin agent runners for default pipeline config: {e}')
+            return config
+
+        if not runners:
+            return config
+
+        selected_runner = next(
+            (runner for runner in runners if runner.id == PREFERRED_DEFAULT_RUNNER_ID),
+            runners[0],
+        )
+        ai_config = config.setdefault('ai', {})
+        runner_config = ai_config.setdefault('runner', {})
+        runner_config['id'] = selected_runner.id
+        runner_config.setdefault('expire-time', 0)
+
+        ai_config['runner_config'] = {
+            selected_runner.id: self._get_default_values_from_schema(selected_runner.config_schema),
+        }
+
+        return config
+
    async def get_pipeline_metadata(self) -> list[dict]:
+        """Get pipeline metadata with dynamically loaded plugin runners from registry"""
+        import copy
+
+        # Deep copy AI metadata to avoid modifying the original
+        ai_metadata = copy.deepcopy(self.ap.pipeline_config_meta_ai)
+
+        # Find the runner stage
+        runner_stage = None
+        for stage in ai_metadata.get('stages', []):
+            if stage.get('name') == 'runner':
+                runner_stage = stage
+                break
+
+        if runner_stage:
+            # Find the runner select config (now uses 'id' field)
+            for config_item in runner_stage.get('config', []):
+                if config_item.get('name') == 'id':
+                    # Get plugin agent runners from registry
+                    try:
+                        (
+                            runner_options,
+                            runner_stages,
+                        ) = await self.ap.agent_runner_registry.get_runner_metadata_for_pipeline()
+
+                        # Replace options entirely with registry options
+                        # Only installed/available runners should be shown
+                        config_item['options'] = runner_options
+
+                        # Prefer the official local-agent plugin when installed; otherwise use the first
+                        # discoverable runner. If no runner is available, leave the default unset so the
+                        # UI can recommend installing an AgentRunner plugin, similar to the RAG flow.
+                        if runner_options and 'default' not in config_item:
+                            default_option = next(
+                                (option for option in runner_options if option['name'] == PREFERRED_DEFAULT_RUNNER_ID),
+                                runner_options[0],
+                            )
+                            config_item['default'] = default_option['name']
+
+                        # Add corresponding stage configuration for each runner
+                        for stage_config in runner_stages:
+                            # Avoid duplicate stages
+                            existing_stage_names = {s.get('name') for s in ai_metadata.get('stages', [])}
+                            if stage_config['name'] not in existing_stage_names:
+                                ai_metadata['stages'].append(stage_config)
+
+                    except Exception as e:
+                        self.ap.logger.warning(f'Failed to load plugin agent runners from registry: {e}')
+
        return [
            self.ap.pipeline_config_meta_trigger,
            self.ap.pipeline_config_meta_safety,
-            self.ap.pipeline_config_meta_ai,
+            ai_metadata,
            self.ap.pipeline_config_meta_output,
        ]

@@ -74,8 +176,6 @@ class PipelineService:
        return self.ap.persistence_mgr.serialize_model(persistence_pipeline.LegacyPipeline, pipeline)

    async def create_pipeline(self, pipeline_data: dict, default: bool = False) -> str:
-        from ....utils import paths as path_utils
-
        # Check limitation
        limitation = self.ap.instance_config.data.get('system', {}).get('limitation', {})
        max_pipelines = limitation.get('max_pipelines', -1)
@@ -89,9 +189,7 @@ class PipelineService:
        pipeline_data['stages'] = default_stage_order.copy()
        pipeline_data['is_default'] = default

-        template_path = path_utils.get_resource_path('templates/default-pipeline-config.json')
-        with open(template_path, 'r', encoding='utf-8') as f:
-            pipeline_data['config'] = json.load(f)
+        pipeline_data['config'] = await self.get_default_pipeline_config()

        # Ensure extensions_preferences is set with enable_all_plugins and enable_all_mcp_servers=True by default
        if 'extensions_preferences' not in pipeline_data:
@@ -113,10 +211,16 @@ class PipelineService:
        return pipeline_data['uuid']

    async def update_pipeline(self, pipeline_uuid: str, pipeline_data: dict) -> None:
+        from ....agent.runner.config_migration import ConfigMigration
+
        pipeline_data = pipeline_data.copy()
        for protected_field in ('uuid', 'for_version', 'stages', 'is_default'):
            pipeline_data.pop(protected_field, None)

+        # Migrate config to new format before saving
+        if 'config' in pipeline_data:
+            pipeline_data['config'] = ConfigMigration.migrate_pipeline_config(pipeline_data['config'])
+
        await self.ap.persistence_mgr.execute_async(
            sqlalchemy.update(persistence_pipeline.LegacyPipeline)
            .where(persistence_pipeline.LegacyPipeline.uuid == pipeline_uuid)
@@ -215,8 +319,6 @@ class PipelineService:
        bound_mcp_servers: list[str] = None,
        enable_all_plugins: bool = True,
        enable_all_mcp_servers: bool = True,
-        bound_skills: list[str] = None,
-        enable_all_skills: bool = True,
    ) -> None:
        """Update the bound plugins and MCP servers for a pipeline"""
        # Get current pipeline
@@ -234,12 +336,9 @@ class PipelineService:
        extensions_preferences = pipeline.extensions_preferences or {}
        extensions_preferences['enable_all_plugins'] = enable_all_plugins
        extensions_preferences['enable_all_mcp_servers'] = enable_all_mcp_servers
-        extensions_preferences['enable_all_skills'] = enable_all_skills
        extensions_preferences['plugins'] = bound_plugins
        if bound_mcp_servers is not None:
            extensions_preferences['mcp_servers'] = bound_mcp_servers
-        if bound_skills is not None:
-            extensions_preferences['skills'] = bound_skills

        await self.ap.persistence_mgr.execute_async(
            sqlalchemy.update(persistence_pipeline.LegacyPipeline)
--- a/src/langbot/pkg/api/http/service/skill.py
+++ b/src/langbot/pkg/api/http/service/skill.py
@@ -1,428 +0,0 @@
-from __future__ import annotations
-
-import io
-import inspect
-import os
-import posixpath
-import zipfile
-from typing import Optional
-from urllib.parse import quote, unquote, urlparse
-
-import httpx
-
-from ....core import app
-from ....skill.utils import parse_frontmatter
-
-
-_PUBLIC_SKILL_FIELDS = (
-    'name',
-    'display_name',
-    'description',
-    'instructions',
-    'package_root',
-    'created_at',
-    'updated_at',
-)
-
-_GITHUB_ASSET_HOSTS = {
-    'github.com',
-    'api.github.com',
-    'objects.githubusercontent.com',
-    'githubusercontent.com',
-    'raw.githubusercontent.com',
-    'codeload.github.com',
-}
-
-
-class SkillService:
-    """Filesystem-backed skill management service."""
-
-    ap: app.Application
-
-    def __init__(self, ap: app.Application) -> None:
-        self.ap = ap
-
-    def _box_service(self):
-        box_service = getattr(self.ap, 'box_service', None)
-        if box_service is not None and getattr(box_service, 'available', False):
-            return box_service
-        return None
-
-    def _require_box(self, action: str):
-        """Return the Box service or raise if it is not available.
-
-        Box is the only source of truth for skills. Every read and write
-        operation goes through it — there is no local-filesystem fallback.
-        """
-        box_service = self._box_service()
-        if box_service is not None:
-            return box_service
-        ap_box = getattr(self.ap, 'box_service', None)
-        if ap_box is None:
-            reason = 'not initialised'
-        elif not getattr(ap_box, 'enabled', True):
-            reason = 'disabled in config (box.enabled = false)'
-        else:
-            connector_error = getattr(ap_box, '_connector_error', '') or 'currently unavailable'
-            reason = f'unavailable: {connector_error}'
-        raise ValueError(
-            f'{action} requires the Box runtime, which is {reason}. '
-            f'Enable Box in config.yaml (box.enabled = true) and ensure the '
-            f'runtime is reachable before retrying.'
-        )
-
-    def _require_box_for_write(self, action: str) -> None:
-        """Backwards-compatible alias preserved for clarity at call sites."""
-        self._require_box(action)
-
-    @staticmethod
-    def _serialize_skill(skill: dict) -> dict:
-        return {field: skill.get(field) for field in _PUBLIC_SKILL_FIELDS if field in skill}
-
-    async def list_skills(self) -> list[dict]:
-        # When Box is unavailable, surface an empty list rather than raising —
-        # the skills page should render cleanly, and the UI separately renders
-        # a "Box disabled / unavailable" banner via useBoxStatus.
-        box_service = self._box_service()
-        if box_service is None:
-            return []
-        return [self._serialize_skill(skill) for skill in await box_service.list_skills()]
-
-    async def get_skill(self, skill_name: str) -> Optional[dict]:
-        box_service = self._box_service()
-        if box_service is None:
-            return None
-        skill = await box_service.get_skill(skill_name)
-        return self._serialize_skill(skill) if skill else None
-
-    async def get_skill_by_name(self, name: str) -> Optional[dict]:
-        return await self.get_skill(name)
-
-    async def create_skill(self, data: dict) -> dict:
-        box_service = self._require_box('Creating a skill')
-        created = await box_service.create_skill(data)
-        await self._reload_skills()
-        return self._serialize_skill(created)
-
-    async def update_skill(self, skill_name: str, data: dict) -> dict:
-        box_service = self._require_box('Editing a skill')
-        updated = await box_service.update_skill(skill_name, data)
-        await self._reload_skills()
-        return self._serialize_skill(updated)
-
-    async def delete_skill(self, skill_name: str) -> bool:
-        box_service = self._require_box('Deleting a skill')
-        await box_service.delete_skill(skill_name)
-        await self._reload_skills()
-        return True
-
-    async def list_skill_files(
-        self,
-        skill_name: str,
-        path: str = '.',
-        include_hidden: bool = False,
-        max_entries: int = 200,
-    ) -> dict:
-        box_service = self._require_box('Browsing skill files')
-        return await box_service.list_skill_files(skill_name, path, include_hidden, max_entries)
-
-    async def read_skill_file(self, skill_name: str, path: str) -> dict:
-        box_service = self._require_box('Reading a skill file')
-        return await box_service.read_skill_file(skill_name, path)
-
-    async def write_skill_file(self, skill_name: str, path: str, content: str) -> dict:
-        box_service = self._require_box('Editing skill files')
-        result = await box_service.write_skill_file(skill_name, path, content)
-        await self._reload_skills()
-        return result
-
-    async def install_from_github(self, data: dict) -> list[dict]:
-        box_service = self._require_box('Installing a skill from GitHub')
-        owner = str(data['owner']).strip()
-        repo = str(data['repo']).strip()
-        release_tag = str(data.get('release_tag', '')).strip()
-        raw_asset_url = str(data['asset_url']).strip()
-        if self._is_github_skill_md_url(raw_asset_url):
-            return await self._install_github_skill_md(raw_asset_url, owner=owner, repo=repo, data=data)
-
-        asset_url = self._validate_github_asset_url(raw_asset_url, owner=owner, repo=repo, release_tag=release_tag)
-        source_subdir = str(data.get('source_subdir', '') or '').strip()
-
-        zip_bytes = await self._download_github_asset(asset_url)
-        filename = f'{repo}-{release_tag.lstrip("v").replace("/", "-") or "source"}.zip'
-        installed = await box_service.install_skill_zip(
-            zip_bytes,
-            filename,
-            source_paths=data.get('source_paths') or [],
-            source_path=str(data.get('source_path', '') or ''),
-            source_subdir=source_subdir,
-        )
-        await self._reload_skills()
-        return [self._serialize_skill(skill) for skill in installed]
-
-    async def preview_install_from_github(self, data: dict) -> list[dict]:
-        box_service = self._require_box('Previewing a skill from GitHub')
-        owner = str(data['owner']).strip()
-        repo = str(data['repo']).strip()
-        release_tag = str(data.get('release_tag', '')).strip()
-        raw_asset_url = str(data['asset_url']).strip()
-        if self._is_github_skill_md_url(raw_asset_url):
-            return await self._preview_github_skill_md(raw_asset_url, owner=owner, repo=repo)
-
-        asset_url = self._validate_github_asset_url(raw_asset_url, owner=owner, repo=repo, release_tag=release_tag)
-        source_subdir = str(data.get('source_subdir', '') or '').strip()
-
-        zip_bytes = await self._download_github_asset(asset_url)
-        return await box_service.preview_skill_zip(
-            zip_bytes,
-            f'{repo}-{release_tag.lstrip("v").replace("/", "-") or "source"}.zip',
-            source_subdir=source_subdir,
-        )
-
-    async def install_from_zip_upload(
-        self,
-        *,
-        file_bytes: bytes,
-        filename: str,
-        source_paths: list[str] | None = None,
-        source_path: str = '',
-    ) -> list[dict]:
-        box_service = self._require_box('Installing a skill from upload')
-        installed = await box_service.install_skill_zip(
-            file_bytes,
-            filename,
-            source_paths=source_paths or [],
-            source_path=source_path,
-        )
-        await self._reload_skills()
-        return [self._serialize_skill(skill) for skill in installed]
-
-    async def preview_install_from_zip_upload(self, *, file_bytes: bytes, filename: str) -> list[dict]:
-        box_service = self._require_box('Previewing a skill upload')
-        return await box_service.preview_skill_zip(file_bytes, filename)
-
-    async def _install_github_skill_md(self, asset_url: str, *, owner: str, repo: str, data: dict) -> list[dict]:
-        box_service = self._require_box('Installing a skill from GitHub')
-        zip_bytes, filename, _package_name = await self._download_github_skill_directory_as_zip(
-            asset_url,
-            owner=owner,
-            repo=repo,
-        )
-
-        installed = await box_service.install_skill_zip(
-            zip_bytes,
-            filename,
-            source_paths=data.get('source_paths') or [],
-            source_path=str(data.get('source_path', '') or ''),
-            target_suffix='',
-        )
-        await self._reload_skills()
-        return [self._serialize_skill(skill) for skill in installed]
-
-    async def _preview_github_skill_md(self, asset_url: str, *, owner: str, repo: str) -> list[dict]:
-        box_service = self._require_box('Previewing a skill from GitHub')
-        zip_bytes, _filename, package_name = await self._download_github_skill_directory_as_zip(
-            asset_url,
-            owner=owner,
-            repo=repo,
-        )
-        return await box_service.preview_skill_zip(zip_bytes, f'{package_name}.zip', target_suffix='')
-
-    async def reload_skills(self) -> list[dict]:
-        await self._reload_skills()
-        return await self.list_skills()
-
-    async def scan_directory_async(self, path: str) -> dict:
-        box_service = self._require_box('Scanning a skill directory')
-        return await box_service.scan_skill_directory(path)
-
-    async def _reload_skills(self) -> None:
-        skill_mgr = getattr(self.ap, 'skill_mgr', None)
-        reload_skills = getattr(skill_mgr, 'reload_skills', None)
-        if not callable(reload_skills):
-            return
-        result = reload_skills()
-        if inspect.isawaitable(result):
-            await result
-
-    async def _download_github_asset(self, asset_url: str) -> bytes:
-        async with httpx.AsyncClient(follow_redirects=True, timeout=120) as client:
-            resp = await client.get(asset_url)
-            resp.raise_for_status()
-            return resp.content
-
-    async def _download_github_skill_directory_as_zip(
-        self, asset_url: str, *, owner: str, repo: str
-    ) -> tuple[bytes, str, str]:
-        info = self._parse_github_skill_md_url(asset_url, owner=owner, repo=repo)
-        archive_url = f'https://codeload.github.com/{owner}/{repo}/zip/{quote(info["ref"], safe="/")}'
-        archive_bytes = await self._download_github_asset(archive_url)
-
-        try:
-            source_archive = zipfile.ZipFile(io.BytesIO(archive_bytes), 'r')
-        except zipfile.BadZipFile as exc:
-            raise ValueError('GitHub repository archive must be a valid .zip archive') from exc
-
-        with source_archive as source_zip:
-            skill_entry = self._find_github_skill_archive_entry(source_zip, info['file_path'])
-            try:
-                skill_md_content = source_zip.read(skill_entry).decode('utf-8')
-            except UnicodeDecodeError as exc:
-                raise ValueError('GitHub SKILL.md must be valid UTF-8 text') from exc
-
-            package_name = self._resolve_github_skill_md_package_name(skill_md_content, info['package_name'])
-            source_skill_dir = posixpath.dirname(posixpath.normpath(skill_entry.filename))
-
-            buffer = io.BytesIO()
-            with zipfile.ZipFile(buffer, 'w', zipfile.ZIP_DEFLATED) as target_zip:
-                self._copy_github_skill_directory_to_zip(source_zip, target_zip, source_skill_dir, package_name)
-        return buffer.getvalue(), f'{package_name}.zip', package_name
-
-    def _find_github_skill_archive_entry(self, archive: zipfile.ZipFile, file_path: str) -> zipfile.ZipInfo:
-        normalized_file_path = posixpath.normpath(file_path).lower()
-        for member in archive.infolist():
-            if member.is_dir():
-                continue
-            normalized_member = posixpath.normpath(member.filename)
-            path_parts = normalized_member.split('/', 1)
-            if len(path_parts) != 2:
-                continue
-            archive_relative_path = path_parts[1].lower()
-            if archive_relative_path == normalized_file_path:
-                return member
-        raise ValueError(f'GitHub archive does not contain requested SKILL.md: {file_path}')
-
-    def _copy_github_skill_directory_to_zip(
-        self,
-        source_zip: zipfile.ZipFile,
-        target_zip: zipfile.ZipFile,
-        source_skill_dir: str,
-        package_name: str,
-    ) -> None:
-        normalized_source_dir = posixpath.normpath(source_skill_dir)
-        source_prefix = f'{normalized_source_dir}/'
-        copied_files = 0
-
-        for member in source_zip.infolist():
-            normalized_member = posixpath.normpath(member.filename)
-            if normalized_member != normalized_source_dir and not normalized_member.startswith(source_prefix):
-                continue
-
-            relative_path = posixpath.relpath(normalized_member, normalized_source_dir)
-            if relative_path in ('', '.'):
-                continue
-            if relative_path.startswith('../') or relative_path == '..' or posixpath.isabs(relative_path):
-                raise ValueError(f'GitHub archive contains an unsafe skill path: {member.filename}')
-
-            target_name = f'{package_name}/{relative_path}'
-            if member.is_dir() and not target_name.endswith('/'):
-                target_name = f'{target_name}/'
-            target_info = zipfile.ZipInfo(target_name, date_time=member.date_time)
-            target_info.external_attr = member.external_attr
-            target_info.compress_type = zipfile.ZIP_DEFLATED
-
-            if member.is_dir():
-                target_zip.writestr(target_info, b'')
-                continue
-
-            target_zip.writestr(target_info, source_zip.read(member))
-            copied_files += 1
-
-        if copied_files == 0:
-            raise ValueError('GitHub skill directory is empty')
-
-    def _uploaded_skill_target_stem(self, filename: str) -> str:
-        stem = os.path.splitext(os.path.basename(str(filename or '').strip()))[0]
-        safe_stem = ''.join(ch if ch.isalnum() or ch in ('-', '_') else '-' for ch in stem).strip('-_')
-        if not safe_stem:
-            safe_stem = 'uploaded-skill'
-        return safe_stem
-
-    @staticmethod
-    def _is_github_skill_md_url(asset_url: str) -> bool:
-        parsed = urlparse(str(asset_url or '').strip())
-        normalized_path = posixpath.normpath(parsed.path or '/')
-        return normalized_path.lower().endswith('/skill.md')
-
-    def _parse_github_skill_md_url(self, asset_url: str, *, owner: str, repo: str) -> dict:
-        parsed = urlparse(str(asset_url or '').strip())
-        if parsed.scheme != 'https' or not parsed.netloc:
-            raise ValueError('asset_url must be a valid HTTPS GitHub SKILL.md URL')
-
-        host = parsed.netloc.lower()
-        path_parts = [unquote(part) for part in (parsed.path or '').split('/') if part]
-        if host == 'github.com':
-            if (
-                len(path_parts) < 5
-                or path_parts[0] != owner
-                or path_parts[1] != repo
-                or path_parts[2]
-                not in (
-                    'blob',
-                    'raw',
-                )
-            ):
-                raise ValueError('GitHub SKILL.md URL must point to the requested owner/repo blob path')
-            ref = path_parts[3]
-            file_path = '/'.join(path_parts[4:])
-        elif host == 'raw.githubusercontent.com':
-            if len(path_parts) < 4 or path_parts[0] != owner or path_parts[1] != repo:
-                raise ValueError('GitHub SKILL.md URL must point to the requested owner/repo raw path')
-            ref = path_parts[2]
-            file_path = '/'.join(path_parts[3:])
-        else:
-            raise ValueError('asset_url must point to a GitHub SKILL.md file')
-
-        normalized_file_path = posixpath.normpath(file_path)
-        normalized_file_path_lower = normalized_file_path.lower()
-        if normalized_file_path_lower != 'skill.md' and not normalized_file_path_lower.endswith('/skill.md'):
-            raise ValueError('GitHub skill import requires a URL ending with SKILL.md')
-
-        parent_dir = posixpath.basename(posixpath.dirname(normalized_file_path)) or repo
-        return {
-            'ref': ref,
-            'file_path': normalized_file_path,
-            'package_name': self._uploaded_skill_target_stem(parent_dir),
-        }
-
-    def _resolve_github_skill_md_package_name(self, content: str, fallback: str) -> str:
-        metadata, _instructions = parse_frontmatter(content)
-        candidate = str(metadata.get('name') or fallback or '').strip()
-        try:
-            return self._validate_skill_name(candidate)
-        except ValueError:
-            return self._validate_skill_name(fallback)
-
-    @staticmethod
-    def _validate_github_asset_url(asset_url: str, *, owner: str, repo: str, release_tag: str) -> str:
-        parsed = urlparse(str(asset_url).strip())
-        if parsed.scheme != 'https' or not parsed.netloc:
-            raise ValueError('asset_url must be a valid HTTPS GitHub asset URL')
-
-        host = parsed.netloc.lower()
-        if host not in _GITHUB_ASSET_HOSTS:
-            raise ValueError('asset_url must point to a GitHub-hosted release asset or archive')
-
-        normalized_path = posixpath.normpath(parsed.path or '/')
-        allowed_prefixes = [
-            f'/repos/{owner}/{repo}/',
-            f'/{owner}/{repo}/',
-        ]
-        if not any(normalized_path.startswith(prefix) for prefix in allowed_prefixes):
-            raise ValueError('asset_url does not match the requested owner/repo')
-
-        if release_tag and release_tag not in parsed.path and release_tag not in parsed.query:
-            raise ValueError('asset_url does not match the requested release_tag')
-
-        return parsed.geturl()
-
-    @staticmethod
-    def _validate_skill_name(name: str) -> str:
-        name = str(name or '').strip()
-        if not name:
-            raise ValueError('Skill name is required')
-        if not name.replace('-', '').replace('_', '').isalnum():
-            raise ValueError('Skill name can only contain letters, numbers, hyphens and underscores')
-        if len(name) > 64:
-            raise ValueError('Skill name cannot exceed 64 characters')
-        return name
--- a/src/langbot/pkg/box/init.py
+++ b/src/langbot/pkg/box/init.py
@@ -1,5 +0,0 @@
-"""LangBot Box runtime package."""
-
-from .workspace import BoxWorkspaceSession
-
-__all__ = ['BoxWorkspaceSession']
--- a/src/langbot/pkg/box/connector.py
+++ b/src/langbot/pkg/box/connector.py
@@ -1,354 +0,0 @@
-from __future__ import annotations
-
-import asyncio
-import json
-import os
-import sys
-import typing
-from typing import TYPE_CHECKING
-from urllib.parse import urlparse
-
-from langbot_plugin.entities.io.actions.enums import CommonAction
-from langbot_plugin.runtime.io.handler import Handler
-from langbot_plugin.runtime.io.connection import Connection
-
-from langbot_plugin.box.client import ActionRPCBoxClient
-from langbot_plugin.box.errors import BoxRuntimeUnavailableError
-from langbot_plugin.box.actions import LangBotToBoxAction
-
-from ..utils import platform
-from ..utils.managed_runtime import ManagedRuntimeConnector
-
-if TYPE_CHECKING:
-    from ..core import app as core_app
-
-
-# Default Docker Compose service name for the standalone Box container.
-_DOCKER_BOX_HOST = 'langbot_box'
-_DEFAULT_PORT = 5410
-
-_HEARTBEAT_INTERVAL_SEC = 20
-
-# Top-level keys under ``box`` that are LangBot-internal and should not be
-# forwarded to the Box runtime.
-_INTERNAL_BOX_CONFIG_KEYS = frozenset({'runtime'})
-
-
-def _get_box_config(ap) -> dict:
-    """Return the 'box' section from instance config.
-
-    Environment-variable overrides are handled uniformly by
-    ``LoadConfigStage._apply_env_overrides_to_config`` using the
-    ``SECTION__SUBSECTION__KEY`` convention (e.g. ``BOX__LOCAL__HOST_ROOT``,
-    ``BOX__LOCAL__ALLOWED_MOUNT_ROOTS="/a,/b"``) before this is read, so no
-    box-specific env parsing is needed here.
-    """
-    instance_config = getattr(ap, 'instance_config', None)
-    config_data = getattr(instance_config, 'data', {}) if instance_config is not None else {}
-    return dict(config_data.get('box', {}) or {})
-
-
-def _get_runtime_endpoint(box_cfg: dict) -> str:
-    runtime_cfg = box_cfg.get('runtime') or {}
-    return str(runtime_cfg.get('endpoint', '')).strip()
-
-
-def _filter_config_for_runtime(box_cfg: dict) -> dict:
-    return {k: v for k, v in box_cfg.items() if k not in _INTERNAL_BOX_CONFIG_KEYS}
-
-
-def resolve_box_ws_relay_url(ap: core_app.Application) -> str:
-    """Derive the WS relay base URL used for managed-process attach.
-
-    The WS relay serves the ``/v1/sessions/{id}/managed-process/ws`` endpoint
-    on the *relay* port (default 5410).
-    """
-    box_cfg = _get_box_config(ap)
-
-    # Explicit runtime endpoint takes precedence. The config value is a base
-    # URL; endpoint-specific paths are appended by the SDK client.
-    endpoint = _get_runtime_endpoint(box_cfg)
-    if endpoint:
-        parsed = urlparse(endpoint)
-        scheme = parsed.scheme or 'ws'
-        if scheme == 'ws':
-            scheme = 'http'
-        elif scheme == 'wss':
-            scheme = 'https'
-        host = parsed.hostname or '127.0.0.1'
-        port = parsed.port or _DEFAULT_PORT
-        return f'{scheme}://{host}:{port}'
-
-    # In Docker, relay lives on the box runtime container.
-    if platform.get_platform() == 'docker':
-        return f'http://{_DOCKER_BOX_HOST}:{_DEFAULT_PORT}'
-
-    return f'http://127.0.0.1:{_DEFAULT_PORT}'
-
-
-class BoxRuntimeConnector(ManagedRuntimeConnector):
-    """Connect to the Box runtime via action RPC.
-
-    Transport decision (mirrors Plugin runtime logic):
-      1. Docker / --standalone-box / explicit runtime.endpoint -> WebSocket to external Box process
-      2. Windows (non-Docker)                              -> subprocess + WebSocket (Windows lacks async stdio pipe)
-      3. Unix / macOS                                      -> subprocess + stdio pipe
-    """
-
-    def __init__(
-        self,
-        ap: core_app.Application,
-        runtime_disconnect_callback: typing.Callable[
-            ['BoxRuntimeConnector'], typing.Coroutine[typing.Any, typing.Any, None]
-        ]
-        | None = None,
-    ):
-        super().__init__(ap)
-        self.runtime_disconnect_callback = runtime_disconnect_callback
-        self.configured_runtime_endpoint = self._load_configured_runtime_endpoint()
-        self.ws_relay_base_url = resolve_box_ws_relay_url(ap)
-        self.client = ActionRPCBoxClient(logger=ap.logger)
-
-        self._handler: Handler | None = None
-        self._handler_task: asyncio.Task | None = None
-        self._ctrl_task: asyncio.Task | None = None
-        self._heartbeat_task: asyncio.Task | None = None
-
-        # Parse the relay URL once for reuse.
-        parsed = urlparse(self.ws_relay_base_url)
-        self._relay_host = parsed.hostname or '127.0.0.1'
-        self._relay_port = parsed.port or _DEFAULT_PORT
-        self._filtered_box_config = _filter_config_for_runtime(_get_box_config(ap))
-
-    def _uses_websocket(self) -> bool:
-        """Whether the connector should use WebSocket to reach the Box runtime.
-
-        True when:
-          - Running inside Docker (Box runtime is a separate container)
-          - The ``--standalone-box`` CLI flag was passed
-          - An explicit ``runtime.endpoint`` was configured
-        """
-        return bool(
-            self.configured_runtime_endpoint
-            or platform.get_platform() == 'docker'
-            or platform.use_websocket_to_connect_box_runtime()
-        )
-
-    async def initialize(self) -> None:
-        if self._uses_websocket():
-            if platform.get_platform() == 'win32' and not self.configured_runtime_endpoint:
-                await self._start_subprocess_then_ws()
-            else:
-                await self._connect_remote_ws()
-        else:
-            await self._start_local_stdio()
-
-        # Start heartbeat after successful connection
-        if self._heartbeat_task is None:
-            self._heartbeat_task = asyncio.create_task(self._heartbeat_loop())
-
-    # -- heartbeat -----------------------------------------------------------
-
-    async def _heartbeat_loop(self) -> None:
-        """Periodically ping the Box runtime to detect silent disconnections."""
-        while True:
-            await asyncio.sleep(_HEARTBEAT_INTERVAL_SEC)
-            try:
-                await self.ping()
-                self.ap.logger.debug('Heartbeat to Box runtime success.')
-            except Exception as e:
-                self.ap.logger.debug(f'Failed to heartbeat to Box runtime: {e}')
-
-    async def ping(self) -> None:
-        if self._handler is None:
-            raise BoxRuntimeUnavailableError('Box runtime is not connected')
-        await self._handler.call_action(CommonAction.PING, {})
-
-    # -- transport paths -----------------------------------------------------
-
-    async def _start_local_stdio(self) -> None:
-        """Launch box server as subprocess and connect via stdio (Unix/macOS)."""
-        from langbot_plugin.runtime.io.controllers.stdio.client import StdioClientController
-
-        self.ap.logger.info('Use stdio to connect to box runtime')
-        python_path = sys.executable
-        env = os.environ.copy()
-        if self._filtered_box_config:
-            env['LANGBOT_BOX_CONFIG'] = json.dumps(self._filtered_box_config)
-
-        connected = asyncio.Event()
-        connect_error: list[Exception] = []
-
-        ctrl = StdioClientController(
-            command=python_path,
-            # Launched through the same CLI entry point as the plugin runtime
-            # (cli.__init__ <subcommand>); `-s` selects the stdio transport,
-            # mirroring `rt -s`.
-            args=['-m', 'langbot_plugin.cli.__init__', 'box', '-s', '--ws-control-port', str(self._relay_port)],
-            env=env,
-        )
-        self._ctrl_task = asyncio.create_task(
-            ctrl.run(self._make_connection_callback('stdio', connected, connect_error))
-        )
-
-        try:
-            await asyncio.wait_for(connected.wait(), timeout=30.0)
-        except asyncio.TimeoutError:
-            raise BoxRuntimeUnavailableError('box runtime subprocess did not connect in time')
-
-        if connect_error:
-            raise BoxRuntimeUnavailableError(f'box runtime connection failed: {connect_error[0]}')
-
-        self._subprocess = ctrl.process
-
-    async def _start_subprocess_then_ws(self) -> None:
-        """Launch box server as detached subprocess, then connect via WS (Windows)."""
-        self.ap.logger.info('(windows) Use cmd to launch box runtime and communicate via ws')
-
-        env = os.environ.copy()
-        if self._filtered_box_config:
-            env['LANGBOT_BOX_CONFIG'] = json.dumps(self._filtered_box_config)
-
-        python_path = sys.executable
-        # Launched through the same CLI entry point as the plugin runtime
-        # (cli.__init__ <subcommand>); no flag => WebSocket transport.
-        self.runtime_subprocess = await asyncio.create_subprocess_exec(
-            python_path,
-            '-m',
-            'langbot_plugin.cli.__init__',
-            'box',
-            '--ws-control-port',
-            str(self._relay_port),
-            env=env,
-        )
-        self.runtime_subprocess_task = asyncio.create_task(self.runtime_subprocess.wait())
-
-        ws_url = f'ws://localhost:{self._relay_port}/rpc/ws'
-        await self._connect_ws(ws_url, '(windows) WebSocket')
-
-    async def _connect_remote_ws(self) -> None:
-        """Connect to a remote (or Docker) box server via WebSocket."""
-        ws_url = self._resolve_rpc_ws_url()
-        self.ap.logger.info(f'Use WebSocket to connect to box runtime ({ws_url})')
-        await self._connect_ws(ws_url, 'WebSocket')
-
-    # -- helpers -------------------------------------------------------------
-
-    def _resolve_rpc_ws_url(self) -> str:
-        """Determine the action-RPC WebSocket URL.
-
-        All endpoints share a single port; action RPC is at ``/rpc/ws``.
-        """
-        if self.configured_runtime_endpoint:
-            base = self.configured_runtime_endpoint.rstrip('/')
-            parsed = urlparse(base)
-            scheme = parsed.scheme or 'ws'
-            if scheme in ('http', 'https'):
-                scheme = 'wss' if scheme == 'https' else 'ws'
-            host = parsed.hostname or '127.0.0.1'
-            port = parsed.port or _DEFAULT_PORT
-            return f'{scheme}://{host}:{port}/rpc/ws'
-
-        if platform.get_platform() == 'docker':
-            return f'ws://{_DOCKER_BOX_HOST}:{_DEFAULT_PORT}/rpc/ws'
-
-        return f'ws://localhost:{self._relay_port}/rpc/ws'
-
-    async def _connect_ws(self, ws_url: str, transport_name: str) -> None:
-        """Shared WebSocket connection procedure."""
-        from langbot_plugin.runtime.io.controllers.ws.client import WebSocketClientController
-
-        connected = asyncio.Event()
-        connect_error: list[Exception] = []
-
-        async def on_connect_failed(ctrl, exc):
-            if exc is not None:
-                self.ap.logger.error(f'Failed to connect to Box runtime ({ws_url}): {exc}')
-            else:
-                self.ap.logger.error(f'Failed to connect to Box runtime ({ws_url}), trying to reconnect...')
-            connect_error.append(exc or BoxRuntimeUnavailableError('ws connection failed'))
-            connected.set()
-            if self.runtime_disconnect_callback is not None:
-                await self.runtime_disconnect_callback(self)
-
-        ctrl = WebSocketClientController(ws_url=ws_url, make_connection_failed_callback=on_connect_failed)
-        self._ctrl_task = asyncio.create_task(
-            ctrl.run(self._make_connection_callback(transport_name, connected, connect_error))
-        )
-
-        try:
-            await asyncio.wait_for(connected.wait(), timeout=30.0)
-        except asyncio.TimeoutError:
-            raise BoxRuntimeUnavailableError(f'box runtime ws connection timed out ({ws_url})')
-
-        if connect_error:
-            raise BoxRuntimeUnavailableError(f'box runtime connection failed: {connect_error[0]}')
-
-    def _make_connection_callback(
-        self,
-        transport_name: str,
-        connected: asyncio.Event,
-        connect_error: list[Exception],
-    ):
-        async def new_connection_callback(connection: Connection) -> None:
-            handler = Handler(connection)
-            self._handler = handler
-            self.client.set_handler(handler)
-            self._handler_task = asyncio.create_task(handler.run())
-            try:
-                await handler.call_action(CommonAction.PING, {})
-                if self._filtered_box_config:
-                    await handler.call_action(LangBotToBoxAction.INIT, self._filtered_box_config)
-                    self.ap.logger.debug('Sent box configuration to Box runtime via INIT.')
-                self.ap.logger.info(f'Connected to Box runtime via {transport_name}.')
-                connected.set()
-                await self._handler_task
-            except Exception as exc:
-                if not connected.is_set():
-                    connect_error.append(exc)
-                    connected.set()
-                    return
-
-            # If we reach here, handler.run() returned normally (connection
-            # closed) or raised after the initial handshake succeeded.
-            # Either way, treat it as a disconnect.
-            if connected.is_set():
-                if self._uses_websocket():
-                    self.ap.logger.error('Disconnected from Box runtime, trying to reconnect...')
-                    if self.runtime_disconnect_callback is not None:
-                        await self.runtime_disconnect_callback(self)
-                else:
-                    self.ap.logger.error(
-                        'Disconnected from Box runtime via stdio. '
-                        'Cannot automatically reconnect — please restart LangBot.'
-                    )
-
-        return new_connection_callback
-
-    # -- lifecycle -----------------------------------------------------------
-
-    def dispose(self) -> None:
-        if self._heartbeat_task is not None:
-            self._heartbeat_task.cancel()
-            self._heartbeat_task = None
-
-        if self._handler_task is not None:
-            self._handler_task.cancel()
-            self._handler_task = None
-
-        if self._ctrl_task is not None:
-            self._ctrl_task.cancel()
-            self._ctrl_task = None
-
-        # stdio-managed subprocess (stored as self._subprocess by _start_local_stdio)
-        if hasattr(self, '_subprocess') and self._subprocess is not None and self._subprocess.returncode is None:
-            self.ap.logger.info('Terminating managed box runtime process...')
-            self._subprocess.terminate()
-
-        # Subprocess launched by ManagedRuntimeConnector._start_runtime_subprocess (Windows path)
-        self._dispose_subprocess()
-
-    # -- config helpers ------------------------------------------------------
-
-    def _load_configured_runtime_endpoint(self) -> str:
-        return _get_runtime_endpoint(_get_box_config(self.ap))
--- a/src/langbot/pkg/box/policy.py
+++ b/src/langbot/pkg/box/policy.py
@@ -1,98 +0,0 @@
-"""Three-layer security policy for LangBot Box.
-
-The design separates concerns into three independent layers, aligned with
-OpenCode / OpenClaw patterns:
-
-1. **SandboxPolicy** – *where* tools run (host vs sandbox).
-2. **ToolPolicy** – *which* tools are allowed (allow/deny lists).
-3. **ElevatedPolicy** – *whether* a single exec call may temporarily
-   escape the default sandbox boundary.
-
-These three layers are orthogonal:
- ToolPolicy is a hard boundary; ``elevated`` cannot bypass a denied tool.
- SandboxPolicy decides the default execution location.
- ElevatedPolicy only affects ``exec`` and only when the framework allows it.
-"""
-
-from __future__ import annotations
-
-import enum
-from typing import Sequence
-
-
-# ── Layer 1: Sandbox Policy ──────────────────────────────────────────
-
-
-class SandboxMode(str, enum.Enum):
-    """Determines when agent execution is routed through the sandbox."""
-
-    OFF = 'off'
-    """Sandbox disabled; all exec runs on the host."""
-
-    NON_DEFAULT = 'non_default'
-    """Only non-default sessions are sandboxed (e.g. sub-agents, MCP)."""
-
-    ALL = 'all'
-    """Every agent exec call is routed through the sandbox."""
-
-
-class SandboxPolicy:
-    """Decides whether a given execution context should use the sandbox."""
-
-    def __init__(self, mode: SandboxMode = SandboxMode.ALL):
-        self.mode = mode
-
-    def should_sandbox(self, *, is_default_session: bool = True) -> bool:
-        if self.mode == SandboxMode.OFF:
-            return False
-        if self.mode == SandboxMode.ALL:
-            return True
-        # NON_DEFAULT: sandbox everything except the default session
-        return not is_default_session
-
-
-# ── Layer 2: Tool Policy ─────────────────────────────────────────────
-
-
-class ToolPolicy:
-    """Controls which tools are available to the current agent/session.
-
-    Rules:
-    - ``deny`` always takes precedence over ``allow``.
-    - An empty ``allow`` list means "all tools allowed" (no allowlist filter).
-    - ``elevated`` cannot bypass a denied tool.
-    """
-
-    def __init__(
-        self,
-        allow: Sequence[str] = (),
-        deny: Sequence[str] = (),
-    ):
-        self._allow: frozenset[str] = frozenset(allow)
-        self._deny: frozenset[str] = frozenset(deny)
-
-    def is_tool_allowed(self, tool_name: str) -> bool:
-        if tool_name in self._deny:
-            return False
-        if self._allow and tool_name not in self._allow:
-            return False
-        return True
-
-
-# ── Layer 3: Elevated Policy ─────────────────────────────────────────
-
-
-class ElevatedPolicy:
-    """Controls whether ``exec`` may request temporary privilege escalation.
-
-    ``elevated`` only applies to the ``exec`` tool.  It means "run this
-    command outside the default sandbox boundary" (e.g. with network, or
-    on the host).  The framework decides whether to honor the request.
-    """
-
-    def __init__(self, *, allow_elevated: bool = False, require_approval: bool = True):
-        self.allow_elevated = allow_elevated
-        self.require_approval = require_approval
-
-    def is_elevation_permitted(self) -> bool:
-        return self.allow_elevated
--- a/src/langbot/pkg/box/service.py
+++ b/src/langbot/pkg/box/service.py
@@ -1,797 +0,0 @@
-from __future__ import annotations
-
-import asyncio
-import collections
-import datetime as _dt
-import enum
-import json
-import os
-from typing import TYPE_CHECKING
-
-import pydantic
-
-from langbot_plugin.box.client import BoxRuntimeClient
-from .connector import BoxRuntimeConnector, _get_box_config
-from langbot_plugin.box.errors import BoxError, BoxValidationError
-from langbot_plugin.box.models import (
-    BUILTIN_PROFILES,
-    BoxExecutionResult,
-    BoxManagedProcessInfo,
-    BoxManagedProcessSpec,
-    BoxProfile,
-    BoxSpec,
-)
-
-_INT_ADAPTER = pydantic.TypeAdapter(int)
-_UTC = _dt.timezone.utc
-_MAX_RECENT_ERRORS = 50
-_MIB = 1024 * 1024
-
-
-def _is_path_under(path: str, root: str) -> bool:
-    """Check whether *path* equals *root* or is a child of *root*."""
-    return path == root or path.startswith(f'{root}{os.sep}')
-
-
-if TYPE_CHECKING:
-    from ..core import app as core_app
-    import langbot_plugin.api.entities.builtin.pipeline.query as pipeline_query
-
-
-class BoxService:
-    def __init__(
-        self,
-        ap: core_app.Application,
-        client: BoxRuntimeClient | None = None,
-        output_limit_chars: int = 4000,
-    ):
-        self.ap = ap
-        self._enabled = self._load_enabled()
-        self._runtime_connector: BoxRuntimeConnector | None = None
-        if client is None:
-            # Always construct a connector — its __init__ is side-effect free
-            # (no I/O, no subprocess). When ``box.enabled = false`` we simply
-            # skip ``connector.initialize()`` so no connection is attempted.
-            self._runtime_connector = BoxRuntimeConnector(ap, runtime_disconnect_callback=self._on_runtime_disconnect)
-            client = self._runtime_connector.client
-        self.client = client
-        self.output_limit_chars = output_limit_chars
-        self.host_root = self._load_host_root()
-        self.allowed_mount_roots = self._load_allowed_mount_roots()
-        self.default_workspace = self._load_default_workspace()
-        self.profile = self._load_profile()
-        self.custom_image = self._load_custom_image()
-        self.workspace_quota_mb = self._load_workspace_quota_mb()
-        self._recent_errors: collections.deque[dict] = collections.deque(maxlen=_MAX_RECENT_ERRORS)
-        self._shutdown_task = None
-        self._available = False
-        self._connector_error: str = ''
-        self._reconnecting = False
-
-    @property
-    def enabled(self) -> bool:
-        """Whether Box is enabled in config. False means the operator has
-        deliberately turned the sandbox off via ``box.enabled = false``.
-        Disabled and "enabled but unavailable" are reported as the same
-        ``available = False`` to consumers, but distinguished in get_status."""
-        return self._enabled
-
-    async def initialize(self):
-        self._ensure_default_workspace()
-        if not self._enabled:
-            # Disabled by config: do NOT connect to a remote runtime, do NOT
-            # fork a stdio subprocess. Every consumer of box_service should
-            # gate on ``available`` and degrade gracefully.
-            self._available = False
-            self._connector_error = 'Box runtime is disabled in config (box.enabled = false)'
-            self.ap.logger.info(
-                'Box runtime disabled by config; sandbox features (exec/read/write/edit, '
-                'skill add/edit, stdio MCP) will be unavailable.'
-            )
-            return
-        try:
-            if self._runtime_connector is not None:
-                await self._runtime_connector.initialize()
-            else:
-                await self.client.initialize()
-            self._available = True
-            self._connector_error = ''
-            self.ap.logger.info(
-                f'LangBot Box runtime initialized: profile={self.profile.name} '
-                f'default_workspace={self.default_workspace or "(none)"}'
-            )
-        except Exception as exc:
-            self.ap.logger.warning(f'LangBot Box runtime unavailable, sandbox features disabled: {exc}')
-            self._available = False
-            self._connector_error = str(exc)
-
-    async def _on_runtime_disconnect(self, connector: BoxRuntimeConnector) -> None:
-        """Called by the connector when the Box runtime connection drops.
-
-        Spawns a background reconnection loop so the caller is not blocked.
-        Skipped entirely when Box is disabled by config — that path should
-        never have connected in the first place.
-        """
-        if not self._enabled:
-            return
-        if self._reconnecting:
-            return  # Another reconnect loop is already running
-        self._reconnecting = True
-        self._available = False
-        self._connector_error = 'Disconnected from Box runtime'
-        self.ap.logger.warning('Box runtime disconnected, sandbox features temporarily disabled.')
-        asyncio.create_task(self._reconnect_loop(connector))
-
-    async def _reconnect_loop(self, connector: BoxRuntimeConnector) -> None:
-        """Retry reconnection with exponential backoff (3s → 60s max)."""
-        delay = 3
-        max_delay = 60
-        try:
-            while True:
-                self.ap.logger.info(f'Attempting to reconnect to Box runtime in {delay}s...')
-                await asyncio.sleep(delay)
-                try:
-                    connector.dispose()
-                    await connector.initialize()
-                    self._available = True
-                    self._connector_error = ''
-                    self.ap.logger.info('Box runtime reconnected, sandbox features restored.')
-                    return
-                except Exception as exc:
-                    self._connector_error = str(exc)
-                    self.ap.logger.warning(f'Box runtime reconnection failed: {exc}')
-                    delay = min(delay * 2, max_delay)
-        finally:
-            self._reconnecting = False
-
-    @property
-    def available(self) -> bool:
-        return self._available
-
-    async def execute_spec_payload(
-        self,
-        spec_payload: dict,
-        query: pipeline_query.Query,
-        *,
-        skip_host_mount_validation: bool = False,
-    ) -> dict:
-        if not self._available:
-            raise BoxError('Box runtime is not available. Install and start Docker to use sandbox features.')
-        try:
-            spec = self.build_spec(spec_payload, skip_host_mount_validation=skip_host_mount_validation)
-        except BoxError as exc:
-            self._record_error(exc, query)
-            raise
-        self.ap.logger.info(
-            'LangBot Box request: '
-            f'query_id={query.query_id} '
-            f'spec={json.dumps(self._summarize_spec(spec), ensure_ascii=False)}'
-        )
-        try:
-            await self._enforce_workspace_quota(spec, phase='before execution')
-        except BoxError as exc:
-            self._record_error(exc, query)
-            raise
-        try:
-            result = await self.client.execute(spec)
-        except BoxError as exc:
-            self._record_error(exc, query)
-            raise
-        try:
-            await self._enforce_workspace_quota(spec, phase='after execution')
-        except BoxError as exc:
-            await self._cleanup_exceeded_session(spec)
-            self._record_error(exc, query)
-            raise
-        self.ap.logger.info(
-            'LangBot Box result: '
-            f'query_id={query.query_id} '
-            f'summary={json.dumps(self._summarize_result(result), ensure_ascii=False)}'
-        )
-        return self._serialize_result(result)
-
-    def resolve_box_session_id(self, query: pipeline_query.Query) -> str:
-        """Resolve the Box session_id from the pipeline's template and query variables."""
-        template = (
-            (query.pipeline_config or {})
-            .get('ai', {})
-            .get('local-agent', {})
-            .get('box-session-id-template', '{launcher_type}_{launcher_id}')
-        )
-        variables = dict(query.variables or {})
-        launcher_type = getattr(query, 'launcher_type', None)
-        if hasattr(launcher_type, 'value'):
-            launcher_type = launcher_type.value
-        launcher_id = getattr(query, 'launcher_id', None)
-        sender_id = getattr(query, 'sender_id', None)
-        query_id = getattr(query, 'query_id', None)
-
-        variables.setdefault('query_id', str(query_id or 'unknown'))
-        variables.setdefault('launcher_type', str(launcher_type or 'query'))
-        variables.setdefault('launcher_id', str(launcher_id or query_id or 'unknown'))
-        variables.setdefault('sender_id', str(sender_id or launcher_id or query_id or 'unknown'))
-        variables.setdefault('global', 'global')
-        return template.format_map(collections.defaultdict(lambda: 'unknown', variables))
-
-    def build_skill_extra_mounts(self, query: pipeline_query.Query) -> list[dict]:
-        """Build extra_mounts entries for all pipeline-bound skills.
-
-        This ensures that when a container is first created it already has
-        all skill packages mounted, regardless of which skill is currently
-        activated.
-
-        Skills whose ``package_root`` is missing or no longer a directory on
-        the LangBot-visible filesystem are skipped with a warning instead of
-        being passed through to the backend. Without this guard the three
-        backends behave inconsistently on a stale mount: nsjail refuses to
-        start the sandbox (failing every exec in the session), Docker
-        silently auto-creates a root-owned empty directory on the host, and
-        E2B silently skips the upload — none of which surfaces an
-        actionable error to the agent or operator.
-        """
-        skill_mgr = getattr(self.ap, 'skill_mgr', None)
-        if skill_mgr is None:
-            return []
-
-        from ..provider.tools.loaders import skill as skill_loader
-
-        visible_skills = skill_loader.get_visible_skills(self.ap, query)
-        mounts: list[dict] = []
-        for skill_name, skill_data in visible_skills.items():
-            package_root = str(skill_data.get('package_root', '') or '').strip()
-            if not package_root:
-                continue
-            if not os.path.isdir(package_root):
-                self.ap.logger.warning(
-                    f'Skill "{skill_name}" package_root missing on filesystem '
-                    f'({package_root}); skipping mount to prevent sandbox failures. '
-                    f'The skill cache may be stale — consider reloading skills.'
-                )
-                continue
-            mounts.append(
-                {
-                    'host_path': package_root,
-                    'mount_path': f'/workspace/.skills/{skill_name}',
-                    'mode': 'rw',
-                }
-            )
-        return mounts
-
-    async def execute_tool(self, parameters: dict, query: pipeline_query.Query) -> dict:
-        """Execute an agent-facing ``exec`` tool call.
-
-        Translates the agent-facing ``command`` field to the internal
-        ``BoxSpec.cmd`` field and injects the session id from the query.
-        """
-        spec_payload: dict = {'cmd': parameters['command']}
-
-        # Pass through allowed agent-facing fields
-        for key in ('workdir', 'timeout_sec', 'env'):
-            if key in parameters:
-                spec_payload[key] = parameters[key]
-
-        # Inject context the agent must not control
-        spec_payload.setdefault('session_id', self.resolve_box_session_id(query))
-
-        # Mount all pipeline-bound skills so they are available in the container
-        if 'extra_mounts' not in spec_payload:
-            spec_payload['extra_mounts'] = self.build_skill_extra_mounts(query)
-
-        return await self.execute_spec_payload(spec_payload, query)
-
-    async def shutdown(self):
-        await self.client.shutdown()
-
-    def dispose(self):
-        if self._runtime_connector is not None:
-            self._runtime_connector.dispose()
-        loop = getattr(self.ap, 'event_loop', None)
-        if loop is not None and not loop.is_closed() and (self._shutdown_task is None or self._shutdown_task.done()):
-            self._shutdown_task = loop.create_task(self.shutdown())
-
-    async def get_sessions(self) -> list[dict]:
-        if not self._available:
-            return []
-        try:
-            return await self.client.get_sessions()
-        except Exception:
-            return []
-
-    def build_spec(self, spec_payload: dict, skip_host_mount_validation: bool = False) -> BoxSpec:
-        spec_payload = dict(spec_payload)
-        spec_payload.setdefault('env', {})
-        if spec_payload.get('host_path') in (None, '') and self.default_workspace is not None:
-            spec_payload['host_path'] = self.default_workspace
-        if spec_payload.get('workspace_quota_mb') in (None, '') and self.workspace_quota_mb is not None:
-            spec_payload['workspace_quota_mb'] = self.workspace_quota_mb
-
-        # Global custom image overrides profile default (but not caller-specified image)
-        if self.custom_image and 'image' not in spec_payload:
-            spec_payload['image'] = self.custom_image
-
-        self._apply_profile(spec_payload)
-
-        try:
-            spec = BoxSpec.model_validate(spec_payload)
-        except pydantic.ValidationError as exc:
-            first_error = exc.errors()[0]
-            raise BoxValidationError(first_error.get('msg', 'invalid box arguments')) from exc
-
-        if not skip_host_mount_validation:
-            self._validate_host_mount(spec)
-        return spec
-
-    async def create_session(self, spec_payload: dict, *, skip_host_mount_validation: bool = False) -> dict:
-        spec = self.build_spec(spec_payload, skip_host_mount_validation=skip_host_mount_validation)
-        return await self.client.create_session(spec)
-
-    async def start_managed_process(self, session_id: str, process_payload: dict) -> BoxManagedProcessInfo:
-        process_spec = BoxManagedProcessSpec.model_validate(process_payload)
-        return await self.client.start_managed_process(session_id, process_spec)
-
-    async def get_managed_process(self, session_id: str, process_id: str = 'default') -> BoxManagedProcessInfo:
-        return await self.client.get_managed_process(session_id, process_id)
-
-    async def stop_managed_process(self, session_id: str, process_id: str = 'default') -> None:
-        return await self.client.stop_managed_process(session_id, process_id)
-
-    def get_managed_process_websocket_url(self, session_id: str, process_id: str = 'default') -> str:
-        getter = getattr(self.client, 'get_managed_process_websocket_url', None)
-        if getter is None:
-            raise BoxValidationError('box runtime client does not support managed process websocket attach')
-        ws_relay_base_url = (
-            self._runtime_connector.ws_relay_base_url
-            if self._runtime_connector is not None
-            else 'http://127.0.0.1:5410'
-        )
-        return getter(session_id, ws_relay_base_url, process_id)
-
-    async def list_skills(self) -> list[dict]:
-        return await self.client.list_skills()
-
-    async def get_skill(self, name: str) -> dict | None:
-        return await self.client.get_skill(name)
-
-    async def create_skill(self, skill: dict) -> dict:
-        return await self.client.create_skill(skill)
-
-    async def update_skill(self, name: str, skill: dict) -> dict:
-        return await self.client.update_skill(name, skill)
-
-    async def delete_skill(self, name: str) -> None:
-        await self.client.delete_skill(name)
-
-    async def scan_skill_directory(self, path: str) -> dict:
-        return await self.client.scan_skill_directory(path)
-
-    async def list_skill_files(
-        self,
-        name: str,
-        path: str = '.',
-        include_hidden: bool = False,
-        max_entries: int = 200,
-    ) -> dict:
-        return await self.client.list_skill_files(name, path, include_hidden, max_entries)
-
-    async def read_skill_file(self, name: str, path: str) -> dict:
-        return await self.client.read_skill_file(name, path)
-
-    async def write_skill_file(self, name: str, path: str, content: str) -> dict:
-        return await self.client.write_skill_file(name, path, content)
-
-    async def preview_skill_zip(
-        self,
-        file_bytes: bytes,
-        filename: str,
-        source_subdir: str = '',
-        target_suffix: str = 'upload',
-    ) -> list[dict]:
-        return await self.client.preview_skill_zip(file_bytes, filename, source_subdir, target_suffix)
-
-    async def install_skill_zip(
-        self,
-        file_bytes: bytes,
-        filename: str,
-        source_paths: list[str] | None = None,
-        source_path: str = '',
-        source_subdir: str = '',
-        target_suffix: str = 'upload',
-    ) -> list[dict]:
-        return await self.client.install_skill_zip(
-            file_bytes,
-            filename,
-            source_paths,
-            source_path,
-            source_subdir,
-            target_suffix,
-        )
-
-    def _serialize_result(self, result: BoxExecutionResult) -> dict:
-        stdout, stdout_truncated = self._truncate(result.stdout)
-        stderr, stderr_truncated = self._truncate(result.stderr)
-
-        return {
-            'session_id': result.session_id,
-            'backend': result.backend_name,
-            'status': result.status.value,
-            'ok': result.ok,
-            'exit_code': result.exit_code,
-            'stdout': stdout,
-            'stderr': stderr,
-            'stdout_truncated': stdout_truncated,
-            'stderr_truncated': stderr_truncated,
-            'duration_ms': result.duration_ms,
-        }
-
-    def _truncate(self, text: str) -> tuple[str, bool]:
-        if len(text) <= self.output_limit_chars:
-            return text, False
-        if self.output_limit_chars <= 0:
-            return '', True
-
-        head_size = 0
-        tail_size = 0
-        notice = ''
-        # Recompute once the omitted count is known so the final payload
-        # stays within output_limit_chars even after adding the notice.
-        for _ in range(4):
-            omitted = max(len(text) - head_size - tail_size, 0)
-            notice = f'\n\n... [{omitted} characters truncated] ...\n\n'
-            available = self.output_limit_chars - len(notice)
-            if available <= 0:
-                return notice[: self.output_limit_chars], True
-
-            new_head_size = int(available * 0.6)
-            new_tail_size = available - new_head_size
-            if new_head_size == head_size and new_tail_size == tail_size:
-                break
-            head_size = new_head_size
-            tail_size = new_tail_size
-
-        head = text[:head_size]
-        tail = text[-tail_size:] if tail_size else ''
-        truncated = f'{head}{notice}{tail}'
-        return truncated[: self.output_limit_chars], True
-
-    def _summarize_spec(self, spec: BoxSpec) -> dict:
-        cmd = spec.cmd.strip()
-        if len(cmd) > 400:
-            cmd = f'{cmd[:397]}...'
-
-        return {
-            'session_id': spec.session_id,
-            'workdir': spec.workdir,
-            'mount_path': spec.mount_path,
-            'timeout_sec': spec.timeout_sec,
-            'network': spec.network.value,
-            'image': spec.image,
-            'host_path': spec.host_path,
-            'host_path_mode': spec.host_path_mode.value,
-            'cpus': spec.cpus,
-            'memory_mb': spec.memory_mb,
-            'pids_limit': spec.pids_limit,
-            'read_only_rootfs': spec.read_only_rootfs,
-            'workspace_quota_mb': spec.workspace_quota_mb,
-            'env_keys': sorted(spec.env.keys()),
-            'cmd': cmd,
-        }
-
-    def _summarize_result(self, result: BoxExecutionResult) -> dict:
-        stdout_preview = result.stdout[:200]
-        stderr_preview = result.stderr[:200]
-        if len(result.stdout) > 200:
-            stdout_preview = f'{stdout_preview}...'
-        if len(result.stderr) > 200:
-            stderr_preview = f'{stderr_preview}...'
-
-        return {
-            'session_id': result.session_id,
-            'backend': result.backend_name,
-            'status': result.status.value,
-            'exit_code': result.exit_code,
-            'duration_ms': result.duration_ms,
-            'stdout_preview': stdout_preview,
-            'stderr_preview': stderr_preview,
-        }
-
-    def _local_config(self) -> dict:
-        """Return ``box.local`` from instance config.
-
-        Environment overrides are applied uniformly by
-        ``LoadConfigStage._apply_env_overrides_to_config`` (e.g.
-        ``BOX__LOCAL__HOST_ROOT``) before this is read, so no box-specific
-        env parsing happens here.
-        """
-        return dict(_get_box_config(self.ap).get('local') or {})
-
-    def _load_allowed_mount_roots(self) -> list[str]:
-        configured_roots = self._local_config().get('allowed_mount_roots', [])
-        # The unified env-override mechanism stores a brand-new key as a raw
-        # string when the key is absent from config.yaml. Accept a
-        # comma-separated string as well as a list so that
-        # ``BOX__LOCAL__ALLOWED_MOUNT_ROOTS="/a,/b"`` keeps working even when
-        # the config file has no ``box.local.allowed_mount_roots`` entry.
-        if isinstance(configured_roots, str):
-            configured_roots = [item.strip() for item in configured_roots.split(',') if item.strip()]
-
-        normalized_roots: list[str] = []
-        for root in configured_roots:
-            root_value = str(root).strip()
-            if not root_value:
-                continue
-            normalized_roots.append(os.path.realpath(os.path.abspath(root_value)))
-
-        if not normalized_roots and self.host_root is not None:
-            normalized_roots.append(self.host_root)
-
-        return normalized_roots
-
-    def _load_host_root(self) -> str | None:
-        host_root = str(self._local_config().get('host_root', '')).strip()
-        if not host_root:
-            return None
-        return os.path.realpath(os.path.abspath(host_root))
-
-    def _load_default_workspace(self) -> str | None:
-        default_workspace = str(self._local_config().get('default_workspace', '')).strip()
-        if not default_workspace:
-            if self.host_root is None:
-                return None
-            default_workspace = os.path.join(self.host_root, 'default')
-        elif not os.path.isabs(default_workspace) and self.host_root is not None:
-            default_workspace = os.path.join(self.host_root, default_workspace)
-        return os.path.realpath(os.path.abspath(default_workspace))
-
-    def get_skills_root(self) -> str | None:
-        skills_root = str(self._local_config().get('skills_root', '') or 'skills').strip()
-        if not skills_root:
-            skills_root = 'skills'
-        if not os.path.isabs(skills_root) and self.host_root is not None:
-            skills_root = os.path.join(self.host_root, skills_root)
-        return os.path.realpath(os.path.abspath(skills_root))
-
-    def _load_enabled(self) -> bool:
-        """Read ``box.enabled`` (top-level, not ``box.local.*``). Default True
-        — disabling is opt-in. Accepts bool, ``'true'``/``'false'`` strings,
-        and the standard env-overridden truthy values that
-        ``LoadConfigStage._apply_env_overrides_to_config`` produces."""
-        raw = _get_box_config(self.ap).get('enabled', True)
-        if isinstance(raw, bool):
-            return raw
-        return str(raw).strip().lower() not in ('false', '0', 'no', 'off', '')
-
-    def _load_custom_image(self) -> str | None:
-        raw = str(self._local_config().get('image', '') or '').strip()
-        return raw or None
-
-    def _load_workspace_quota_mb(self) -> int | None:
-        raw_value = self._local_config().get('workspace_quota_mb')
-        if raw_value in (None, ''):
-            return None
-        try:
-            value = _INT_ADAPTER.validate_python(raw_value)
-        except pydantic.ValidationError as exc:
-            raise BoxValidationError('workspace_quota_mb must be an integer greater than or equal to 0') from exc
-        if value < 0:
-            raise BoxValidationError('workspace_quota_mb must be greater than or equal to 0')
-        return value
-
-    def _ensure_default_workspace(self):
-        if self.default_workspace is None:
-            return
-
-        if os.path.isdir(self.default_workspace):
-            return
-
-        if os.path.exists(self.default_workspace):
-            raise BoxValidationError('box.local.default_workspace must point to a directory on the host')
-
-        if not self.allowed_mount_roots:
-            raise BoxValidationError(
-                'box.local.default_workspace cannot be created because no allowed_mount_roots are configured'
-            )
-
-        for allowed_root in self.allowed_mount_roots:
-            if _is_path_under(self.default_workspace, allowed_root):
-                os.makedirs(self.default_workspace, exist_ok=True)
-                return
-
-        allowed_roots = ', '.join(self.allowed_mount_roots)
-        raise BoxValidationError(f'box.local.default_workspace is outside allowed_mount_roots: {allowed_roots}')
-
-    def _validate_host_mount(self, spec: BoxSpec):
-        if spec.host_path is None:
-            return
-
-        host_path = os.path.realpath(spec.host_path)
-        if not os.path.isdir(host_path):
-            raise BoxValidationError('host_path must point to an existing directory on the host')
-
-        if not self.allowed_mount_roots:
-            raise BoxValidationError('host_path mounting is disabled because no allowed_mount_roots are configured')
-
-        for allowed_root in self.allowed_mount_roots:
-            if _is_path_under(host_path, allowed_root):
-                return
-
-        allowed_roots = ', '.join(self.allowed_mount_roots)
-        raise BoxValidationError(f'host_path is outside allowed_mount_roots: {allowed_roots}')
-
-    def _load_profile(self) -> BoxProfile:
-        profile_name = str(self._local_config().get('profile', 'default')).strip() or 'default'
-
-        profile = BUILTIN_PROFILES.get(profile_name)
-        if profile is None:
-            available = ', '.join(sorted(BUILTIN_PROFILES))
-            raise BoxValidationError(f"unknown box profile '{profile_name}', available profiles: {available}")
-        return profile
-
-    def _apply_profile(self, params: dict):
-        """Merge profile defaults into *params* in-place, enforce locked fields and clamp timeout."""
-        profile = self.profile
-        _PROFILE_FIELDS = (
-            'image',
-            'network',
-            'timeout_sec',
-            'host_path_mode',
-            'cpus',
-            'memory_mb',
-            'pids_limit',
-            'read_only_rootfs',
-            'workspace_quota_mb',
-        )
-
-        for field in _PROFILE_FIELDS:
-            profile_value = getattr(profile, field)
-            raw_value = profile_value.value if isinstance(profile_value, enum.Enum) else profile_value
-
-            if field in profile.locked:
-                params[field] = raw_value
-            elif field not in params:
-                params[field] = raw_value
-
-        timeout = params.get('timeout_sec')
-        try:
-            normalized_timeout = _INT_ADAPTER.validate_python(timeout)
-        except pydantic.ValidationError:
-            return
-
-        if normalized_timeout > profile.max_timeout_sec:
-            params['timeout_sec'] = profile.max_timeout_sec
-
-    def _get_workspace_size_bytes(self, root: str) -> int:
-        total = 0
-
-        def _walk(path: str):
-            nonlocal total
-            try:
-                with os.scandir(path) as entries:
-                    for entry in entries:
-                        try:
-                            if entry.is_symlink():
-                                total += entry.stat(follow_symlinks=False).st_size
-                                continue
-                            if entry.is_dir(follow_symlinks=False):
-                                _walk(entry.path)
-                                continue
-                            total += entry.stat(follow_symlinks=False).st_size
-                        except FileNotFoundError:
-                            continue
-            except FileNotFoundError:
-                return
-
-        _walk(root)
-        return total
-
-    async def _enforce_workspace_quota(self, spec: BoxSpec, *, phase: str) -> None:
-        if spec.host_path is None or spec.workspace_quota_mb <= 0:
-            return
-
-        host_path = os.path.realpath(spec.host_path)
-        if not os.path.isdir(host_path):
-            return
-
-        # Walk the workspace off the event loop — this runs on every
-        # quota-enforced exec, and a large tree would otherwise block the whole
-        # asyncio runtime (all bots/pipelines) for the duration of the scan.
-        used_bytes = await asyncio.to_thread(self._get_workspace_size_bytes, host_path)
-        limit_bytes = spec.workspace_quota_mb * _MIB
-        if used_bytes <= limit_bytes:
-            return
-
-        raise BoxValidationError(
-            f'workspace quota exceeded {phase}: '
-            f'used={used_bytes} bytes limit={limit_bytes} bytes '
-            f'host_path={host_path} session_id={spec.session_id}'
-        )
-
-    async def _cleanup_exceeded_session(self, spec: BoxSpec) -> None:
-        try:
-            await self.client.delete_session(spec.session_id)
-        except Exception as exc:
-            self.ap.logger.warning(
-                'Failed to clean up Box session after workspace quota was exceeded: '
-                f'session_id={spec.session_id} error={exc}'
-            )
-
-    # ── Observability ─────────────────────────────────────────────────
-
-    def _record_error(self, exc: Exception, query: pipeline_query.Query):
-        self._recent_errors.append(
-            {
-                'timestamp': _dt.datetime.now(_UTC).isoformat(),
-                'type': type(exc).__name__,
-                'message': str(exc),
-                'query_id': str(query.query_id),
-            }
-        )
-
-    def get_recent_errors(self) -> list[dict]:
-        return list(self._recent_errors)
-
-    def get_system_guidance(self) -> str:
-        """Return LLM system-prompt guidance for the exec tool.
-
-        All execution-specific prompt text is kept here so that callers
-        (e.g. LocalAgentRunner) stay free of box domain knowledge.
-        """
-        guidance = (
-            'When the exec tool is available, use it for exact calculations, statistics, structured data parsing, '
-            'and code execution instead of estimating mentally. If the user provides numbers, tables, CSV-like text, '
-            'JSON, or other data and asks for a computed answer, prefer running a short Python script via exec '
-            'and then answer from the tool result. Unless the user explicitly asks for the script, code, or implementation '
-            'details, do not include the generated script in the final answer; return the result and a brief explanation only.'
-        )
-        if self.default_workspace:
-            guidance += (
-                ' A default workspace is mounted at /workspace for file tasks. When the user asks to read, create, or '
-                'modify local files in the working directory, use exec with /workspace paths directly; do not ask the '
-                'user for directory parameters unless they explicitly need a different directory.'
-            )
-        return guidance
-
-    async def get_status(self) -> dict:
-        if not self._available:
-            return {
-                'available': False,
-                'enabled': self._enabled,
-                'profile': self.profile.name,
-                'recent_error_count': len(self._recent_errors),
-                'connector_error': self._connector_error,
-            }
-        try:
-            runtime_status = await self.client.get_status()
-        except Exception as exc:
-            # RPC failed — the runtime likely just disconnected and the
-            # heartbeat hasn't flipped _available yet.
-            return {
-                'available': False,
-                'enabled': self._enabled,
-                'profile': self.profile.name,
-                'recent_error_count': len(self._recent_errors),
-                'connector_error': str(exc),
-            }
-        # Backend state can be unavailable even when the connector is healthy
-        # (operator selected nsjail but the binary is missing, Docker daemon
-        # went down after the runtime started, E2B credentials wrong, ...).
-        # Report the combined state in the top-level ``available`` so the
-        # frontend banner / ``useBoxStatus`` hook / native-tool gate all
-        # agree on "actually usable" rather than "connector alive". The
-        # detailed ``backend`` object stays in the payload so the dialog
-        # can still show which backend was tried.
-        backend_info = runtime_status.get('backend') if isinstance(runtime_status, dict) else None
-        backend_ok = bool(backend_info and backend_info.get('available', False))
-        payload = {
-            **runtime_status,
-            'available': backend_ok,
-            'enabled': self._enabled,
-            'profile': self.profile.name,
-            'recent_error_count': len(self._recent_errors),
-        }
-        if not backend_ok and 'connector_error' not in payload:
-            backend_name = backend_info.get('name') if backend_info else None
-            if backend_name:
-                payload['connector_error'] = f'Configured sandbox backend "{backend_name}" is unavailable'
-            else:
-                payload['connector_error'] = 'No supported sandbox backend (Docker / nsjail / E2B) is available'
-        return payload
--- a/src/langbot/pkg/box/workspace.py
+++ b/src/langbot/pkg/box/workspace.py
@@ -1,413 +0,0 @@
-"""Reusable workspace/session helpers built on top of Box.
-
-This module is the middle layer between the raw Box runtime primitives and
-application-specific flows such as skills or MCP stdio.
-
-It intentionally stays generic:
- path and virtualenv rewriting are workspace concerns
- Python project detection/bootstrap are workspace concerns
- session exec / managed-process helpers are workspace concerns
-
-Higher layers add their own semantics on top, for example:
- skills choose a stable per-skill session id and use repeated exec
- MCP stdio chooses how to prepare dependencies and attaches to a managed process
-"""
-
-from __future__ import annotations
-
-import os
-import textwrap
-from typing import Any
-
-PYTHON_MANIFEST_FILES = (
-    'requirements.txt',
-    'pyproject.toml',
-    'setup.py',
-    'setup.cfg',
-)
-_VENV_DIRS = frozenset({'.venv', 'venv', 'env', '.env'})
-_VENV_BIN_DIRS = frozenset({'bin', 'Scripts'})
-
-
-def normalize_host_path(path: str | None) -> str:
-    if path is None:
-        return ''
-    stripped = str(path).strip()
-    if not stripped:
-        return ''
-    return os.path.realpath(os.path.abspath(stripped))
-
-
-def rewrite_mounted_path(path: str, host_path: str | None, *, mount_path: str = '/workspace') -> str:
-    """Translate a host path into the path visible inside the sandbox mount."""
-    if not host_path or not path:
-        return path
-    normalized_host = os.path.realpath(host_path)
-    normalized_path = os.path.realpath(path)
-    if normalized_path.startswith(normalized_host + '/'):
-        return mount_path + normalized_path[len(normalized_host) :]
-    if normalized_path == normalized_host:
-        return mount_path
-    return path
-
-
-def unwrap_venv_path(directory: str) -> str:
-    """Collapse ``.../.venv/bin`` style paths back to the project root."""
-    parts = directory.replace('\\', '/').split('/')
-    for i in range(len(parts) - 1, 0, -1):
-        if parts[i] in _VENV_BIN_DIRS and i >= 1:
-            venv_dir = parts[i - 1]
-            if venv_dir in _VENV_DIRS:
-                project_root = '/'.join(parts[: i - 1])
-                return project_root if project_root else '/'
-    return directory
-
-
-def infer_workspace_host_path(command: str, args: list[str] | None = None) -> str | None:
-    """Infer the project/workspace root from absolute command/arg paths."""
-    candidates: list[str] = []
-    for part in [command, *(args or [])]:
-        if not os.path.isabs(part):
-            continue
-        if os.path.exists(part):
-            directory = os.path.dirname(part)
-            candidates.append(os.path.realpath(unwrap_venv_path(directory)))
-    if not candidates:
-        return None
-    common = os.path.commonpath(candidates)
-    return common if common != '/' else None
-
-
-def rewrite_venv_command(command: str, host_path: str | None, *, mount_path: str = '/workspace') -> str:
-    """Rewrite host venv interpreters to plain ``python`` inside the sandbox.
-
-    Once a project is mounted into the sandbox, host virtualenv paths are no
-    longer valid. For those paths we intentionally drop down to ``python`` and
-    let the sandbox-side environment/bootstrap decide what interpreter to use.
-    """
-    if not host_path or not command:
-        return command
-    normalized_host = os.path.realpath(host_path)
-    normalized_command = os.path.realpath(command)
-    if not normalized_command.startswith(normalized_host + '/'):
-        return command
-    rel = normalized_command[len(normalized_host) + 1 :]
-    parts = rel.replace('\\', '/').split('/')
-    if len(parts) >= 3 and parts[0] in _VENV_DIRS and parts[1] in _VENV_BIN_DIRS and parts[2].startswith('python'):
-        return 'python'
-    return rewrite_mounted_path(normalized_command, host_path, mount_path=mount_path)
-
-
-def list_python_manifest_files(host_path: str | None) -> list[str]:
-    normalized_root = normalize_host_path(host_path)
-    if not normalized_root:
-        return []
-    return [filename for filename in PYTHON_MANIFEST_FILES if os.path.isfile(os.path.join(normalized_root, filename))]
-
-
-def classify_python_workspace(host_path: str | None) -> str | None:
-    """Return the generic Python workspace shape, without app-specific policy."""
-    manifest_files = set(list_python_manifest_files(host_path))
-    if not manifest_files:
-        return None
-    if {'pyproject.toml', 'setup.py', 'setup.cfg'} & manifest_files:
-        return 'package'
-    if 'requirements.txt' in manifest_files:
-        return 'requirements'
-    return None
-
-
-def should_prepare_python_env(host_path: str | None) -> bool:
-    normalized_root = normalize_host_path(host_path)
-    if not normalized_root:
-        return False
-    if os.path.isdir(os.path.join(normalized_root, '.venv')):
-        return True
-    return bool(list_python_manifest_files(normalized_root))
-
-
-def wrap_python_command_with_env(command: str, *, mount_path: str = '/workspace') -> str:
-    """Wrap a command with a reusable sandbox-local Python env bootstrap.
-
-    This is the generic "workspace is a Python project" path used by mutable
-    workspaces such as skills. Read-only installation strategies stay in the
-    higher-level caller because they are application policy, not workspace
-    semantics.
-    """
-    bootstrap = textwrap.dedent(
-        f"""
-        set -e
-
-        _LB_VENV_DIR="{mount_path}/.venv"
-        _LB_META_DIR="{mount_path}/.langbot"
-        _LB_META_FILE="$_LB_META_DIR/python-env.json"
-        _LB_LOCK_DIR="$_LB_META_DIR/python-env.lock"
-        _LB_TMP_DIR="{mount_path}/.tmp"
-        _LB_PIP_CACHE_DIR="{mount_path}/.cache/pip"
-
-        mkdir -p "$_LB_META_DIR" "$_LB_TMP_DIR" "$_LB_PIP_CACHE_DIR"
-        export TMPDIR="$_LB_TMP_DIR"
-        export TEMP="$_LB_TMP_DIR"
-        export TMP="$_LB_TMP_DIR"
-        export PIP_CACHE_DIR="$_LB_PIP_CACHE_DIR"
-
-        _lb_python_meta() {{
-          python - <<'PY'
-        import hashlib
-        import json
-        import os
-        import sys
-
-        root = "{mount_path}"
-        digest = hashlib.sha256()
-        manifest_files = []
-        for rel in ("requirements.txt", "pyproject.toml", "setup.py", "setup.cfg"):
-            path = os.path.join(root, rel)
-            if not os.path.isfile(path):
-                continue
-            manifest_files.append(rel)
-            with open(path, "rb") as handle:
-                digest.update(rel.encode("utf-8"))
-                digest.update(b"\\0")
-                digest.update(handle.read())
-                digest.update(b"\\0")
-
-        print(
-            json.dumps(
-                {{
-                    "python_executable": sys.executable,
-                    "python_version": list(sys.version_info[:3]),
-                    "manifest_files": manifest_files,
-                    "manifest_sha256": digest.hexdigest(),
-                }},
-                sort_keys=True,
-            )
-        )
-        PY
-        }}
-
-        _LB_CURRENT_META="$(_lb_python_meta)"
-        _LB_NEEDS_BOOTSTRAP=0
-
-        if [ ! -x "$_LB_VENV_DIR/bin/python" ]; then
-          _LB_NEEDS_BOOTSTRAP=1
-        elif [ ! -f "$_LB_META_FILE" ]; then
-          _LB_NEEDS_BOOTSTRAP=1
-        elif [ "$(cat "$_LB_META_FILE")" != "$_LB_CURRENT_META" ]; then
-          _LB_NEEDS_BOOTSTRAP=1
-        fi
-
-        if [ "$_LB_NEEDS_BOOTSTRAP" -eq 1 ]; then
-          _LB_LOCK_WAIT=0
-          while ! mkdir "$_LB_LOCK_DIR" 2>/dev/null; do
-            if [ "$_LB_LOCK_WAIT" -ge 120 ]; then
-              echo "Timed out waiting for Python environment lock: $_LB_LOCK_DIR" >&2
-              exit 1
-            fi
-            sleep 1
-            _LB_LOCK_WAIT=$((_LB_LOCK_WAIT + 1))
-          done
-
-          _lb_cleanup_lock() {{
-            rmdir "$_LB_LOCK_DIR" >/dev/null 2>&1 || true
-          }}
-          trap _lb_cleanup_lock EXIT INT TERM
-
-          _LB_CURRENT_META="$(_lb_python_meta)"
-          _LB_NEEDS_BOOTSTRAP=0
-          if [ ! -x "$_LB_VENV_DIR/bin/python" ]; then
-            _LB_NEEDS_BOOTSTRAP=1
-          elif [ ! -f "$_LB_META_FILE" ]; then
-            _LB_NEEDS_BOOTSTRAP=1
-          elif [ "$(cat "$_LB_META_FILE")" != "$_LB_CURRENT_META" ]; then
-            _LB_NEEDS_BOOTSTRAP=1
-          fi
-
-          if [ "$_LB_NEEDS_BOOTSTRAP" -eq 1 ]; then
-            rm -rf "$_LB_VENV_DIR"
-            python -m venv "$_LB_VENV_DIR"
-            . "$_LB_VENV_DIR/bin/activate"
-            python -m pip install --upgrade pip setuptools wheel
-            if [ -f "{mount_path}/requirements.txt" ]; then
-              python -m pip install -r "{mount_path}/requirements.txt"
-            elif [ -f "{mount_path}/pyproject.toml" ] || [ -f "{mount_path}/setup.py" ] || [ -f "{mount_path}/setup.cfg" ]; then
-              python -m pip install "{mount_path}"
-            fi
-            printf '%s' "$_LB_CURRENT_META" > "$_LB_META_FILE"
-          fi
-        fi
-
-        export VIRTUAL_ENV="$_LB_VENV_DIR"
-        export PATH="$_LB_VENV_DIR/bin:$PATH"
-        {command}
-        """
-    ).strip()
-    return bootstrap + '\n'
-
-
-class BoxWorkspaceSession:
-    """High-level handle for one reusable workspace-backed Box session.
-
-    The Box runtime already understands sessions and managed processes. This
-    wrapper adds LangBot's workspace-centric view on top: a mounted host path,
-    a stable ``session_id``, optional environment defaults, and convenience
-    helpers for exec or long-running processes inside that workspace.
-    """
-
-    def __init__(
-        self,
-        box_service,
-        session_id: str,
-        *,
-        host_path: str | None = None,
-        host_path_mode: str = 'rw',
-        workdir: str = '/workspace',
-        env: dict[str, str] | None = None,
-        mount_path: str = '/workspace',
-        network: str | None = None,
-        read_only_rootfs: bool | None = None,
-        image: str | None = None,
-        cpus: float | None = None,
-        memory_mb: int | None = None,
-        pids_limit: int | None = None,
-        persistent: bool = False,
-    ):
-        self.box_service = box_service
-        self.session_id = session_id
-        self.host_path = host_path
-        self.host_path_mode = host_path_mode
-        self.workdir = workdir
-        self.env = dict(env or {})
-        self.mount_path = mount_path
-        self.network = network
-        self.read_only_rootfs = read_only_rootfs
-        self.image = image
-        self.cpus = cpus
-        self.memory_mb = memory_mb
-        self.pids_limit = pids_limit
-        self.persistent = persistent
-
-    def rewrite_path(self, path: str) -> str:
-        return rewrite_mounted_path(path, self.host_path, mount_path=self.mount_path)
-
-    def rewrite_venv_command(self, command: str) -> str:
-        return rewrite_venv_command(command, self.host_path, mount_path=self.mount_path)
-
-    def build_session_payload(self) -> dict[str, Any]:
-        # Keep this payload generic so callers can reuse the same workspace
-        # handle for plain exec, file-producing tasks, or managed processes.
-        payload: dict[str, Any] = {
-            'session_id': self.session_id,
-            'workdir': self.workdir,
-            'env': self.env,
-            'persistent': self.persistent,
-        }
-        if self.network is not None:
-            payload['network'] = self.network
-        if self.read_only_rootfs is not None:
-            payload['read_only_rootfs'] = self.read_only_rootfs
-        if self.host_path:
-            payload['host_path'] = self.host_path
-            payload['host_path_mode'] = self.host_path_mode
-        for key in ('image', 'cpus', 'memory_mb', 'pids_limit'):
-            value = getattr(self, key)
-            if value is not None:
-                payload[key] = value
-        return payload
-
-    def build_exec_payload(
-        self,
-        cmd: str,
-        *,
-        workdir: str | None = None,
-        env: dict[str, str] | None = None,
-        timeout_sec: int | None = None,
-    ) -> dict[str, Any]:
-        # Exec payloads inherit the session-level workspace config, then layer
-        # per-call command/workdir/env overrides on top.
-        payload = self.build_session_payload()
-        payload['cmd'] = cmd
-        payload['workdir'] = workdir or self.workdir
-        if timeout_sec is not None:
-            payload['timeout_sec'] = timeout_sec
-        resolved_env = self.env if env is None else env
-        if resolved_env:
-            payload['env'] = resolved_env
-        elif 'env' in payload and not payload['env']:
-            payload.pop('env')
-        return payload
-
-    async def execute_raw(
-        self,
-        cmd: str,
-        *,
-        workdir: str | None = None,
-        env: dict[str, str] | None = None,
-        timeout_sec: int | None = None,
-    ):
-        payload = self.build_exec_payload(cmd, workdir=workdir, env=env, timeout_sec=timeout_sec)
-        return await self.box_service.client.execute(self.box_service.build_spec(payload))
-
-    async def execute_for_query(
-        self,
-        query,
-        cmd: str,
-        *,
-        workdir: str | None = None,
-        env: dict[str, str] | None = None,
-        timeout_sec: int | None = None,
-    ) -> dict:
-        payload = self.build_exec_payload(cmd, workdir=workdir, env=env, timeout_sec=timeout_sec)
-        return await self.box_service.execute_spec_payload(payload, query)
-
-    async def create_session(self):
-        return await self.box_service.create_session(self.build_session_payload())
-
-    def build_process_payload(
-        self,
-        command: str,
-        args: list[str] | None = None,
-        *,
-        env: dict[str, str] | None = None,
-        cwd: str = '/workspace',
-    ) -> dict[str, Any]:
-        # Managed processes run inside the same workspace model as one-shot
-        # execs, so path/venv rewriting is shared here.
-        normalized_command = command
-        normalized_args = list(args or [])
-        normalized_cwd = cwd
-        if self.host_path:
-            normalized_command = self.rewrite_venv_command(command)
-            normalized_args = [self.rewrite_path(arg) for arg in normalized_args]
-            normalized_cwd = self.rewrite_path(cwd)
-        return {
-            'command': normalized_command,
-            'args': normalized_args,
-            'env': dict(env or {}),
-            'cwd': normalized_cwd,
-        }
-
-    async def start_managed_process(
-        self,
-        command: str,
-        args: list[str] | None = None,
-        *,
-        process_id: str = 'default',
-        env: dict[str, str] | None = None,
-        cwd: str = '/workspace',
-    ):
-        payload = self.build_process_payload(command, args, env=env, cwd=cwd)
-        payload['process_id'] = process_id
-        return await self.box_service.start_managed_process(self.session_id, payload)
-
-    async def get_managed_process(self, process_id: str = 'default'):
-        return await self.box_service.get_managed_process(self.session_id, process_id)
-
-    async def stop_managed_process(self, process_id: str = 'default') -> None:
-        await self.box_service.stop_managed_process(self.session_id, process_id)
-
-    def get_managed_process_websocket_url(self, process_id: str = 'default') -> str:
-        return self.box_service.get_managed_process_websocket_url(self.session_id, process_id)
-
-    async def cleanup(self) -> None:
-        await self.box_service.client.delete_session(self.session_id)
--- a/src/langbot/pkg/core/app.py
+++ b/src/langbot/pkg/core/app.py
@@ -4,12 +4,12 @@ import logging
 import asyncio
 import traceback
 import os
+from typing import TYPE_CHECKING

 from ..platform import botmgr as im_mgr
 from ..platform.webhook_pusher import WebhookPusher
 from ..provider.session import sessionmgr as llm_session_mgr
 from ..provider.modelmgr import modelmgr as llm_model_mgr
-from ..box import service as box_service_module

 from langbot.pkg.provider.tools import toolmgr as llm_tool_mgr
 from ..config import manager as config_mgr
@@ -32,8 +32,8 @@ from ..api.http.service import mcp as mcp_service
 from ..api.http.service import apikey as apikey_service
 from ..api.http.service import webhook as webhook_service
 from ..api.http.service import monitoring as monitoring_service
-from ..api.http.service import skill as skill_service
 from ..api.http.service import maintenance as maintenance_service
+
 from ..discover import engine as discover_engine
 from ..storage import mgr as storagemgr
 from ..utils import logcache
@@ -44,7 +44,9 @@ from ..rag.service import RAGRuntimeService
 from ..vector import mgr as vectordb_mgr
 from ..telemetry import telemetry as telemetry_module
 from ..survey import manager as survey_module
-from ..skill import manager as skill_mgr
+
+if TYPE_CHECKING:
+    from ..agent.runner import AgentRunnerRegistry, AgentRunOrchestrator


 class Application:
@@ -72,7 +74,6 @@ class Application:

    # TODO move to pipeline
    tool_mgr: llm_tool_mgr.ToolManager = None
-    box_service: box_service_module.BoxService = None

    # ======= Config manager =======

@@ -159,12 +160,13 @@ class Application:

    monitoring_service: monitoring_service.MonitoringService = None

-    skill_service: skill_service.SkillService = None
-
-    skill_mgr: skill_mgr.SkillManager = None
-
    maintenance_service: maintenance_service.MaintenanceService = None

+    # Agent runner subsystem
+    agent_runner_registry: AgentRunnerRegistry = None
+
+    agent_run_orchestrator: AgentRunOrchestrator = None
+
    def __init__(self):
        pass

@@ -308,10 +310,7 @@ class Application:
        return parsed

    def dispose(self):
-        if self.plugin_connector is not None:
-            self.plugin_connector.dispose()
-        if self.box_service is not None:
-            self.box_service.dispose()
+        self.plugin_connector.dispose()

    async def print_web_access_info(self):
        """Print access webui tips"""
--- a/src/langbot/pkg/core/boot.py
+++ b/src/langbot/pkg/core/boot.py
@@ -62,6 +62,4 @@ async def main(loop: asyncio.AbstractEventLoop):
        app_inst = await make_app(loop)
        await app_inst.run()
    except Exception:
-        if app_inst is not None:
-            app_inst.dispose()
        traceback.print_exc()
--- a/src/langbot/pkg/core/migrations/m009_msg_truncator_cfg.py
+++ b/src/langbot/pkg/core/migrations/m009_msg_truncator_cfg.py
@@ -1,22 +0,0 @@
-from __future__ import annotations
-
-from .. import migration
-
-
-@migration.migration_class('msg-truncator-cfg-migration', 9)
-class MsgTruncatorConfigMigration(migration.Migration):
-    """迁移"""
-
-    async def need_migrate(self) -> bool:
-        """判断当前环境是否需要运行此迁移"""
-        return 'msg-truncate' not in self.ap.pipeline_cfg.data
-
-    async def run(self):
-        """执行迁移"""
-
-        self.ap.pipeline_cfg.data['msg-truncate'] = {
-            'method': 'round',
-            'round': {'max-round': 10},
-        }
-
-        await self.ap.pipeline_cfg.dump_config()
--- a/src/langbot/pkg/core/stages/build_app.py
+++ b/src/langbot/pkg/core/stages/build_app.py
@@ -6,7 +6,6 @@ from .. import stage, app
 from ...utils import version, proxy
 from ...pipeline import pool, controller, pipelinemgr
 from ...pipeline import aggregator as message_aggregator
-from ...box import service as box_service
 from ...plugin import connector as plugin_connector
 from ...command import cmdmgr
 from ...provider.session import sessionmgr as llm_session_mgr
@@ -29,8 +28,6 @@ from ...api.http.service import mcp as mcp_service
 from ...api.http.service import apikey as apikey_service
 from ...api.http.service import webhook as webhook_service
 from ...api.http.service import monitoring as monitoring_service
-from ...api.http.service import skill as skill_service
-from ...skill import manager as skill_mgr
 from ...api.http.service import maintenance as maintenance_service
 from ...discover import engine as discover_engine
 from ...storage import mgr as storagemgr
@@ -39,6 +36,7 @@ from ...vector import mgr as vectordb_mgr
 from .. import taskmgr
 from ...telemetry import telemetry as telemetry_module
 from ...survey import manager as survey_module
+from ...agent.runner import AgentRunnerRegistry, AgentRunOrchestrator


@stage.stage_class('BuildAppStage')
@@ -89,9 +87,6 @@ class BuildAppStage(stage.BootingStage):
        webhook_service_inst = webhook_service.WebhookService(ap)
        ap.webhook_service = webhook_service_inst

-        skill_service_inst = skill_service.SkillService(ap)
-        ap.skill_service = skill_service_inst
-
        proxy_mgr = proxy.ProxyManager(ap)
        await proxy_mgr.initialize()
        ap.proxy_mgr = proxy_mgr
@@ -135,10 +130,6 @@ class BuildAppStage(stage.BootingStage):
        await llm_session_mgr_inst.initialize()
        ap.sess_mgr = llm_session_mgr_inst

-        box_service_inst = box_service.BoxService(ap)
-        await box_service_inst.initialize()
-        ap.box_service = box_service_inst
-
        llm_tool_mgr_inst = llm_tool_mgr.ToolManager(ap)
        await llm_tool_mgr_inst.initialize()
        ap.tool_mgr = llm_tool_mgr_inst
@@ -159,11 +150,6 @@ class BuildAppStage(stage.BootingStage):
        msg_aggregator_inst = message_aggregator.MessageAggregator(ap)
        ap.msg_aggregator = msg_aggregator_inst

-        # Initialize skill manager
-        skill_mgr_inst = skill_mgr.SkillManager(ap)
-        await skill_mgr_inst.initialize()
-        ap.skill_mgr = skill_mgr_inst
-
        rag_mgr_inst = rag_mgr.RAGManager(ap)
        await rag_mgr_inst.initialize()
        ap.rag_mgr = rag_mgr_inst
@@ -194,5 +180,12 @@ class BuildAppStage(stage.BootingStage):
        await plugin_connector_inst.initialize()
        ap.plugin_connector = plugin_connector_inst

+        # Initialize agent runner subsystem
+        agent_runner_registry_inst = AgentRunnerRegistry(ap)
+        ap.agent_runner_registry = agent_runner_registry_inst
+
+        agent_run_orchestrator_inst = AgentRunOrchestrator(ap, agent_runner_registry_inst)
+        ap.agent_run_orchestrator = agent_run_orchestrator_inst
+
        ctrl = controller.Controller(ap)
        ap.ctrl = ctrl
--- a/src/langbot/pkg/entity/persistence/agent_runner_state.py
+++ b/src/langbot/pkg/entity/persistence/agent_runner_state.py
@@ -0,0 +1,88 @@
+"""Agent runner state persistence entity for host-owned state."""
+from __future__ import annotations
+
+import sqlalchemy
+import datetime
+
+from .base import Base
+
+
+class AgentRunnerState(Base):
+    """AgentRunnerState stores host-owned state for AgentRunner protocol.
+
+    State is:
+    - Host-owned: Managed by LangBot, not by plugin instances
+    - Scope-isolated: Separated by runner_id + binding_identity + scope
+    - Policy-enforced: Controlled by StatePolicy (enable_state, state_scopes)
+
+    Scope key design:
+    - conversation: runner_id + binding_id + conversation_id [+ thread_id]
+    - actor: runner_id + binding_id + actor_type + actor_id
+    - subject: runner_id + binding_id + subject_type + subject_id
+    - runner: runner_id + binding_id
+
+    This table is the production store for AgentRunner state.
+    """
+
+    __tablename__ = 'agent_runner_state'
+
+    id = sqlalchemy.Column(sqlalchemy.Integer, primary_key=True, autoincrement=True)
+    """Auto-increment ID for sequencing."""
+
+    # Identity
+    runner_id = sqlalchemy.Column(sqlalchemy.String(255), nullable=False, index=True)
+    """Runner descriptor ID (plugin:author/name/runner)."""
+
+    binding_identity = sqlalchemy.Column(sqlalchemy.String(255), nullable=False, index=True)
+    """Binding identity for isolation (binding_id or scope_type:scope_id)."""
+
+    scope = sqlalchemy.Column(sqlalchemy.String(50), nullable=False, index=True)
+    """State scope: 'conversation', 'actor', 'subject', or 'runner'."""
+
+    scope_key = sqlalchemy.Column(sqlalchemy.String(512), nullable=False, index=True)
+    """Full scope key for unique lookup (includes all identity parts)."""
+
+    state_key = sqlalchemy.Column(sqlalchemy.String(255), nullable=False)
+    """State key within scope (should use namespace prefix like external.*)."""
+
+    value_json = sqlalchemy.Column(sqlalchemy.Text, nullable=True)
+    """State value as JSON string (size-limited by host)."""
+
+    # Context fields for querying/filtering
+    bot_id = sqlalchemy.Column(sqlalchemy.String(255), nullable=True, index=True)
+    """Bot UUID if applicable."""
+
+    workspace_id = sqlalchemy.Column(sqlalchemy.String(255), nullable=True)
+    """Workspace ID for multi-tenant."""
+
+    conversation_id = sqlalchemy.Column(sqlalchemy.String(255), nullable=True, index=True)
+    """Conversation ID for conversation scope."""
+
+    thread_id = sqlalchemy.Column(sqlalchemy.String(255), nullable=True)
+    """Thread ID for thread-scoped conversation state."""
+
+    actor_type = sqlalchemy.Column(sqlalchemy.String(50), nullable=True)
+    """Actor type for actor scope."""
+
+    actor_id = sqlalchemy.Column(sqlalchemy.String(255), nullable=True, index=True)
+    """Actor ID for actor scope."""
+
+    subject_type = sqlalchemy.Column(sqlalchemy.String(50), nullable=True)
+    """Subject type for subject scope."""
+
+    subject_id = sqlalchemy.Column(sqlalchemy.String(255), nullable=True)
+    """Subject ID for subject scope."""
+
+    # Lifecycle
+    created_at = sqlalchemy.Column(sqlalchemy.DateTime, nullable=False, default=datetime.datetime.utcnow)
+    """When this state entry was created."""
+
+    updated_at = sqlalchemy.Column(sqlalchemy.DateTime, nullable=False, default=datetime.datetime.utcnow, onupdate=datetime.datetime.utcnow)
+    """When this state entry was last updated."""
+
+    # Unique constraint: scope_key + state_key
+    __table_args__ = (
+        sqlalchemy.UniqueConstraint('scope_key', 'state_key', name='uq_agent_runner_state_scope_key_state_key'),
+        sqlalchemy.Index('ix_agent_runner_state_runner_binding', 'runner_id', 'binding_identity'),
+        sqlalchemy.Index('ix_agent_runner_state_scope_key_lookup', 'scope_key'),
+    )
--- a/src/langbot/pkg/entity/persistence/artifact.py
+++ b/src/langbot/pkg/entity/persistence/artifact.py
@@ -0,0 +1,77 @@
+"""Artifact persistence entity for Host-owned artifact store."""
+from __future__ import annotations
+
+import sqlalchemy
+import datetime
+
+from .base import Base
+
+
+class AgentArtifact(Base):
+    """AgentArtifact stores metadata for large files, images, tool results, etc.
+
+    This table only stores metadata. The actual blob content is stored in
+    BinaryStorage or external storage, referenced by storage_key.
+
+    Artifacts are accessed via artifact_metadata and artifact_read APIs
+    with run_id authorization.
+    """
+
+    __tablename__ = 'agent_artifact'
+
+    id = sqlalchemy.Column(sqlalchemy.Integer, primary_key=True, autoincrement=True)
+    """Auto-increment ID for sequencing."""
+
+    artifact_id = sqlalchemy.Column(sqlalchemy.String(255), nullable=False, unique=True, index=True)
+    """Unique artifact identifier."""
+
+    artifact_type = sqlalchemy.Column(sqlalchemy.String(50), nullable=False)
+    """Artifact type: 'image', 'file', 'voice', 'tool_result', 'platform_attachment', etc."""
+
+    mime_type = sqlalchemy.Column(sqlalchemy.String(255), nullable=True)
+    """MIME type of the content."""
+
+    name = sqlalchemy.Column(sqlalchemy.String(255), nullable=True)
+    """Original file name (if applicable)."""
+
+    size_bytes = sqlalchemy.Column(sqlalchemy.BigInteger, nullable=True)
+    """Size in bytes."""
+
+    sha256 = sqlalchemy.Column(sqlalchemy.String(64), nullable=True)
+    """SHA256 hash of content (for integrity verification)."""
+
+    source = sqlalchemy.Column(sqlalchemy.String(50), nullable=False)
+    """Source of artifact: 'platform', 'runner', 'tool', 'system'."""
+
+    # Storage reference (points to BinaryStorage or external storage)
+    storage_key = sqlalchemy.Column(sqlalchemy.String(255), nullable=True)
+    """Key in BinaryStorage or external storage reference."""
+
+    storage_type = sqlalchemy.Column(sqlalchemy.String(50), nullable=False, default='binary_storage')
+    """Storage type: 'binary_storage', 'file', 'url', etc."""
+
+    # Context
+    conversation_id = sqlalchemy.Column(sqlalchemy.String(255), nullable=True, index=True)
+    """Conversation this artifact belongs to."""
+
+    run_id = sqlalchemy.Column(sqlalchemy.String(255), nullable=True, index=True)
+    """Run ID that created this artifact."""
+
+    runner_id = sqlalchemy.Column(sqlalchemy.String(255), nullable=True)
+    """Runner ID that created this artifact."""
+
+    bot_id = sqlalchemy.Column(sqlalchemy.String(255), nullable=True)
+    """Bot UUID that handled this artifact."""
+
+    workspace_id = sqlalchemy.Column(sqlalchemy.String(255), nullable=True)
+    """Workspace ID for multi-tenant deployments."""
+
+    # Lifecycle
+    created_at = sqlalchemy.Column(sqlalchemy.DateTime, nullable=False, default=datetime.datetime.utcnow)
+    """When this artifact was created."""
+
+    expires_at = sqlalchemy.Column(sqlalchemy.DateTime, nullable=True)
+    """When this artifact expires (optional)."""
+
+    metadata_json = sqlalchemy.Column(sqlalchemy.Text, nullable=True)
+    """Additional metadata as JSON string."""
--- a/src/langbot/pkg/entity/persistence/event_log.py
+++ b/src/langbot/pkg/entity/persistence/event_log.py
@@ -0,0 +1,85 @@
+"""EventLog persistence entity for storing auditable event facts."""
+from __future__ import annotations
+
+import sqlalchemy
+import datetime
+
+from .base import Base
+
+
+class EventLog(Base):
+    """EventLog stores auditable event records for AgentRunner.
+
+    This is the fact source for events - messages, tool calls, system events, etc.
+    Large payloads are stored separately as artifacts; this table stores
+    references and summaries.
+    """
+
+    __tablename__ = 'event_log'
+
+    id = sqlalchemy.Column(sqlalchemy.Integer, primary_key=True, autoincrement=True)
+    """Auto-increment ID for sequencing."""
+
+    event_id = sqlalchemy.Column(sqlalchemy.String(255), nullable=False, unique=True, index=True)
+    """Unique event identifier."""
+
+    event_type = sqlalchemy.Column(sqlalchemy.String(100), nullable=False, index=True)
+    """Event type (message.received, tool.call.started, etc.)."""
+
+    event_time = sqlalchemy.Column(sqlalchemy.DateTime, nullable=True)
+    """When the event occurred."""
+
+    source = sqlalchemy.Column(sqlalchemy.String(50), nullable=False)
+    """Event source (platform, webui, api, scheduler, system, pipeline_adapter)."""
+
+    bot_id = sqlalchemy.Column(sqlalchemy.String(255), nullable=True, index=True)
+    """Bot UUID that handled this event."""
+
+    workspace_id = sqlalchemy.Column(sqlalchemy.String(255), nullable=True)
+    """Workspace ID for multi-tenant deployments."""
+
+    conversation_id = sqlalchemy.Column(sqlalchemy.String(255), nullable=True, index=True)
+    """Conversation ID this event belongs to."""
+
+    thread_id = sqlalchemy.Column(sqlalchemy.String(255), nullable=True)
+    """Thread ID if platform supports threads."""
+
+    # Actor information
+    actor_type = sqlalchemy.Column(sqlalchemy.String(50), nullable=True)
+    """Actor type (user, system, runner)."""
+
+    actor_id = sqlalchemy.Column(sqlalchemy.String(255), nullable=True)
+    """Actor identifier."""
+
+    actor_name = sqlalchemy.Column(sqlalchemy.String(255), nullable=True)
+    """Actor display name."""
+
+    # Subject information
+    subject_type = sqlalchemy.Column(sqlalchemy.String(50), nullable=True)
+    """Subject type (message, tool_call, artifact)."""
+
+    subject_id = sqlalchemy.Column(sqlalchemy.String(255), nullable=True)
+    """Subject identifier."""
+
+    # Input information
+    input_summary = sqlalchemy.Column(sqlalchemy.Text, nullable=True)
+    """Brief summary of input (truncated text, max 1000 chars)."""
+
+    input_json = sqlalchemy.Column(sqlalchemy.Text, nullable=True)
+    """Full input JSON if reasonably sized (AgentInput as JSON string)."""
+
+    # Raw event reference
+    raw_ref = sqlalchemy.Column(sqlalchemy.String(255), nullable=True)
+    """Reference to raw event payload in ArtifactStore."""
+
+    run_id = sqlalchemy.Column(sqlalchemy.String(255), nullable=True, index=True)
+    """Run ID that processed this event."""
+
+    runner_id = sqlalchemy.Column(sqlalchemy.String(255), nullable=True)
+    """Runner ID that processed this event."""
+
+    created_at = sqlalchemy.Column(sqlalchemy.DateTime, nullable=False, default=datetime.datetime.utcnow)
+    """When this record was created."""
+
+    metadata_json = sqlalchemy.Column(sqlalchemy.Text, nullable=True)
+    """Additional metadata as JSON string."""
--- a/src/langbot/pkg/entity/persistence/transcript.py
+++ b/src/langbot/pkg/entity/persistence/transcript.py
@@ -0,0 +1,72 @@
+"""Transcript persistence entity for conversation history projection."""
+from __future__ import annotations
+
+import sqlalchemy
+import datetime
+
+from .base import Base
+
+
+class Transcript(Base):
+    """Transcript stores conversation-oriented message projection for history API.
+
+    This is a projection of EventLog, optimized for agent history retrieval.
+    It includes message content and artifact refs, but not raw platform payloads.
+    """
+
+    __tablename__ = 'transcript'
+
+    id = sqlalchemy.Column(sqlalchemy.Integer, primary_key=True, autoincrement=True)
+    """Auto-increment ID for sequencing."""
+
+    transcript_id = sqlalchemy.Column(sqlalchemy.String(255), nullable=False, unique=True, index=True)
+    """Unique transcript item identifier."""
+
+    event_id = sqlalchemy.Column(sqlalchemy.String(255), nullable=False, index=True)
+    """Reference to the source event in EventLog."""
+
+    conversation_id = sqlalchemy.Column(sqlalchemy.String(255), nullable=False, index=True)
+    """Conversation this item belongs to."""
+
+    thread_id = sqlalchemy.Column(sqlalchemy.String(255), nullable=True)
+    """Thread ID if platform supports threads."""
+
+    role = sqlalchemy.Column(sqlalchemy.String(50), nullable=False)
+    """Message role: 'user', 'assistant', 'system', or 'tool'."""
+
+    item_type = sqlalchemy.Column(sqlalchemy.String(50), nullable=False, default='message')
+    """Item type: 'message', 'tool_call', 'tool_result', 'system'."""
+
+    # Content
+    content = sqlalchemy.Column(sqlalchemy.Text, nullable=True)
+    """Text content summary (may be truncated for large messages, max 4000 chars)."""
+
+    content_json = sqlalchemy.Column(sqlalchemy.Text, nullable=True)
+    """Full structured content as JSON string (Message model dump)."""
+
+    # Artifact references
+    artifact_refs_json = sqlalchemy.Column(sqlalchemy.Text, nullable=True)
+    """Artifact references as JSON string (list of ArtifactRef)."""
+
+    # Sequence for cursor-based pagination
+    seq = sqlalchemy.Column(sqlalchemy.Integer, nullable=False, index=True)
+    """Monotonic cursor sequence for pagination."""
+
+    # Context
+    run_id = sqlalchemy.Column(sqlalchemy.String(255), nullable=True, index=True)
+    """Run ID that generated this item (for assistant messages)."""
+
+    runner_id = sqlalchemy.Column(sqlalchemy.String(255), nullable=True)
+    """Runner ID that generated this item."""
+
+    created_at = sqlalchemy.Column(sqlalchemy.DateTime, nullable=False, default=datetime.datetime.utcnow)
+    """When this item was created."""
+
+    metadata_json = sqlalchemy.Column(sqlalchemy.Text, nullable=True)
+    """Additional metadata as JSON string (sender_id, platform, etc.)."""
+
+    # Indexes
+    __table_args__ = (
+        sqlalchemy.Index('ix_transcript_conversation_seq', 'conversation_id', 'seq'),
+        sqlalchemy.Index('ix_transcript_conversation_created', 'conversation_id', 'created_at'),
+    )
--- a/src/langbot/pkg/persistence/alembic/env.py
+++ b/src/langbot/pkg/persistence/alembic/env.py
@@ -13,6 +13,28 @@ from sqlalchemy.engine import Connection

 from langbot.pkg.entity.persistence.base import Base

+# Import all ORM models so they are registered with Base.metadata
+# This is required for autogenerate to detect model changes
+from langbot.pkg.entity.persistence import (
+    agent_runner_state,
+    apikey,
+    artifact,
+    bot,
+    bstorage,
+    event_log,
+    mcp,
+    metadata,
+    model,
+    monitoring,
+    pipeline,
+    plugin,
+    rag,
+    transcript,
+    user,
+    vector,
+    webhook,
+)
+
 target_metadata = Base.metadata


--- a/src/langbot/pkg/persistence/alembic/versions/0004_migrate_runner_config.py
+++ b/src/langbot/pkg/persistence/alembic/versions/0004_migrate_runner_config.py
@@ -0,0 +1,145 @@
+"""Migrate pipeline config to new runner format
+
+Revision ID: 0004_migrate_runner_config
+Revises: 0003_add_rerank_models
+Create Date: 2026-05-10
+"""
+
+import json
+import sqlalchemy as sa
+from alembic import op
+
+revision = '0004_migrate_runner_config'
+down_revision = '0003_add_rerank_models'
+branch_labels = None
+depends_on = None
+
+# Mapping from old built-in runner names to official plugin runner IDs
+OLD_RUNNER_TO_PLUGIN_RUNNER_ID = {
+    'local-agent': 'plugin:langbot/local-agent/default',
+    'dify-service-api': 'plugin:langbot/dify-agent/default',
+    'n8n-service-api': 'plugin:langbot/n8n-agent/default',
+    'coze-api': 'plugin:langbot/coze-agent/default',
+    'dashscope-app-api': 'plugin:langbot/dashscope-agent/default',
+    'langflow-api': 'plugin:langbot/langflow-agent/default',
+    'tbox-app-api': 'plugin:langbot/tbox-agent/default',
+}
+
+
+def is_plugin_runner_id(runner_id: str) -> bool:
+    """Check if runner ID is in plugin:* format."""
+    return runner_id.startswith('plugin:')
+
+
+def normalize_runner_config_for_migration(runner_id: str, runner_config: dict) -> dict:
+    """Normalize released legacy runner fields before storing binding config."""
+    normalized = dict(runner_config)
+
+    if runner_id == OLD_RUNNER_TO_PLUGIN_RUNNER_ID['local-agent']:
+        legacy_kb = normalized.pop('knowledge-base', None)
+        if 'knowledge-bases' not in normalized:
+            if isinstance(legacy_kb, str) and legacy_kb and legacy_kb not in {'__none__', '__none'}:
+                normalized['knowledge-bases'] = [legacy_kb]
+            elif legacy_kb is not None:
+                normalized['knowledge-bases'] = []
+
+    return normalized
+
+
+def migrate_pipeline_config(config: dict) -> dict:
+    """Migrate pipeline config to new format."""
+    new_config = dict(config)
+    ai_config = new_config.get('ai', {})
+    if not ai_config:
+        return new_config
+
+    runner_config = ai_config.get('runner', {})
+    runner_configs = ai_config.get('runner_config', {})
+
+    # Check for new format first
+    runner_id = runner_config.get('id')
+    if runner_id and is_plugin_runner_id(runner_id):
+        if runner_id in runner_configs:
+            runner_configs[runner_id] = normalize_runner_config_for_migration(
+                runner_id,
+                runner_configs[runner_id],
+            )
+            ai_config['runner_config'] = runner_configs
+            new_config['ai'] = ai_config
+        return new_config
+
+    # Check for old format
+    old_runner_name = runner_config.get('runner')
+    if old_runner_name:
+        # Map to new runner ID
+        if is_plugin_runner_id(old_runner_name):
+            runner_id = old_runner_name
+        else:
+            runner_id = OLD_RUNNER_TO_PLUGIN_RUNNER_ID.get(old_runner_name, old_runner_name)
+
+        # Set new format
+        runner_config['id'] = runner_id
+
+        # Remove old runner field if it's a mapped built-in runner
+        if old_runner_name in OLD_RUNNER_TO_PLUGIN_RUNNER_ID:
+            del runner_config['runner']
+
+        # Migrate runner-specific config and remove old config blocks
+        if old_runner_name in ai_config:
+            old_runner_config = ai_config[old_runner_name]
+            if old_runner_config:
+                runner_configs[runner_id] = normalize_runner_config_for_migration(runner_id, old_runner_config)
+            # Remove old config block after migration
+            del ai_config[old_runner_name]
+
+        # Also check if runner_id has config under other old name formats
+        for old_name, mapped_id in OLD_RUNNER_TO_PLUGIN_RUNNER_ID.items():
+            if mapped_id == runner_id and old_name in ai_config:
+                runner_configs[runner_id] = normalize_runner_config_for_migration(runner_id, ai_config[old_name])
+                # Remove old config block after migration
+                del ai_config[old_name]
+
+    # Update configs
+    ai_config['runner'] = runner_config
+    ai_config['runner_config'] = runner_configs
+    new_config['ai'] = ai_config
+
+    return new_config
+
+
+def upgrade() -> None:
+    """Migrate existing pipeline configs to new runner format."""
+    conn = op.get_bind()
+    inspector = sa.inspect(conn)
+
+    # Check if pipelines table exists (may not exist in fresh install)
+    if 'pipelines' not in inspector.get_table_names():
+        return
+
+    # Get all pipelines
+    result = conn.execute(sa.text('SELECT uuid, config FROM pipelines'))
+    pipelines = result.fetchall()
+
+    for pipeline_uuid, config_json in pipelines:
+        if not config_json:
+            continue
+
+        try:
+            config = json.loads(config_json)
+            migrated_config = migrate_pipeline_config(config)
+
+            # Only update if config changed
+            if json.dumps(config, sort_keys=True) != json.dumps(migrated_config, sort_keys=True):
+                conn.execute(
+                    sa.text('UPDATE pipelines SET config = :config WHERE uuid = :uuid'),
+                    {'config': json.dumps(migrated_config), 'uuid': pipeline_uuid},
+                )
+        except Exception:
+            # Skip invalid configs
+            continue
+
+
+def downgrade() -> None:
+    """Downgrade is not supported for data migration."""
+    # No downgrade - keep configs in new format
+    pass
--- a/src/langbot/pkg/persistence/alembic/versions/58846a8d7a81_add_event_log_and_transcript_tables.py
+++ b/src/langbot/pkg/persistence/alembic/versions/58846a8d7a81_add_event_log_and_transcript_tables.py
@@ -0,0 +1,102 @@
+"""add_event_log_and_transcript_tables
+
+Revision ID: 58846a8d7a81
+Revises: 0004_migrate_runner_config
+Create Date: 2026-05-23 15:41:47.030841
+"""
+from alembic import op
+import sqlalchemy as sa
+
+# revision identifiers
+revision = '58846a8d7a81'
+down_revision = '0004_migrate_runner_config'
+branch_labels = None
+depends_on = None
+
+
+def upgrade() -> None:
+    # Create event_log table
+    op.create_table(
+        'event_log',
+        sa.Column('id', sa.Integer(), primary_key=True, autoincrement=True),
+        sa.Column('event_id', sa.String(255), nullable=False, unique=True),
+        sa.Column('event_type', sa.String(100), nullable=False),
+        sa.Column('event_time', sa.DateTime(), nullable=True),
+        sa.Column('source', sa.String(50), nullable=False),
+        sa.Column('bot_id', sa.String(255), nullable=True),
+        sa.Column('workspace_id', sa.String(255), nullable=True),
+        sa.Column('conversation_id', sa.String(255), nullable=True),
+        sa.Column('thread_id', sa.String(255), nullable=True),
+        sa.Column('actor_type', sa.String(50), nullable=True),
+        sa.Column('actor_id', sa.String(255), nullable=True),
+        sa.Column('actor_name', sa.String(255), nullable=True),
+        sa.Column('subject_type', sa.String(50), nullable=True),
+        sa.Column('subject_id', sa.String(255), nullable=True),
+        sa.Column('input_summary', sa.Text(), nullable=True),
+        sa.Column('input_json', sa.Text(), nullable=True),
+        sa.Column('raw_ref', sa.String(255), nullable=True),
+        sa.Column('run_id', sa.String(255), nullable=True),
+        sa.Column('runner_id', sa.String(255), nullable=True),
+        sa.Column('created_at', sa.DateTime(), nullable=False, server_default=sa.text('(CURRENT_TIMESTAMP)')),
+        sa.Column('metadata_json', sa.Text(), nullable=True),
+    )
+
+    # Create indexes for event_log
+    with op.batch_alter_table('event_log', schema=None) as batch_op:
+        batch_op.create_index('ix_event_log_event_id', ['event_id'], unique=True)
+        batch_op.create_index('ix_event_log_event_type', ['event_type'], unique=False)
+        batch_op.create_index('ix_event_log_bot_id', ['bot_id'], unique=False)
+        batch_op.create_index('ix_event_log_conversation_id', ['conversation_id'], unique=False)
+        batch_op.create_index('ix_event_log_run_id', ['run_id'], unique=False)
+
+    # Create transcript table
+    op.create_table(
+        'transcript',
+        sa.Column('id', sa.Integer(), primary_key=True, autoincrement=True),
+        sa.Column('transcript_id', sa.String(255), nullable=False, unique=True),
+        sa.Column('event_id', sa.String(255), nullable=False),
+        sa.Column('conversation_id', sa.String(255), nullable=False),
+        sa.Column('thread_id', sa.String(255), nullable=True),
+        sa.Column('role', sa.String(50), nullable=False),
+        sa.Column('item_type', sa.String(50), nullable=False, server_default='message'),
+        sa.Column('content', sa.Text(), nullable=True),
+        sa.Column('content_json', sa.Text(), nullable=True),
+        sa.Column('artifact_refs_json', sa.Text(), nullable=True),
+        sa.Column('seq', sa.Integer(), nullable=False),
+        sa.Column('run_id', sa.String(255), nullable=True),
+        sa.Column('runner_id', sa.String(255), nullable=True),
+        sa.Column('created_at', sa.DateTime(), nullable=False, server_default=sa.text('(CURRENT_TIMESTAMP)')),
+        sa.Column('metadata_json', sa.Text(), nullable=True),
+    )
+
+    # Create indexes for transcript
+    with op.batch_alter_table('transcript', schema=None) as batch_op:
+        batch_op.create_index('ix_transcript_transcript_id', ['transcript_id'], unique=True)
+        batch_op.create_index('ix_transcript_event_id', ['event_id'], unique=False)
+        batch_op.create_index('ix_transcript_conversation_id', ['conversation_id'], unique=False)
+        batch_op.create_index('ix_transcript_conversation_seq', ['conversation_id', 'seq'], unique=False)
+        batch_op.create_index('ix_transcript_conversation_created', ['conversation_id', 'created_at'], unique=False)
+        batch_op.create_index('ix_transcript_run_id', ['run_id'], unique=False)
+
+
+def downgrade() -> None:
+    # Drop transcript table
+    with op.batch_alter_table('transcript', schema=None) as batch_op:
+        batch_op.drop_index('ix_transcript_run_id')
+        batch_op.drop_index('ix_transcript_conversation_created')
+        batch_op.drop_index('ix_transcript_conversation_seq')
+        batch_op.drop_index('ix_transcript_conversation_id')
+        batch_op.drop_index('ix_transcript_event_id')
+        batch_op.drop_index('ix_transcript_transcript_id')
+
+    op.drop_table('transcript')
+
+    # Drop event_log table
+    with op.batch_alter_table('event_log', schema=None) as batch_op:
+        batch_op.drop_index('ix_event_log_run_id')
+        batch_op.drop_index('ix_event_log_conversation_id')
+        batch_op.drop_index('ix_event_log_bot_id')
+        batch_op.drop_index('ix_event_log_event_type')
+        batch_op.drop_index('ix_event_log_event_id')
+
+    op.drop_table('event_log')
--- a/src/langbot/pkg/persistence/alembic/versions/6dfd3dd7f0c7_add_agent_runner_state_table_for_host_.py
+++ b/src/langbot/pkg/persistence/alembic/versions/6dfd3dd7f0c7_add_agent_runner_state_table_for_host_.py
@@ -0,0 +1,68 @@
+# Alembic script.py.mako — template for auto-generated revisions
+"""add agent_runner_state table for host-owned persistent state
+
+Revision ID: 6dfd3dd7f0c7
+Revises: a1b2c3d4e5f6
+Create Date: 2026-05-23 19:49:08.529110
+"""
+from alembic import op
+import sqlalchemy as sa
+
+
+# revision identifiers
+revision = '6dfd3dd7f0c7'
+down_revision = 'a1b2c3d4e5f6'
+branch_labels = None
+depends_on = None
+
+
+def upgrade() -> None:
+    # ### commands auto generated by Alembic - please adjust! ###
+    op.create_table('agent_runner_state',
+    sa.Column('id', sa.Integer(), autoincrement=True, nullable=False),
+    sa.Column('runner_id', sa.String(length=255), nullable=False),
+    sa.Column('binding_identity', sa.String(length=255), nullable=False),
+    sa.Column('scope', sa.String(length=50), nullable=False),
+    sa.Column('scope_key', sa.String(length=512), nullable=False),
+    sa.Column('state_key', sa.String(length=255), nullable=False),
+    sa.Column('value_json', sa.Text(), nullable=True),
+    sa.Column('bot_id', sa.String(length=255), nullable=True),
+    sa.Column('workspace_id', sa.String(length=255), nullable=True),
+    sa.Column('conversation_id', sa.String(length=255), nullable=True),
+    sa.Column('thread_id', sa.String(length=255), nullable=True),
+    sa.Column('actor_type', sa.String(length=50), nullable=True),
+    sa.Column('actor_id', sa.String(length=255), nullable=True),
+    sa.Column('subject_type', sa.String(length=50), nullable=True),
+    sa.Column('subject_id', sa.String(length=255), nullable=True),
+    sa.Column('created_at', sa.DateTime(), nullable=False),
+    sa.Column('updated_at', sa.DateTime(), nullable=False),
+    sa.PrimaryKeyConstraint('id'),
+    sa.UniqueConstraint('scope_key', 'state_key', name='uq_agent_runner_state_scope_key_state_key')
+    )
+    with op.batch_alter_table('agent_runner_state', schema=None) as batch_op:
+        batch_op.create_index(batch_op.f('ix_agent_runner_state_actor_id'), ['actor_id'], unique=False)
+        batch_op.create_index(batch_op.f('ix_agent_runner_state_binding_identity'), ['binding_identity'], unique=False)
+        batch_op.create_index(batch_op.f('ix_agent_runner_state_bot_id'), ['bot_id'], unique=False)
+        batch_op.create_index(batch_op.f('ix_agent_runner_state_conversation_id'), ['conversation_id'], unique=False)
+        batch_op.create_index('ix_agent_runner_state_runner_binding', ['runner_id', 'binding_identity'], unique=False)
+        batch_op.create_index(batch_op.f('ix_agent_runner_state_runner_id'), ['runner_id'], unique=False)
+        batch_op.create_index(batch_op.f('ix_agent_runner_state_scope'), ['scope'], unique=False)
+        batch_op.create_index(batch_op.f('ix_agent_runner_state_scope_key'), ['scope_key'], unique=False)
+
+    # ### end Alembic commands ###
+
+
+def downgrade() -> None:
+    # ### commands auto generated by Alembic - please adjust! ###
+    with op.batch_alter_table('agent_runner_state', schema=None) as batch_op:
+        batch_op.drop_index(batch_op.f('ix_agent_runner_state_scope_key'))
+        batch_op.drop_index(batch_op.f('ix_agent_runner_state_scope'))
+        batch_op.drop_index(batch_op.f('ix_agent_runner_state_runner_id'))
+        batch_op.drop_index('ix_agent_runner_state_runner_binding')
+        batch_op.drop_index(batch_op.f('ix_agent_runner_state_conversation_id'))
+        batch_op.drop_index(batch_op.f('ix_agent_runner_state_bot_id'))
+        batch_op.drop_index(batch_op.f('ix_agent_runner_state_binding_identity'))
+        batch_op.drop_index(batch_op.f('ix_agent_runner_state_actor_id'))
+
+    op.drop_table('agent_runner_state')
+    # ### end Alembic commands ###
--- a/src/langbot/pkg/persistence/alembic/versions/a1b2c3d4e5f6_add_agent_artifact_table.py
+++ b/src/langbot/pkg/persistence/alembic/versions/a1b2c3d4e5f6_add_agent_artifact_table.py
@@ -0,0 +1,55 @@
+"""add_agent_artifact_table
+
+Revision ID: a1b2c3d4e5f6
+Revises: 58846a8d7a81
+Create Date: 2026-05-23 20:00:00.000000
+"""
+from alembic import op
+import sqlalchemy as sa
+
+# revision identifiers
+revision = 'a1b2c3d4e5f6'
+down_revision = '58846a8d7a81'
+branch_labels = None
+depends_on = None
+
+
+def upgrade() -> None:
+    # Create agent_artifact table
+    op.create_table(
+        'agent_artifact',
+        sa.Column('id', sa.Integer(), primary_key=True, autoincrement=True),
+        sa.Column('artifact_id', sa.String(255), nullable=False, unique=True),
+        sa.Column('artifact_type', sa.String(50), nullable=False),
+        sa.Column('mime_type', sa.String(255), nullable=True),
+        sa.Column('name', sa.String(255), nullable=True),
+        sa.Column('size_bytes', sa.BigInteger(), nullable=True),
+        sa.Column('sha256', sa.String(64), nullable=True),
+        sa.Column('source', sa.String(50), nullable=False),
+        sa.Column('storage_key', sa.String(255), nullable=True),
+        sa.Column('storage_type', sa.String(50), nullable=False, server_default='binary_storage'),
+        sa.Column('conversation_id', sa.String(255), nullable=True),
+        sa.Column('run_id', sa.String(255), nullable=True),
+        sa.Column('runner_id', sa.String(255), nullable=True),
+        sa.Column('bot_id', sa.String(255), nullable=True),
+        sa.Column('workspace_id', sa.String(255), nullable=True),
+        sa.Column('created_at', sa.DateTime(), nullable=False, server_default=sa.text('(CURRENT_TIMESTAMP)')),
+        sa.Column('expires_at', sa.DateTime(), nullable=True),
+        sa.Column('metadata_json', sa.Text(), nullable=True),
+    )
+
+    # Create indexes for agent_artifact
+    with op.batch_alter_table('agent_artifact', schema=None) as batch_op:
+        batch_op.create_index('ix_agent_artifact_artifact_id', ['artifact_id'], unique=True)
+        batch_op.create_index('ix_agent_artifact_conversation_id', ['conversation_id'], unique=False)
+        batch_op.create_index('ix_agent_artifact_run_id', ['run_id'], unique=False)
+
+
+def downgrade() -> None:
+    # Drop agent_artifact table
+    with op.batch_alter_table('agent_artifact', schema=None) as batch_op:
+        batch_op.drop_index('ix_agent_artifact_run_id')
+        batch_op.drop_index('ix_agent_artifact_conversation_id')
+        batch_op.drop_index('ix_agent_artifact_artifact_id')
+
+    op.drop_table('agent_artifact')
--- a/src/langbot/pkg/persistence/migrations/dbm001_migrate_v3_config.py
+++ b/src/langbot/pkg/persistence/migrations/dbm001_migrate_v3_config.py
@@ -118,9 +118,6 @@ class DBMigrateV3Config(migration.DBMigration):
                'runner': self.ap.provider_cfg.data['runner'],
            }
            pipeline_config['ai']['local-agent']['model'] = model_uuid
-            pipeline_config['ai']['local-agent']['max-round'] = self.ap.pipeline_cfg.data['msg-truncate']['round'][
-                'max-round'
-            ]

            pipeline_config['ai']['local-agent']['prompt'] = [
                {
--- a/src/langbot/pkg/pipeline/msgtrun/init.py
+++ b/src/langbot/pkg/pipeline/msgtrun/init.py
--- a/src/langbot/pkg/pipeline/msgtrun/msgtrun.py
+++ b/src/langbot/pkg/pipeline/msgtrun/msgtrun.py
@@ -1,35 +0,0 @@
-from __future__ import annotations
-
-from .. import stage, entities
-from . import truncator
-from ...utils import importutil
-import langbot_plugin.api.entities.builtin.pipeline.query as pipeline_query
-from . import truncators
-
-importutil.import_modules_in_pkg(truncators)
-
-
-@stage.stage_class('ConversationMessageTruncator')
-class ConversationMessageTruncator(stage.PipelineStage):
-    """Conversation message truncator
-
-    Used to truncate the conversation message chain to adapt to the LLM message length limit.
-    """
-
-    trun: truncator.Truncator
-
-    async def initialize(self, pipeline_config: dict):
-        use_method = 'round'
-
-        for trun in truncator.preregistered_truncators:
-            if trun.name == use_method:
-                self.trun = trun(self.ap)
-                break
-        else:
-            raise ValueError(f'Unknown truncator: {use_method}')
-
-    async def process(self, query: pipeline_query.Query, stage_inst_name: str) -> entities.StageProcessResult:
-        """处理"""
-        query = await self.trun.truncate(query)
-
-        return entities.StageProcessResult(result_type=entities.ResultType.CONTINUE, new_query=query)
--- a/src/langbot/pkg/pipeline/msgtrun/truncator.py
+++ b/src/langbot/pkg/pipeline/msgtrun/truncator.py
@@ -1,56 +0,0 @@
-from __future__ import annotations
-
-import typing
-import abc
-
-from ...core import app
-import langbot_plugin.api.entities.builtin.pipeline.query as pipeline_query
-
-preregistered_truncators: list[typing.Type[Truncator]] = []
-
-
-def truncator_class(
-    name: str,
-) -> typing.Callable[[typing.Type[Truncator]], typing.Type[Truncator]]:
-    """截断器类装饰器
-
-    Args:
-        name (str): 截断器名称
-
-    Returns:
-        typing.Callable[[typing.Type[Truncator]], typing.Type[Truncator]]: 装饰器
-    """
-
-    def decorator(cls: typing.Type[Truncator]) -> typing.Type[Truncator]:
-        assert issubclass(cls, Truncator)
-
-        cls.name = name
-
-        preregistered_truncators.append(cls)
-
-        return cls
-
-    return decorator
-
-
-class Truncator(abc.ABC):
-    """消息截断器基类"""
-
-    name: str
-
-    ap: app.Application
-
-    def __init__(self, ap: app.Application):
-        self.ap = ap
-
-    async def initialize(self):
-        pass
-
-    @abc.abstractmethod
-    async def truncate(self, query: pipeline_query.Query) -> pipeline_query.Query:
-        """截断
-
-        一般只需要操作query.messages，也可以扩展操作query.prompt, query.user_message。
-        请勿操作其他字段。
-        """
-        pass
--- a/src/langbot/pkg/pipeline/msgtrun/truncators/init.py
+++ b/src/langbot/pkg/pipeline/msgtrun/truncators/init.py
--- a/src/langbot/pkg/pipeline/msgtrun/truncators/round.py
+++ b/src/langbot/pkg/pipeline/msgtrun/truncators/round.py
@@ -1,30 +0,0 @@
-from __future__ import annotations
-
-from .. import truncator
-import langbot_plugin.api.entities.builtin.pipeline.query as pipeline_query
-
-
-@truncator.truncator_class('round')
-class RoundTruncator(truncator.Truncator):
-    """Truncate the conversation message chain to adapt to the LLM message length limit."""
-
-    async def truncate(self, query: pipeline_query.Query) -> pipeline_query.Query:
-        """截断"""
-        max_round = query.pipeline_config['ai']['local-agent']['max-round']
-
-        temp_messages = []
-
-        current_round = 0
-
-        # Traverse from back to front
-        for msg in query.messages[::-1]:
-            if current_round < max_round:
-                temp_messages.append(msg)
-                if msg.role == 'user':
-                    current_round += 1
-            else:
-                break
-
-        query.messages = temp_messages[::-1]
-
-        return query
--- a/src/langbot/pkg/pipeline/pipelinemgr.py
+++ b/src/langbot/pkg/pipeline/pipelinemgr.py
@@ -28,7 +28,6 @@ from . import (
    wrapper,
    preproc,
    ratelimit,
-    msgtrun,
 )

 importutil.import_modules_in_pkgs(
@@ -42,7 +41,6 @@ importutil.import_modules_in_pkgs(
        wrapper,
        preproc,
        ratelimit,
-        msgtrun,
    ]
 )

@@ -438,6 +436,9 @@ class PipelineManager:
        # initialize stage containers according to pipeline_entity.stages
        stage_containers: list[StageInstContainer] = []
        for stage_name in pipeline_entity.stages:
+            if stage_name not in self.stage_dict:
+                self.ap.logger.warning(f'Pipeline stage {stage_name} is not registered; skipping')
+                continue
            stage_containers.append(StageInstContainer(inst_name=stage_name, inst=self.stage_dict[stage_name](self.ap)))

        for stage_container in stage_containers:
--- a/src/langbot/pkg/pipeline/preproc/preproc.py
+++ b/src/langbot/pkg/pipeline/preproc/preproc.py
@@ -1,6 +1,7 @@
 from __future__ import annotations

 import datetime
+import typing

 from .. import stage, entities
 from langbot_plugin.api.entities.builtin.provider import message as provider_message
@@ -9,6 +10,14 @@ import langbot_plugin.api.entities.builtin.platform.message as platform_message
 import langbot_plugin.api.entities.builtin.pipeline.query as pipeline_query
 import langbot_plugin.api.entities.builtin.platform.events as platform_events

+from ...agent.runner.descriptor import AgentRunnerDescriptor
+from ...agent.runner.config_migration import ConfigMigration
+from ...agent.runner import config_schema
+
+
+DEFAULT_PROMPT_CONFIG = [
+    {'role': 'system', 'content': 'You are a helpful assistant.'},
+]

@stage.stage_class('PreProcessor')
 class PreProcessor(stage.PipelineStage):
@@ -25,55 +34,109 @@ class PreProcessor(stage.PipelineStage):
        - use_funcs
    """

+    async def _get_runner_descriptor(
+        self,
+        runner_id: str | None,
+        bound_plugins: list[str] | None,
+    ) -> AgentRunnerDescriptor | None:
+        if not runner_id:
+            return None
+
+        registry = getattr(self.ap, 'agent_runner_registry', None)
+        if registry is None:
+            return None
+
+        try:
+            return await registry.get(runner_id, bound_plugins)
+        except Exception as e:
+            self.ap.logger.debug(f'Unable to load AgentRunner descriptor for {runner_id}: {e}')
+            return None
+
+    async def _resolve_llm_model(
+        self,
+        primary_uuid: str,
+    ) -> typing.Any | None:
+        if primary_uuid in config_schema.NONE_SENTINELS:
+            return None
+        try:
+            return await self.ap.model_mgr.get_model_by_uuid(primary_uuid)
+        except ValueError:
+            self.ap.logger.warning(f'LLM model {primary_uuid} not found or not configured')
+            return None
+
+    async def _resolve_fallback_models(self, fallback_uuids: list[str]) -> list[str]:
+        valid_fallbacks = []
+        for fallback_uuid in fallback_uuids:
+            if fallback_uuid in config_schema.NONE_SENTINELS:
+                continue
+            try:
+                await self.ap.model_mgr.get_model_by_uuid(fallback_uuid)
+                valid_fallbacks.append(fallback_uuid)
+            except ValueError:
+                self.ap.logger.warning(f'Fallback model {fallback_uuid} not found, skipping')
+        return valid_fallbacks
+
+    def _runner_accepts_multimodal_input(self, descriptor: AgentRunnerDescriptor | None) -> bool:
+        if descriptor is None:
+            return True
+        return descriptor.capabilities.get('multimodal_input', False)
+
+    def _model_supports_vision(self, llm_model: typing.Any | None) -> bool:
+        if not llm_model:
+            return False
+        abilities = getattr(getattr(llm_model, 'model_entity', None), 'abilities', [])
+        return 'vision' in abilities
+
+    def _should_keep_image_inputs(
+        self,
+        descriptor: AgentRunnerDescriptor | None,
+        uses_host_models: bool,
+        llm_model: typing.Any | None,
+    ) -> bool:
+        if not self._runner_accepts_multimodal_input(descriptor):
+            return False
+        if uses_host_models:
+            return self._model_supports_vision(llm_model)
+        return True
+
+    def _strip_images_from_history(self, query: pipeline_query.Query) -> None:
+        for msg in query.messages:
+            if isinstance(msg.content, list):
+                msg.content = [elem for elem in msg.content if elem.type != 'image_url']
+
    async def process(
        self,
        query: pipeline_query.Query,
        stage_inst_name: str,
    ) -> entities.StageProcessResult:
        """Process"""
-        selected_runner = query.pipeline_config['ai']['runner']['runner']
-        include_skill_authoring = (
-            selected_runner == 'local-agent' and getattr(self.ap, 'skill_service', None) is not None
-        )
+        # Resolve runner ID using ConfigMigration (supports both new and old formats)
+        runner_id = ConfigMigration.resolve_runner_id(query.pipeline_config)
+
+        # Get runner config from ai.runner_config[runner_id].
+        runner_config = ConfigMigration.resolve_runner_config(query.pipeline_config, runner_id) if runner_id else {}
+        query.variables = query.variables or {}
+        bound_plugins = query.variables.get('_pipeline_bound_plugins', None)
+        bound_mcp_servers = query.variables.get('_pipeline_bound_mcp_servers', None)
+        descriptor = await self._get_runner_descriptor(runner_id, bound_plugins)

        session = await self.ap.sess_mgr.get_session(query)

-        # When not local-agent, llm_model is None
+        uses_host_models = config_schema.uses_host_models(descriptor)
        llm_model = None
-        if selected_runner == 'local-agent':
-            # Read model config — new format is { primary: str, fallbacks: [str] },
-            # but handle legacy plain string for backward compatibility
-            model_config = query.pipeline_config['ai']['local-agent'].get('model', {})
-            if isinstance(model_config, str):
-                # Legacy format: plain UUID string
-                primary_uuid = model_config
-                fallback_uuids = []
-            else:
-                primary_uuid = model_config.get('primary', '')
-                fallback_uuids = model_config.get('fallbacks', [])
+        if uses_host_models:
+            primary_uuid, fallback_uuids = config_schema.extract_model_selection(descriptor, runner_config)
+            llm_model = await self._resolve_llm_model(primary_uuid)
+            valid_fallbacks = await self._resolve_fallback_models(fallback_uuids)
+            if valid_fallbacks:
+                query.variables['_fallback_model_uuids'] = valid_fallbacks

-            if primary_uuid:
-                try:
-                    llm_model = await self.ap.model_mgr.get_model_by_uuid(primary_uuid)
-                except ValueError:
-                    self.ap.logger.warning(f'LLM model {primary_uuid} not found or not configured')
-
-            # Resolve fallback model UUIDs
-            if fallback_uuids:
-                valid_fallbacks = []
-                for fb_uuid in fallback_uuids:
-                    try:
-                        await self.ap.model_mgr.get_model_by_uuid(fb_uuid)
-                        valid_fallbacks.append(fb_uuid)
-                    except ValueError:
-                        self.ap.logger.warning(f'Fallback model {fb_uuid} not found, skipping')
-                if valid_fallbacks:
-                    query.variables['_fallback_model_uuids'] = valid_fallbacks
+        prompt_config = config_schema.extract_prompt_config(descriptor, runner_config, DEFAULT_PROMPT_CONFIG)

        conversation = await self.ap.sess_mgr.get_conversation(
            query,
            session,
-            query.pipeline_config['ai']['local-agent']['prompt'],
+            prompt_config,
            query.pipeline_uuid,
            query.bot_uuid,
        )
@@ -82,7 +145,7 @@ class PreProcessor(stage.PipelineStage):
        # been idle for longer than the configured conversation expire time.
        # The idle window is measured from the last preprocess/update time, not
        # from the conversation creation time.
-        conversation_expire_time = query.pipeline_config.get('ai', {}).get('runner', {}).get('expire-time', None)
+        conversation_expire_time = ConfigMigration.get_expire_time(query.pipeline_config)
        now = datetime.datetime.now()
        if conversation_expire_time is not None and conversation_expire_time > 0:
            last_update_time = getattr(conversation, 'update_time', None) or getattr(conversation, 'create_time', None)
@@ -104,20 +167,15 @@ class PreProcessor(stage.PipelineStage):
        query.prompt = conversation.prompt.copy()
        query.messages = conversation.messages.copy()

-        if selected_runner == 'local-agent':
+        if uses_host_models:
            query.use_funcs = []
            if llm_model:
                query.use_llm_model_uuid = llm_model.model_entity.uuid

-                if llm_model.model_entity.abilities.__contains__('func_call'):
-                    # Get bound plugins and MCP servers for filtering tools
-                    bound_plugins = query.variables.get('_pipeline_bound_plugins', None)
-                    bound_mcp_servers = query.variables.get('_pipeline_bound_mcp_servers', None)
-                    query.use_funcs = await self.ap.tool_mgr.get_all_tools(
-                        bound_plugins,
-                        bound_mcp_servers,
-                        include_skill_authoring=include_skill_authoring,
-                    )
+                if config_schema.uses_host_tools(descriptor) and llm_model.model_entity.abilities.__contains__(
+                    'func_call'
+                ):
+                    query.use_funcs = await self.ap.tool_mgr.get_all_tools(bound_plugins, bound_mcp_servers)

                    self.ap.logger.debug(f'Bound plugins: {bound_plugins}')
                    self.ap.logger.debug(f'Bound MCP servers: {bound_mcp_servers}')
@@ -125,14 +183,18 @@ class PreProcessor(stage.PipelineStage):

            # If primary model doesn't support func_call but fallback models exist,
            # load tools anyway since fallback models may support them
-            if not query.use_funcs and query.variables.get('_fallback_model_uuids'):
-                bound_plugins = query.variables.get('_pipeline_bound_plugins', None)
-                bound_mcp_servers = query.variables.get('_pipeline_bound_mcp_servers', None)
-                query.use_funcs = await self.ap.tool_mgr.get_all_tools(
-                    bound_plugins,
-                    bound_mcp_servers,
-                    include_skill_authoring=include_skill_authoring,
-                )
+            if (
+                config_schema.uses_host_tools(descriptor)
+                and not query.use_funcs
+                and query.variables.get('_fallback_model_uuids')
+            ):
+                query.use_funcs = await self.ap.tool_mgr.get_all_tools(bound_plugins, bound_mcp_servers)
+        elif config_schema.uses_host_tools(descriptor):
+            query.use_funcs = await self.ap.tool_mgr.get_all_tools(bound_plugins, bound_mcp_servers)
+
+            self.ap.logger.debug(f'Bound plugins: {bound_plugins}')
+            self.ap.logger.debug(f'Bound MCP servers: {bound_mcp_servers}')
+            self.ap.logger.debug(f'Use funcs: {query.use_funcs}')

        sender_name = ''

@@ -157,32 +219,21 @@ class PreProcessor(stage.PipelineStage):
        }
        query.variables.update(variables)

-        # Check if this model supports vision, if not, remove all images
-        # TODO this checking should be performed in runner, and in this stage, the image should be reserved
-        if (
-            selected_runner == 'local-agent'
-            and llm_model
-            and not llm_model.model_entity.abilities.__contains__('vision')
-        ):
-            for msg in query.messages:
-                if isinstance(msg.content, list):
-                    for me in msg.content:
-                        if me.type == 'image_url':
-                            msg.content.remove(me)
+        keep_image_inputs = self._should_keep_image_inputs(descriptor, uses_host_models, llm_model)
+        if not keep_image_inputs:
+            self._strip_images_from_history(query)

        content_list: list[provider_message.ContentElement] = []

        plain_text = ''
-        quote_msg = query.pipeline_config['trigger'].get('misc', '').get('combine-quote-message')
+        quote_msg = query.pipeline_config['trigger'].get('misc', {}).get('combine-quote-message', False)

        for me in query.message_chain:
            if isinstance(me, platform_message.Plain):
                content_list.append(provider_message.ContentElement.from_text(me.text))
                plain_text += me.text
            elif isinstance(me, platform_message.Image):
-                if selected_runner != 'local-agent' or (
-                    llm_model and llm_model.model_entity.abilities.__contains__('vision')
-                ):
+                if keep_image_inputs:
                    if me.base64 is not None:
                        content_list.append(provider_message.ContentElement.from_image_base64(me.base64))
            elif isinstance(me, platform_message.Voice):
@@ -201,9 +252,7 @@ class PreProcessor(stage.PipelineStage):
                    if isinstance(msg, platform_message.Plain):
                        content_list.append(provider_message.ContentElement.from_text(msg.text))
                    elif isinstance(msg, platform_message.Image):
-                        if selected_runner != 'local-agent' or (
-                            llm_model and llm_model.model_entity.abilities.__contains__('vision')
-                        ):
+                        if keep_image_inputs:
                            if msg.base64 is not None:
                                content_list.append(provider_message.ContentElement.from_image_base64(msg.base64))
                    elif isinstance(msg, platform_message.File):
@@ -223,14 +272,12 @@ class PreProcessor(stage.PipelineStage):

        query.user_message = provider_message.Message(role='user', content=content_list)

-        # Extract knowledge base UUIDs into query variables so plugins can modify them
-        # during PromptPreProcessing before the runner performs retrieval.
-        kb_uuids = query.pipeline_config['ai']['local-agent'].get('knowledge-bases', [])
-        if not kb_uuids:
-            old_kb_uuid = query.pipeline_config['ai']['local-agent'].get('knowledge-base', '')
-            if old_kb_uuid and old_kb_uuid != '__none__':
-                kb_uuids = [old_kb_uuid]
-        query.variables['_knowledge_base_uuids'] = list(kb_uuids)
+        # Extract configured KB UUIDs into query variables so PromptPreProcessing
+        # plugins can still adjust the authorized retrieval set before run_agent.
+        query.variables['_knowledge_base_uuids'] = config_schema.extract_knowledge_base_uuids(
+            descriptor,
+            runner_config,
+        )

        # =========== 触发事件 PromptPreProcessing

@@ -248,67 +295,4 @@ class PreProcessor(stage.PipelineStage):
        query.prompt.messages = event_ctx.event.default_prompt
        query.messages = event_ctx.event.prompt

-        # =========== Skill awareness for the local-agent runner ===========
-        # The actual activation goes through the ``activate`` Tool Call so the
-        # LLM doesn't see full SKILL.md instructions until it commits to a
-        # skill (Claude Code's progressive disclosure). But the LLM still has
-        # to KNOW which skills exist to make that choice, so we:
-        #   1. resolve the pipeline's bound skills and stash them in
-        #      ``query.variables['_pipeline_bound_skills']`` for downstream
-        #      visibility checks (skill loader, native exec workdir);
-        #   2. inject a short ``Available Skills`` index (name + description
-        #      only) into the system prompt. The contributor's original PR
-        #      relied on this injection; without it the LLM never discovers
-        #      the skills are there and just calls native tools instead.
-        if selected_runner == 'local-agent' and self.ap.skill_mgr:
-            pipeline_data = await self.ap.pipeline_service.get_pipeline(query.pipeline_uuid)
-            extensions_prefs = (pipeline_data or {}).get('extensions_preferences', {})
-            enable_all_skills = extensions_prefs.get('enable_all_skills', True)
-
-            if enable_all_skills:
-                bound_skills = None  # None = all loaded skills are visible
-            else:
-                bound_skills = extensions_prefs.get('skills', [])
-
-            query.variables['_pipeline_bound_skills'] = bound_skills
-
-            skill_addition = self.ap.skill_mgr.build_skill_aware_prompt_addition(
-                bound_skills=bound_skills,
-            )
-            if skill_addition:
-                # Append to the first system message; create one if the
-                # prompt has none. Handles both plain-string and
-                # content-element (list) message bodies.
-                if query.prompt.messages and query.prompt.messages[0].role == 'system':
-                    head = query.prompt.messages[0]
-                    if isinstance(head.content, str):
-                        head.content = head.content + skill_addition
-                    elif isinstance(head.content, list):
-                        appended = False
-                        for ce in head.content:
-                            if getattr(ce, 'type', None) == 'text':
-                                ce.text = (ce.text or '') + skill_addition
-                                appended = True
-                                break
-                        if not appended:
-                            head.content.append(provider_message.ContentElement(type='text', text=skill_addition))
-                else:
-                    query.prompt.messages.insert(
-                        0,
-                        provider_message.Message(role='system', content=skill_addition.strip()),
-                    )
-                self.ap.logger.debug(
-                    f'Skill index injected into system prompt: '
-                    f'pipeline={query.pipeline_uuid} '
-                    f'bound_skills={bound_skills or "all"} '
-                    f'loaded_skills={len(self.ap.skill_mgr.skills)}'
-                )
-            else:
-                self.ap.logger.debug(
-                    f'No skills available for prompt injection: '
-                    f'pipeline={query.pipeline_uuid} '
-                    f'loaded_skills={len(self.ap.skill_mgr.skills)} '
-                    f'bound_skills={bound_skills}'
-                )
-
        return entities.StageProcessResult(result_type=entities.ResultType.CONTINUE, new_query=query)
--- a/src/langbot/pkg/pipeline/process/handler.py
+++ b/src/langbot/pkg/pipeline/process/handler.py
@@ -5,7 +5,6 @@ import abc
 from ...core import app
 from .. import entities
 import langbot_plugin.api.entities.builtin.pipeline.query as pipeline_query
-import langbot_plugin.api.entities.builtin.provider.message as provider_message


 class MessageHandler(metaclass=abc.ABCMeta):
@@ -32,29 +31,3 @@ class MessageHandler(metaclass=abc.ABCMeta):
        if len(s0) > 20 or '\n' in s:
            s0 = s0[:20] + '...'
        return s0
-
-    def format_result_log(
-        self,
-        result: provider_message.Message | provider_message.MessageChunk,
-    ) -> str | None:
-        if result.tool_calls:
-            tool_names = [tc.function.name for tc in result.tool_calls if tc.function and tc.function.name]
-            if tool_names:
-                return f'{result.role}: requested tools: {", ".join(tool_names)}'
-            return f'{result.role}: requested tool calls'
-
-        content = result.content
-        if isinstance(content, str):
-            if not content.strip():
-                return None
-
-            if result.role == 'tool':
-                if content.startswith('err:'):
-                    return f'tool error: {self.cut_str(content)}'
-
-            return self.cut_str(result.readable_str())
-
-        if isinstance(content, list) and len(content) == 0:
-            return None
-
-        return self.cut_str(result.readable_str())
--- a/src/langbot/pkg/pipeline/process/handlers/chat.py
+++ b/src/langbot/pkg/pipeline/process/handlers/chat.py
@@ -9,29 +9,28 @@ from datetime import datetime

 from .. import handler
 from ... import entities
-from ....provider import runner as runner_module

 import langbot_plugin.api.entities.events as events
-from ....utils import importutil, constants, runner as runner_utils
-from ....provider import runners
+from ....utils import constants, runner as runner_utils
 import langbot_plugin.api.entities.builtin.provider.session as provider_session
 import langbot_plugin.api.entities.builtin.pipeline.query as pipeline_query
 import langbot_plugin.api.entities.builtin.provider.message as provider_message


-importutil.import_modules_in_pkg(runners)
-
-
 class ChatMessageHandler(handler.MessageHandler):
+    """Chat message handler using AgentRunOrchestrator.
+
+    This handler delegates all runner execution to the agent_run_orchestrator,
+    which resolves runner ID, builds context, invokes plugin runtime,
+    and normalizes results.
+    """
+
    async def handle(
        self,
        query: pipeline_query.Query,
    ) -> typing.AsyncGenerator[entities.StageProcessResult, None]:
-        """处理"""
-        # 调API
-        #   生成器
-
-        # 触发插件事件
+        """Handle chat message by delegating to AgentRunOrchestrator."""
+        # Trigger plugin event
        event_class = (
            events.PersonNormalMessageReceived
            if query.launcher_type == provider_session.LauncherTypes.PERSON
@@ -52,7 +51,7 @@ class ChatMessageHandler(handler.MessageHandler):
        bound_plugins = query.variables.get('_pipeline_bound_plugins', None)
        event_ctx = await self.ap.plugin_connector.emit_event(event, bound_plugins)

-        is_create_card = False  # 判断下是否需要创建流式卡片
+        is_create_card = False  # Track if streaming card was created

        if event_ctx.is_prevented_default():
            if event_ctx.event.reply_message_chain is not None:
@@ -83,85 +82,85 @@ class ChatMessageHandler(handler.MessageHandler):
                is_stream = False

            try:
-                for r in runner_module.preregistered_runners:
-                    if r.name == query.pipeline_config['ai']['runner']['runner']:
-                        runner = r(self.ap, query.pipeline_config)
-                        break
-                else:
-                    raise ValueError(f'Request Runner not found: {query.pipeline_config["ai"]["runner"]["runner"]}')
                # Mark start time for telemetry
                start_ts = time.time()

-                if is_stream:
-                    resp_message_id = uuid.uuid4()
-                    chunk_count = 0  # Track streaming chunks to reduce excessive logging
+                # Create a single resp_message_id for the entire streaming response
+                resp_message_id = uuid.uuid4()

-                    async for result in runner.run(query):
-                        result.resp_message_id = str(resp_message_id)
+                # Use AgentRunOrchestrator to run the agent
+                # This replaces direct runner lookup and PluginAgentRunnerWrapper
+                async for result in self.ap.agent_run_orchestrator.run_from_query(query):
+                    result.resp_message_id = str(resp_message_id)
+
+                    # For streaming mode, pop previous response before adding new chunk
+                    # This allows incremental card updates
+                    if is_stream:
                        if query.resp_messages:
                            query.resp_messages.pop()
                        if query.resp_message_chain:
                            query.resp_message_chain.pop()
-                        # 此时连接外部 AI 服务正常,创建卡片
-                        if not is_create_card:  # 只有不是第一次才创建卡片
-                            await query.adapter.create_message_card(str(resp_message_id), query.message_event)
-                            is_create_card = True
-                        query.resp_messages.append(result)

-                        chunk_count += 1
-                        # Only log every 10th chunk to reduce excessive logging during streaming
-                        # This prevents memory overflow from thousands of log entries per conversation
-                        # First chunk uses INFO level to confirm connection establishment
-                        if chunk_count == 1:
-                            summary = self.format_result_log(result)
-                            if summary is not None:
-                                self.ap.logger.info(f'Conversation({query.query_id}) Streaming started: {summary}')
-                            else:
-                                self.ap.logger.info(f'Conversation({query.query_id}) Streaming started')
-                        elif chunk_count % 10 == 0:
-                            self.ap.logger.debug(
-                                f'Conversation({query.query_id}) Streaming chunk {chunk_count}: {self.cut_str(result.readable_str())}'
-                            )
+                    # Create streaming card on first result (connection established)
+                    if is_stream and not is_create_card:
+                        await query.adapter.create_message_card(str(resp_message_id), query.message_event)
+                        is_create_card = True

-                        if result.content is not None:
-                            text_length += len(result.content)
+                    query.resp_messages.append(result)

-                        yield entities.StageProcessResult(result_type=entities.ResultType.CONTINUE, new_query=query)
+                    # Logging (reduce verbosity for streaming chunks)
+                    if not is_stream:
+                        self.ap.logger.info(
+                            f'Conversation({query.query_id}) Response: {self.cut_str(result.readable_str())}'
+                        )

-                    # Log final summary after streaming completes
+                    if result.content is not None:
+                        text_length += len(result.content)
+
+                    yield entities.StageProcessResult(result_type=entities.ResultType.CONTINUE, new_query=query)
+
+                # Log final summary after streaming completes
+                if is_stream:
+                    chunk_count = len(query.resp_messages)
                    self.ap.logger.info(
                        f'Conversation({query.query_id}) Streaming completed: {chunk_count} chunks, {text_length} chars'
                    )

-                else:
-                    async for result in runner.run(query):
-                        query.resp_messages.append(result)
-
-                        summary = self.format_result_log(result)
-                        if summary is not None:
-                            self.ap.logger.info(f'Conversation({query.query_id}) Response: {summary}')
-
-                        if result.content is not None:
-                            text_length += len(result.content)
-
-                        yield entities.StageProcessResult(result_type=entities.ResultType.CONTINUE, new_query=query)
-
+                # Update conversation history
                query.session.using_conversation.messages.append(query.user_message)
-
                query.session.using_conversation.messages.extend(query.resp_messages)
+
            except Exception as e:
+                # Import orchestrator errors for specific handling
+                from ....agent.runner.errors import (
+                    RunnerNotFoundError,
+                    RunnerNotAuthorizedError,
+                    RunnerExecutionError,
+                )
+
                error_info = f'{traceback.format_exc()}'
                self.ap.logger.error(f'Conversation({query.query_id}) Request Failed: {error_info}')
-                traceback.print_exc()

-                exception_handling = query.pipeline_config['output']['misc'].get('exception-handling', 'show-hint')
+                # Handle specific runner errors with appropriate messages
+                if isinstance(e, RunnerNotFoundError):
+                    user_notice = f'Agent runner not found: {e.runner_id}'
+                elif isinstance(e, RunnerNotAuthorizedError):
+                    user_notice = 'Agent runner not authorized for this pipeline'
+                elif isinstance(e, RunnerExecutionError):
+                    if e.retryable:
+                        user_notice = 'Agent runner temporarily unavailable. Please try again.'
+                    else:
+                        user_notice = 'Agent runner execution failed.'
+                else:
+                    # Use existing exception handling
+                    exception_handling = query.pipeline_config['output']['misc'].get('exception-handling', 'show-hint')

-                if exception_handling == 'show-error':
-                    user_notice = f'{e}'
-                elif exception_handling == 'show-hint':
-                    user_notice = query.pipeline_config['output']['misc'].get('failure-hint', 'Request failed.')
-                else:  # hide
-                    user_notice = None
+                    if exception_handling == 'show-error':
+                        user_notice = f'{e}'
+                    elif exception_handling == 'show-hint':
+                        user_notice = query.pipeline_config['output']['misc'].get('failure-hint', 'Request failed.')
+                    else:  # hide
+                        user_notice = None

                yield entities.StageProcessResult(
                    result_type=entities.ResultType.INTERRUPT,
@@ -171,7 +170,7 @@ class ChatMessageHandler(handler.MessageHandler):
                    debug_notice=traceback.format_exc(),
                )
            finally:
-                # Telemetry reporting: collect minimal per-query execution info and send asynchronously
+                # Telemetry reporting
                try:
                    end_ts = time.time()
                    duration_ms = None
@@ -179,16 +178,14 @@ class ChatMessageHandler(handler.MessageHandler):
                        duration_ms = int((end_ts - start_ts) * 1000)

                    adapter_name = query.adapter.__class__.__name__ if hasattr(query, 'adapter') else None
-                    runner_name = (
-                        query.pipeline_config.get('ai', {}).get('runner', {}).get('runner')
-                        if query.pipeline_config
-                        else None
-                    )

-                    # Model name if using localagent
+                    # Use orchestrator to resolve runner ID for telemetry
+                    runner_name = self.ap.agent_run_orchestrator.resolve_runner_id_for_telemetry(query)
+
+                    # Model name if available
                    model_name = None
                    try:
-                        if runner_name == 'local-agent' and getattr(query, 'use_llm_model_uuid', None):
+                        if getattr(query, 'use_llm_model_uuid', None):
                            m = await self.ap.model_mgr.get_model_by_uuid(query.use_llm_model_uuid)
                            if m and getattr(m, 'model_entity', None):
                                model_name = getattr(m.model_entity, 'name', None)
@@ -198,7 +195,7 @@ class ChatMessageHandler(handler.MessageHandler):
                    pipeline_plugins = query.variables.get('_pipeline_bound_plugins', None)

                    runner_category = runner_utils.get_runner_category_from_runner(
-                        runner_name, runner, query.pipeline_config
+                        runner_name, None, query.pipeline_config
                    )

                    payload = {
@@ -216,7 +213,6 @@ class ChatMessageHandler(handler.MessageHandler):
                        'timestamp': datetime.utcnow().isoformat(),
                    }

-                    # Send telemetry asynchronously and do not block pipeline via app's telemetry manager
                    await self.ap.telemetry.start_send_task(payload)

                    # Trigger survey event on first successful non-WebSocket response
@@ -224,5 +220,4 @@ class ChatMessageHandler(handler.MessageHandler):
                        if self.ap.survey:
                            await self.ap.survey.trigger_event('first_bot_response_success')
                except Exception as ex:
-                    # Ensure telemetry issues do not affect normal flow
-                    self.ap.logger.warning(f'Failed to send telemetry: {ex}')
+                    self.ap.logger.warning(f'Failed to send telemetry: {ex}')
--- a/src/langbot/pkg/platform/sources/aiocqhttp.py
+++ b/src/langbot/pkg/platform/sources/aiocqhttp.py
@@ -3,7 +3,6 @@ import typing
 import asyncio
 import traceback
 import datetime
-import json

 import aiocqhttp
 import pydantic
@@ -294,29 +293,6 @@ class AiocqhttpMessageConverter(abstract_platform_adapter.AbstractMessageConvert
            elif msg.type == 'dice':
                face_id = msg.data['result']
                yiri_msg_list.append(platform_message.Face(face_type='dice', face_id=int(face_id), face_name='骰子'))
-            elif msg.type == 'json':
-                try:
-                    raw = msg.data.get('data', {})
-                    if isinstance(raw, str):
-                        raw = json.loads(raw)
-                    if isinstance(raw, dict):
-                        _meta = raw.get('meta', {}) or {}
-                        if isinstance(_meta, dict):
-                            _detail = _meta.get('detail_1') or _meta.get('music') or _meta.get('news') or {}
-                        else:
-                            _detail = {}
-                        if isinstance(_detail, dict):
-                            preview = _detail.get('preview', '')
-                            title = _detail.get('desc', '') or _detail.get('title', '')
-                            url = _detail.get('qqdocurl', '') or _detail.get('jumpUrl', '')
-                        else:
-                            preview = title = url = ''
-                        text = ' '.join([f'[{raw.get("app", "")}]', preview, title, url]).strip()
-                        yiri_msg_list.append(platform_message.Plain(text=text or '[收到一张JSON卡片]'))
-                    else:
-                        yiri_msg_list.append(platform_message.Plain(text=str(raw)))
-                except Exception:
-                    yiri_msg_list.append(platform_message.Plain(text='[收到一张JSON卡片]'))

        chain = platform_message.MessageChain(yiri_msg_list)

--- a/src/langbot/pkg/platform/sources/web_page_bot_adapter.py
+++ b/src/langbot/pkg/platform/sources/web_page_bot_adapter.py
@@ -84,6 +84,20 @@ class WebPageBotAdapter(abstract_platform_adapter.AbstractMessagePlatformAdapter
    ):
        self.listeners.pop(event_type, None)

+    async def is_stream_output_supported(self) -> bool:
+        """Delegate stream output check to ws_adapter."""
+        if self._ws_adapter is not None:
+            return await self._ws_adapter.is_stream_output_supported()
+        return False
+
+    async def create_message_card(
+        self, message_id: str | int, event: platform_events.MessageEvent
+    ) -> bool:
+        """Delegate create_message_card to ws_adapter."""
+        if self._ws_adapter is not None:
+            return await self._ws_adapter.create_message_card(message_id, event)
+        return False
+
    async def is_muted(self, group_id: int) -> bool:
        return False

--- a/src/langbot/pkg/plugin/connector.py
+++ b/src/langbot/pkg/plugin/connector.py
@@ -18,7 +18,6 @@ from langbot_plugin.api.entities.builtin.pipeline.query import provider_session
 from ..core import app
 from . import handler
 from ..utils import platform
-from ..utils.managed_runtime import ManagedRuntimeConnector
 from langbot_plugin.runtime.io.controllers.stdio import (
    client as stdio_client_controller,
 )
@@ -40,9 +39,11 @@ class PluginRuntimeNotConnectedError(RuntimeError):
    """Raised when plugin runtime operations are requested before connection."""


-class PluginRuntimeConnector(ManagedRuntimeConnector):
+class PluginRuntimeConnector:
    """Plugin runtime connector"""

+    ap: app.Application
+
    handler: handler.RuntimeConnectionHandler

    handler_task: asyncio.Task
@@ -53,6 +54,10 @@ class PluginRuntimeConnector(ManagedRuntimeConnector):

    ctrl: stdio_client_controller.StdioClientController | ws_client_controller.WebSocketClientController

+    runtime_subprocess_on_windows: asyncio.subprocess.Process | None = None
+
+    runtime_subprocess_on_windows_task: asyncio.Task | None = None
+
    runtime_disconnect_callback: typing.Callable[
        [PluginRuntimeConnector], typing.Coroutine[typing.Any, typing.Any, None]
    ]
@@ -67,7 +72,7 @@ class PluginRuntimeConnector(ManagedRuntimeConnector):
            [PluginRuntimeConnector], typing.Coroutine[typing.Any, typing.Any, None]
        ],
    ):
-        super().__init__(ap)
+        self.ap = ap
        self.runtime_disconnect_callback = runtime_disconnect_callback
        self.is_enable_plugin = self.ap.instance_config.data.get('plugin', {}).get('enable', True)

@@ -103,16 +108,6 @@ class PluginRuntimeConnector(ManagedRuntimeConnector):

            self.handler_task = asyncio.create_task(self.handler.run())
            _ = await self.handler.ping()
-            # Push the configured marketplace (Space) URL to the runtime so it
-            # downloads plugins from the same Space LangBot is bound to, rather
-            # than relying on the runtime's own env/default.
-            space_url = self.ap.instance_config.data.get('space', {}).get('url', '').rstrip('/')
-            if space_url:
-                try:
-                    await self.handler.set_runtime_config(cloud_service_url=space_url)
-                    self.ap.logger.info(f'Pushed marketplace URL to plugin runtime: {space_url}')
-                except Exception as e:
-                    self.ap.logger.warning(f'Failed to push runtime config: {e}')
            self.ap.logger.info('Connected to plugin runtime.')
            await self.handler_task

@@ -145,7 +140,19 @@ class PluginRuntimeConnector(ManagedRuntimeConnector):
            # We have to launch runtime via cmd but communicate via ws.
            self.ap.logger.info('(windows) use cmd to launch plugin runtime and communicate via ws')

-            await self._start_runtime_subprocess('-m', 'langbot_plugin.cli.__init__', 'rt')
+            if self.runtime_subprocess_on_windows is None:  # only launch once
+                python_path = sys.executable
+                env = os.environ.copy()
+                self.runtime_subprocess_on_windows = await asyncio.create_subprocess_exec(
+                    python_path,
+                    '-m',
+                    'langbot_plugin.cli.__init__',
+                    'rt',
+                    env=env,
+                )
+
+                # hold the process
+                self.runtime_subprocess_on_windows_task = asyncio.create_task(self.runtime_subprocess_on_windows.wait())

            ws_url = 'ws://localhost:5400/control/ws'

@@ -187,6 +194,15 @@ class PluginRuntimeConnector(ManagedRuntimeConnector):
    async def initialize_plugins(self):
        pass

+    async def _refresh_agent_runner_registry(self) -> None:
+        registry = getattr(self.ap, 'agent_runner_registry', None)
+        if registry is None:
+            return
+        try:
+            await registry.refresh()
+        except Exception as e:
+            self.ap.logger.warning(f'Failed to refresh agent runner registry: {e}')
+
    async def ping_plugin_runtime(self):
        if not hasattr(self, 'handler'):
            raise PluginRuntimeNotConnectedError('Plugin runtime is not connected')
@@ -229,81 +245,6 @@ class PluginRuntimeConnector(ManagedRuntimeConnector):

        return plugin_author, plugin_name

-    async def _install_mcp_from_marketplace(
-        self,
-        mcp_data: dict[str, Any],
-        task_context: taskmgr.TaskContext | None = None,
-    ):
-        """Install an MCP server from marketplace data.
-
-        Marketplace MCP records carry the runtime-ready ``mode`` and
-        ``extra_args`` directly (the same shape LangBot stores in
-        ``mcp_servers``), so they are used as-is rather than reconstructed.
-        For ``stdio`` this preserves ``command``/``args``/``env``/``box``;
-        for ``http``/``sse`` it preserves ``url``/``headers``/``timeout``/
-        ``ssereadtimeout``.
-        """
-        from ..entity.persistence import mcp as persistence_mcp
-        import uuid
-
-        mode = mcp_data.get('mode') or 'stdio'
-        extra_args = mcp_data.get('extra_args') or {}
-        # Use __ instead of / to avoid URL routing issues with slashes
-        name = f'{mcp_data.get("author", "")}__{mcp_data.get("name", "")}'
-
-        # Check if MCP server already exists
-        existing = await self.ap.persistence_mgr.execute_async(
-            sqlalchemy.select(persistence_mcp.MCPServer).where(persistence_mcp.MCPServer.name == name)
-        )
-        if existing.scalar_one_or_none():
-            self.ap.logger.info(f'MCP server {name} already exists, skipping installation')
-            return
-
-        # Create MCP server record
-        server_uuid = str(uuid.uuid4())
-        server_data = {
-            'uuid': server_uuid,
-            'name': name,
-            'enable': True,
-            'mode': mode,
-            'extra_args': extra_args,
-        }
-
-        await self.ap.persistence_mgr.execute_async(sqlalchemy.insert(persistence_mcp.MCPServer).values(server_data))
-
-        # Start the MCP server
-        result = await self.ap.persistence_mgr.execute_async(
-            sqlalchemy.select(persistence_mcp.MCPServer).where(persistence_mcp.MCPServer.uuid == server_uuid)
-        )
-        server_entity = result.first()
-        if server_entity:
-            server_config = self.ap.persistence_mgr.serialize_model(persistence_mcp.MCPServer, server_entity)
-            if self.ap.tool_mgr.mcp_tool_loader:
-                mcp_task = asyncio.create_task(self.ap.tool_mgr.mcp_tool_loader.host_mcp_server(server_config))
-                self.ap.tool_mgr.mcp_tool_loader._hosted_mcp_tasks.append(mcp_task)
-
-        self.ap.logger.info(f'Installed MCP server {name} from marketplace')
-
-    async def _install_skill_from_zip(
-        self,
-        file_bytes: bytes,
-        filename: str,
-        task_context: taskmgr.TaskContext | None = None,
-    ):
-        """Install a skill from marketplace ZIP data."""
-        from ..api.http.service.skill import SkillService
-
-        skill_service = SkillService(self.ap)
-
-        self.ap.logger.info(f'Installing skill from marketplace ZIP ({len(file_bytes)} bytes)')
-
-        # Install from ZIP using skill service
-        result = await skill_service.install_from_zip_upload(
-            file_bytes=file_bytes,
-            filename=filename + '.zip',
-        )
-        self.ap.logger.info(f'Skill installed successfully: {result}')
-
    def _build_plugin_startup_failure_message(
        self,
        plugin_author: str,
@@ -366,117 +307,6 @@ class PluginRuntimeConnector(ManagedRuntimeConnector):
        plugin_author = install_info.get('plugin_author')
        plugin_name = install_info.get('plugin_name')

-        if install_source == PluginInstallSource.MARKETPLACE:
-            # Handle marketplace plugin/mcp/skill installation
-            plugin_author = install_info.get('plugin_author', '')
-            plugin_name = install_info.get('plugin_name', '')
-            space_url = (
-                self.ap.instance_config.data.get('space', {}).get('url', 'https://space.langbot.app').rstrip('/')
-            )
-
-            # Try MCP endpoint first
-            async with httpx.AsyncClient(trust_env=True, timeout=15) as client:
-                mcp_resp = await client.get(f'{space_url}/api/v1/marketplace/mcps/{plugin_author}/{plugin_name}')
-                if mcp_resp.status_code == 200:
-                    mcp_data = mcp_resp.json().get('data', {}).get('mcp', {})
-                    if mcp_data.get('mode'):
-                        # It's an MCP - create server locally
-                        self.ap.logger.info(f'Installing MCP from marketplace: {plugin_author}/{plugin_name}')
-                        if task_context:
-                            task_context.set_current_action('installing mcp server')
-                        await self._install_mcp_from_marketplace(mcp_data, task_context)
-                        # Best-effort install report (bumps marketplace install_count).
-                        try:
-                            await client.post(
-                                f'{space_url}/api/v1/marketplace/mcps/{plugin_author}/{plugin_name}/install'
-                            )
-                        except Exception as report_err:
-                            self.ap.logger.debug(f'Failed to report MCP install: {report_err}')
-                        return
-                    else:
-                        raise Exception(f'MCP {plugin_author}/{plugin_name} has no mode')
-                elif mcp_resp.status_code == 404:
-                    # Try skill endpoint - download ZIP and install
-                    self.ap.logger.info(f'Trying skill endpoint for: {plugin_author}/{plugin_name}')
-                    if task_context:
-                        task_context.set_current_action('checking skill marketplace')
-
-                    # Get skill detail to find version
-                    skill_resp = await client.get(
-                        f'{space_url}/api/v1/marketplace/skills/{plugin_author}/{plugin_name}'
-                    )
-                    if skill_resp.status_code == 200:
-                        self.ap.logger.info(f'Installing skill from marketplace: {plugin_author}/{plugin_name}')
-                        if task_context:
-                            task_context.set_current_action('installing skill from marketplace')
-
-                        # Download the skill ZIP (no version needed - uses latest)
-                        if task_context:
-                            task_context.set_current_action('downloading skill package')
-
-                        download_resp = await client.get(
-                            f'{space_url}/api/v1/marketplace/skills/download/{plugin_author}/{plugin_name}'
-                        )
-                        if download_resp.status_code != 200:
-                            raise Exception(
-                                f'Failed to download skill {plugin_author}/{plugin_name}: {download_resp.status_code}'
-                            )
-
-                        file_bytes = download_resp.content
-                        file_size = len(file_bytes)
-                        self.ap.logger.info(f'Downloaded skill ZIP ({file_size} bytes)')
-
-                        # Install skill from ZIP using skill service
-                        await self._install_skill_from_zip(file_bytes, f'{plugin_author}-{plugin_name}', task_context)
-                        return
-                    elif skill_resp.status_code == 404:
-                        # Try plugin endpoint - get versions and download
-                        self.ap.logger.info(f'Trying plugin endpoint for: {plugin_author}/{plugin_name}')
-                        if task_context:
-                            task_context.set_current_action('checking plugin marketplace')
-
-                        # Get plugin versions to find latest
-                        versions_resp = await client.get(
-                            f'{space_url}/api/v1/marketplace/plugins/{plugin_author}/{plugin_name}/versions'
-                        )
-                        if versions_resp.status_code == 200:
-                            versions_data = versions_resp.json().get('data', {}).get('versions', [])
-                            if versions_data:
-                                latest_version = versions_data[0].get('version', '')
-                                if latest_version:
-                                    self.ap.logger.info(
-                                        f'Installing plugin from marketplace: {plugin_author}/{plugin_name} v{latest_version}'
-                                    )
-                                    if task_context:
-                                        task_context.set_current_action('downloading plugin package')
-
-                                    download_resp = await client.get(
-                                        f'{space_url}/api/v1/marketplace/plugins/download/{plugin_author}/{plugin_name}/{latest_version}'
-                                    )
-                                    if download_resp.status_code != 200:
-                                        raise Exception(
-                                            f'Failed to download plugin {plugin_author}/{plugin_name}: {download_resp.status_code}'
-                                        )
-
-                                    file_bytes = download_resp.content
-                                    self._extract_deps_metadata(file_bytes, task_context)
-                                    file_key = await self.handler.send_file(file_bytes, 'lbpkg')
-                                    install_info['plugin_file_key'] = file_key
-                                    self.ap.logger.info(f'Transfered file {file_key} to plugin runtime')
-                                    # Continue to install via runtime
-                                else:
-                                    raise Exception(f'No version found for plugin {plugin_author}/{plugin_name}')
-                            else:
-                                raise Exception(f'Plugin {plugin_author}/{plugin_name} has no versions')
-                        else:
-                            raise Exception(f'Plugin {plugin_author}/{plugin_name} not found in marketplace')
-                    else:
-                        skill_resp.raise_for_status()
-                        raise Exception(f'Failed to get skill {plugin_author}/{plugin_name}')
-                else:
-                    mcp_resp.raise_for_status()
-                    raise Exception(f'Failed to get MCP {plugin_author}/{plugin_name}')
-
        if install_source == PluginInstallSource.LOCAL:
            # transfer file before install
            file_bytes = install_info['plugin_file']
@@ -546,6 +376,7 @@ class PluginRuntimeConnector(ManagedRuntimeConnector):
                task_context.metadata.update(metadata)

        await self._wait_for_installed_plugin_ready(plugin_author, plugin_name, task_context)
+        await self._refresh_agent_runner_registry()

    async def upgrade_plugin(
        self,
@@ -564,6 +395,8 @@ class PluginRuntimeConnector(ManagedRuntimeConnector):
                if task_context is not None:
                    task_context.trace(trace)

+        await self._refresh_agent_runner_registry()
+
    async def delete_plugin(
        self,
        plugin_author: str,
@@ -588,6 +421,8 @@ class PluginRuntimeConnector(ManagedRuntimeConnector):
                task_context.trace('Cleaning up plugin configuration and storage...')
            await self.handler.cleanup_plugin_data(plugin_author, plugin_name)

+        await self._refresh_agent_runner_registry()
+
    async def list_plugins(self, component_kinds: list[str] | None = None) -> list[dict[str, Any]]:
        """List plugins, optionally filtered by component kinds.

@@ -778,6 +613,53 @@ class PluginRuntimeConnector(ManagedRuntimeConnector):

            yield cmd_ret

+    # AgentRunner methods
+    async def list_agent_runners(self, bound_plugins: list[str] | None = None) -> list[dict[str, Any]]:
+        """List all available AgentRunner components.
+
+        Returns list of dicts with plugin_author, plugin_name, runner_name, manifest, etc.
+        """
+        if not self.is_enable_plugin:
+            return []
+
+        runners_data = await self.handler.list_agent_runners(include_plugins=bound_plugins)
+        return runners_data
+
+    async def run_agent(
+        self,
+        plugin_author: str,
+        plugin_name: str,
+        runner_name: str,
+        context: dict[str, Any],
+    ) -> typing.AsyncGenerator[dict[str, Any], None]:
+        """Run an AgentRunner from a plugin.
+
+        Args:
+            plugin_author: Plugin author
+            plugin_name: Plugin name
+            runner_name: AgentRunner component name
+            context: AgentRunContext as dict
+
+        Yields:
+            AgentRunResult dicts
+        """
+        if not self.is_enable_plugin:
+            # Return a protocol-level failure result.
+            yield {
+                'type': 'run.failed',
+                'data': {
+                    'error': 'Plugin system is disabled',
+                    'code': 'plugin.disabled',
+                    'retryable': False,
+                },
+            }
+            return
+
+        gen = self.handler.run_agent(plugin_author, plugin_name, runner_name, context)
+
+        async for ret in gen:
+            yield ret
+
    async def retrieve_knowledge(
        self,
        plugin_author: str,
@@ -792,18 +674,13 @@ class PluginRuntimeConnector(ManagedRuntimeConnector):
        return await self.handler.retrieve_knowledge(plugin_author, plugin_name, retriever_name, retrieval_context)

    def dispose(self):
-        # On non-Windows stdio mode, terminate via the controller's process handle.
-        # On Windows, the managed subprocess is cleaned up by the base class.
-        if (
-            self.is_enable_plugin
-            and hasattr(self, 'ctrl')
-            and isinstance(self.ctrl, stdio_client_controller.StdioClientController)
-        ):
+        # No need to consider the shutdown on Windows
+        # for Windows can kill processes and subprocesses chainly
+
+        if self.is_enable_plugin and isinstance(self.ctrl, stdio_client_controller.StdioClientController):
            self.ap.logger.info('Terminating plugin runtime process...')
            self.ctrl.process.terminate()

-        self._dispose_subprocess()
-
        if self.heartbeat_task is not None:
            self.heartbeat_task.cancel()
            self.heartbeat_task = None
--- a/src/langbot/pkg/plugin/handler.py
+++ b/src/langbot/pkg/plugin/handler.py
--- a/src/langbot/pkg/provider/modelmgr/requesters/bailianchatcmpl.py
+++ b/src/langbot/pkg/provider/modelmgr/requesters/bailianchatcmpl.py
@@ -171,7 +171,8 @@ class BailianChatCompletions(modelscopechatcmpl.ModelScopeChatCompletions):
                # 解析 chunk 数据
                if hasattr(chunk, 'choices') and chunk.choices:
                    choice = chunk.choices[0]
-                    delta = choice.delta.model_dump() if hasattr(choice, 'delta') else {}
+                    delta_obj = getattr(choice, 'delta', None)
+                    delta = delta_obj.model_dump() if delta_obj is not None else {}
                    finish_reason = getattr(choice, 'finish_reason', None)
                else:
                    delta = {}
--- a/src/langbot/pkg/provider/modelmgr/requesters/chatcmpl.py
+++ b/src/langbot/pkg/provider/modelmgr/requesters/chatcmpl.py
@@ -359,7 +359,8 @@ class OpenAIChatCompletions(requester.ProviderAPIRequester):

            if hasattr(chunk, 'choices') and chunk.choices:
                choice = chunk.choices[0]
-                delta = choice.delta.model_dump() if hasattr(choice, 'delta') else {}
+                delta_obj = getattr(choice, 'delta', None)
+                delta = delta_obj.model_dump() if delta_obj is not None else {}

                finish_reason = getattr(choice, 'finish_reason', None)
            else:
--- a/src/langbot/pkg/provider/modelmgr/requesters/geminichatcmpl.py
+++ b/src/langbot/pkg/provider/modelmgr/requesters/geminichatcmpl.py
@@ -132,7 +132,8 @@ class GeminiChatCompletions(chatcmpl.OpenAIChatCompletions):

            if hasattr(chunk, 'choices') and chunk.choices:
                choice = chunk.choices[0]
-                delta = choice.delta.model_dump() if hasattr(choice, 'delta') else {}
+                delta_obj = getattr(choice, 'delta', None)
+                delta = delta_obj.model_dump() if delta_obj is not None else {}

                finish_reason = getattr(choice, 'finish_reason', None)
            else:
--- a/src/langbot/pkg/provider/modelmgr/requesters/jiekouaichatcmpl.py
+++ b/src/langbot/pkg/provider/modelmgr/requesters/jiekouaichatcmpl.py
@@ -144,7 +144,8 @@ class JieKouAIChatCompletions(chatcmpl.OpenAIChatCompletions):
            # 解析 chunk 数据
            if hasattr(chunk, 'choices') and chunk.choices:
                choice = chunk.choices[0]
-                delta = choice.delta.model_dump() if hasattr(choice, 'delta') else {}
+                delta_obj = getattr(choice, 'delta', None)
+                delta = delta_obj.model_dump() if delta_obj is not None else {}
                finish_reason = getattr(choice, 'finish_reason', None)
            else:
                delta = {}
@@ -159,7 +160,7 @@ class JieKouAIChatCompletions(chatcmpl.OpenAIChatCompletions):
            # reasoning_content = delta.get('reasoning_content', '')

            if remove_think:
-                if delta['content'] is not None:
+                if delta.get('content') is not None:
                    if '<think>' in delta['content'] and not thinking_started and not thinking_ended:
                        thinking_started = True
                        continue
--- a/src/langbot/pkg/provider/modelmgr/requesters/modelscopechatcmpl.py
+++ b/src/langbot/pkg/provider/modelmgr/requesters/modelscopechatcmpl.py
@@ -391,7 +391,8 @@ class ModelScopeChatCompletions(requester.ProviderAPIRequester):
            # 解析 chunk 数据
            if hasattr(chunk, 'choices') and chunk.choices:
                choice = chunk.choices[0]
-                delta = choice.delta.model_dump() if hasattr(choice, 'delta') else {}
+                delta_obj = getattr(choice, 'delta', None)
+                delta = delta_obj.model_dump() if delta_obj is not None else {}
                finish_reason = getattr(choice, 'finish_reason', None)
            else:
                delta = {}
--- a/Show More
+++ b/Show More
Author	SHA1	Message	Date
huanghuoguoguo	fac52f3b9b	refactor(agent-runner): remove host context windowing	2026-06-02 17:01:45 +08:00
huanghuoguoguo	9fbc2432e0	feat(agent-runner): normalize binding config boundaries	2026-06-02 15:40:57 +08:00
huanghuoguoguo	0b83b0c623	fix: enforce agent run API permissions	2026-05-30 20:14:06 +08:00
huanghuoguoguo	95b859c55d	fix(agent-runner): authorize external runner tools	2026-05-30 09:48:27 +08:00
huanghuoguoguo	768d52f509	docs(agent-runner): document external MCP bridge	2026-05-30 09:10:51 +08:00
huanghuoguoguo	9e9bfbfb3d	docs(agent-runner): align runner protocol boundaries	2026-05-29 22:41:10 +08:00
huanghuoguoguo	471d9d68b2	docs(agent-runner): record codex runner smoke	2026-05-29 21:37:15 +08:00
huanghuoguoguo	58e4b35770	fix(agent-runner): stabilize event context and streams	2026-05-29 21:05:20 +08:00
huanghuoguoguo	056e62aa03	docs(agent-runner): update pluginization design status	2026-05-29 21:03:21 +08:00
huanghuoguoguo	9330a684fe	refactor(agent-runner): tighten protocol v1 runtime boundaries	2026-05-25 10:34:16 +08:00
huanghuoguoguo	90dffa7cd8	feat(agent-runner): align protocol adapter terminology	2026-05-24 09:13:15 +08:00
huanghuoguoguo	ea6c8fba57	feat(agent-runner): route pipeline runs through event-first flow - run_from_query() now delegates to run(event, binding) instead of maintaining a separate legacy execution path - Pipeline Query is converted to AgentEventEnvelope via PipelineCompatAdapter - Pipeline config is converted to AgentBinding with StatePolicy - bound_plugins authorization preserved from Pipeline - Legacy compatibility fields preserved: - query_id → context.runtime.query_id → session registry - prompt → context.compatibility.extra.prompt (not top-level) - params → context.compatibility.extra.params (with proper filtering) - max-round → bootstrap.messages and compatibility.legacy_messages - Pipeline path gains event-first host capabilities: - EventLog and Transcript writing - ArtifactStore registration - PersistentStateStore for state.updated - Removed legacy handlers: - _handle_artifact_created_query() (replaced by _handle_artifact_created) - _handle_state_updated() (replaced by _handle_state_updated_event) This change unifies the execution path while preserving backward compatibility for Pipeline-based runners. EventGateway is not implemented in this branch; only the event-first entry point is reserved.	2026-05-23 22:26:15 +08:00
huanghuoguoguo	ce007c49c8	feat(agent-runner): add persistent state APIs	2026-05-23 21:45:11 +08:00
huanghuoguoguo	4e68a93df7	feat(agent-runner): scope event-first state by binding	2026-05-23 19:45:57 +08:00
huanghuoguoguo	7247d8f221	feat(agent-runner): persist created artifacts	2026-05-23 18:13:53 +08:00
huanghuoguoguo	e0e321251e	feat(agent-runner): add artifact store pull APIs	2026-05-23 17:29:18 +08:00
huanghuoguoguo	8db23bf950	feat(agent-runner): add event-first context facts and pull APIs Add EventLog and Transcript persistence entities for storing auditable event facts and conversation history projection. Implement event-first AgentRunContext builder that produces Protocol v1 compliant context payloads with required fields: event, delivery, context (ContextAccess). Key changes: - EventLog ORM: auditable event records with indexes - Transcript ORM: conversation history projection with composite indexes - AgentRunContextBuilder: Protocol v1 payload with delivery, context, bootstrap - EventLogStore/TranscriptStore: async stores for fact sources - Host action handlers: HISTORY_PAGE, HISTORY_SEARCH, EVENT_GET, EVENT_PAGE - Context validation: build_context output validates via SDK AgentRunContext - Alembic migration for event_log and transcript tables - Alembic env.py imports all ORM models for autogenerate discovery Legacy compatibility: max-round messages go into bootstrap.messages and compatibility.legacy_messages, not top-level messages field.	2026-05-23 16:07:46 +08:00
huanghuoguoguo	8063303cfa	docs(agent-runner): split protocol and context design	2026-05-23 13:07:57 +08:00
huanghuoguoguo	094b87e578	fix(agent-runner): package context for plugin execution	2026-05-21 13:56:17 +08:00
huanghuoguoguo	26923c66c0	feat: make agent runner config schema driven	2026-05-19 12:20:28 +08:00
huanghuoguoguo	146694539e	chore(pipeline): clarify preferred default runner	2026-05-19 10:36:19 +08:00
huanghuoguoguo	7d6f635664	chore(agent): remove v1 wording from runner internals	2026-05-19 10:27:40 +08:00
huanghuoguoguo	641b15c74d	Revert "chore: update uv lock registry urls" This reverts commit `0cf29930a8`.	2026-05-19 10:15:34 +08:00
huanghuoguoguo	0cf29930a8	chore: update uv lock registry urls	2026-05-19 10:15:05 +08:00
huanghuoguoguo	927388c1f7	feat(agent): reserve stable runner event names	2026-05-19 10:15:00 +08:00
huanghuoguoguo	760baa24a3	docs: add phase1 qa report	2026-05-19 10:07:26 +08:00
huanghuoguoguo	036affe01f	feat(agent-runner): enrich plugin runner host context	2026-05-17 23:26:52 +08:00
huanghuoguoguo	19557c3227	fix: log agent runner best-effort failures	2026-05-17 11:07:52 +08:00
huanghuoguoguo	b9ecb27560	test: address agent runner review comments	2026-05-17 11:07:52 +08:00
huanghuoguoguo	b96dd8edc7	fix: stabilize dynamic forms and mcp testing	2026-05-17 11:07:52 +08:00
huanghuoguoguo	423fa0f942	refactor(modelmgr): simplify model sync logic and remove timeout configuration	2026-05-17 11:07:52 +08:00
huanghuoguoguo	948591d439	fix(rag): align knowledge engine plugin actions	2026-05-17 11:07:52 +08:00
huanghuoguoguo	ac3989d3ba	feat: support dynamic agent runner defaults	2026-05-17 11:07:52 +08:00
huanghuoguoguo	1e5acb947b	feat(toolmgr): add get_tool_by_name for unified tool lookup Add unified tool lookup method that searches both plugin and MCP loaders. Also add _get_tool method to MCPLoader for consistency with PluginToolLoader.	2026-05-17 11:07:52 +08:00
huanghuoguoguo	74b829a288	docs: update PROGRESS.md - rerank support completed	2026-05-17 11:07:52 +08:00
huanghuoguoguo	6e982ff49d	feat(plugin): implement INVOKE_RERANK handler with run-scoped authorization - Add invoke_rerank action handler in plugin handler - Validate rerank model access via run session - Cap documents at 64 for API limit - Return sorted results by relevance score	2026-05-17 11:07:52 +08:00
huanghuoguoguo	b220cf02e5	docs(runner): mark legacy runners and add PROGRESS.md - Add DEPRECATED docstring to all legacy runners in pkg/provider/runners/ - Mark migration target for each runner (local-agent, dify, n8n, coze, dashscope, langflow, tbox) - Add PROGRESS.md to track agent-runner-pluginization implementation status - Remove completed PHASE0_INTEGRATION_RECORD.md	2026-05-17 11:07:52 +08:00
huanghuoguoguo	66eaa99887	perf(agent-runner): improve session registry and orchestrator efficiency - Add pre-computed _authorized_ids (frozenset) at session registration for O(1) lookup - Refactor is_resource_allowed() from linear search to set membership check - Add thread-safe locking to get_session_registry() singleton - Cache _session_registry and _state_store references in orchestrator __init__ - Add asyncio.gather() for parallel resource building in AgentResourceBuilder - Create shared test fixtures in tests/unit_tests/agent/conftest.py - Update test files to import from shared conftest.py Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-17 11:07:52 +08:00
huanghuoguoguo	5aaa422250	feat(agent-runner): integrate AgentRunner Protocol v1 with plugin system Phase 0 integration complete - verified minimal loop with local-agent stub runner. Changes: - Add AgentRunOrchestrator for plugin-based agent execution - Add AgentResultNormalizer for Protocol v1 result conversion - Add AgentRunnerDescriptor for runner ID parsing (plugin:author/name/runner) - Update chat handler to use new orchestrator instead of direct runner lookup - Add plugin handler methods for list_agent_runners and run_agent - Add connector methods for AgentRunner protocol forwarding - Update pipeline API to include runner options in metadata - Add integration docs and implementation plan Integration verified: - Runner: plugin:langbot/local-agent/default - Input: "你好" - Output: [stub] Echo: 你好 - Date: 2026-05-10 10:09 Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-17 11:05:27 +08:00
Junyan Qin	b7dcda8b23	docs: record agent runner design decisions	2026-05-17 11:05:27 +08:00
Junyan Qin	3c58b9141b	docs: design agent runner pluginization	2026-05-17 11:05:27 +08:00
Junyan Qin	ddbf390d56	chore: stash code	2026-05-17 11:05:27 +08:00