refactor(agent-runner): remove host context windowing

2026-07-22 12:26:08 +00:00 · 2026-06-02 17:01:45 +08:00
parent afaf09ccc7
commit d0383e146e
26 changed files with 79 additions and 815 deletions
@@ -14,7 +14,7 @@
 - ✅ `AgentRunAPIProxy.state` — get/set/delete API
 - ✅ EventLog / Transcript / ArtifactStore — host 事实源
 - ✅ PersistentStateStore — 持久化状态存储
- ✅ `max-round` 已从协议实体中移除；如某 runner 仍需要类似历史窗口参数，应作为 runner binding config 由插件 manifest 暴露，而不是 Host / Pipeline 协议字段
+- ✅ `max-round` / host-side history window 已从 LangBot Host/Pipeline 语义中移除；如某 runner 仍需要类似参数，应由该 runner 自己解释配置
 - ✅ 外部 harness context projection 已用 Claude Code runner 做 MVP 验证：context 文件、skill 投影、MCP 配置和 host-owned resume state

 ## 1. 设计原则
@@ -41,7 +41,7 @@

 如果某个 runner 仍需要“最多读取多少轮历史”这样的策略参数，应由该 runner 在自己的 manifest/config schema 中声明，并作为 binding config 存到 `ctx.config` / `runner_config`。Host 只提供 history pull API、cursor、hard cap 和权限边界；runner 自己决定是否读取、读取多少、如何截断和压缩。

-当前 official local-agent 方向是通过 Host history API 拉取 transcript，并由 runner 自己管理模型上下文。它不依赖 Pipeline adapter 下发的 `max-round` / bootstrap 窗口。
+当前 official local-agent 方向是通过 Host history API 拉取 transcript，并由 runner 自己管理模型上下文。它不依赖 Pipeline adapter 下发历史窗口。

 新协议不应该问“LangBot 每轮裁几轮历史给 agent”，而应该问：

@@ -58,7 +58,7 @@
 - `Transcript`: Host 从 EventLog 投影出的对话视图，用于 UI、审计和按需历史读取。
 - `Working context`: Agent 本轮实际送进模型或 runtime 的上下文，由 AgentRunner 决定。

-LangBot 可以为简单 runner 提供 bootstrap window，但这只是 convenience，不是主架构。
+LangBot 不再提供 host-side bootstrap window。简单 runner 如果需要历史窗口，应在 runner 内部通过 Host history API 拉取并裁剪。

 ## 2. Event 到来时传什么

@@ -117,22 +117,11 @@ class AgentRunContext(BaseModel):

 这些会破坏跨进程序列化成本、泄露范围、KV cache 稳定性，也会迫使 host 替 agent 做 context 策略。

-### 2.3 可选 bootstrap
+### 2.3 不提供 Host Bootstrap Window

-根据 runner manifest 可以提供可选 bootstrap：
+`AgentRunContext.bootstrap` 可以作为协议里的可选扩展字段保留，但 LangBot Host 默认不填历史窗口，也不通过 Pipeline 配置决定窗口大小。

-```yaml
-context:
-  bootstrap: none | current_event | recent_tail | summary_tail
-  max_inline_events: 0
-  max_inline_bytes: 0
-```
-
-建议默认：
-
- 自管 runtime：`bootstrap: current_event`
- 简单 HTTP runner：`bootstrap: recent_tail`
- runner 如果需要 `recent_tail` 策略，应通过自己的 binding config 声明窗口大小；Host 不把 `max-round` 作为通用协议字段扩展。
+如果 runner 需要类似 `recent_tail` 的策略，它应在自己的 manifest/config schema 中声明参数，并在 runner 内部通过 `history_page` / `history_search` 读取、裁剪和压缩历史。Host 只负责权限、分页、hard cap 和事实源。

 ## 3. ContextAccess

@@ -335,7 +324,7 @@ LangBot core 不应内置官方 agent 的业务流程：

 **已完成（当前分支）**：

- ✅ `max-round` 不再是协议字段；类似历史窗口策略属于 runner binding config，而不是 Host / Pipeline 通用语义
+- ✅ `max-round` 不再是协议字段，也不再是 Host / Pipeline 通用语义
 - ✅ 新 runner 默认不收到历史窗口
 - ✅ `AgentRunContext` 增加 `context` / cursor / access capabilities
 - ✅ `AgentRunAPIProxy` 增加 history / events / artifacts / state API
@@ -156,7 +156,7 @@ class AgentRunnerDescriptor(BaseModel):

 ### 3.4 context_builder.py / pipeline_adapter.py

-`context_builder.py` 只负责从 `AgentEventEnvelope + AgentBinding` 构造 SDK v1 `AgentRunContext`。Pipeline Query 的读取、参数过滤、prompt 提取和 `max-round` bootstrap 映射都属于 `PipelineAdapter`，不再放进 context builder。
+`context_builder.py` 只负责从 `AgentEventEnvelope + AgentBinding` 构造 SDK v1 `AgentRunContext`。Pipeline Query 的读取、参数过滤和 prompt 提取属于 `PipelineAdapter`，但 PipelineAdapter 不再做历史窗口裁剪或 bootstrap 打包。

 当前消息 Pipeline 进入 agent runner 的路径：

@@ -183,7 +183,7 @@ Protocol v1 context 的稳定字段：
 - `state`: `PersistentStateStore` 读取的 host-managed scoped state snapshot
 - `runtime`: host/version/workspace/bot/query/trace/deadline
 - `config`: 当前 binding 对该 runner id 的配置，即 `runner_config`
- `bootstrap`: 可选小窗口，不是完整历史
+- `bootstrap`: 可选扩展字段；LangBot Host 默认不填历史窗口
 - `adapter`: Pipeline 或其它入口 adapter 的元数据

 Pipeline adapter 的 `prompt` 和公开业务变量不进入顶层协议字段：
@@ -191,58 +191,36 @@ Pipeline adapter 的 `prompt` 和公开业务变量不进入顶层协议字段
 - filtered params -> `ctx.adapter.extra["params"]`
 - legacy/effective prompt 可以暂存到 `ctx.adapter.extra["prompt"]`，但 official
  runner 不应把它当作行为契约
- `max-round` working window 可以保留在 Pipeline adapter 兼容层，但 official
-  `local-agent` 不消费该 bootstrap/window
- packaging 元数据 -> `ctx.runtime.metadata.context_packaging`
+- LangBot Host 不生成 `bootstrap.messages`、`adapter_messages` 或 context packaging 元数据

 现阶段不要把新的压缩或 token-budget 裁剪塞回 Pipeline stage。Pipeline 只负责入口适配；完整历史和长期上下文由 EventLog / Transcript / pull APIs / future ContextCompressor 支撑。

 ### 3.4.1 Agentic context plan

-本轮只在 `PipelineAdapter` 中保留 `max-round` working window，不改变 user-round 选择规则。
 EventLog / Transcript / Host pull APIs 已落地，`ContextCompressor` 仍是设计预留。
 目标是让 Pipeline 逐步退化为入口 adapter，让 AgentRunner 层拥有上下文打包职责。

-建议最终拆成四个 host-side 服务：
+建议 Host 保持三类事实源和受限 API：

 ```text
 ConversationStore / EventLog
  -> durable append-only raw messages, events, tool results, artifact refs
 ConversationProjection
  -> converts events into agent-readable conversation history
-PipelineAdapter bootstrap policy
-  -> builds the bounded working context for one run
 ContextCompressor
-  -> creates and updates summaries/checkpoints when thresholds are exceeded
+  -> future optional service for summaries/checkpoints, requested and consumed by runners
 ```

 关键原则：

 - 完整历史属于 LangBot host，不属于插件实例。插件仍是 singleton/stateless。
- `ctx.bootstrap.messages` 是 optional working context window，不是完整 conversation dump。
+- `ctx.bootstrap.messages` 不是 Host 默认下发的 working context。
 - 每轮不能全量复制/序列化完整历史给插件 runtime；否则长会话会产生 O(n) 成本和跨进程 payload 膨胀。
- `max-round` 的 user-round 规则只属于 Pipeline adapter 的 bootstrap 策略。
- LiteLLM 接入后，context packaging 应升级为 token budget / summary / pull API 协作策略。
+- `max-round` 或类似窗口规则不属于 LangBot Host / Pipeline 语义。
+- LiteLLM 接入后，模型窗口元信息应作为 resource/runtime metadata 暴露给 runner，由 runner 决定预算和压缩策略。
 - `ContextCompressor` 生成的是派生 summary/checkpoint，不能覆盖或删除 raw history。
 - 重启恢复依赖持久化 store 和 summary checkpoint，不依赖 `SessionManager` 里的进程内 conversation list。

-后续 `AgentRunContext` 可增加：
-
-```python
-context_request: AgentContextRequest | None
-context_packaging: ContextPackagingMetadata
-```
-
-建议语义：
-
- `context_request.mode`: AgentRunner manifest / binding config 请求的 `max_round`、`token_budget`、`summary_hybrid`、`external_session`
- `context_request.budget`: 模型窗口、预留输出 token、工具/RAG 预算等偏好
- `context_packaging.policy`: Host 本次实际采用的打包策略
- `context_packaging.delivered_count`: 本次下发的历史消息数
- `context_packaging.source_total_count`: packager 可见的原始历史消息数
- `context_packaging.messages_complete`: 本窗口是否已经包含完整历史
- `context_packaging.cursor_before`: 未来通过 host API 读取更早历史的 cursor
-
 未来需要的受限 API：

 ```python
@@ -256,7 +234,7 @@ page size、总字节数、deadline 和可访问 conversation。

 ### 3.4.2 Large artifacts and tool collaboration

-大文件、多模态输入和工具产物不要内联进 bootstrap messages 或 tool result。后续统一用
+大文件、多模态输入和工具产物不要内联进 prompt、bootstrap 或 tool result。后续统一用
 artifact/resource ref 协作：

 - message/content 里只放小文本和必要摘要。
@@ -512,7 +490,7 @@ async def run_from_query(query: pipeline_query.Query) -> AsyncGenerator[Message
 ### Step 4：local-agent parity

 - 使用静态绑定配置 `ctx.config["prompt"]`，不读取 `ctx.adapter.extra["prompt"]`。
- 通过 Host history API 拉取 transcript，不读取 `ctx.bootstrap.messages` 或 `ctx.adapter.adapter_messages`。
+- 通过 Host history API 拉取 transcript，不读取 `ctx.bootstrap.messages` 或 adapter window 字段。
 - 当前 user message 从 `ctx.input.contents` 构造，保留多模态内容。
 - RAG 只替换/插入文本部分，不丢图片/文件。
 - streaming/non-streaming 默认跟随 `runtime.metadata.streaming_supported`。
@@ -183,10 +183,9 @@ LangBot core 不应为了 local-agent 保留业务编排逻辑。local-agent 的
 - `ctx.runtime.metadata.streaming_supported`：当前 adapter 是否能消费流式输出。
 - 宿主代理 action：模型、工具、知识库、rerank 调用必须通过 `run_id` 校验资源权限。

-`local-agent` 不应消费 Pipeline adapter 生成的 `max-round` / `bootstrap`
-窗口，也不应读取 `ctx.adapter.extra.prompt`。它应从绑定配置读取静态
-`prompt`，并通过 Host history API 拉取 transcript。Pipeline adapter 可以继续为旧入口
-保留 `max-round` 兼容逻辑，但这不是 official local-agent 的行为契约。
+`local-agent` 不应消费 Pipeline adapter 生成的历史窗口，也不应读取
+`ctx.adapter.extra.prompt`。它应从绑定配置读取静态 `prompt`，并通过 Host
+history API 拉取 transcript。Pipeline adapter 不保留 Host-side window 兼容逻辑。

 建议 local-agent manifest 使用 hybrid 或 self-managed context：

@@ -11,7 +11,7 @@
 - ✅ Host 支持 `run_id` session authorization
 - ✅ Host 能从当前 Pipeline 入口生成 event-first context
 - ✅ `messages` 降级为 optional bootstrap
- ✅ `max-round` 不出现在协议实体中；类似历史窗口参数若存在，应来自 runner manifest/config schema，并作为 binding config 进入 `ctx.config`
+- ✅ `max-round` 不出现在协议实体中，也不属于 Host / Pipeline 语义；类似参数若存在，由 runner 自己解释 `ctx.config`
 - ✅ Proxy 覆盖 model、tool、knowledge、state/storage
 - ✅ History / Event / Artifact / State API 已落地
 - ✅ EventLog / Transcript / ArtifactStore / PersistentStateStore 已落地
@@ -142,13 +142,13 @@ class AgentRunnerContextPolicy(BaseModel):
    wants_static_context_refs: bool = True
 ```

-Host 使用该声明决定是否给 runner inline bootstrap history。默认原则：
+Host 不使用该声明给 runner inline 历史窗口。默认原则：

 - Host 不得默认 inline 全量历史。
- Host 默认只 inline 当前 event / input 和 context handles。
+- Host 只 inline 当前 event / input 和 context handles。
 - Runner 拥有 working context assembly。
 - Runner 可在授权后通过 Host history / event / artifact / state APIs 拉取更多上下文。
- `max-round` 不属于 Protocol v1 字段，也不属于 Pipeline / Host 通用语义。
+- `max-round` 或类似窗口参数不属于 Protocol v1 字段，也不属于 Pipeline / Host 通用语义；如果某个 runner 需要，应由 runner 自己解释 `ctx.config`。

 ## 4. Run 协议

@@ -193,7 +193,7 @@ class AgentRunContext(BaseModel):

 - `event` 是必选字段，Protocol v1 是 event-first。
 - `input` 表示当前事件的主输入，不等于历史消息。
- `bootstrap` 是可选字段，不是完整 history。
+- `bootstrap` 是可选字段；LangBot Host 默认不填历史窗口。
 - `adapter` 只放 Pipeline adapter 字段，runner 不应依赖它做长期能力。
 - `config` 是 Host binding config，不是插件实例状态。

@@ -342,10 +342,10 @@ class BootstrapContext(BaseModel):

 约束：

- `bootstrap.messages` 是 host convenience，不是协议核心。
- 自管 context runner 默认应收到空 bootstrap 或只收到当前 event。
+- `bootstrap.messages` 不是 LangBot Host 的默认行为。
+- 自管 context runner 默认应收到空 bootstrap。
 - Host 不应为了”帮 agent 更聪明”而自动拼接完整 transcript。
- 类似历史窗口策略应由具体 runner 的 binding config 表达；new/official runners 不应依赖 Pipeline adapter 下发的 bootstrap window。
+- 类似历史窗口策略应由具体 runner 自己解释 binding config，并通过 Host history API 拉取历史；new/official runners 不应依赖 Pipeline adapter 下发历史窗口。

 ### 4.10 RuntimeContext

@@ -685,7 +685,7 @@ Protocol v1 已在当前分支完成：
 - ✅ Host 支持 `run_id` session authorization
 - ✅ Host 能从当前 Pipeline 入口生成 event-first context
 - ✅ `messages` 降级为 optional bootstrap
- ✅ `max-round` 不出现在协议实体中；类似参数属于具体 runner binding config
+- ✅ `max-round` 不出现在协议实体中，也不属于 Host / Pipeline 语义
 - ✅ Proxy 至少覆盖 model、tool、knowledge、state/storage
 - ✅ History / event / artifact API 已落地
 - ✅ EventLog / Transcript / ArtifactStore / PersistentStateStore 已落地