Files
LangBot/skills/qa-agent-docs/qa-agent/02-log-guard-plan.md
T
Junyan Chin e9dd584792 feat: MCP server + in-repo skills (agent-friendly platform) (#2269)
* feat(api): support global API key from config.yaml (api.global_api_key)

Accept a config-defined global API key anywhere a web-UI key is accepted
(X-API-Key / Bearer), with no login session and no DB record. Useful for
automated deployments and AI agents (HTTP API + MCP). Defaults to empty
(disabled); does not require the lbk_ prefix.

- templates/config.yaml: add api.global_api_key with security notes
- service/apikey.py: verify_api_key checks global key first (constant-time)
- docs/API_KEY_AUTH.md: document the global key + security guidance
- tests: cover global-key match, prefix-free, fallback-to-db, disabled

* feat(mcp): expose LangBot management as an MCP server at /mcp

Add an MCP (Model Context Protocol) server so external AI agents can manage a
LangBot instance. Reuses the same API-key auth as the HTTP API (including the
config.yaml global API key).

- pkg/api/mcp/server.py: FastMCP server wrapping the service layer; 21 curated
  tools across system/bots/pipelines/models/knowledge/mcp-servers/skills
- pkg/api/mcp/mount.py: ASGI dispatcher fronting Quart; authenticates /mcp
  requests with an API key, runs the streamable-HTTP session manager lifespan
- controller/main.py: serve the wrapped ASGI app via hypercorn (was run_task)
- web: new 'MCP' tab in the API integration dialog showing endpoint, auth, and
  client config; i18n for 8 locales
- tests/manual/mcp_smoke.py: e2e check (401 unauth, list tools, call tools)

Tool surface is intentionally curated (not all ~25 route groups) to keep the
agent surface small, safe, and maintainable. Extend deliberately.

* feat(skills): add in-repo skills/ as the single source of truth

Migrate the agent skills + QA/e2e test harness from the (now archived)
langbot-app/langbot-skills repo into LangBot/skills/, and add four new skills.

Migrated:
- langbot-plugin-dev, langbot-testing (e2e), langbot-env-setup,
  langbot-skills-maintenance, langbot-eba-adapter-dev
- the bin/lbs CLI (src/, test/, scripts/, schemas/, qa-agent-docs/)

New:
- langbot-dev      core backend + web development
- langbot-deploy   Docker/K8s deployment + config.yaml + global API key
- langbot-mcp-ops  operating the LangBot MCP server (/mcp)
- langbot-space-ops operating the Space marketplace MCP server

- src/cli.ts repoRoot(): recognize the skills assets root (skills.index.json +
  bin/lbs) so the CLI works when nested inside the LangBot repo
- README.md: unified skill catalog; skills.index.json regenerated

Parity with source verified: bin/lbs validate + node test suite match the
source repo (only the uncommitted .lbpkg build-artifact fixture differs).

* docs(agents): document agent-facing surfaces + API/MCP/skills sync rule

* docs(readme): add 'Built for AI Agents' section across all locales

Highlight MCP server, in-repo skills (single source of truth), AGENTS.md
sync rule, and llms.txt. Cross-link LangBot Space MCP marketplace.

* style(mcp): fix ruff format + prettier lint in MCP server and API panel

* style(web): prettier format MCP i18n locale entries

* docs(skills): note MCP instance control in dev/testing skills

All development-guidance skills now point to the LangBot instance MCP
server (/mcp) and the Space marketplace MCP server, reusing API keys.
2026-06-20 15:14:47 +08:00

4.6 KiB

日志守卫规划

状态

这是当前活跃设计,已有第一版文件扫描 MVP。实现边界需要和黑盒 E2E 路线保持一致:

  • 日志守卫服务于 lbs test report
  • 它不替代浏览器/UI 判断。
  • 它不发展成独立后端 API 测试框架。
  • 第一版默认扫描 LANGBOT_REPO/data/logs/ 下最新的 langbot-*.log,也可扫描 agent 显式提供的 backend/frontend/console 日志文件。

当前总体路线见:

docs/qa-agent/04-black-box-e2e-roadmap.md

目标

日志守卫是 lbs test report 的一部分,用来在 agent 执行测试期间捕获 UI 断言之外的运行时问题。

当前命令方向已收敛为 lbs test plan / lbs test report。日志守卫服务于 agent-browser QA,不是独立的后端 API 测试入口。

LangBot 是异步且集成度高的系统,有些问题不会直接表现为页面失败:

  • 后台任务异常
  • 未等待的协程
  • Provider 流式调用失败
  • 插件 runtime 超时
  • 平台发送失败
  • 数据库连接问题
  • 敏感信息泄露

日志守卫负责把这些信号结构化地放进测试报告,并关联到 troubleshooting 资产。

输入

日志守卫应从环境和运行上下文读取配置:

  • skills/.env 中的 LANGBOT_BACKEND_URL
  • skills/.env 中的 LANGBOT_REPO,用于自动发现 LangBot 后端日志
  • lbs test plan / report 记录的 case id
  • LangBot 后端进程输出
  • 前端 dev server 输出
  • 浏览器 console/network 错误
  • case 声明的 success/failure patterns 和 expected failures

MVP 范围

  • 读取一个或多个日志流或日志文件。
  • 检测错误模式。
  • 支持按 case id 或 pattern 白名单。
  • 输出 JSON/Markdown 摘要。
  • 发现非预期错误时让测试报告标记失败;未来如果有自动执行器,再返回非零退出码。

错误分类

永远非预期

除非 case 明确声明,否则应失败:

  • Traceback
  • Task exception was never retrieved
  • RuntimeWarning: coroutine .* was never awaited
  • Unclosed client session
  • Unclosed connector
  • KeyError
  • TypeError
  • AttributeError
  • 密钥、token、secret 明文泄露

Case 预期错误

只有当前 case 声明时允许:

  • 无效 provider key
  • Provider 认证失败
  • 无效 webhook payload
  • 插件测试故意抛错
  • 超时测试
  • 限流测试

仅警告

报告但默认不失败:

  • 可恢复重试
  • 恢复的超时
  • 废弃配置
  • 慢请求
  • 版本检查失败

与 Troubleshooting 集成

日志守卫不只输出错误文本,还应尽量匹配已知 troubleshooting id。

例子:

Action list_plugins call timed out
Action list_agent_runners call timed out
Action invoke_llm_stream call timed out

可映射到:

plugin-runtime-timeout
uppercase proxy points to one host, lowercase proxy points to another

可映射到:

proxy-env-mismatch

未来命令

bin/lbs test plan pipeline-debug-chat
bin/lbs test start pipeline-debug-chat
bin/lbs test run pipeline-debug-chat --dry-run
bin/lbs test report pipeline-debug-chat
bin/lbs test report --output report.md
bin/lbs test report pipeline-debug-chat --backend-log /path/to/backend.log --console-log /path/to/console.log
bin/lbs test report pipeline-debug-chat --since "2026-05-21T10:30:00+08:00"
bin/lbs test report pipeline-debug-chat --tail-lines 2000
bin/lbs test report pipeline-debug-chat --since "2026-05-21T10:30:00+08:00" --tail-lines 2000
bin/lbs test report pipeline-debug-chat --no-auto-log

运行报告应包含:

  • case id
  • URL 和环境变量摘要,不能包含 secrets
  • 浏览器可见结果
  • 后端日志摘要
  • console/network 错误
  • 匹配到的 troubleshooting id
  • 通过/失败结论

MVP 完成标准

  • 可以自动扫描最新 LangBot 后端日志,也可以扫描前端日志和 console 日志文件。
  • 可以用 --since--tail-lines 把扫描范围限制到本次测试窗口。
  • 可以检测明显 Python/运行时错误和 secret 泄露风险。
  • 可以识别 case 声明的 success/failure patterns。
  • 可以识别 troubleshooting pattern,包括 plugin-runtime-timeoutproxy-env-mismatch
  • 支持 case 级白名单。
  • 输出机器可读摘要。
  • 至少一个 langbot-testing case 使用它。

当前 MVP 已覆盖自动发现 LangBot 后端日志、文件扫描、--since/--tail-lines 扫描窗口、 基础错误检测、case success/failure signal、troubleshooting 匹配、secret 脱敏和 --json 输出。仍待继续完善的是 live log 采集、更多规则、case 级 expected failure 的资产化和真实 E2E report 样例。