Files
LangBot/skills/qa-agent-docs/qa-agent/01-qa-agent-harness-plan.md
T
Junyan Chin e9dd584792 feat: MCP server + in-repo skills (agent-friendly platform) (#2269)
* feat(api): support global API key from config.yaml (api.global_api_key)

Accept a config-defined global API key anywhere a web-UI key is accepted
(X-API-Key / Bearer), with no login session and no DB record. Useful for
automated deployments and AI agents (HTTP API + MCP). Defaults to empty
(disabled); does not require the lbk_ prefix.

- templates/config.yaml: add api.global_api_key with security notes
- service/apikey.py: verify_api_key checks global key first (constant-time)
- docs/API_KEY_AUTH.md: document the global key + security guidance
- tests: cover global-key match, prefix-free, fallback-to-db, disabled

* feat(mcp): expose LangBot management as an MCP server at /mcp

Add an MCP (Model Context Protocol) server so external AI agents can manage a
LangBot instance. Reuses the same API-key auth as the HTTP API (including the
config.yaml global API key).

- pkg/api/mcp/server.py: FastMCP server wrapping the service layer; 21 curated
  tools across system/bots/pipelines/models/knowledge/mcp-servers/skills
- pkg/api/mcp/mount.py: ASGI dispatcher fronting Quart; authenticates /mcp
  requests with an API key, runs the streamable-HTTP session manager lifespan
- controller/main.py: serve the wrapped ASGI app via hypercorn (was run_task)
- web: new 'MCP' tab in the API integration dialog showing endpoint, auth, and
  client config; i18n for 8 locales
- tests/manual/mcp_smoke.py: e2e check (401 unauth, list tools, call tools)

Tool surface is intentionally curated (not all ~25 route groups) to keep the
agent surface small, safe, and maintainable. Extend deliberately.

* feat(skills): add in-repo skills/ as the single source of truth

Migrate the agent skills + QA/e2e test harness from the (now archived)
langbot-app/langbot-skills repo into LangBot/skills/, and add four new skills.

Migrated:
- langbot-plugin-dev, langbot-testing (e2e), langbot-env-setup,
  langbot-skills-maintenance, langbot-eba-adapter-dev
- the bin/lbs CLI (src/, test/, scripts/, schemas/, qa-agent-docs/)

New:
- langbot-dev      core backend + web development
- langbot-deploy   Docker/K8s deployment + config.yaml + global API key
- langbot-mcp-ops  operating the LangBot MCP server (/mcp)
- langbot-space-ops operating the Space marketplace MCP server

- src/cli.ts repoRoot(): recognize the skills assets root (skills.index.json +
  bin/lbs) so the CLI works when nested inside the LangBot repo
- README.md: unified skill catalog; skills.index.json regenerated

Parity with source verified: bin/lbs validate + node test suite match the
source repo (only the uncommitted .lbpkg build-artifact fixture differs).

* docs(agents): document agent-facing surfaces + API/MCP/skills sync rule

* docs(readme): add 'Built for AI Agents' section across all locales

Highlight MCP server, in-repo skills (single source of truth), AGENTS.md
sync rule, and llms.txt. Cross-link LangBot Space MCP marketplace.

* style(mcp): fix ruff format + prettier lint in MCP server and API panel

* style(web): prettier format MCP i18n locale entries

* docs(skills): note MCP instance control in dev/testing skills

All development-guidance skills now point to the LangBot instance MCP
server (/mcp) and the Space marketplace MCP server, reusing API keys.
2026-06-20 15:14:47 +08:00

232 lines
6.6 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# LangBot Skills 测试资产库规划
## 状态
这是早期测试资产库规划文档,保留用于解释 `langbot-skills` 的分层来源。
当前路线已经收敛为黑盒 E2E QA:开发者用 agent 通过浏览器测试 LangBot
稳定路径沉淀为 case,失败知识沉淀为 troubleshooting。`lbs test report`
日志守卫已有 MVP,后续重点是报告证据、case 元数据和少量稳定路径自动化。当前优先级见:
```text
docs/qa-agent/04-black-box-e2e-roadmap.md
```
本文中关于 `case list/show``trouble show/search``test plan` 的“计划实现”
内容已经部分过时,因为这些能力已经落地。
## 目标
让开发者 clone `langbot-skills` 后,可以把测试意图交给 agent,由 agent 复用已有环境配置、测试路径和故障知识完成 LangBot 功能验证。
典型场景:
- 冒烟测试:验证 pipeline Debug Chat、provider、常见页面是否正常。
- Provider 测试:添加 DeepSeek/OpenAI/Claude 等供应商并验证模型可用。
- 新 feature 测试:探索新 UI 路径,并在稳定后沉淀成 case/reference。
- 回归测试:复用旧路径,避免每个窗口重新探索登录、模型配置、pipeline 调试。
- 故障沉淀:把 runtime 超时、代理不一致、WebSocket 问题记录为可搜索资产。
核心方向见 `03-agent-browser-qa-principles.md`:agent 必须以浏览器/UI 为主路径,API/curl 只能作为诊断手段。
## 当前仓库结构
```text
skills/
.env # 共享默认变量
langbot-env-setup/ # 环境准备、浏览器控制路径、代理、登录态
langbot-testing/ # WebUI / provider / pipeline 测试入口
langbot-plugin-dev/ # 插件开发测试
langbot-eba-adapter-dev/ # 平台适配器开发测试
src/
lbs.ts # CLI 源码
bin/
lbs # CLI 入口
docs/
qa-agent/ # 规划文档,历史目录名保留
```
## 设计分层
### 1. Skill 层
`SKILL.md` 只做触发和路由,不承载大段流程。
例子:
```text
langbot-env-setup -> 选择 Computer Use / Playwright MCP / OAuth profile / proxy
langbot-testing -> 选择 WebUI / pipeline / provider / troubleshooting
```
### 2. Reference 层
Markdown 记录人和 agent 都能读的流程说明。
适合内容:
- 如何选择浏览器控制方式
- 如何启动/检查服务
- 如何执行 pipeline Debug Chat
- 如何处理 OAuth 登录态
### 3. Case 层
使用 YAML 记录可重复测试路径。
建议结构:
```text
skills/langbot-testing/cases/
pipeline-debug-chat.yaml
provider-deepseek.yaml
```
建议格式:
```yaml
id: pipeline-debug-chat
title: Pipeline Debug Chat returns a bot response
mode: agent-browser
area: pipeline
type: smoke
skills:
- langbot-env-setup
- langbot-testing
env:
- LANGBOT_FRONTEND_URL
- LANGBOT_BACKEND_URL
steps:
- Open LANGBOT_FRONTEND_URL
- Navigate to Pipelines
- Open target pipeline
- Select Debug Chat
- Send deterministic prompt
checks:
- "UI: User message appears"
- "UI: Bot message appears"
- "Console: No unexpected frontend errors"
- "Logs: Backend log includes Conversation(0) Streaming completed"
diagnostics:
- "Use API/curl only after the UI path is attempted, to distinguish frontend display failure from backend/runtime failure."
troubleshooting:
- plugin-runtime-timeout
- proxy-env-mismatch
```
### 4. Troubleshooting 层
故障资产会逐渐变大,适合结构化记录。
历史 Markdown 入口保留在:
```text
skills/langbot-testing/references/troubleshooting.md
```
当前 canonical 结构化故障资产在:
```text
skills/langbot-testing/troubleshooting/
plugin-runtime-timeout.yaml
proxy-env-mismatch.yaml
```
### 5. CLI 层
`lbs` 是统一入口,不再引入独立 `qa` 命令。
已实现或当前可用:
```bash
bin/lbs list
bin/lbs validate
bin/lbs index
bin/lbs new-skill <name>
bin/lbs new-ref <skill> <name>
bin/lbs case new pipeline-debug-chat --title "Pipeline Debug Chat"
bin/lbs case list
bin/lbs case show pipeline-debug-chat
bin/lbs trouble list <skill>
bin/lbs trouble show plugin-runtime-timeout
bin/lbs trouble search runtime
bin/lbs trouble add <skill> --title ... --symptom ... --cause ... --fix ...
bin/lbs test plan pipeline-debug-chat
bin/lbs test start pipeline-debug-chat
bin/lbs test run pipeline-debug-chat --dry-run
bin/lbs test report pipeline-debug-chat
bin/lbs test report pipeline-debug-chat --backend-log /path/to/backend.log
```
## 测试库位置
不要使用隐藏 `.qa/` 作为主测试库。测试资产应该和 skill 放在一起,便于触发和维护:
```text
skills/langbot-testing/
references/
cases/
troubleshooting/
reports/ # 可选,本地运行产物可按需忽略或输出到外部目录
```
如果未来需要项目本地测试库,可以允许 `lbs` 支持 `--workspace` 或项目根目录配置,但 canonical 资产仍保存在 `langbot-skills`
## 阶段规划
### 阶段一:环境和测试路径沉淀
状态:基本完成,持续维护。
- `skills/.env` 管共享默认变量。
- `langbot-env-setup` 拆出 Computer Use、Playwright MCP、OAuth profile、proxy、service startup。
- `langbot-testing` 记录 WebUI、pipeline、provider 测试路径。
- `lbs validate/index` 维护结构。
完成标准:
- agent 可以从 `skills/.env` 和 references 中找到当前测试入口。
- pipeline Debug Chat 这类路径不再需要从头探索。
### 阶段二:结构化 case/troubleshooting
状态:主体已完成,继续补齐元数据和资产质量。
目标:
- `lbs case new/list/show`
- `lbs trouble show/search`
- case id 去重、字段校验、索引生成
完成标准:
- 冒烟测试路径可以用结构化 case 表示。
- 下一个 agent 窗口可以直接读取 case 执行。
### 阶段三:计划和报告
状态:已有 MVP,继续完善。
目标:
- `lbs test plan <case>`
- agent 按 plan 使用浏览器执行 UI QA
- `lbs test report`
- 日志守卫集成
- 报告产物和 evidence 约定
完成标准:
- agent 可以按 case plan 执行浏览器测试。
- 结果报告包含 UI 结果、后端日志、console 错误和 troubleshooting 建议。
## 执行规则
- agent 可以直接编辑 Markdown reference。
- 新增结构化 case/troubleshooting 时,优先使用 `lbs`
- 每次结构变更后运行 `bin/lbs validate`
- 每次索引相关变更后运行 `bin/lbs index`
- 测试文档不写死端口,使用 `skills/.env` 中的 URL 变量。
- 测试 case 的 `mode` 固定为 `agent-browser`
- API/curl 只能写入 `diagnostics`,不能替代 UI 步骤和 UI 检查。