fix(api): guard /set-password with allow_modify_login_info (#2288 )

The /change-password and /bind-space endpoints already refuse when system.allow_modify_login_info is false, but /set-password did not, leaving a path to alter login credentials on locked-down deployments (e.g. public demo instances). Apply the same guard. Co-authored-by: dadachann <185672915+dadachann@users.noreply.github.com>
Add performance and reliability QA gates (#2283 )
2026-06-26 15:34:26 +00:00 · 2026-06-26 16:35:50 +08:00 · 2026-06-25 21:02:44 +08:00 · 2026-06-25 20:31:44 +08:00 · 2026-06-25 20:26:25 +08:00 · 2026-06-25 08:22:01 -04:00
138 changed files with 12234 additions and 2360 deletions
@@ -48,6 +48,7 @@ coverage.xml
 .coverage
 src/langbot/web/
 testsdk/
 .qa/
 # Build artifacts
 /dist
@@ -1,160 +1,105 @@
 # AGENTS.md
-This file guides code agents (Claude Code, GitHub Copilot, OpenAI Codex, etc.) working in the LangBot project. `CLAUDE.md` is a symlink to this file.
+This file guides code agents working in the LangBot main repository. `CLAUDE.md` is a symlink to this file.
-## Project Overview
+Read `ARCHITECTURE.md` before non-trivial backend, frontend, runtime, plugin, Box, MCP, persistence, or cross-repo SDK changes. This file is the working checklist; `ARCHITECTURE.md` is the system map.
-LangBot is an open-source, LLM-native instant-messaging bot development platform. It aims to provide an out-of-the-box IM bot development experience with Agent, RAG, MCP and other LLM application capabilities, supporting mainstream global IM platforms and exposing rich APIs for custom development.
+## Quick Facts
-LangBot has a comprehensive web frontend — almost every operation can be performed through it.
+- Python backend: `>=3.11,<4.0`, dependencies managed by `uv`.
 - Frontend: `web/` is Vite + React Router 7 + shadcn/ui + Tailwind, managed by `pnpm`.
 - Backend framework: Quart served by Hypercorn on `api.port`, default `5300`.
 - Frontend dev server: `web/` on `3000`, with `VITE_API_BASE_URL` pointing at the backend.
 - Plugin/Box/runtime contracts live in sibling repo `langbot-plugin-sdk`, pinned as `langbot-plugin` in `pyproject.toml`.
- **Python**: `>=3.11,<4.0`, dependencies managed by `uv`. Package version is in `pyproject.toml`.
+## Essential Commands
 - **Frontend**: `web/` is a **Vite + React Router 7 + shadcn/ui + Tailwind CSS** SPA, managed by `pnpm`. (Note: this is NOT Next.js — the `dev` script is `vite`.)
 - **Backend framework**: Quart (the async flavour of Flask). The HTTP API and the pre-built web UI are both served by the backend on `http://127.0.0.1:5300`.
 ## Repository Layout
 ```
 LangBot/
 ├── main.py                     # Entrypoint shim -> langbot.__main__.main()
 ├── pyproject.toml              # Python project + deps (uv), pins langbot-plugin==<x.y.z>
 ├── src/langbot/
 │   ├── __main__.py             # Real entrypoint, CLI args (--standalone-runtime, --standalone-box, --debug)
 │   ├── pkg/                    # Core backend package
 │   │   ├── api/                # HTTP API controllers + services (Quart)
 │   │   ├── core/               # App bootstrap, stages, task manager
 │   │   ├── platform/           # IM platform adapters, bot managers, session managers
 │   │   ├── provider/           # LLM providers, requesters, tool providers
 │   │   ├── pipeline/           # Pipelines, stages, query pool
 │   │   ├── plugin/             # Bridge connecting LangBot to the plugin runtime (see below)
 │   │   ├── box/                # Code-sandbox subsystem (Docker / nsjail / E2B backends)
 │   │   ├── skill/              # Skill subsystem
 │   │   ├── rag/ , vector/      # RAG + vector store
 │   │   ├── command/            # Built-in commands
 │   │   ├── persistence/        # ORM models + Alembic migrations (SQLite & PostgreSQL)
 │   │   ├── storage/            # Object/file storage abstractions
 │   │   ├── config/, entity/, discover/, utils/, telemetry/, survey/
 │   ├── libs/                   # Vendored SDKs (qq_official_api, wecom_api, etc.)
 │   └── templates/              # Config/component templates (e.g. templates/config.yaml)
 ├── web/                        # Frontend SPA (Vite + React Router 7 + shadcn + Tailwind)
 └── docker/                     # docker-compose deployment files
 ```
 ## Development Environment Setup
 Full guide lives in the wiki: **["开发配置" / Dev Config](https://docs.langbot.app/zh/develop/dev-config)**. Summary:
 ### Backend
 ```bash
 pip install uv
 uv sync --dev          # uv creates a .venv/ for you; point your editor's interpreter at it
 uv run main.py         # serves API + web UI on http://127.0.0.1:5300
 ```
 On first run the config file is generated at `data/config.yaml`. DB is SQLite by default (zero setup); PostgreSQL is supported. Migrations run automatically on startup.
 ### Frontend
 Requires Node.js + [pnpm](https://pnpm.io/installation).
 ```bash
 cd web
 cp .env.example .env   # Windows: copy .env.example .env
 pnpm install
 pnpm dev               # http://127.0.0.1:3000  (npm install / npm run dev also work)
 ```
 `pnpm dev` reads `VITE_API_BASE_URL` from `web/.env` so the dev frontend can reach the backend on port `5300`. In production the frontend is pre-built into static files served by the backend on the same origin.
 ### Code formatting
 The repo runs lint + format checks in CI. Install the pre-commit hooks so the same checks run locally before each commit:
 ```bash
 uv sync --dev
 uv run main.py
 uv run pre-commit install
 cd web
 pnpm install
 pnpm dev
 pnpm build
 ```
-## Plugin System
+Useful focused tests:
 LangBot's plugin system (Plugin SDK, CLI `lbp`, Plugin Runtime, and the shared entity/API definitions) lives in a **separate repository**: [`langbot-plugin-sdk`](https://github.com/langbot-app/langbot-plugin-sdk). LangBot depends on it via the pinned `langbot-plugin` package in `pyproject.toml`.
 ### Architecture (what to know inside this repo)
 - Plugins run as independent processes managed by the **Plugin Runtime**. The Runtime supports two control transports: `stdio` and `websocket`.
 - When LangBot is started directly by a user (not in a container), it spawns and connects to the Runtime over **stdio** (lightweight/personal use).
 - When LangBot runs in a container, it connects to a standalone Runtime over **WebSocket** (production).
 - The bridge code lives in `src/langbot/pkg/plugin/` (`connector.py`, `handler.py`).
 - Relevant config (`data/config.yaml`): `plugin.runtime_ws_url` (e.g. `ws://langbot_plugin_runtime:5400/control/ws`). Start LangBot with `--standalone-runtime` to make it connect to an externally-launched Runtime over WebSocket instead of spawning one over stdio.
 ### Debugging the Plugin Runtime / CLI / SDK
 This is documented in detail in the **SDK repo's `AGENTS.md`** and in the wiki page **["调试插件运行时、CLI、SDK" / Plugin Runtime](https://docs.langbot.app/zh/develop/plugin-runtime)**. The short version:
 - Clone `LangBot` and `langbot-plugin-sdk` as siblings under one parent dir so the editor resolves shared entities.
 - Start a standalone Runtime from the SDK repo: `uv run --no-sync lbp rt` (control port `5400`, debug port `5401`).
 - To make LangBot use a locally-modified SDK: from the SDK dir, with LangBot's `.venv` active, run `uv pip install .`, then launch LangBot with `uv run --no-sync main.py --standalone-runtime` (keep `--no-sync` so your local SDK isn't overwritten).
 ### Debugging the Box (sandbox) runtime
 The Box subsystem (`src/langbot/pkg/box/`) is the code sandbox. It picks the first available backend among **Docker / nsjail / E2B**. The standalone Box runtime is launched via the SDK CLI: `lbp box`. Backend selection details, the `lbp box` flags, and the SDK-side architecture are documented in the SDK repo's `AGENTS.md`.
 Relevant config (`data/config.yaml`, `box:` section): `box.enabled` (master switch — disabling it also disables the native sandbox tools, skill add/edit, and stdio-mode MCP servers), `box.backend` (`'local'` = Docker/nsjail auto-pick, or `'docker'` / `'nsjail'` / `'e2b'`; also settable via `BOX__BACKEND`), and `box.runtime.endpoint` (external Box runtime base URL, e.g. `ws://127.0.0.1:5410`; empty = local auto-managed runtime). Like the plugin runtime, LangBot can connect to an externally-launched Box runtime by setting that endpoint and starting with `--standalone-box`.
 > A common false "No supported sandbox backend (Docker / nsjail / E2B) is available" comes from Docker being installed and running but the current user not being in the `docker` group → `docker info` gets `permission denied` on the socket. Fix: `sudo usermod -aG docker <user>` and restart the backend in a shell that has the new group.
 ## Development Standards
 - LangBot is a global project: **all code comments and docstrings must be in English**, and every user-facing string must support **i18n** (`en_US` + `zh_Hans` at minimum, plus `ja_JP` where the repo already has it).
 - LangBot is adopted in both toC and toB scenarios — always consider compatibility and security.
 - **Commit message format**: `<type>(<scope>): <subject>`
  - `type`: one of `feat`, `fix`, `docs`, `style`, `refactor`, `perf`, `test`, `chore`, etc.
  - `scope`: the affected package/module/file/class.
  - `subject`: concise description of the change.
 ### Database migrations (Alembic)
 LangBot uses [Alembic](https://alembic.sqlalchemy.org/) for migrations, supporting both SQLite and PostgreSQL from a single set of scripts. Migration files live in `src/langbot/pkg/persistence/alembic/versions/`.
 If you change ORM model definitions, generate a migration:
 ```bash
-# Run from the project root (requires data/config.yaml to exist)
+uv run pytest tests/unit_tests -q
-uv run python -m langbot.pkg.persistence.alembic_runner autogenerate "description of your change"
+uv run pytest tests/integration -q
 uv run pytest tests/integration/persistence -q
 uv run pytest tests/manual/mcp_smoke.py
 cd web
 pnpm lint
 pnpm test:e2e
 ```
-Review and edit the generated script before committing. Migrations execute automatically on startup. `autogenerate` detects schema changes (add/drop columns, tables, type changes) but **data migrations** (e.g. mutating JSON field contents) must be hand-written into the generated script. `env.py` sets `render_as_batch=True`, so SQLite's ALTER TABLE limits are handled automatically — no need to branch per database. More in the wiki ["开发配置"](https://docs.langbot.app/zh/develop/dev-config#数据库迁移).
+Run the narrowest useful test first, then broader checks when confidence is needed.
-When writing a migration, follow these rules:
+## Where to Look
- **Revision id ≤ 32 characters.** PostgreSQL stores `alembic_version.version_num` as `varchar(32)`; a longer id raises `StringDataRightTruncationError` at runtime. Prefer short, descriptive ids like `0005_add_llm_context_length`.
+- Architecture map: `ARCHITECTURE.md`.
- **Guard every operation against missing tables/columns.** Fresh installs build the schema via `create_all()` and then stamp the Alembic baseline, so a migration may run against a table that already has the change — or, in tests, against an empty database. Check `inspector.get_table_names()` / `inspector.get_columns(...)` before `add_column` / `drop_column`, mirroring the existing migrations.
+- Dev environment guide: https://docs.langbot.app/zh/develop/dev-config.
- **Keep a single linear head.** Chain `down_revision` to the current head; do not create branches. Run the migration tests after adding one: `uv run pytest tests/integration/persistence/ -q` (the PostgreSQL test needs a running PG via `TEST_POSTGRES_URL`).
+- Plugin runtime / CLI / SDK debugging: https://docs.langbot.app/zh/develop/plugin-runtime.
 - API-key auth: `docs/API_KEY_AUTH.md`.
 - Box deep-dive notes: `docs/review/box-architecture.md` and related files.
 - In-repo skills: `skills/` is the single source of truth for LangBot agent skills.
 - SDK repo: `../langbot-plugin-sdk/` when changing shared entities, plugin APIs, action protocol, `lbp rt`, or `lbp box`.
-> **Legacy migration system (deprecated — do not extend).** The old 3.x migration system under `src/langbot/pkg/persistence/migrations/` (`DBMigration` subclasses in `dbmXXX_*.py`, run from `pkg/persistence/mgr.py`) is **frozen**. Do **not** add new `dbmXXX_*.py` files. The chain is capped at `required_database_version = 25` (`pkg/utils/constants.py`); those files only exist to upgrade pre-existing 3.x databases up to the Alembic baseline and are kept read-only. All new schema changes go through Alembic.
+## Cross-Repo SDK Work
-## Agent-Facing Surfaces (MCP + Skills)
+When changing SDK contracts used by LangBot:
-LangBot is built to be **agent-friendly**. Three surfaces let AI agents work
+```bash
-with LangBot, and they MUST be kept in lockstep with the HTTP API:
+# from langbot-plugin-sdk, with LangBot's .venv active
 uv pip install .
-1. **MCP server** — `src/langbot/pkg/api/mcp/` exposes a curated subset of the
+# from LangBot, preserve the locally installed SDK
-   API as MCP tools at `/mcp` (API-key authenticated, including the
+uv run --no-sync main.py
-   `api.global_api_key` from config.yaml). `server.py` defines the tools (they
+```
   call the service layer directly); `mount.py` is the ASGI dispatcher.
 2. **In-repo skills** — `skills/` is the **single source of truth** for agent
   skills (plugin/core/deploy/e2e/MCP-ops). Docs and the landing page link here
   rather than embedding their own copies.
 3. **API-key auth** — `api.global_api_key` (config.yaml) authenticates the API
   and MCP without a login session; see `docs/API_KEY_AUTH.md`.
-> **Maintenance rule (important).** When you add, remove, or change an HTTP API
+For standalone runtime debugging:
 > endpoint that should be agent-accessible, you MUST update **both** the matching
 > MCP tool in `src/langbot/pkg/api/mcp/server.py` **and** the relevant skill under
 > `skills/` (especially `skills/skills/langbot-mcp-ops`). The API, the MCP tool
 > surface, and the skills are one system — drift between them is a bug.
-## Some Principles
+```bash
 # in langbot-plugin-sdk
 uv run --no-sync lbp rt
 uv run --no-sync lbp box
 # in LangBot
 uv run --no-sync main.py --standalone-runtime
 uv run --no-sync main.py --standalone-box
 ```
 Config keys to verify in `data/config.yaml` / `src/langbot/templates/config.yaml`:
 - Plugin runtime: `plugin.runtime_ws_url`, default Docker host `langbot_plugin_runtime:5400/control/ws`.
 - Box runtime: `box.enabled`, `box.backend`, `box.runtime.endpoint`, Docker host `langbot_box:5410`.
 - API/MCP auth: `api.global_api_key`.
 ## Change Rules
 - HTTP API changes that should be agent-accessible must update the matching MCP tool in `src/langbot/pkg/api/mcp/server.py` and the relevant skill under `skills/` in the same pass.
 - New schema changes use Alembic under `src/langbot/pkg/persistence/alembic/versions/`; do not add legacy `dbmXXX` migrations.
 - New platform behavior belongs in platform adapters only for platform translation; pipeline/business logic belongs in `pkg/pipeline/` or services.
 - User-facing strings must support i18n (`en_US`, `zh_Hans`; include `ja_JP` where the repo already does).
 - Code comments and docstrings must be English.
 - Keep compatibility and security in mind; LangBot is used in both self-hosted/community and toB deployments.
 - Commit message format: `<type>(<scope>): <subject>`.
 ## Runtime Pitfalls
 - Local stdio Plugin Runtime disconnects do not auto-reconnect; restart LangBot if that path breaks.
 - Orphan runtime processes on `5400`/`5401` commonly break plugin debugging.
 - Use `uv run --no-sync` after locally installing the SDK, or `uv` may restore the pinned package.
 - A false Box “no backend” often means Docker is running but the current user lacks Docker socket permission.
 - Do not confuse external MCP servers LangBot connects to (`pkg/provider/tools/loaders/mcp.py`) with LangBot's own `/mcp` server (`pkg/api/mcp/`).
 - `CLAUDE.md` is a symlink to this file; edit `AGENTS.md`, not the symlink.
 ## Principles
 - Keep it simple, stupid.
 - Entities should not be multiplied unnecessarily.
@@ -0,0 +1,250 @@
 # Architecture
 This document is a map of LangBot's moving parts. It is intentionally more stable than a feature guide and more concrete than the README: when you need to change behavior, start here, then follow the file references into the code.
 For agent-specific working rules, see `AGENTS.md`. For plugin-runtime and Box-runtime implementation details, also read the sibling SDK repo: [`langbot-plugin-sdk`](https://github.com/langbot-app/langbot-plugin-sdk).
 ## What LangBot Is
 LangBot is an open-source platform for building production IM bots backed by LLMs, agents, RAG, plugins, MCP tools, and a web management panel.
 At runtime, one LangBot process owns:
 - a Quart/Hypercorn HTTP service and the built web UI on `:5300`;
 - messaging-platform adapters such as Discord, Telegram, Slack, WeChat, QQ, WeCom, Lark, DingTalk, KOOK, LINE, Satori, Matrix, and HTTP/WebSocket bots;
 - a pipeline engine that turns inbound platform messages into LLM/tool/plugin work and replies;
 - persistence, storage, vector database, telemetry, monitoring, and configuration managers;
 - bridges to the Plugin Runtime and Box Runtime provided by `langbot-plugin-sdk`;
 - an MCP server at `/mcp` exposing a curated agent-facing subset of the service layer.
 ## Repository Boundary
 LangBot is not a single-repo system.
 - `LangBot/` is the main product: backend, web UI, platform adapters, pipeline engine, HTTP API, MCP server, RAG, persistence, skills integration, and the bridge code that talks to runtimes.
 - `langbot-plugin-sdk/` is published as `langbot-plugin` and pinned in `LangBot/pyproject.toml`. It contains plugin developer APIs, shared entities, `lbp`, the Plugin Runtime (`lbp rt`), and the Box Runtime (`lbp box`).
 - Plugins import SDK APIs from `langbot_plugin.*`; the LangBot main process imports the same package for shared entities and runtime protocols.
 This split matters. If a change modifies SDK entities, component APIs, action protocols, `lbp rt`, or `lbp box`, verify the sibling SDK repo and install the local SDK into LangBot's virtualenv when testing cross-repo behavior.
 ## Startup Path
 The process entrypoint is small and layered:
 1. `main.py` delegates to `langbot.__main__.main()`.
 2. `src/langbot/__main__.py` parses `--standalone-runtime`, `--standalone-box`, and `--debug`, checks dependencies, generates missing config/data files, and calls `pkg.core.boot.main()`.
 3. `pkg/core/boot.py` executes startup stages in order: `LoadConfigStage`, `GenKeysStage`, `SetupLoggerStage`, `BuildAppStage`, `ShowNotesStage`.
 4. `BuildAppStage` constructs the `Application` object by wiring managers, services, runtime connectors, and controllers.
 5. `Application.run()` starts the platform manager, query controller, HTTP controller, telemetry/cleanup loops, and plugin initialization.
 The central runtime object is `pkg/core/app.py::Application`. It is a service locator for long-lived managers. That is not elegant, but it is the current architectural center; most subsystems receive `ap: Application` and collaborate through it.
 ## Top-Level Layout
 ```text
 LangBot/
 ├── main.py                         # Entrypoint shim
 ├── pyproject.toml                  # Python package, deps, pinned langbot-plugin
 ├── src/langbot/
 │   ├── __main__.py                 # CLI entrypoint and boot handoff
 │   ├── pkg/
 │   │   ├── core/                   # Application, boot stages, task manager
 │   │   ├── api/                    # HTTP API + MCP server mount
 │   │   ├── platform/               # IM adapters and runtime bot manager
 │   │   ├── pipeline/               # Message routing and pipeline stages
 │   │   ├── provider/               # LLM runners, model manager, tools
 │   │   ├── plugin/                 # LangBot-side Plugin Runtime connector/handler
 │   │   ├── box/                    # LangBot-side Box service/connector
 │   │   ├── skill/                  # Skill metadata/activation integration
 │   │   ├── rag/ , vector/          # Knowledge-base and vector DB integration
 │   │   ├── persistence/            # SQLAlchemy/SQLModel, Alembic, legacy migrations
 │   │   ├── storage/                # Local/S3 file storage abstraction
 │   │   └── config/, entity/, utils/, telemetry/, survey/
 │   ├── libs/                       # Vendored third-party platform SDKs
 │   └── templates/                  # Default config and component metadata
 ├── web/                            # Vite + React Router + shadcn/ui + Tailwind SPA
 ├── docker/                         # Deployment manifests
 ├── skills/                         # In-repo agent skills, single source of truth
 └── tests/                          # Unit/integration/e2e/manual tests
 ```
 ## The Runtime Graph
 The most useful mental model is this graph:
 ```text
 Platform adapter
  → RuntimeBot
  → MessageAggregator
  → QueryPool
  → Controller
  → RuntimePipeline
  → PipelineStage chain
  → RequestRunner / ToolManager / PluginRuntimeConnector / BoxService
  → response via adapter
 ```
 The HTTP and MCP surfaces are parallel entrypoints into the same service layer:
 ```text
 HTTP client / Web UI
  → Quart route group
  → api/http/service/*
  → Application managers / persistence / runtime connectors
 MCP client
  → /mcp mount
  → api/mcp/server.py tools
  → the same service layer directly
 ```
 ## Message Flow
 Inbound platform messages enter through adapter-specific SDK callbacks. The common path is:
 1. A platform adapter under `pkg/platform/sources/` converts platform-specific events into SDK message/event entities.
 2. `RuntimeBot` in `pkg/platform/botmgr.py` applies pipeline routing rules and either discards the message, pushes it to webhooks, or sends it to the message aggregator.
 3. `MessageAggregator` batches/normalizes messages before adding a `Query` to `QueryPool`.
 4. `Controller` in `pkg/pipeline/controller.py` selects queries subject to global pipeline concurrency and per-session concurrency.
 5. `RuntimePipeline` in `pkg/pipeline/pipelinemgr.py` runs configured pipeline stages using a responsibility-chain style executor that supports generator stages.
 6. The chat stage emits plugin events, calls a configured `RequestRunner`, handles streaming/non-streaming responses, records telemetry, and appends conversation history.
 7. Output stages send text, cards, chunks, files, or error notices back through the original platform adapter.
 Pipeline components are registered by decorators and package import side effects. When adding a new stage, loader, runner, or adapter, check the corresponding preregistration mechanism instead of inventing a second registry.
 ## Platform Layer
 Platform code lives under `pkg/platform/`.
 - `botmgr.py` owns runtime bots, routing rules, event logging, webhook pushing, and adapter lifecycle.
 - `sources/` contains adapter implementations. Each adapter subclasses `langbot_plugin.api.definition.abstract.platform.adapter.AbstractMessagePlatformAdapter` from the SDK.
 - Platform entities such as `MessageChain`, `Image`, `At`, `Voice`, and events come from `langbot-plugin-sdk`, not from this repo.
 The platform layer should translate between external platform APIs and LangBot's shared message/event model. It should not contain LLM-provider logic or pipeline business logic.
 ## Pipeline Layer
 Pipeline code lives under `pkg/pipeline/`.
 Important pieces:
 - `pool.py::QueryPool` stores pending queries and cached in-flight queries for plugin backward-compatible calls.
 - `controller.py::Controller` schedules query processing and enforces concurrency.
 - `pipelinemgr.py::RuntimePipeline` materializes database pipeline config into a runtime stage chain.
 - `process/handlers/chat.py::ChatMessageHandler` is the main LLM conversation handler.
 - Stage families include response rules, banned sessions, content filters, preprocessors, rate limits, message truncation, long text handling, response-back, command handling, and wrappers.
 Pipelines are configuration-driven. Prefer adding a stage or extending an existing stage family over hard-coding behavior in platform adapters.
 ## Provider, RAG, and Tools
 Provider code lives under `pkg/provider/`.
 - `modelmgr/` manages configured model providers and requesters.
 - `runners/` implements request runners such as the local agent runner and external workflow integrations.
 - `tools/toolmgr.py` aggregates tools from native tools, plugin tools, external MCP servers, and skill-authoring tools.
 - `tools/loaders/mcp.py` is the MCP client side: external MCP servers that LangBot connects to for agent tools.
 - RAG lives across `pkg/rag/`, `pkg/vector/`, model services, and plugin KnowledgeEngine actions.
 Do not confuse LangBot's MCP client side with LangBot's own MCP server at `/mcp`; they are different surfaces.
 ## Plugin System
 The plugin system crosses the repo boundary.
 In this repo:
 - `pkg/plugin/connector.py` connects LangBot to the Plugin Runtime over stdio or WebSocket.
 - `pkg/plugin/handler.py` exposes LangBot actions to the runtime and calls runtime actions for plugin operations.
 - `pkg/provider/tools/loaders/plugin.py` exposes plugin Tool components to LLM runners.
 - Pipeline handlers emit SDK events such as normal-message events and prompt-processing events.
 In `langbot-plugin-sdk`:
 - `src/langbot_plugin/api/` defines `BasePlugin`, component base classes, message/event entities, contexts, proxies, and manifests.
 - `src/langbot_plugin/runtime/` implements `lbp rt`, plugin discovery, dependency installation, process launching, and control/debug connections.
 - `src/langbot_plugin/entities/io/` defines the action protocol shared by LangBot, runtime, and plugin processes.
 The Plugin Runtime supports stdio and WebSocket control transports. Direct local LangBot runs usually spawn the runtime over stdio. Containerized/standalone deployments connect over WebSocket using `plugin.runtime_ws_url` and `--standalone-runtime`.
 ## Box Runtime and Skills
 Box is the sandbox subsystem used by native agent tools, stdio MCP servers, skill authoring, and managed processes.
 In this repo:
 - `pkg/box/service.py` is the application-facing facade for exec, sessions, managed processes, skill CRUD, status, reconnects, quotas, mounts, and sandbox profiles.
 - `pkg/box/connector.py` connects to the Box Runtime over stdio, Windows subprocess+WebSocket, or remote WebSocket.
 - `pkg/provider/tools/loaders/native.py`, `mcp_stdio.py`, and skill loaders depend on Box availability.
 - `pkg/skill/manager.py` loads skills from the Box runtime, falling back to local `data/skills` when needed.
 In `langbot-plugin-sdk`:
 - `src/langbot_plugin/box/server.py` implements `lbp box` and the WebSocket endpoints on `:5410`.
 - `src/langbot_plugin/box/runtime.py` owns sandbox sessions and managed processes.
 - `backend.py`, `nsjail_backend.py`, and `e2b_backend.py` implement sandbox backends.
 - `skill_store.py` manages skill packages from the Box side.
 Important config keys live under `box:` in `src/langbot/templates/config.yaml`: `box.enabled`, `box.backend`, `box.runtime.endpoint`, and `box.local.*`. Start LangBot with `--standalone-box` when connecting to an externally launched Box runtime.
 ## HTTP API, Web UI, and MCP Server
 `pkg/api/http/controller/main.py` builds a Quart app, registers route groups, serves the built SPA, and wraps the ASGI app with the MCP dispatcher.
 - HTTP route groups live under `pkg/api/http/controller/groups/`.
 - Service-layer logic lives under `pkg/api/http/service/`.
 - The built web UI is served from the frontend build path with SPA fallback.
 - The MCP server lives under `pkg/api/mcp/` and is mounted at `/mcp`.
 The MCP server intentionally exposes a curated subset of the API. Tools call service classes directly rather than making HTTP requests back into LangBot.
 Maintenance rule: when adding, removing, or changing an HTTP endpoint that should be agent-accessible, update the matching MCP tool and the relevant in-repo skill under `skills/` in the same pass.
 ## Persistence and Configuration
 Persistence is centered on `pkg/persistence/mgr.py`.
 - SQLite is the default database; PostgreSQL is supported.
 - Models live under `pkg/entity/persistence/`.
 - Fresh schemas are created from metadata, then legacy migrations run up to the frozen 3.x baseline, then Alembic migrations run to head.
 - New schema changes should use Alembic under `pkg/persistence/alembic/versions/`; do not extend the frozen legacy migration chain.
 Configuration starts from `src/langbot/templates/config.yaml` and is generated into `data/config.yaml` on first run. Most long-lived managers read from `ap.instance_config.data`.
 ## Frontend
 The frontend lives in `web/` and is a Vite SPA using React Router 7, shadcn/ui, Tailwind CSS, and pnpm. It is not Next.js, despite some historical filenames.
 In development, `pnpm dev` serves the UI on `:3000` and reads `VITE_API_BASE_URL` to call the backend on `:5300`. In production, the built frontend is packaged into the Python distribution and served by the backend.
 Keep frontend API behavior aligned with `pkg/api/http/service/` and route groups. User-facing strings must go through the existing i18n setup.
 ## Agent-Facing Surfaces
 LangBot is deliberately agent-friendly. The agent-facing surfaces are part of the architecture, not extra docs.
 - `skills/` is the single source of truth for in-repo skills.
 - `pkg/api/mcp/server.py` exposes the LangBot MCP server at `/mcp`.
 - `api.global_api_key` authenticates API/MCP access without a browser login.
 - `AGENTS.md` and `ARCHITECTURE.md` tell coding agents how the repo works.
 When one of these changes, update the others if the behavior or contract changed. API, MCP tools, and skills are one system; drift is a bug.
 ## Where to Change Things
 - New HTTP API: add/adjust a service in `pkg/api/http/service/`, a route group in `pkg/api/http/controller/groups/`, tests, and MCP/skills if agent-accessible.
 - New platform adapter: add a `pkg/platform/sources/*` adapter, component metadata/templates as needed, i18n, docs, and tests/smoke coverage.
 - New pipeline behavior: add or extend a pipeline stage family under `pkg/pipeline/`; avoid putting pipeline rules in adapters.
 - New LLM provider/requester: work under `pkg/provider/modelmgr/` and related service/UI surfaces.
 - New LLM tool source: extend `pkg/provider/tools/loaders/` and `ToolManager` intentionally.
 - New plugin component/API/protocol: change `langbot-plugin-sdk` first or in lockstep, then update LangBot bridge code.
 - New Box capability: change both `pkg/box/` and `langbot-plugin-sdk/src/langbot_plugin/box/`, plus config and tests.
 - New database schema: add an Alembic migration, not a legacy `dbmXXX` migration.
 ## Design Biases
 - Keep platform translation, pipeline orchestration, provider execution, and runtime protocols separate.
 - Reuse existing registries and service layers instead of adding parallel paths.
 - Prefer small, explicit agent surfaces over exposing every internal API.
 - Treat cross-repo contracts with the SDK as public interfaces.
 - Test behavior at the narrowest useful layer first, then add integration/e2e coverage for runtime or platform changes.
@@ -52,6 +52,15 @@ RUN apt-get update \
    && echo "deb [arch=$(dpkg --print-architecture) signed-by=/etc/apt/keyrings/docker.asc] https://download.docker.com/linux/debian $(. /etc/os-release && echo \"$VERSION_CODENAME\") stable" > /etc/apt/sources.list.d/docker.list \
    && apt-get update \
    && apt-get install -y --no-install-recommends docker-ce-cli \
    # Install Node.js LTS so the sandbox (nsjail/Docker box) can run npx-based
    # stdio MCP servers. node/npx land in /usr/bin, which is on the nsjail
    # read-only mount whitelist (_READONLY_SYSTEM_MOUNTS), so they are bound
    # into the sandbox chroot automatically. Without node, any npx-launched
    # MCP server exits with return_code=127 (command not found).
    && curl -fsSL https://deb.nodesource.com/setup_22.x -o /tmp/nodesource_setup.sh \
    && bash /tmp/nodesource_setup.sh \
    && apt-get install -y --no-install-recommends nodejs \
    && rm -f /tmp/nodesource_setup.sh \
    && python -m pip install --no-cache-dir uv \
    && uv sync \
    && apt-get purge -y --auto-remove curl gnupg \
@@ -55,6 +55,12 @@ LangBot is an **open-source, production-grade platform** for building AI-powered
 ---
 ## 😎 Stay Updated
 Click the Star and Watch buttons in the top-right corner of the repository to get the latest updates.
 ![star gif](https://langbot.app/star.gif)
 ## Quick Start
 ### ☁️ LangBot Cloud (Recommended)
@@ -74,7 +80,7 @@ uvx langbot
 ```bash
 git clone https://github.com/langbot-app/LangBot
 cd LangBot/docker
-docker compose up -d
+docker compose --profile all up -d
 ```
 ### One-Click Cloud Deploy
@@ -55,6 +55,12 @@ LangBot 是一个**开源的生产级平台**，用于构建 AI 驱动的即时
 ---
 ## 😎 保持更新
 点击[仓库首页](https://github.com/langbot-app/LangBot)右上角 Star 和 Watch 按钮，获取最新动态。
 ![star gif](https://langbot.app/star.gif)
 ## 快速开始
 ### ☁️ LangBot Cloud（推荐）
@@ -74,7 +80,7 @@ uvx langbot
 ```bash
 git clone https://github.com/langbot-app/LangBot
 cd LangBot/docker
-docker compose up -d
+docker compose --profile all up -d
 ```
 ### 一键云部署
@@ -54,6 +54,12 @@ LangBot es una **plataforma de código abierto y grado de producción** para con
 ---
 ## 😎 Manténgase Actualizado
 Haga clic en los botones Star y Watch en la esquina superior derecha del repositorio para obtener las últimas actualizaciones.
 ![star gif](https://langbot.app/star.gif)
 ## Inicio Rápido
 ### ☁️ LangBot Cloud (Recomendado)
@@ -73,7 +79,7 @@ uvx langbot
 ```bash
 git clone https://github.com/langbot-app/LangBot
 cd LangBot/docker
-docker compose up -d
+docker compose --profile all up -d
 ```
 ### Despliegue en la Nube con un Clic
@@ -54,6 +54,12 @@ LangBot est une **plateforme open-source de niveau production** pour créer des
 ---
 ## 😎 Restez à Jour
 Cliquez sur les boutons Star et Watch dans le coin supérieur droit du dépôt pour obtenir les dernières mises à jour.
 ![star gif](https://langbot.app/star.gif)
 ## Démarrage Rapide
 ### ☁️ LangBot Cloud (Recommandé)
@@ -73,7 +79,7 @@ uvx langbot
 ```bash
 git clone https://github.com/langbot-app/LangBot
 cd LangBot/docker
-docker compose up -d
+docker compose --profile all up -d
 ```
 ### Déploiement Cloud en un Clic
@@ -54,6 +54,12 @@ LangBot は、AI搭載のインスタントメッセージングボットを構
 ---
 ## 😎 最新情報を入手
 リポジトリの右上にある Star と Watch ボタンをクリックして、最新の更新を取得してください。
 ![star gif](https://langbot.app/star.gif)
 ## クイックスタート
 ### ☁️ LangBot Cloud（推奨）
@@ -73,7 +79,7 @@ uvx langbot
 ```bash
 git clone https://github.com/langbot-app/LangBot
 cd LangBot/docker
-docker compose up -d
+docker compose --profile all up -d
 ```
 ### ワンクリッククラウドデプロイ
@@ -54,6 +54,12 @@ LangBot은 AI 기반 인스턴트 메시징 봇을 구축하기 위한 **오픈
 ---
 ## 😎 최신 정보 받기
 리포지토리 오른쪽 상단의 Star 및 Watch 버튼을 클릭하여 최신 업데이트를 받으세요.
 ![star gif](https://langbot.app/star.gif)
 ## 빠른 시작
 ### ☁️ LangBot Cloud (추천)
@@ -73,7 +79,7 @@ uvx langbot
 ```bash
 git clone https://github.com/langbot-app/LangBot
 cd LangBot/docker
-docker compose up -d
+docker compose --profile all up -d
 ```
 ### 원클릭 클라우드 배포
@@ -54,6 +54,12 @@ LangBot — это **платформа с открытым исходным к
 ---
 ## 😎 Оставайтесь в курсе
 Нажмите кнопки Star и Watch в правом верхнем углу репозитория, чтобы получать последние обновления.
 ![star gif](https://langbot.app/star.gif)
 ## Быстрый старт
 ### ☁️ LangBot Cloud (Рекомендуется)
@@ -73,7 +79,7 @@ uvx langbot
 ```bash
 git clone https://github.com/langbot-app/LangBot
 cd LangBot/docker
-docker compose up -d
+docker compose --profile all up -d
 ```
 ### Облачное развертывание одним кликом
@@ -56,6 +56,12 @@ LangBot 是一個**開源的生產級平台**，用於建構 AI 驅動的即時
 ---
 ## 😎 保持更新
 點擊倉庫右上角 Star 和 Watch 按鈕，獲取最新動態。
 ![star gif](https://langbot.app/star.gif)
 ## 快速開始
 ### ☁️ LangBot Cloud（推薦）
@@ -75,7 +81,7 @@ uvx langbot
 ```bash
 git clone https://github.com/langbot-app/LangBot
 cd LangBot/docker
-docker compose up -d
+docker compose --profile all up -d
 ```
 ### 一鍵雲端部署
@@ -54,6 +54,12 @@ LangBot là một **nền tảng mã nguồn mở, cấp sản xuất** để x
 ---
 ## 😎 Cập nhật Mới nhất
 Nhấp vào các nút Star và Watch ở góc trên bên phải của kho lưu trữ để nhận các bản cập nhật mới nhất.
 ![star gif](https://langbot.app/star.gif)
 ## Bắt đầu nhanh
 ### ☁️ LangBot Cloud (Khuyên dùng)
@@ -73,7 +79,7 @@ uvx langbot
 ```bash
 git clone https://github.com/langbot-app/LangBot
 cd LangBot/docker
-docker compose up -d
+docker compose --profile all up -d
 ```
 ### Triển khai đám mây một cú nhấp
@@ -62,11 +62,12 @@ services:
      - TZ=Asia/Shanghai
      # Unified env-override convention: SECTION__SUBSECTION__KEY overrides the
      # matching config.yaml field (see LoadConfigStage). These map onto
-      # box.local.* and are forwarded to the Box runtime via INIT RPC.
+      # box.* and are forwarded to the Box runtime via INIT RPC.
      - BOX__LOCAL__HOST_ROOT=${LANGBOT_BOX_ROOT:-${PWD}/data/box}
      - BOX__LOCAL__DEFAULT_WORKSPACE=default
      - BOX__LOCAL__SKILLS_ROOT=skills
      - BOX__LOCAL__ALLOWED_MOUNT_ROOTS=${LANGBOT_BOX_ROOT:-${PWD}/data/box}
      - BOX__DOCKER__CPU_LIMIT_ENABLED=${LANGBOT_BOX_DOCKER_CPU_LIMIT_ENABLED:-true}
    ports:
      - 5300:5300  # For web ui and webhook callback
      - 2280-2285:2280-2285  # For platform reverse connection
@@ -0,0 +1,575 @@
 # HTTP Bot Adapter — Design Document
 > Status: **Implemented** · Branch: `feat/http-bot-adapter` · Author: LangBot core
 >
 > A first-class, **standalone** message-platform adapter (`http_bot`) that lets
 > any external system (e.g. LangBot Space ticketing, an internal back-office, a
 > CRM, a custom web app) talk to a LangBot pipeline over plain HTTP — **inbound**
 > by POSTing messages in, **outbound** by receiving replies on a callback URL —
 > with full support for the pipeline's native N→1 aggregation and 1→M
 > multi-reply semantics, and **without** holding a long-lived WebSocket
 > connection.
 >
 > **Shipped in this branch:**
 > - `src/langbot/pkg/platform/sources/http_bot.yaml` — adapter manifest (auto-discovered)
 > - `src/langbot/pkg/platform/sources/http_bot.py` — `HttpBotAdapter`
 > - `src/langbot/pkg/platform/sources/http_bot_signing.py` — HMAC helpers
 > - `src/langbot/pkg/platform/sources/http_bot.svg` — icon
 > - `docs/platforms/http-bot.md` — integration guide
 > - `docs/http-bot-openapi.json` — machine-readable contract
 > - `examples/http-bot/` — Python + TypeScript reference clients
 >
 > **Final decisions (resolving the original open questions):**
 > 1. Callback URL is **config-only** — never accepted per-message (SSRF closed).
 > 2. **Session reset is provided** — `POST /bots/<uuid>/reset` keyed by `session_id`.
 > 3. Reference **clients are provided** — `examples/http-bot/client.py` + `client.ts`.
 > 4. **Sync convenience mode is included** — `POST /bots/<uuid>/sync` (opt-in, lossy).
 ---
 ## 1. Background & Motivation
 ### 1.1 The concrete need
 LangBot Space wants to use a LangBot pipeline as the brain for **ticket
 handling**. The integration is **server-to-server**: Space's backend pushes a
 user's ticket messages into LangBot and renders LangBot's replies back into the
 ticket thread.
 This interaction is **not** request/response shaped:
 - **N → 1**: a user may fire several messages in a row ("the app crashed" …
  "when I click export" … "here's a screenshot"). The pipeline's
  **message aggregation** feature should debounce and merge these into one turn.
 - **1 → N**: a single turn may yield **multiple** outbound messages — a tool/
  function call narrating progress, a plugin emitting several cards, a streamed
  answer split into chunks.
 ### 1.2 Why the existing options don't fit
 LangBot today exposes exactly one externally-reachable way to drive a pipeline
 that is **not** tied to a specific IM vendor: the **WebSocket** path
 (`/api/v1/pipelines/<uuid>/ws/connect` for dashboard debug, and
 `/api/v1/embed/<bot_uuid>/ws/connect` for the embeddable web widget).
 For a server-to-server integration the WebSocket path has real friction:
 | Problem | Detail |
 |---|---|
 | Long-lived connection | Caller must maintain a socket, heartbeats, and reconnect logic for what is fundamentally a fire-and-collect workload. |
 | Session identity | Inbound messages are keyed by the transient `connection_id` (`websocket_{connection_id}`); the caller **cannot supply a stable, business-meaningful session id** (e.g. a ticket number). Multi-ticket isolation is not expressible. |
 | Auth mismatch | The debug socket is gated by the **dashboard JWT** (must not be handed to an external service); the embed socket is gated by **Cloudflare Turnstile** (a *browser* human-check that a backend cannot satisfy). Neither is a server-to-server credential. |
 | In-memory, single-process state | Session history lives in process memory and is lost on restart. |
 > **Key realisation.** The N→1 / 1→M behaviour the caller wants is **not**
 > provided by WebSocket — it is provided by the **pipeline** (aggregation +
 > the adapter being free to call `reply_message` any number of times). It is
 > therefore **transport-independent**. We can deliver the exact same semantics
 > over a far lighter HTTP transport.
 ### 1.3 Why a *new, standalone* adapter (not a refactor of an existing one)
 The brief is explicit: **do not reuse / fork an existing vendor adapter.** The
 vendor adapters (`lark`, `wecom`, `qqofficial`, `slack`, …) carry vendor-specific
 signature schemes, payload shapes, and message-segment mappings. Bending one of
 them into a "generic" mode would couple a public integration surface to one
 vendor's quirks and make the developer experience worse for everyone.
 Instead we ship `http_bot` as a clean, independent adapter whose **entire
 contract is LangBot's own** — documented, versioned, and designed front-to-back
 around *integrator* developer experience.
 ---
 ## 2. Goals & Non-Goals
 ### Goals
 - **G1** A standalone `http_bot` adapter, selectable like any other platform
  adapter in the dashboard, with its own config schema and docs.
 - **G2** **Inbound**: external systems POST messages to a stable LangBot URL,
  carrying a **caller-defined `session_id`** that maps 1:1 to a LangBot session.
 - **G3** **Outbound**: LangBot delivers each reply by POSTing to a
  caller-configured **callback URL**; one turn may produce **many** callbacks.
 - **G4** Preserve pipeline-native **N→1 aggregation** and **1→M multi-reply**.
 - **G5** Server-to-server **auth**: shared-secret HMAC request signing both
  directions (no JWT, no Turnstile, no long-lived socket).
 - **G6** **Great DX**: copy-pasteable curl, a tiny reference client, an OpenAPI
  fragment, idempotency, clear error envelope, and a local echo-server recipe.
 ### Non-Goals
 - Not replacing or deprecating the WebSocket / embed widget path (that remains
  the right tool for *browser*, real-time, streaming chat UIs).
 - Not a synchronous "one request → one response" RPC (explicitly rejected: it
  cannot express 1→M; see §9 for the optional sync convenience mode).
 - No built-in message **persistence/replay** in v1 (callbacks are at-least-once
  best-effort; durability is the caller's responsibility — see §8).
 - No multi-tenant API-key management UI in v1 (one secret per bot; see §11).
 ---
 ## 3. How LangBot routes a message (the parts we plug into)
 Understanding the existing flow is what makes this adapter cheap. A message
 flows through these stages (verified against current `master`):
 ```
                INBOUND                                         OUTBOUND
 external POST ─┐                                       ┌─ reply_message()
               ▼                                        │  reply_message_chunk()
  POST /bots/<bot_uuid>            (unified webhook router, AuthType.NONE)
               │  webhooks.py → adapter.handle_unified_webhook(bot_uuid, path, request)
               ▼                                        │
  HttpBotAdapter.handle_unified_webhook                 │  (called 0..N times
   • verify HMAC signature                              │   per turn by the
   • parse {session_id, message[]}                      │   pipeline / plugins)
   • build FriendMessage / GroupMessage                 │
   • fire registered listener  ───────────────┐        │
               │                               │        │
               ▼                               ▼        │
  botmgr.on_friend_message / on_group_message           │
   • (optional) webhook_pusher fan-out                  │
   • msg_aggregator.add_message(...) ── N→1 debounce ──►│
               │                                        │
               ▼                                        │
  query_pool → pipeline.run()  ─── invokes adapter ─────┘
                                    reply methods 1..M times
 ```
 Two framework facts we rely on:
 1. **N→1 aggregation is free.** `botmgr` hands every inbound event to
   `self.ap.msg_aggregator.add_message(...)`, which debounces per
   `session_id` and merges consecutive messages into one pipeline turn
   (`pkg/pipeline/aggregator.py`). The adapter does nothing special.
 2. **1→M is free.** The pipeline (and any plugin in the chain) calls
   `adapter.reply_message()` / `reply_message_chunk()` **as many times as it
   wants** per turn. The adapter's only job is to deliver each call outward.
   For `http_bot` that means: **one outbound callback POST per call.**
 3. **A unified inbound route already exists.** `WebhookRouterGroup`
   (`pkg/api/http/controller/groups/webhooks.py`) maps
   `POST /bots/<bot_uuid>[/<path>]` (auth `NONE`) to
   `adapter.handle_unified_webhook(bot_uuid, path, request)`. `http_bot`
   implements that method and is reachable **without registering any new
   route** — it does its own signature verification, exactly like the vendor
   webhook adapters do.
 > Net new code is essentially: one `http_bot.py` adapter, one `http_bot.yaml`
 > schema, signing helpers, and docs. No router, aggregator, or pipeline changes.
 ---
 ## 4. Architecture Overview
 ```
 ┌────────────────────┐         (1) inbound: POST signed message
 │  External system   │  ──────────────────────────────────────────────►  ┌──────────────────────┐
 │ (LangBot Space,    │         POST /bots/<bot_uuid>                      │      LangBot         │
 │  CRM, web app …)   │         X-LB-Signature, X-LB-Timestamp             │                      │
 │                    │         { session_id, message:[...] }              │  HttpBotAdapter      │
 │  - callback server │  ◄──────────────────────────────────────────────  │   (platform/sources) │
 │    (receives       │         (4) outbound: POST signed reply(s)         │                      │
 │     replies)       │         POST <callback_url>                        │  pipeline + aggregator│
 └────────────────────┘         X-LB-Signature, X-LB-Timestamp            └──────────────────────┘
                               { session_id, sequence, is_final,
                                 message:[...] }      (sent 1..M times)
 ```
 - The adapter is **stateless across requests** at the HTTP layer; session
  continuity is carried by `session_id` and resolved by LangBot's normal
  session manager.
 - **Inbound** and **outbound** are **independent HTTP exchanges**. LangBot does
  not answer the inbound POST with the pipeline result; it `202 Accepts` it and
  later POSTs the reply(s) to the callback URL. This is what makes 1→M natural.
 ---
 ## 5. Configuration Schema (`http_bot.yaml`)
 Follows the existing `MessagePlatformAdapter` manifest convention (cf.
 `slack.yaml`). Fields:
 | field | type | required | purpose |
 |---|---|---|---|
 | `inbound_secret` | string (secret) | yes | HMAC key the **caller** uses to sign inbound POSTs; LangBot verifies. |
 | `callback_url` | string (url) | no* | Where LangBot POSTs replies. *Optional if the caller supplies `callback_url` per-message (see §6.1); a static default lives here. |
 | `outbound_secret` | string (secret) | no | HMAC key LangBot uses to sign outbound callbacks; caller verifies. Defaults to `inbound_secret` if empty. |
 | `default_session_type` | enum `person`/`group` | no | Default when a message omits `session_type`. Default `person`. |
 | `signature_required` | bool | no | If `false`, skip inbound signature check (dev only; logs a warning). Default `true`. |
 | `callback_timeout` | int (seconds) | no | Per-callback HTTP timeout. Default `15`. |
 | `callback_max_retries` | int | no | Retries on 5xx/timeout with backoff. Default `3`. |
 | `webhook_url` | webhook-url (display) | — | Read-only field rendering the inbound URL `…/bots/<bot_uuid>` for copy-paste, like other webhook adapters. |
 Manifest sketch (i18n labels elided for brevity):
 ```yaml
 apiVersion: v1
 kind: MessagePlatformAdapter
 metadata:
  name: http_bot
  label: { en_US: "HTTP Bot", zh_Hans: "HTTP 通用接入" }
  description:
    en_US: "Integrate any backend over plain HTTP. Push messages in, receive replies on a callback URL. Server-to-server, no long-lived connection."
    zh_Hans: "通过 HTTP 接入任意后端系统。推入消息、在回调地址接收回复。面向服务间集成，无需长连接。"
  icon: http_bot.svg
 spec:
  categories: [popular, global]
  help_links:
    zh: https://docs.langbot.app/zh/platforms/http-bot
    en: https://docs.langbot.app/en/platforms/http-bot
  config:
    - { name: inbound_secret,       type: string, required: true,  default: "" }
    - { name: callback_url,         type: string, required: false, default: "" }
    - { name: outbound_secret,      type: string, required: false, default: "" }
    - { name: default_session_type, type: select, required: false, default: "person",
        options: [person, group] }
    - { name: signature_required,   type: boolean, required: false, default: true }
    - { name: callback_timeout,     type: integer, required: false, default: 15 }
    - { name: callback_max_retries, type: integer, required: false, default: 3 }
    - { name: webhook_url,          type: webhook-url, required: false, default: "" }
 execution:
  python:
    path: ./http_bot.py
    attr: HttpBotAdapter
 ```
 ---
 ## 6. The HTTP Contract (this is the DX surface)
 ### 6.1 Inbound — push a message into LangBot
 ```
 POST /bots/{bot_uuid}
 Content-Type: application/json
 X-LB-Timestamp: 1718000000
 X-LB-Signature: sha256=<hex hmac>
 X-LB-Idempotency-Key: <uuid>        # optional, dedup window
 ```
 Body:
 ```jsonc
 {
  "session_id": "ticket-10293",        // REQUIRED. Caller-defined. Maps 1:1 to a LangBot session.
  "session_type": "person",            // optional, "person" | "group"; default from config
  "sender": {                          // optional metadata, surfaced to pipeline/plugins
    "id": "user-5567",
    "name": "Alice"
  },
  "message": [                         // REQUIRED. A LangBot MessageChain (list of segments).
    { "type": "Plain", "text": "Export keeps failing on the dashboard." },
    { "type": "Image", "url": "https://.../screenshot.png" }
  ]
 }
 ```
 Response (LangBot does **not** block on the pipeline):
 ```jsonc
 // 202 Accepted
 {
  "code": 0,
  "msg": "accepted",
  "data": {
    "session_id": "ticket-10293",
    "accepted_message_id": "in_01H....",   // server-assigned id for this inbound message
    "aggregating": true                    // true if buffered by the aggregator
  }
 }
 ```
 **N→1 in practice.** Fire three POSTs with the same `session_id` inside the
 aggregation window → the pipeline runs **once** with the three messages merged.
 No special flag needed; this is the aggregator's default behaviour when enabled
 on the pipeline.
 ### 6.2 Outbound — LangBot delivers replies to your callback
 For each `reply_message` / `reply_message_chunk` the pipeline emits, LangBot
 POSTs to `callback_url`:
 ```
 POST {callback_url}
 Content-Type: application/json
 X-LB-Timestamp: 1718000001
 X-LB-Signature: sha256=<hex hmac over body>
 ```
 Body:
 ```jsonc
 {
  "session_id": "ticket-10293",         // echoes the inbound session
  "reply_to": "in_01H....",             // the inbound message id this answers
  "sequence": 1,                        // 1-based ordinal within this turn (for 1→M ordering)
  "is_final": false,                    // false for intermediate/streamed parts
  "stream": false,                      // true when this is a streamed chunk
  "message": [
    { "type": "Plain", "text": "Looking into it — checking your export logs…" }
  ],
  "timestamp": "2026-06-22T09:00:01Z"
 }
 ```
 **1→M in practice.** A turn that fires a function call then a final answer
 produces e.g.:
 ```
 POST callback  → { sequence: 1, is_final: false, message: ["Checking logs…"] }
 POST callback  → { sequence: 2, is_final: false, message: ["Found 2 failed exports."] }
 POST callback  → { sequence: 3, is_final: true,  message: ["Fixed. Try again now."] }
 ```
 The caller stitches by `session_id` + `sequence`, and knows the turn is complete
 when `is_final: true` arrives.
 Your callback endpoint should return `200` quickly. A non-2xx triggers retry
 with backoff (`callback_max_retries`).
 ### 6.3 Error envelope (inbound)
 Consistent, machine-readable; never leak internals:
 ```jsonc
 { "code": 40101, "msg": "invalid signature", "data": null }
 ```
 | HTTP | code | meaning |
 |---|---|---|
 | 202 | 0 | accepted |
 | 400 | 40001 | malformed body / missing `session_id` or `message` |
 | 401 | 40101 | bad/expired signature |
 | 403 | 40301 | bot disabled |
 | 404 | 40401 | bot_uuid not found / not an `http_bot` adapter |
 | 409 | 40901 | duplicate idempotency key (already accepted) |
 | 413 | 41301 | message too large |
 | 500 | 50001 | internal error |
 ---
 ## 7. Signing scheme (both directions)
 Symmetric, dependency-free HMAC-SHA256 — trivial to implement in any language.
 ```
 signing_string = "{timestamp}.{raw_request_body}"
 signature      = "sha256=" + hex(HMAC_SHA256(secret, signing_string))
 ```
 Verification rules:
 - Reject if `|now - timestamp| > 300s` (replay window).
 - Constant-time compare (`hmac.compare_digest`).
 - Inbound verified with `inbound_secret`; outbound signed with
  `outbound_secret` (falls back to `inbound_secret`).
 - `signature_required: false` bypasses verification **and logs a warning** —
  intended only for local development behind a trusted network.
 Reference (Python, ~6 lines):
 ```python
 import hmac, hashlib, time
 def sign(secret: str, body: bytes, ts: int | None = None) -> tuple[str, str]:
    ts = ts or int(time.time())
    mac = hmac.new(secret.encode(), f"{ts}.".encode() + body, hashlib.sha256)
    return str(ts), "sha256=" + mac.hexdigest()
 ```
 ---
 ## 8. Delivery semantics & reliability
 - **Inbound**: `202 Accepted` means *queued*, not *processed*. Use
  `X-LB-Idempotency-Key` to make client retries safe (dedup window, e.g. 10 min).
 - **Outbound**: **at-least-once**, best-effort. Retries on timeout/5xx with
  exponential backoff up to `callback_max_retries`. Callbacks for one
  `session_id` are delivered **in `sequence` order** (serialised per session);
  across sessions they may interleave.
 - **No persistence in v1**: if LangBot restarts mid-turn, in-flight callbacks
  may be lost. Durable replay is deferred (see §13). Callers needing exactly-once
  should dedup on `(session_id, reply_to, sequence)`.
 - **Backpressure**: the adapter must not block the pipeline on slow callbacks —
  outbound POSTs run on a per-session ordered queue with the configured timeout.
 ---
 ## 9. Optional: synchronous convenience mode (v1.1, behind a flag)
 Some simple callers genuinely want "POST a message, get the reply in the HTTP
 response" and don't care about streaming/multi-part. We can offer an **opt-in**
 sync endpoint that internally waits for `is_final` and **collapses** all 1→M
 parts into one array:
 ```
 POST /bots/{bot_uuid}/sync     →    200 { session_id, message: [ ...all parts concatenated... ] }
 ```
 Implemented by attaching a per-request future that resolves on the final reply,
 with a hard timeout. This is a **convenience wrapper** over the same machinery,
 explicitly documented as lossy for streaming/ordering. Not in v1 core.
 ---
 ## 10. Adapter implementation sketch (`platform/sources/http_bot.py`)
 Implements `AbstractMessagePlatformAdapter`. Key methods:
 ```python
 class HttpBotAdapter(AbstractMessagePlatformAdapter):
    listeners: dict = pydantic.Field(default_factory=dict, exclude=True)
    # --- inbound -------------------------------------------------------
    async def handle_unified_webhook(self, bot_uuid, path, request):
        body = await request.get_body()
        if self.config.get("signature_required", True):
            if not self._verify(request, body):
                return jsonify({"code": 40101, "msg": "invalid signature"}), 401
        data = json.loads(body)
        session_id    = data["session_id"]                 # caller-defined identity
        session_type  = data.get("session_type", self.config.get("default_session_type", "person"))
        chain         = MessageChain.model_validate(data["message"])
        event         = self._build_event(session_type, session_id, data.get("sender"), chain)
        # remember where to send replies for this session
        self._callback_for[session_id] = data.get("callback_url") or self.config.get("callback_url")
        # fire the registered listener → botmgr → msg_aggregator (N→1) → pipeline
        if type(event) in self.listeners:
            asyncio.create_task(self.listeners[type(event)](event, self))
        return jsonify({"code": 0, "msg": "accepted",
                        "data": {"session_id": session_id, "accepted_message_id": event.message_id}}), 202
    # --- outbound (called 1..M times per turn by the pipeline) ---------
    async def reply_message(self, message_source, message, quote_origin=False):
        return await self._post_callback(message_source, message, is_final=True, stream=False)
    async def reply_message_chunk(self, message_source, bot_message, message,
                                  quote_origin=False, is_final=False):
        return await self._post_callback(message_source, message, is_final=is_final, stream=True)
    async def is_stream_output_supported(self) -> bool:
        return True
    def register_listener(self, event_type, func):   self.listeners[event_type] = func
    def unregister_listener(self, event_type, func): self.listeners.pop(event_type, None)
    async def run_async(self): pass     # nothing to poll; purely webhook-driven
    async def kill(self): pass
 ```
 `_post_callback` resolves the session's callback URL, assigns the next
 `sequence`, signs the body, and enqueues an ordered, retrying POST.
 Session→callback mapping is kept in a small in-memory dict keyed by
 `session_id` (acceptable for v1; a turn's callback URL is captured at inbound
 time so replies always have a destination even if config later changes).
 ---
 ## 11. Security considerations
 - **Inbound route is `AuthType.NONE`** at the framework level (same as all
  webhook adapters) — the adapter **must** enforce HMAC itself. Default
  `signature_required: true`.
 - **Timestamp window** (±300s) + idempotency key blunt replay.
 - **SSRF on callback_url**: validate scheme (`https` in prod), and consider an
  allow-list / block of private CIDRs since LangBot initiates the POST. Document
  this; enforce in code where feasible.
 - **Secret storage**: secrets live in the bot's `adapter_config` like every
  other adapter credential; surfaced as `type: string`/secret in the dashboard.
 - **One secret per bot** in v1. Per-caller key rotation / multiple keys is a
  future enhancement (§13).
 ---
 ## 12. Developer Experience (explicit deliverables)
 The whole point of a standalone adapter is that **integrating is pleasant**. v1
 ships:
 1. **`docs/platforms/http-bot.md`** — task-oriented integration guide:
   create the bot → copy inbound URL → set secret → stand up a callback
   endpoint → send first message → handle 1→M.
 2. **Copy-paste curl** for the first message (with a working signing one-liner).
 3. **Reference clients** (≤50 LOC each) in `examples/http-bot/`:
   `client.py` (push + a Flask/Quart callback receiver) and `client.ts`.
 4. **OpenAPI fragment** `docs/http-bot-openapi.json` describing inbound +
   callback shapes, so integrators can codegen.
 5. **Local echo recipe**: a one-command callback server that prints every
   reply, so a developer sees N→1 and 1→M working in under five minutes.
 6. **Postman/Hoppscotch collection** (nice-to-have).
 DX acceptance check: *a developer who has never seen LangBot can, from the docs
 alone, push a message and observe a multi-part reply on their callback within
 10 minutes.*
 ### Quickstart (curl)
 ```bash
 BOT=https://your-langbot/bots/2f1c....
 SECRET=supersecret
 BODY='{"session_id":"ticket-10293","message":[{"type":"Plain","text":"hello"}]}'
 TS=$(date +%s)
 SIG="sha256=$(printf '%s.%s' "$TS" "$BODY" | openssl dgst -sha256 -hmac "$SECRET" -r | cut -d' ' -f1)"
 curl -sS -X POST "$BOT" \
  -H "Content-Type: application/json" \
  -H "X-LB-Timestamp: $TS" \
  -H "X-LB-Signature: $SIG" \
  -d "$BODY"
 ```
 ---
 ## 13. Future work
 - **Durable outbound queue** (persist + replay across restarts; exactly-once).
 - **Per-caller API keys** with rotation and scopes (multi-tenant Space usage).
 - **Sync convenience endpoint** (§9) once core is stable.
 - **Server-Sent Events outbound option** for callers that *do* want a stream but
  not a full duplex socket — single GET, server pushes chunks.
 - **Dashboard "test console"** for `http_bot` (send a message, watch callbacks)
  mirroring the existing WebSocket debug panel.
 ---
 ## 14. Rollout / task breakdown
 | # | Task | Touches |
 |---|---|---|
 | 1 | `http_bot.yaml` manifest + icon | `platform/sources/` |
 | 2 | `HttpBotAdapter` (inbound verify, event build, outbound queue) | `platform/sources/http_bot.py` |
 | 3 | Signing helper module (shared) | `platform/sources/` or `utils/` |
 | 4 | i18n strings (en/zh/ja) | adapter yaml + web locale |
 | 5 | Integration docs `docs/platforms/http-bot.md` | `docs/` |
 | 6 | OpenAPI fragment + reference clients | `docs/`, `examples/http-bot/` |
 | 7 | Tests: signature verify, N→1 aggregation, 1→M ordering, retry | `tests/` |
 | 8 | (opt) SSRF guard for callback_url | adapter |
 No changes required to: the unified webhook router, the aggregator, the query
 pool, or the pipeline. That is the design's main payoff.
 ---
 ## 15. Resolved decisions
 1. **Callback URL trust** — **config-only.** The inbound message may not carry a
   `callback_url`; replies always go to the bot-config URL. Closes the SSRF
   vector where a leaked inbound secret could redirect replies.
 2. **Session lifecycle** — **`POST /bots/<uuid>/reset`** (body `{session_id,
   session_type?}`) drops the matching session from the session manager; the
   next message starts a fresh conversation. Implemented via sub-path routing in
   `handle_unified_webhook`.
 3. **Group semantics** — for `session_type: group`, `session_id` is the group/
   launcher id; `sender.id` (and optional `sender.group_name`) identify the
   member. A Space ticket maps to one `session_id`.
 4. **Backpressure** — bounded per-session outbound queue (maxlen 1000); on
   overflow the oldest reply is dropped and a warning logged, so a persistently
   down callback can never exhaust memory.
 ### Still open / deferred (see §13)
 - Durable outbound queue (persist + replay across restarts).
 - Per-caller API keys with rotation/scopes for multi-tenant Space usage.
 - SSE outbound option and a dashboard test console.
@@ -0,0 +1,198 @@
 {
  "openapi": "3.0.3",
  "info": {
    "title": "LangBot HTTP Bot Adapter",
    "version": "1.0.0",
    "description": "Server-to-server HTTP integration for a LangBot pipeline. Inbound messages are POSTed to the unified webhook route; replies are delivered to a configured callback URL (one POST per reply part). All requests are HMAC-SHA256 signed. See docs/platforms/http-bot.md."
  },
  "paths": {
    "/bots/{bot_uuid}": {
      "post": {
        "summary": "Push a message into the pipeline (fire-and-collect)",
        "description": "Returns 202 immediately. Replies arrive asynchronously on the configured callback URL. Reuse the same session_id within the aggregation window to merge multiple messages into one turn (N->1).",
        "parameters": [
          { "$ref": "#/components/parameters/BotUuid" },
          { "$ref": "#/components/parameters/Timestamp" },
          { "$ref": "#/components/parameters/Signature" },
          { "$ref": "#/components/parameters/Idempotency" }
        ],
        "requestBody": {
          "required": true,
          "content": { "application/json": { "schema": { "$ref": "#/components/schemas/InboundMessage" } } }
        },
        "responses": {
          "202": {
            "description": "Accepted (queued for the pipeline)",
            "content": { "application/json": { "schema": { "$ref": "#/components/schemas/AcceptedResponse" } } }
          },
          "400": { "$ref": "#/components/responses/Error" },
          "401": { "$ref": "#/components/responses/Error" },
          "409": { "$ref": "#/components/responses/Error" },
          "413": { "$ref": "#/components/responses/Error" }
        }
      }
    },
    "/bots/{bot_uuid}/sync": {
      "post": {
        "summary": "Push a message and wait for the collapsed reply",
        "description": "Blocking convenience mode. Waits for is_final and returns all reply parts collapsed into one array. Lossy (no sequence/streaming). One in-flight sync per session_id.",
        "parameters": [
          { "$ref": "#/components/parameters/BotUuid" },
          { "$ref": "#/components/parameters/Timestamp" },
          { "$ref": "#/components/parameters/Signature" }
        ],
        "requestBody": {
          "required": true,
          "content": { "application/json": { "schema": { "$ref": "#/components/schemas/InboundMessage" } } }
        },
        "responses": {
          "200": {
            "description": "The collapsed reply",
            "content": { "application/json": { "schema": { "$ref": "#/components/schemas/SyncResponse" } } }
          },
          "400": { "$ref": "#/components/responses/Error" },
          "401": { "$ref": "#/components/responses/Error" }
        }
      }
    },
    "/bots/{bot_uuid}/reset": {
      "post": {
        "summary": "Reset a session's conversation",
        "parameters": [
          { "$ref": "#/components/parameters/BotUuid" },
          { "$ref": "#/components/parameters/Timestamp" },
          { "$ref": "#/components/parameters/Signature" }
        ],
        "requestBody": {
          "required": true,
          "content": {
            "application/json": {
              "schema": {
                "type": "object",
                "required": ["session_id"],
                "properties": {
                  "session_id": { "type": "string" },
                  "session_type": { "type": "string", "enum": ["person", "group"] }
                }
              }
            }
          }
        },
        "responses": {
          "200": { "description": "Reset done" },
          "400": { "$ref": "#/components/responses/Error" },
          "401": { "$ref": "#/components/responses/Error" }
        }
      }
    }
  },
  "components": {
    "parameters": {
      "BotUuid": {
        "name": "bot_uuid", "in": "path", "required": true,
        "schema": { "type": "string", "format": "uuid" }
      },
      "Timestamp": {
        "name": "X-LB-Timestamp", "in": "header", "required": true,
        "description": "Unix seconds; rejected if more than +/-300s from server time.",
        "schema": { "type": "string" }
      },
      "Signature": {
        "name": "X-LB-Signature", "in": "header", "required": true,
        "description": "sha256=<hex> of HMAC-SHA256(secret, \"{timestamp}.\" + raw_body).",
        "schema": { "type": "string" }
      },
      "Idempotency": {
        "name": "X-LB-Idempotency-Key", "in": "header", "required": false,
        "description": "Dedup key; a repeat within the dedup window returns 409.",
        "schema": { "type": "string" }
      }
    },
    "schemas": {
      "Segment": {
        "type": "object",
        "required": ["type"],
        "properties": {
          "type": { "type": "string", "enum": ["Plain", "Image", "Voice", "File", "At", "Quote"] },
          "text": { "type": "string", "description": "For type=Plain." },
          "url": { "type": "string", "description": "For media types." },
          "base64": { "type": "string", "description": "For media types (data URI or raw base64)." }
        }
      },
      "InboundMessage": {
        "type": "object",
        "required": ["session_id", "message"],
        "properties": {
          "session_id": { "type": "string", "description": "Caller-defined; maps 1:1 to a LangBot session." },
          "session_type": { "type": "string", "enum": ["person", "group"], "default": "person" },
          "sender": {
            "type": "object",
            "properties": {
              "id": { "type": "string" },
              "name": { "type": "string" },
              "group_name": { "type": "string", "description": "For session_type=group." }
            }
          },
          "message": { "type": "array", "items": { "$ref": "#/components/schemas/Segment" } }
        }
      },
      "AcceptedResponse": {
        "type": "object",
        "properties": {
          "code": { "type": "integer", "example": 0 },
          "msg": { "type": "string", "example": "accepted" },
          "data": {
            "type": "object",
            "properties": {
              "session_id": { "type": "string" },
              "accepted_message_id": { "type": "string", "example": "in_01H..." },
              "aggregating": { "type": "boolean" }
            }
          }
        }
      },
      "SyncResponse": {
        "type": "object",
        "properties": {
          "code": { "type": "integer", "example": 0 },
          "msg": { "type": "string", "example": "ok" },
          "data": {
            "type": "object",
            "properties": {
              "session_id": { "type": "string" },
              "reply_to": { "type": "string" },
              "message": { "type": "array", "items": { "$ref": "#/components/schemas/Segment" } }
            }
          }
        }
      },
      "Callback": {
        "type": "object",
        "description": "Delivered by LangBot to your callback_url, one POST per reply part. Signed with the outbound secret.",
        "properties": {
          "session_id": { "type": "string" },
          "reply_to": { "type": "string", "description": "The accepted_message_id this answers." },
          "sequence": { "type": "integer", "description": "1-based ordinal within the turn." },
          "is_final": { "type": "boolean", "description": "True on the last part of the turn." },
          "stream": { "type": "boolean" },
          "message": { "type": "array", "items": { "$ref": "#/components/schemas/Segment" } },
          "timestamp": { "type": "string", "format": "date-time" }
        }
      },
      "ErrorEnvelope": {
        "type": "object",
        "properties": {
          "code": { "type": "integer", "example": 40101 },
          "msg": { "type": "string", "example": "invalid signature: signature_mismatch" },
          "data": { "nullable": true }
        }
      }
    },
    "responses": {
      "Error": {
        "description": "Error envelope",
        "content": { "application/json": { "schema": { "$ref": "#/components/schemas/ErrorEnvelope" } } }
      }
    }
  }
 }
@@ -0,0 +1,256 @@
 # HTTP Bot Adapter — Integration Guide
 Integrate **any backend system** with a LangBot pipeline over plain HTTP. Push
 messages in via a signed webhook; receive replies on a callback URL. No
 long-lived connection, full support for message **aggregation** (many inbound
 messages merged into one turn) and **multi-part replies** (one turn → many
 outbound messages).
 This is the right adapter for **server-to-server** integrations — ticketing
 systems, CRMs, internal tools, custom web backends. (For an in-browser,
 real-time chat widget, use the embeddable Web Page Bot instead.)
 > **5-minute goal:** stand up a callback receiver, send a message, and watch a
 > multi-part reply arrive — using the reference client in
 > [`examples/http-bot/`](../../examples/http-bot/).
 ---
 ## 1. Mental model
 ```
 Your backend  ──(1) POST signed message──►  LangBot   /bots/<bot_uuid>
                                            (pipeline runs: aggregate → think → reply)
 Your callback ◄─(2) POST signed reply(s)──  LangBot   one POST per reply part
 ```
 - **(1) Inbound** is *fire-and-collect*: LangBot answers `202 Accepted`
  immediately and does **not** return the pipeline result on that response.
 - **(2) Outbound** replies arrive later as separate signed POSTs to your
  `callback_url`. A single turn may produce **several** callbacks (e.g. a tool
  call narration followed by the final answer).
 - Everything is keyed by a **`session_id` you choose** (e.g. a ticket number).
  Each `session_id` maps to one isolated LangBot conversation.
 ---
 ## 2. Create the bot
 1. In the LangBot dashboard, add a bot and choose the **HTTP Bot** platform.
 2. Fill in the config:
   | Field | Required | Notes |
   |---|---|---|
   | **Inbound Signing Secret** | yes | Your backend signs inbound requests with this. |
   | **Outbound Callback URL** | yes | Where LangBot POSTs replies. **Config-only** — cannot be overridden per message (SSRF protection). |
   | **Outbound Signing Secret** | no | LangBot signs callbacks with this; defaults to the inbound secret. |
   | **Default Session Type** | no | `person` (default) or `group`. |
   | **Require Inbound Signature** | no | Keep `true` in production. |
   | **Callback Timeout / Max Retries** | no | Defaults: 15s, 3 retries. |
 3. Bind the bot to a **pipeline** and **enable** it.
 4. Copy the **Inbound Webhook URL** shown in the config — it looks like
   `https://your-langbot/bots/<bot_uuid>`.
 ---
 ## 3. The signature scheme
 Both directions use the same dependency-free HMAC-SHA256 scheme:
 ```
 signing_string = "{timestamp}." + raw_body_bytes
 signature      = "sha256=" + hex(HMAC_SHA256(secret, signing_string))
 ```
 Sent as headers:
 | Header | Meaning |
 |---|---|
 | `X-LB-Timestamp` | Unix seconds. Rejected if more than **±300s** from server time. |
 | `X-LB-Signature` | `sha256=<hex>` over `"{timestamp}." + body`. |
 | `X-LB-Idempotency-Key` | *(optional, inbound)* dedup key; retries with the same key return `409`. |
 Verify outbound callbacks the same way, using the **outbound** secret (or the
 inbound secret if you left it blank).
 A six-line reference implementation is in `examples/http-bot/client.py`
 (`sign()` / `verify()`); a Node/TS version is in `client.ts`.
 ---
 ## 4. Send your first message (curl)
 ```bash
 BOT="https://your-langbot/bots/<bot_uuid>"
 SECRET="your-inbound-secret"
 BODY='{"session_id":"ticket-10293","message":[{"type":"Plain","text":"Export keeps failing on the dashboard."}]}'
 TS=$(date +%s)
 SIG="sha256=$(printf '%s.%s' "$TS" "$BODY" | openssl dgst -sha256 -hmac "$SECRET" -r | cut -d' ' -f1)"
 curl -sS -X POST "$BOT" \
  -H "Content-Type: application/json" \
  -H "X-LB-Timestamp: $TS" \
  -H "X-LB-Signature: $SIG" \
  -d "$BODY"
 # -> 202 {"code":0,"msg":"accepted","data":{"session_id":"ticket-10293","accepted_message_id":"in_...","aggregating":true}}
 ```
 The reply(s) will be POSTed to your configured callback URL shortly after.
 ---
 ## 5. Inbound request format
 `POST /bots/{bot_uuid}`
 ```jsonc
 {
  "session_id": "ticket-10293",     // REQUIRED. Your stable id. Maps 1:1 to a LangBot session.
  "session_type": "person",         // optional: "person" | "group"; default from config
  "sender": {                       // optional metadata, surfaced to the pipeline/plugins
    "id": "user-5567",
    "name": "Alice"
  },
  "message": [                      // REQUIRED. A LangBot MessageChain (array of segments).
    { "type": "Plain", "text": "Export keeps failing on the dashboard." },
    { "type": "Image", "url": "https://example.com/screenshot.png" }
  ]
 }
 ```
 **Message segments.** Text uses `{"type":"Plain","text":"..."}`. Images use
 `{"type":"Image","url":"..."}` (or `base64`). Other supported types: `Voice`,
 `File`, `At`, `Quote`.
 > Note: the callback URL is **not** accepted in the body — it is taken only from
 > bot config. This is deliberate (prevents an attacker who obtains the inbound
 > secret from redirecting replies to an arbitrary host).
 ### Aggregation (N → 1)
 If your pipeline has **message aggregation** enabled, send several messages with
 the **same `session_id`** within the aggregation window and they are merged into
 **one** pipeline turn. No special flag — just reuse the `session_id`.
 ---
 ## 6. Outbound callback format
 LangBot POSTs each reply part to your `callback_url`:
 ```jsonc
 {
  "session_id": "ticket-10293",     // echoes the inbound session
  "reply_to": "in_01H...",          // the accepted_message_id this answers
  "sequence": 1,                    // 1-based ordinal within this turn
  "is_final": false,                // true on the last part of the turn
  "stream": false,                  // true for streamed chunks
  "message": [ { "type": "Plain", "text": "Looking into it…" } ],
  "timestamp": "2026-06-22T09:00:01Z"
 }
 ```
 Your endpoint should return `2xx` quickly. Non-2xx / timeout → LangBot retries
 with exponential backoff (up to `callback_max_retries`).
 ### Multi-part replies (1 → M)
 One turn may emit multiple callbacks, delivered **in `sequence` order** for a
 given session:
 ```
 seq=1 is_final=false  "Checking your export logs…"
 seq=2 is_final=false  "Found 2 failed exports."
 seq=3 is_final=true   "Fixed — please try again."
 ```
 Stitch by `session_id` + `sequence`; the turn is complete when
 `is_final: true` arrives.
 ---
 ## 7. Reset a session
 Start a fresh conversation for a `session_id` (drops history):
 ```
 POST /bots/{bot_uuid}/reset
 { "session_id": "ticket-10293", "session_type": "person" }
 → 200 { "code":0, "msg":"reset", "data": { "session_id":"ticket-10293", "removed": true } }
 ```
 Signed exactly like an inbound message.
 ---
 ## 8. Synchronous convenience mode
 If you don't need streaming/multi-part and just want one reply back on the same
 HTTP call, POST to `/sync`. LangBot waits for the turn to finish and returns all
 parts **collapsed** into one array:
 ```
 POST /bots/{bot_uuid}/sync
 { "session_id": "ticket-10293", "message": [ { "type":"Plain", "text":"hi" } ] }
 → 200 { "code":0, "msg":"ok",
        "data": { "session_id":"ticket-10293", "reply_to":"in_...",
                  "message": [ {"type":"Plain","text":"..."}, ... ] } }
 ```
 This is **lossy** (you lose `sequence` / streaming boundaries) and blocks up to
 `callback_timeout × 4` seconds. Prefer the callback model for anything
 real-time or multi-part. Only one in-flight `/sync` per `session_id`.
 ---
 ## 9. Error envelope
 ```jsonc
 { "code": 40101, "msg": "invalid signature: signature_mismatch", "data": null }
 ```
 | HTTP | code | meaning |
 |---|---|---|
 | 202 | 0 | accepted |
 | 400 | 40001 | malformed body / missing `session_id` or `message` |
 | 401 | 40101 | bad/expired signature |
 | 409 | 40901 | duplicate idempotency key |
 | 413 | 41301 | message too large (>1 MiB) |
 | 500 | 50001 | internal error |
 ---
 ## 10. Try it end-to-end in 5 minutes
 ```bash
 cd examples/http-bot
 pip install flask requests
 # Terminal 1 — your callback receiver (point the bot's callback_url here, e.g. via a tunnel):
 python client.py serve --port 8900 --secret SHARED_SECRET
 # Terminal 2 — push a message:
 python client.py push \
  --url https://your-langbot/bots/<bot_uuid> \
  --secret SHARED_SECRET \
  --session ticket-1 \
  --text "hello"
 ```
 Watch Terminal 1 print each reply part (`[part ]` / `[FINAL]`) with its
 sequence number — that's 1→M working, signatures verified.
 A machine-readable contract is in
 [`docs/http-bot-openapi.json`](../http-bot-openapi.json).
 ---
 ## 11. Security checklist
 - Keep **Require Inbound Signature** on in production.
 - Use **HTTPS** callback URLs; the URL is config-only (no per-message override).
 - Treat the secrets like passwords; rotate via the dashboard.
 - The inbound route is unauthenticated at the framework level **by design** —
  security comes entirely from the HMAC signature, so never disable it on a
  public deployment.
@@ -0,0 +1,75 @@
 # HTTP Bot Adapter — Reference Clients
 > English | [中文](./README.zh.md)
 Minimal, dependency-light clients for the LangBot **HTTP Bot** platform adapter.
 They show the whole loop: signing a request, pushing a message, and receiving
 multi-part replies on a callback endpoint.
 Full guide: [docs.langbot.app — HTTP Bot](https://docs.langbot.app/en/usage/platforms/http-bot).
 Machine-readable contract: [`docs/http-bot-openapi.json`](../../docs/http-bot-openapi.json).
 ## Files
 | File | What it is |
 |---|---|
 | `playground.py` | **Interactive browser debug console** — a single-file web app you open in a browser to chat with a running `http_bot` bot and watch signing / 202 / callbacks live. Zero extra deps. |
 | `client.py` | Python client + Flask callback receiver (`pip install flask requests`). |
 | `client.ts` | TypeScript/Node 18+ client + callback receiver, **zero deps** (`npx tsx client.ts`). |
 All three implement the identical HMAC-SHA256 scheme
 (`sha256=hex(HMAC(secret, "{timestamp}." + body))`) — verified byte-for-byte
 against the adapter.
 ## Interactive playground (recommended first run)
 A self-contained web console: type a message in your browser, it is signed and
 POSTed to a **running** `http_bot` bot, and the bot's replies stream back into
 the page — with a debug panel showing the signature, the `202` ack, and each
 callback's `sequence` / signature-verification.
 ```bash
 # From the LangBot repo root, with the backend already running:
 PUBLIC_IP=<your-host-ip> ./.venv/bin/python examples/http-bot/playground.py
 # then open  http://<your-host-ip>:8920/
 ```
 On startup it reads the LangBot API key + the `http_bot` bot from
 `data/langbot.db`, and configures that bot (inbound/outbound secret +
 `callback_url`) to point back at itself via the LangBot API — the bot reloads
 live, no restart needed. Requirements: an enabled `http_bot` bot bound to a
 working pipeline, and port `8920` reachable from your browser.
 Env knobs: `PUBLIC_IP` (default `127.0.0.1`), `PLAYGROUND_PORT` (default `8920`).
 ## Headless clients
 ```bash
 # Python — Terminal 1: callback receiver (your callback_url target)
 python client.py serve --port 8900 --secret SHARED_SECRET
 # Python — Terminal 2: push a message
 python client.py push --url https://your-langbot/bots/<BOT_UUID> \
    --secret SHARED_SECRET --session ticket-1 --text "hello"
 # blocking sync mode
 python client.py sync  --url https://your-langbot/bots/<BOT_UUID> \
    --secret SHARED_SECRET --session ticket-1 --text "hello"
 # reset a session
 python client.py reset --url https://your-langbot/bots/<BOT_UUID> \
    --secret SHARED_SECRET --session ticket-1
 ```
 ```bash
 # TypeScript (Node 18+)
 npx tsx client.ts serve 8900 SHARED_SECRET
 npx tsx client.ts push  https://your-langbot/bots/<BOT_UUID> SHARED_SECRET ticket-1 "hello"
 ```
 When the bot replies, the receiver prints each part with its `sequence` and an
 `[FINAL]` marker on the last one — that's the 1→M multi-reply model in action.
 > The bot's `callback_url` must be reachable from LangBot. For local testing,
 > expose your receiver with a tunnel (cloudflared / ngrok) and set that URL in
 > the bot config.
@@ -0,0 +1,71 @@
 # HTTP Bot 适配器 —— 参考客户端
 > [English](./README.md) | 中文
 面向 LangBot **HTTP Bot** 平台适配器的极简、低依赖客户端示例。
 它们完整展示了整条链路:对请求签名、推送一条消息、在回调端点接收
 1→M 的多段回复。
 完整指南:[docs.langbot.app —— HTTP Bot](https://docs.langbot.app/zh/usage/platforms/http-bot)。
 机器可读的接口契约:[`docs/http-bot-openapi.json`](../../docs/http-bot-openapi.json)。
 ## 文件清单
 | 文件 | 是什么 |
 |---|---|
 | `playground.py` | **浏览器交互式调试台** —— 单文件 Web 应用,在浏览器里和一个运行中的 `http_bot` bot 对话,实时观察签名 / 202 / 回调。零额外依赖。 |
 | `client.py` | Python 客户端 + Flask 回调接收端(`pip install flask requests`)。 |
 | `client.ts` | TypeScript/Node 18+ 客户端 + 回调接收端,**零依赖**(`npx tsx client.ts`)。 |
 三者实现完全一致的 HMAC-SHA256 签名方案
 (`sha256=hex(HMAC(secret, "{timestamp}." + body))`)—— 已与适配器逐字节比对验证。
 ## 交互式 playground(推荐先跑这个)
 一个自包含的 Web 控制台:在浏览器里输入消息,它会被签名并 POST 给一个
 **运行中**的 `http_bot` bot,bot 的回复会流式回到页面上 —— 调试面板会显示
 签名、`202` 确认,以及每条回调的 `sequence` / 签名验证结果。
 ```bash
 # 在 LangBot 仓库根目录、后端已启动的前提下:
 PUBLIC_IP=<你的主机IP> ./.venv/bin/python examples/http-bot/playground.py
 # 然后打开  http://<你的主机IP>:8920/
 ```
 启动时它会从 `data/langbot.db` 读取 LangBot API key 和 `http_bot` bot,
 并通过 LangBot API 把该 bot 配好(入站/出站密钥 + `callback_url`)指回自己 ——
 bot 会热加载,无需重启。前提:有一个已启用、绑定了可用 pipeline 的
 `http_bot` bot,且端口 `8920` 能从你的浏览器访问到。
 可调环境变量:`PUBLIC_IP`(默认 `127.0.0.1`)、`PLAYGROUND_PORT`(默认 `8920`)。
 ## 无头客户端
 ```bash
 # Python —— 终端 1:回调接收端(你的 callback_url 指向它)
 python client.py serve --port 8900 --secret SHARED_SECRET
 # Python —— 终端 2:推送一条消息
 python client.py push --url https://your-langbot/bots/<BOT_UUID> \
    --secret SHARED_SECRET --session ticket-1 --text "hello"
 # 阻塞式同步模式
 python client.py sync  --url https://your-langbot/bots/<BOT_UUID> \
    --secret SHARED_SECRET --session ticket-1 --text "hello"
 # 重置一个会话
 python client.py reset --url https://your-langbot/bots/<BOT_UUID> \
    --secret SHARED_SECRET --session ticket-1
 ```
 ```bash
 # TypeScript(Node 18+)
 npx tsx client.ts serve 8900 SHARED_SECRET
 npx tsx client.ts push  https://your-langbot/bots/<BOT_UUID> SHARED_SECRET ticket-1 "hello"
 ```
 当 bot 回复时,接收端会逐条打印,带上各自的 `sequence`,并在最后一条标记
 `[FINAL]` —— 这就是 1→M 多段回复模型的实际效果。
 > bot 的 `callback_url` 必须能从 LangBot 访问到。本地测试时,可用隧道
 > (cloudflared / ngrok)把你的接收端暴露出去,并把那个 URL 填进 bot 配置。
@@ -0,0 +1,167 @@
 #!/usr/bin/env python3
 """LangBot HTTP Bot adapter — reference client (Python).
 Two things in one file:
 1. ``push()`` / ``push_sync()`` — send a message into a LangBot ``http_bot`` bot.
 2. A tiny Flask callback receiver that verifies signatures and prints replies,
   so you can watch N->1 aggregation and 1->M multi-reply working live.
 Usage
 -----
    pip install flask requests
    # Terminal 1 — start the callback receiver (this is your callback_url):
    python client.py serve --port 8900 --secret SHARED_SECRET
    # Terminal 2 — push a message (async; reply lands on the receiver):
    python client.py push \
        --url   https://your-langbot/bots/<BOT_UUID> \
        --secret SHARED_SECRET \
        --session ticket-10293 \
        --text "Export keeps failing on the dashboard."
    # Or push and block for the collapsed reply (sync convenience mode):
    python client.py sync --url https://your-langbot/bots/<BOT_UUID> \
        --secret SHARED_SECRET --session ticket-10293 --text "hi"
 The signing scheme is HMAC-SHA256 over ``"{timestamp}." + raw_body``; see
 ``sign()`` below — it is intentionally tiny and easy to port.
 """
 from __future__ import annotations
 import argparse
 import hashlib
 import hmac
 import json
 import sys
 import time
 import uuid
 HEADER_TIMESTAMP = 'X-LB-Timestamp'
 HEADER_SIGNATURE = 'X-LB-Signature'
 HEADER_IDEMPOTENCY = 'X-LB-Idempotency-Key'
 REPLAY_WINDOW = 300
 def sign(secret: str, body: bytes, timestamp: int | None = None) -> tuple[str, str]:
    """Return (timestamp, signature) for *body*."""
    ts = str(timestamp if timestamp is not None else int(time.time()))
    mac = hmac.new(secret.encode(), f'{ts}.'.encode() + body, hashlib.sha256)
    return ts, 'sha256=' + mac.hexdigest()
 def verify(secret: str, body: bytes, timestamp: str | None, signature: str | None) -> bool:
    """Verify an inbound signature (used by the callback receiver)."""
    if not timestamp or not signature:
        return False
    try:
        if abs(int(time.time()) - int(float(timestamp))) > REPLAY_WINDOW:
            return False
    except ValueError:
        return False
    _, expected = sign(secret, body, int(float(timestamp)))
    return hmac.compare_digest(expected, signature)
 def _post(url: str, secret: str, payload: dict, idempotency: bool = True):
    import requests
    body = json.dumps(payload, ensure_ascii=False).encode()
    ts, sig = sign(secret, body)
    headers = {
        'Content-Type': 'application/json',
        HEADER_TIMESTAMP: ts,
        HEADER_SIGNATURE: sig,
    }
    if idempotency:
        headers[HEADER_IDEMPOTENCY] = uuid.uuid4().hex
    resp = requests.post(url, data=body, headers=headers, timeout=30)
    print(f'-> {resp.status_code} {resp.text}')
    return resp
 def push(url: str, secret: str, session: str, text: str, session_type: str = 'person'):
    """Fire-and-collect: returns 202 immediately; reply arrives on your callback."""
    payload = {
        'session_id': session,
        'session_type': session_type,
        'message': [{'type': 'Plain', 'text': text}],
    }
    return _post(url.rstrip('/'), secret, payload)
 def push_sync(url: str, secret: str, session: str, text: str, session_type: str = 'person'):
    """Blocking convenience: POST to /sync and get the collapsed reply back."""
    payload = {
        'session_id': session,
        'session_type': session_type,
        'message': [{'type': 'Plain', 'text': text}],
    }
    resp = _post(url.rstrip('/') + '/sync', secret, payload, idempotency=False)
    return resp
 def reset(url: str, secret: str, session: str, session_type: str = 'person'):
    """Reset a session's conversation (next message starts fresh)."""
    payload = {'session_id': session, 'session_type': session_type}
    return _post(url.rstrip('/') + '/reset', secret, payload, idempotency=False)
 def serve(port: int, secret: str):
    """Run a callback receiver that verifies signatures and prints replies."""
    from flask import Flask, request
    app = Flask(__name__)
    @app.route('/', methods=['POST'])
    def recv():
        raw = request.get_data()
        ok = verify(secret, raw, request.headers.get(HEADER_TIMESTAMP), request.headers.get(HEADER_SIGNATURE))
        if not ok:
            print('!! signature verification FAILED — rejecting')
            return {'error': 'bad signature'}, 401
        data = json.loads(raw)
        text_parts = [c.get('text', '') for c in data.get('message', []) if c.get('type') == 'Plain']
        marker = 'FINAL' if data.get('is_final') else 'part '
        print(
            f'[{marker}] session={data["session_id"]} seq={data["sequence"]} '
            f'reply_to={data.get("reply_to")}: {" ".join(text_parts)}'
        )
        return {'ok': True}
    print(f'callback receiver listening on http://0.0.0.0:{port}/  (Ctrl-C to stop)')
    app.run(host='0.0.0.0', port=port)
 def main(argv=None):
    p = argparse.ArgumentParser(description='LangBot HTTP Bot reference client')
    sub = p.add_subparsers(dest='cmd', required=True)
    sp = sub.add_parser('serve', help='run the callback receiver')
    sp.add_argument('--port', type=int, default=8900)
    sp.add_argument('--secret', required=True)
    for name in ('push', 'sync', 'reset'):
        c = sub.add_parser(name)
        c.add_argument('--url', required=True, help='https://host/bots/<BOT_UUID>')
        c.add_argument('--secret', required=True)
        c.add_argument('--session', required=True)
        c.add_argument('--session-type', default='person', choices=['person', 'group'])
        if name != 'reset':
            c.add_argument('--text', required=True)
    args = p.parse_args(argv)
    if args.cmd == 'serve':
        serve(args.port, args.secret)
    elif args.cmd == 'push':
        push(args.url, args.secret, args.session, args.text, args.session_type)
    elif args.cmd == 'sync':
        push_sync(args.url, args.secret, args.session, args.text, args.session_type)
    elif args.cmd == 'reset':
        reset(args.url, args.secret, args.session, args.session_type)
 if __name__ == '__main__':
    sys.exit(main())
@@ -0,0 +1,123 @@
 /**
 * LangBot HTTP Bot adapter — reference client (TypeScript / Node 18+).
 *
 * Zero runtime dependencies (uses global `fetch`, `crypto`, and `http`).
 *
 *   - `push()`      : fire-and-collect; reply lands on your callback URL.
 *   - `pushSync()`  : POST /sync and await the collapsed reply.
 *   - `reset()`     : reset a session's conversation.
 *   - `startReceiver()` : a callback server that verifies signatures and logs
 *                         replies, so you can watch N->1 and 1->M live.
 *
 * Run the demos:
 *   npx tsx client.ts serve   8900 SHARED_SECRET
 *   npx tsx client.ts push    https://host/bots/<UUID> SHARED_SECRET ticket-1 "hello"
 *   npx tsx client.ts sync    https://host/bots/<UUID> SHARED_SECRET ticket-1 "hello"
 *   npx tsx client.ts reset   https://host/bots/<UUID> SHARED_SECRET ticket-1
 */
 import { createHmac, randomUUID, timingSafeEqual } from 'node:crypto';
 import { createServer } from 'node:http';
 const HEADER_TIMESTAMP = 'X-LB-Timestamp';
 const HEADER_SIGNATURE = 'X-LB-Signature';
 const HEADER_IDEMPOTENCY = 'X-LB-Idempotency-Key';
 const REPLAY_WINDOW = 300;
 /** Compute the `sha256=<hex>` signature over `"{ts}." + body`. */
 export function sign(secret: string, body: Buffer | string, timestamp?: number): [string, string] {
  const ts = String(timestamp ?? Math.floor(Date.now() / 1000));
  const buf = typeof body === 'string' ? Buffer.from(body) : body;
  const mac = createHmac('sha256', secret).update(Buffer.concat([Buffer.from(`${ts}.`), buf])).digest('hex');
  return [ts, `sha256=${mac}`];
 }
 /** Verify an inbound signature (used by the callback receiver). */
 export function verify(secret: string, body: Buffer, timestamp?: string, signature?: string): boolean {
  if (!timestamp || !signature) return false;
  if (Math.abs(Math.floor(Date.now() / 1000) - Number(timestamp)) > REPLAY_WINDOW) return false;
  const [, expected] = sign(secret, body, Number(timestamp));
  const a = Buffer.from(expected);
  const b = Buffer.from(signature);
  return a.length === b.length && timingSafeEqual(a, b);
 }
 interface Segment { type: string; text?: string; url?: string; [k: string]: unknown }
 async function post(url: string, secret: string, payload: object, idempotency = true) {
  const body = Buffer.from(JSON.stringify(payload));
  const [ts, sig] = sign(secret, body);
  const headers: Record<string, string> = {
    'Content-Type': 'application/json',
    [HEADER_TIMESTAMP]: ts,
    [HEADER_SIGNATURE]: sig,
  };
  if (idempotency) headers[HEADER_IDEMPOTENCY] = randomUUID();
  const resp = await fetch(url, { method: 'POST', headers, body });
  const text = await resp.text();
  console.log(`-> ${resp.status} ${text}`);
  return { status: resp.status, text };
 }
 /** Fire-and-collect: 202 now, reply later on your callback URL. */
 export function push(url: string, secret: string, session: string, text: string, sessionType = 'person') {
  return post(url.replace(/\/$/, ''), secret, {
    session_id: session,
    session_type: sessionType,
    message: [{ type: 'Plain', text }] as Segment[],
  });
 }
 /** Blocking convenience: POST /sync, get the collapsed reply. */
 export function pushSync(url: string, secret: string, session: string, text: string, sessionType = 'person') {
  return post(`${url.replace(/\/$/, '')}/sync`, secret, {
    session_id: session,
    session_type: sessionType,
    message: [{ type: 'Plain', text }] as Segment[],
  }, false);
 }
 /** Reset a session's conversation. */
 export function reset(url: string, secret: string, session: string, sessionType = 'person') {
  return post(`${url.replace(/\/$/, '')}/reset`, secret, { session_id: session, session_type: sessionType }, false);
 }
 /** Run a callback receiver that verifies signatures and prints replies. */
 export function startReceiver(port: number, secret: string) {
  const server = createServer((req, res) => {
    if (req.method !== 'POST') { res.writeHead(405).end(); return; }
    const chunks: Buffer[] = [];
    req.on('data', (c) => chunks.push(c));
    req.on('end', () => {
      const raw = Buffer.concat(chunks);
      const ok = verify(secret, raw, req.headers[HEADER_TIMESTAMP.toLowerCase()] as string,
        req.headers[HEADER_SIGNATURE.toLowerCase()] as string);
      if (!ok) {
        console.log('!! signature verification FAILED — rejecting');
        res.writeHead(401, { 'Content-Type': 'application/json' }).end(JSON.stringify({ error: 'bad signature' }));
        return;
      }
      const data = JSON.parse(raw.toString());
      const parts = (data.message as Segment[]).filter((c) => c.type === 'Plain').map((c) => c.text).join(' ');
      const marker = data.is_final ? 'FINAL' : 'part ';
      console.log(`[${marker}] session=${data.session_id} seq=${data.sequence} reply_to=${data.reply_to}: ${parts}`);
      res.writeHead(200, { 'Content-Type': 'application/json' }).end(JSON.stringify({ ok: true }));
    });
  });
  server.listen(port, () => console.log(`callback receiver listening on http://0.0.0.0:${port}/  (Ctrl-C to stop)`));
 }
 // --- CLI ---
 const [cmd, ...rest] = process.argv.slice(2);
 if (cmd === 'serve') {
  startReceiver(Number(rest[0] ?? 8900), rest[1] ?? 'SHARED_SECRET');
 } else if (cmd === 'push') {
  push(rest[0], rest[1], rest[2], rest[3]);
 } else if (cmd === 'sync') {
  pushSync(rest[0], rest[1], rest[2], rest[3]);
 } else if (cmd === 'reset') {
  reset(rest[0], rest[1], rest[2]);
 } else if (cmd) {
  console.error(`unknown command: ${cmd}`);
  process.exit(1);
 }
@@ -0,0 +1,349 @@
 #!/usr/bin/env python3
 """LangBot HTTP Bot — interactive playground (public, browser-based).
 This is a REAL end-to-end demo against the RUNNING LangBot instance on this
 host. It is NOT a mock and NOT an in-process import: every message you type in
 the browser is signed and POSTed to the live `http_bot` bot at
 http://127.0.0.1:5300/bots/<uuid>, and the bot's replies come back to this
 server's /callback endpoint over real HTTP, then stream to your browser via SSE.
 What it does on startup:
  1. Reads the LangBot API key + the http_bot bot from data/langbot.db.
  2. Configures the bot via the LangBot API (PUT /api/v1/platform/bots/<uuid>):
     sets inbound_secret + outbound_secret + callback_url to point back here.
     (LangBot reloads the bot live — no server restart needed.)
  3. Serves a chat page on 0.0.0.0:<PORT> so you can open it from the internet.
 Run:  ./.venv/bin/python examples/http-bot/playground.py
 Then open:  http://<this-host-public-ip>:<PORT>/
 """
 from __future__ import annotations
 import asyncio
 import json
 import os
 import sqlite3
 import sys
 REPO = os.path.abspath(os.path.join(os.path.dirname(__file__), '..', '..'))
 sys.path.insert(0, os.path.join(REPO, 'src'))
 from aiohttp import web  # noqa: E402
 import aiohttp  # noqa: E402
 from langbot.pkg.platform.sources import http_bot_signing as sg  # noqa: E402
 # ---- config -----------------------------------------------------------------
 LANGBOT_BASE = 'http://127.0.0.1:5300'
 DB_PATH = os.path.join(REPO, 'data', 'langbot.db')
 PUBLIC_IP = os.environ.get('PUBLIC_IP', '127.0.0.1')
 PORT = int(os.environ.get('PLAYGROUND_PORT', '8920'))
 SECRET = 'playground-shared-secret'
 # SSE subscribers: list of asyncio.Queue
 subscribers: list[asyncio.Queue] = []
 def db_lookup() -> tuple[str, str]:
    """Return (api_key, http_bot_uuid) from the LangBot DB."""
    db = sqlite3.connect(DB_PATH)
    db.row_factory = sqlite3.Row
    api_key = db.execute('SELECT key FROM api_keys LIMIT 1').fetchone()['key']
    bot = db.execute("SELECT uuid FROM bots WHERE adapter='http_bot' LIMIT 1").fetchone()
    if not bot:
        raise SystemExit('No http_bot bot found. Create one in the WebUI first.')
    return api_key, bot['uuid']
 async def configure_bot(api_key: str, bot_uuid: str, callback_url: str):
    """Point the live bot at this playground via the LangBot API.
    update_bot() runs a raw SQL UPDATE with whatever keys we send, so we send a
    MINIMAL payload: only adapter_config (built from scratch, not read back —
    the GET masks secrets). LangBot reloads + reruns the bot live.
    """
    cfg = {
        'inbound_secret': SECRET,
        'outbound_secret': SECRET,
        'callback_url': callback_url,
        'signature_required': True,
        'default_session_type': 'person',
        'callback_timeout': 15,
        'callback_max_retries': 3,
    }
    async with aiohttp.ClientSession() as s:
        async with s.put(
            f'{LANGBOT_BASE}/api/v1/platform/bots/{bot_uuid}',
            headers={'Authorization': f'Bearer {api_key}', 'Content-Type': 'application/json'},
            json={'adapter_config': cfg},
        ) as r:
            txt = await r.text()
            print(f'[configure] PUT adapter_config -> {r.status} {txt[:200]}')
            return r.status < 400
 async def broadcast(event: dict):
    for q in list(subscribers):
        try:
            q.put_nowait(event)
        except Exception:
            pass
 # ---- HTTP handlers ----------------------------------------------------------
 async def index(request: web.Request):
    return web.Response(text=PAGE, content_type='text/html')
 async def send(request: web.Request):
    """Browser -> here -> signed POST -> live LangBot bot."""
    body_in = await request.json()
    session_id = body_in.get('session_id') or 'playground-1'
    text = body_in.get('text', '')
    bot_uuid = request.app['bot_uuid']
    payload = {
        'session_id': session_id,
        'sender': {'id': 'browser-user', 'name': 'You'},
        'message': [{'type': 'Plain', 'text': text}],
    }
    raw = json.dumps(payload, ensure_ascii=False).encode()
    ts, sig = sg.sign(SECRET, raw)
    url = f'{LANGBOT_BASE}/bots/{bot_uuid}'
    # echo what we send to the browser timeline
    await broadcast(
        {'dir': 'out', 'kind': 'request', 'session_id': session_id, 'text': text, 'url': url, 'sig': sig[:24] + '…'}
    )
    async with aiohttp.ClientSession() as s:
        async with s.post(
            url,
            data=raw,
            headers={
                'Content-Type': 'application/json',
                sg.HEADER_TIMESTAMP: ts,
                sg.HEADER_SIGNATURE: sig,
            },
        ) as r:
            status = r.status
            try:
                jr = await r.json()
            except Exception:
                jr = {'raw': await r.text()}
    await broadcast({'dir': 'in', 'kind': 'ack', 'status': status, 'data': jr})
    return web.json_response({'status': status, 'data': jr})
 async def callback(request: web.Request):
    """Live LangBot bot -> here. Verify signature, stream to browser."""
    raw = await request.read()
    ok, why = sg.verify(SECRET, raw, request.headers.get(sg.HEADER_TIMESTAMP), request.headers.get(sg.HEADER_SIGNATURE))
    data = json.loads(raw)
    text = ' '.join(c.get('text', '') for c in data.get('message', []) if c.get('type') == 'Plain')
    await broadcast(
        {
            'dir': 'in',
            'kind': 'reply',
            'session_id': data.get('session_id'),
            'sequence': data.get('sequence'),
            'is_final': data.get('is_final'),
            'sig_ok': ok,
            'sig_why': why,
            'text': text,
        }
    )
    return web.json_response({'ok': True})
 async def events(request: web.Request):
    """SSE stream to the browser."""
    resp = web.StreamResponse(
        headers={
            'Content-Type': 'text/event-stream',
            'Cache-Control': 'no-cache',
            'Connection': 'keep-alive',
            'Access-Control-Allow-Origin': '*',
        }
    )
    await resp.prepare(request)
    q: asyncio.Queue = asyncio.Queue()
    subscribers.append(q)
    try:
        await resp.write(b': connected\n\n')
        while True:
            try:
                ev = await asyncio.wait_for(q.get(), timeout=15)
                await resp.write(f'data: {json.dumps(ev, ensure_ascii=False)}\n\n'.encode())
            except asyncio.TimeoutError:
                await resp.write(b': ping\n\n')
    except (asyncio.CancelledError, ConnectionResetError):
        pass
    finally:
        if q in subscribers:
            subscribers.remove(q)
    return resp
 PAGE = r"""<!doctype html>
 <html lang="zh"><head><meta charset="utf-8"/>
 <meta name="viewport" content="width=device-width,initial-scale=1"/>
 <title>LangBot HTTP Bot · 调试台</title>
 <style>
  :root{
    --bg:#f7f8fa; --panel:#ffffff; --line:#e8eaed; --ink:#1f2329; --mut:#8a909a;
    --brand:#2563eb; --brand-soft:#eef3ff; --ok:#16a34a; --bad:#dc2626; --code:#f3f4f6;
  }
  *{box-sizing:border-box}
  html,body{height:100%}
  body{margin:0;background:var(--bg);color:var(--ink);
    font:14px/1.6 -apple-system,BlinkMacSystemFont,"Segoe UI",Roboto,"PingFang SC","Microsoft YaHei",sans-serif}
  .top{height:52px;background:var(--panel);border-bottom:1px solid var(--line);
    display:flex;align-items:center;gap:10px;padding:0 18px}
  .logo{width:26px;height:26px;border-radius:7px;background:var(--brand);display:grid;place-items:center;color:#fff;font-weight:700;font-size:14px}
  .top b{font-size:15px} .top .ver{font-size:12px;color:var(--mut)}
  .dot{width:8px;height:8px;border-radius:50%;background:#cbd2dc;display:inline-block;margin-right:5px;vertical-align:middle}
  .dot.on{background:var(--ok)} .dot.off{background:var(--bad)}
  .conn{margin-left:auto;font-size:12px;color:var(--mut)}
  .wrap{max-width:1080px;margin:0 auto;padding:18px;display:grid;grid-template-columns:1fr 360px;gap:16px}
  @media(max-width:880px){.wrap{grid-template-columns:1fr}}
  .card{background:var(--panel);border:1px solid var(--line);border-radius:12px;display:flex;flex-direction:column;min-height:0}
  .card h3{margin:0;padding:12px 16px;font-size:13px;font-weight:600;color:#4b5563;border-bottom:1px solid var(--line);display:flex;align-items:center;gap:8px}
  .chat{height:62vh}
  .msgs{flex:1;overflow:auto;padding:16px;display:flex;flex-direction:column;gap:12px}
  .row{display:flex;flex-direction:column;gap:4px;max-width:82%}
  .row.me{align-self:flex-end;align-items:flex-end}
  .row.bot{align-self:flex-start}
  .bub{padding:9px 13px;border-radius:12px;white-space:pre-wrap;word-break:break-word}
  .me .bub{background:var(--brand);color:#fff;border-bottom-right-radius:3px}
  .bot .bub{background:#f1f3f6;color:var(--ink);border-bottom-left-radius:3px}
  .meta{font-size:11px;color:var(--mut)}
  .meta .ok{color:var(--ok)} .meta .bad{color:var(--bad)}
  .sys{align-self:center;font-size:12px;color:var(--mut);background:#f1f3f6;border-radius:8px;padding:4px 12px}
  .bar{display:flex;gap:8px;padding:12px;border-top:1px solid var(--line)}
  .bar input{flex:1;border:1px solid var(--line);border-radius:9px;padding:10px 12px;font-size:14px;outline:none}
  .bar input:focus{border-color:var(--brand);box-shadow:0 0 0 3px var(--brand-soft)}
  .bar button{background:var(--brand);color:#fff;border:0;border-radius:9px;padding:0 18px;font-size:14px;font-weight:500;cursor:pointer}
  .bar button:disabled{opacity:.5;cursor:default}
  .side{height:62vh}
  .kv{padding:12px 16px;border-bottom:1px solid var(--line);font-size:12px}
  .kv .k{color:var(--mut)} .kv .v{color:var(--ink);word-break:break-all}
  .kv code{background:var(--code);border-radius:5px;padding:1px 5px;font-size:11px}
  .sessrow{display:flex;align-items:center;gap:8px;padding:10px 16px;border-bottom:1px solid var(--line);font-size:12px}
  .sessrow input{flex:1;border:1px solid var(--line);border-radius:7px;padding:5px 8px;font-size:12px}
  .sessrow button{border:1px solid var(--line);background:#fff;border-radius:7px;padding:5px 9px;font-size:12px;cursor:pointer;color:#4b5563}
  .trace{flex:1;overflow:auto;padding:10px 12px;font:11px/1.55 ui-monospace,SFMono-Regular,Menlo,monospace}
  .ev{padding:6px 8px;border-radius:7px;margin-bottom:6px;border:1px solid var(--line)}
  .ev .t{font-weight:600;font-size:10px;letter-spacing:.3px;text-transform:uppercase}
  .ev.out{background:#f5f8ff;border-color:#dbe6ff}.ev.out .t{color:var(--brand)}
  .ev.ack{background:#f4f6f8}.ev.ack .t{color:#6b7280}
  .ev.reply{background:#f1faf3;border-color:#cdeed6}.ev.reply .t{color:var(--ok)}
  .ev pre{margin:3px 0 0;white-space:pre-wrap;word-break:break-all;color:#374151}
 </style></head>
 <body>
 <div class="top">
  <div class="logo">L</div>
  <b>HTTP Bot 调试台</b><span class="ver">examples/http-bot</span>
  <span class="conn"><span class="dot off" id="cdot"></span><span id="conn">连接中…</span></span>
 </div>
 <div class="wrap">
  <!-- chat -->
  <div class="card chat">
    <h3>对话 · 真实发往运行中的 http_bot</h3>
    <div class="msgs" id="msgs"></div>
    <div class="bar">
      <input id="msg" placeholder="输入消息,回车发送…" autofocus/>
      <button id="send">发送</button>
    </div>
  </div>
  <!-- debug -->
  <div class="card side">
    <h3>调试信息</h3>
    <div class="kv"><span class="k">入站地址</span><br><span class="v"><code id="endpoint">/bots/&lt;uuid&gt;</code></span></div>
    <div class="kv"><span class="k">签名</span> <span class="v">HMAC-SHA256 · <code>X-LB-Signature</code></span></div>
    <div class="sessrow">
      <span class="k">会话</span>
      <input id="sid" value="playground-1"/>
      <button id="reset">新会话</button>
    </div>
    <div class="trace" id="trace"></div>
  </div>
 </div>
 <script>
 const $=s=>document.querySelector(s);
 const msgs=$('#msgs'),trace=$('#trace'),inp=$('#msg'),btn=$('#send'),
      conn=$('#conn'),cdot=$('#cdot'),sidIn=$('#sid');
 function el(c){const d=document.createElement('div');d.className=c;return d}
 function atBottom(n){n.scrollTop=n.scrollHeight}
 function bubble(side,text,metaHtml){
  const r=el('row '+side),b=el('bub');b.textContent=text;r.appendChild(b);
  if(metaHtml){const m=el('meta');m.innerHTML=metaHtml;r.appendChild(m)}
  msgs.appendChild(r);atBottom(msgs)}
 function sys(t){const d=el('sys');d.textContent=t;msgs.appendChild(d);atBottom(msgs)}
 function logEv(kind,title,obj){
  const e=el('ev '+kind),t=el('t');t.textContent=title;e.appendChild(t);
  if(obj!==undefined){const p=document.createElement('pre');
    p.textContent=typeof obj==='string'?obj:JSON.stringify(obj,null,2);e.appendChild(p)}
  trace.appendChild(e);atBottom(trace)}
 const es=new EventSource('/events');
 es.onopen=()=>{conn.textContent='SSE 已连接';cdot.className='dot on'};
 es.onerror=()=>{conn.textContent='SSE 断开,重连…';cdot.className='dot off'};
 es.onmessage=e=>{const ev=JSON.parse(e.data);
  if(ev.kind==='request'){
    if(ev.endpoint)$('#endpoint').textContent=ev.url||ev.endpoint;
    logEv('out','出站 · 已签名 POST',{url:ev.url,session_id:ev.session_id,'X-LB-Signature':ev.sig});
  }else if(ev.kind==='ack'){
    const id=ev.data&&ev.data.data&&ev.data.data.accepted_message_id;
    sys(`LangBot 已接收 · HTTP ${ev.status}`);
    logEv('ack','入站确认 202',{status:ev.status,accepted_message_id:id||'-'});
  }else if(ev.kind==='reply'){
    const sig=ev.sig_ok?'<span class=ok>验签通过</span>':'<span class=bad>验签失败</span>';
    bubble('bot',ev.text,`seq=${ev.sequence} · ${ev.is_final?'<b>FINAL</b>':'中间段'} · ${sig}`);
    logEv('reply',`回调 · seq ${ev.sequence}${ev.is_final?' · FINAL':''}`,
      {session_id:ev.session_id,sequence:ev.sequence,is_final:ev.is_final,sig_ok:ev.sig_ok,text:ev.text});
  }};
 async function send(){
  const t=inp.value.trim();if(!t)return;inp.value='';btn.disabled=true;
  bubble('me',t,'已签名 → POST /bots/&lt;uuid&gt;');
  try{await fetch('/send',{method:'POST',headers:{'Content-Type':'application/json'},
    body:JSON.stringify({session_id:sidIn.value.trim()||'playground-1',text:t})});}
  catch(e){sys('发送失败:'+e)}
  btn.disabled=false;inp.focus();}
 btn.onclick=send;inp.addEventListener('keydown',e=>{if(e.key==='Enter')send()});
 $('#reset').onclick=()=>{sidIn.value='playground-'+Math.random().toString(36).slice(2,7);
  sys('已切换到新会话 '+sidIn.value);};
 sys('调试台就绪 · 每条消息都会真实发往运行中的 http_bot,右侧可观察签名 / 202 / 回调全过程。');
 </script>
 </body></html>"""
 async def main():
    api_key, bot_uuid = db_lookup()
    callback_url = f'http://{PUBLIC_IP}:{PORT}/callback'
    print(f'[init] http_bot uuid = {bot_uuid}')
    print(f'[init] callback_url  = {callback_url}')
    ok = await configure_bot(api_key, bot_uuid, callback_url)
    if not ok:
        print('[warn] bot config update failed; check the API key / payload shape')
    app = web.Application()
    app['bot_uuid'] = bot_uuid
    app.router.add_get('/', index)
    app.router.add_post('/send', send)
    app.router.add_post('/callback', callback)
    app.router.add_get('/events', events)
    runner = web.AppRunner(app)
    await runner.setup()
    site = web.TCPSite(runner, '0.0.0.0', PORT)
    await site.start()
    print(f'\n  ▶ 打开:  http://{PUBLIC_IP}:{PORT}/\n')
    while True:
        await asyncio.sleep(3600)
 if __name__ == '__main__':
    asyncio.run(main())
@@ -0,0 +1,48 @@
 # Page Bot Adapter — Embed Demo
 > English | [中文](./README.zh.md)
 A single self-contained HTML page that demos the LangBot **Page Bot**
 (`web_page_bot`) embeddable chat widget — the one you drop onto any website with
 a single `<script>` tag.
 Full guide: [docs.langbot.app — Page Bot](https://docs.langbot.app/en/usage/platforms/webpage).
 ## Files
 | File | What it is |
 |---|---|
 | `index.html` | **Browser demo** — open it, point it at a running LangBot instance + a Page Bot you created, and it loads the live embed widget so you can chat with the bot exactly as a site visitor would. Zero deps, no build step. |
 ## How to use
 1. In the LangBot WebUI, create a bot with the **Page Bot** (`页面机器人`)
   adapter and bind it to a working pipeline. Copy its **bot UUID** from the
   generated embed code.
 2. Open `index.html` in a browser. Any of these work:
   - double-click the file, or
   - serve the folder: `python3 -m http.server 8930` then open
     `http://localhost:8930/examples/web-page-bot/`.
 3. Fill in:
   - **LangBot base URL** — where your instance is reachable from the browser
     (e.g. `http://localhost:5300`, or your public address).
   - **Page Bot UUID** — from step 1.
   - **Widget title** — optional, sets the `data-title` attribute.
 4. Click **Load widget**. A floating chat bubble appears in the bottom-right
   corner — click it and chat.
 The page also renders the exact `<script>` snippet you'd paste into your own
 site (before `</body>`), and updates it live as you edit the fields.
 ## What it demonstrates
 - The embed contract: `<script data-title="…" src="<base>/api/v1/embed/<uuid>/widget.js"></script>`.
 - `widget.js` is served by LangBot pre-configured for that bot UUID — title,
  bubble icon, language and optional Cloudflare Turnstile protection all come
  from the bot's config, no page changes needed.
 - Messages travel over a WebSocket to the bot's bound pipeline; replies stream
  back into the bubble.
 > The widget loads `widget.js` from your LangBot instance, so the **base URL
 > must be reachable from the browser** you open this page in. If LangBot runs on
 > a server, use its public address instead of `localhost`.
@@ -0,0 +1,44 @@
 # 页面机器人适配器 —— 嵌入演示
 > [English](./README.md) | 中文
 一个自包含的单文件 HTML 页面，用于演示 LangBot **页面机器人**
 (`web_page_bot`) 的可嵌入聊天组件 —— 也就是你用一行 `<script>` 标签就能放到任意
 网站上的那个组件。
 完整指南：[docs.langbot.app —— 页面机器人](https://docs.langbot.app/zh/usage/platforms/webpage)。
 ## 文件清单
 | 文件 | 是什么 |
 |---|---|
 | `index.html` | **浏览器演示页** —— 打开它，填上一个运行中的 LangBot 实例地址 + 你创建的页面机器人，它就会加载真实的嵌入组件，让你像网站访客一样和机器人对话。零依赖，无需构建。 |
 ## 使用方法
 1. 在 LangBot WebUI 中，用 **页面机器人**（`web_page_bot`）适配器创建一个机器人，
   并绑定一个可用的流水线。从生成的嵌入代码里复制它的 **机器人 UUID**。
 2. 在浏览器中打开 `index.html`，以下任一方式皆可：
   - 直接双击该文件；或
   - 起一个静态服务：`python3 -m http.server 8930`，然后打开
     `http://localhost:8930/examples/web-page-bot/`。
 3. 填写：
   - **LangBot base URL** —— 你的实例在该浏览器中可访问的地址
     （例如 `http://localhost:5300`，或你的公网地址）。
   - **页面机器人 UUID** —— 第 1 步里复制的。
   - **组件标题** —— 可选，对应 `data-title` 属性。
 4. 点击 **Load widget**。页面右下角会出现一个浮动聊天气泡 —— 点开即可对话。
 页面还会实时渲染出你需要粘贴到自己网站（放在 `</body>` 前）的那段 `<script>`
 代码，并随着你编辑输入框同步更新。
 ## 它演示了什么
 - 嵌入契约：`<script data-title="…" src="<base>/api/v1/embed/<uuid>/widget.js"></script>`。
 - `widget.js` 由 LangBot 针对该机器人 UUID 预配置后下发 —— 标题、气泡图标、语言
  以及可选的 Cloudflare Turnstile 防护，全部来自机器人配置，无需改动页面。
 - 消息通过 WebSocket 发往机器人绑定的流水线，回复流式回到气泡中。
 > 组件会从你的 LangBot 实例加载 `widget.js`，因此 **base URL 必须能从你打开本页
 > 的浏览器访问到**。如果 LangBot 部署在服务器上，请用它的公网地址而非
 > `localhost`。
@@ -0,0 +1,205 @@
 <!doctype html>
 <html lang="en">
 <head>
 <meta charset="utf-8" />
 <meta name="viewport" content="width=device-width, initial-scale=1" />
 <title>LangBot Page Bot · Embed Demo</title>
 <style>
  :root {
    --bg: #f7f8fa; --panel: #ffffff; --line: #e8eaed; --ink: #1f2329;
    --mut: #8a909a; --brand: #2563eb; --brand-soft: #eef3ff;
    --ok: #16a34a; --bad: #dc2626; --code: #f3f4f6;
  }
  * { box-sizing: border-box; }
  html, body { height: 100%; }
  body {
    margin: 0; background: var(--bg); color: var(--ink);
    font: 14px/1.6 -apple-system, BlinkMacSystemFont, "Segoe UI", Roboto,
      "PingFang SC", "Microsoft YaHei", sans-serif;
  }
  .top {
    height: 52px; background: var(--panel); border-bottom: 1px solid var(--line);
    display: flex; align-items: center; gap: 10px; padding: 0 18px;
  }
  .logo {
    width: 26px; height: 26px; border-radius: 7px; background: var(--brand);
    display: grid; place-items: center; color: #fff; font-weight: 700; font-size: 14px;
  }
  .top b { font-size: 15px; }
  .top .ver { font-size: 12px; color: var(--mut); }
  .wrap { max-width: 760px; margin: 0 auto; padding: 28px 18px 80px; }
  .hero h1 { margin: 8px 0 6px; font-size: 22px; }
  .hero p { margin: 0 0 4px; color: var(--mut); }
  .card {
    background: var(--panel); border: 1px solid var(--line); border-radius: 12px;
    padding: 20px; margin-top: 20px;
  }
  .card h3 {
    margin: 0 0 14px; font-size: 14px; font-weight: 600; color: #4b5563;
    display: flex; align-items: center; gap: 8px;
  }
  .card h3 .num {
    width: 20px; height: 20px; border-radius: 50%; background: var(--brand-soft);
    color: var(--brand); display: grid; place-items: center; font-size: 12px; font-weight: 700;
  }
  .field { margin-bottom: 14px; }
  .field:last-child { margin-bottom: 0; }
  .field label { display: block; font-size: 12px; color: var(--mut); margin-bottom: 5px; }
  .field input {
    width: 100%; border: 1px solid var(--line); border-radius: 9px;
    padding: 10px 12px; font-size: 14px; outline: none; font-family: inherit;
  }
  .field input:focus { border-color: var(--brand); box-shadow: 0 0 0 3px var(--brand-soft); }
  .hint { font-size: 12px; color: var(--mut); margin-top: 5px; }
  .hint code { background: var(--code); border-radius: 5px; padding: 1px 5px; font-size: 11px; }
  .actions { display: flex; gap: 10px; margin-top: 18px; align-items: center; }
  button {
    border: 0; border-radius: 9px; padding: 10px 18px; font-size: 14px;
    font-weight: 500; cursor: pointer; font-family: inherit;
  }
  .btn-primary { background: var(--brand); color: #fff; }
  .btn-primary:disabled { opacity: .5; cursor: default; }
  .btn-ghost { background: #fff; border: 1px solid var(--line); color: #4b5563; }
  .status { font-size: 13px; color: var(--mut); }
  .status .ok { color: var(--ok); }
  .status .bad { color: var(--bad); }
  pre {
    background: #0f172a; color: #e2e8f0; border-radius: 10px; padding: 14px 16px;
    overflow: auto; font: 12px/1.6 ui-monospace, SFMono-Regular, Menlo, monospace;
    margin: 0;
  }
  .snippet-row { position: relative; }
  .snippet-row .copy {
    position: absolute; top: 10px; right: 10px; background: rgba(255,255,255,.12);
    color: #fff; border: 0; border-radius: 7px; padding: 5px 10px; font-size: 12px; cursor: pointer;
  }
  ul.steps { margin: 0; padding-left: 18px; color: #4b5563; }
  ul.steps li { margin-bottom: 6px; }
 </style>
 </head>
 <body>
 <div class="top">
  <div class="logo">L</div>
  <b>Page Bot · Embed Demo</b>
  <span class="ver">examples/web-page-bot</span>
 </div>
 <div class="wrap">
  <div class="hero">
    <h1>Try the LangBot Page Bot widget</h1>
    <p>Point this page at a running LangBot instance and a <strong>Page Bot</strong> you created,</p>
    <p>then load the live embed widget below to chat with it — exactly as your site visitors would.</p>
  </div>
  <div class="card">
    <h3><span class="num">1</span> Connect your Page Bot</h3>
    <div class="field">
      <label for="base">LangBot base URL</label>
      <input id="base" placeholder="http://localhost:5300" value="http://localhost:5300" />
      <div class="hint">The address where your LangBot instance is reachable from this browser. No trailing slash.</div>
    </div>
    <div class="field">
      <label for="uuid">Page Bot UUID</label>
      <input id="uuid" placeholder="xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx" />
      <div class="hint">Create a bot with the <code>Page Bot</code> adapter in the WebUI, then copy its UUID from the embed code.</div>
    </div>
    <div class="field">
      <label for="title">Widget title (optional)</label>
      <input id="title" placeholder="LangBot" value="LangBot" />
    </div>
    <div class="actions">
      <button id="load" class="btn-primary">Load widget</button>
      <button id="unload" class="btn-ghost">Remove widget</button>
      <span class="status" id="status">Not loaded.</span>
    </div>
  </div>
  <div class="card">
    <h3><span class="num">2</span> The embed snippet</h3>
    <p style="margin:0 0 12px;color:var(--mut)">This is exactly what you paste into your own site (before <code>&lt;/body&gt;</code>). It updates as you edit the fields above.</p>
    <div class="snippet-row">
      <button class="copy" id="copy">Copy</button>
      <pre id="snippet">&lt;script data-title="LangBot" src="http://localhost:5300/api/v1/embed/&lt;bot-uuid&gt;/widget.js"&gt;&lt;/script&gt;</pre>
    </div>
  </div>
  <div class="card">
    <h3><span class="num">3</span> How it works</h3>
    <ul class="steps">
      <li>The <code>&lt;script&gt;</code> tag pulls <code>widget.js</code> from your LangBot instance, pre-configured for that bot UUID.</li>
      <li>A floating chat bubble appears in the bottom-right corner of the page.</li>
      <li>Messages travel over a WebSocket to the bot's bound pipeline; replies stream back into the bubble.</li>
      <li>Title, bubble icon, language and optional Cloudflare Turnstile protection are all set in the bot's config — no page changes needed.</li>
    </ul>
  </div>
 </div>
 <script>
  var $ = function (s) { return document.querySelector(s); };
  var baseEl = $("#base"), uuidEl = $("#uuid"), titleEl = $("#title"),
      statusEl = $("#status"), snippetEl = $("#snippet");
  var WIDGET_ID = "langbot-embed-demo-script";
  function clean(v) { return (v || "").trim().replace(/\/+$/, ""); }
  function buildSrc() {
    var base = clean(baseEl.value) || "http://localhost:5300";
    var uuid = uuidEl.value.trim() || "<bot-uuid>";
    return base + "/api/v1/embed/" + uuid + "/widget.js";
  }
  function refreshSnippet() {
    var title = titleEl.value.trim() || "LangBot";
    var src = buildSrc();
    snippetEl.textContent =
      '<script data-title="' + title + '" src="' + src + '"><\/script>';
  }
  function setStatus(html) { statusEl.innerHTML = html; }
  function removeWidget() {
    var old = document.getElementById(WIDGET_ID);
    if (old) old.remove();
    // The widget injects its own DOM (bubble + panel). Clear the common containers it creates.
    document.querySelectorAll('[id^="langbot-"]').forEach(function (n) {
      if (n.id !== WIDGET_ID) n.remove();
    });
  }
  function loadWidget() {
    var uuid = uuidEl.value.trim();
    var uuidRe = /^[a-f0-9]{8}-[a-f0-9]{4}-[a-f0-9]{4}-[a-f0-9]{4}-[a-f0-9]{12}$/i;
    if (!uuidRe.test(uuid)) {
      setStatus('<span class="bad">Enter a valid bot UUID first.</span>');
      return;
    }
    removeWidget();
    var s = document.createElement("script");
    s.id = WIDGET_ID;
    s.setAttribute("data-title", titleEl.value.trim() || "LangBot");
    s.src = buildSrc();
    s.onload = function () {
      setStatus('<span class="ok">Widget loaded — look bottom-right.</span>');
    };
    s.onerror = function () {
      setStatus('<span class="bad">Failed to load widget.js — check the base URL and that the bot is enabled.</span>');
    };
    document.body.appendChild(s);
    setStatus("Loading…");
  }
  $("#load").onclick = loadWidget;
  $("#unload").onclick = function () {
    removeWidget();
    setStatus("Widget removed.");
  };
  $("#copy").onclick = function () {
    navigator.clipboard.writeText(snippetEl.textContent).then(function () {
      var b = $("#copy"); b.textContent = "Copied"; setTimeout(function () { b.textContent = "Copy"; }, 1200);
    });
  };
  [baseEl, uuidEl, titleEl].forEach(function (el) { el.addEventListener("input", refreshSnippet); });
  refreshSnippet();
 </script>
 </body>
 </html>
@@ -1,6 +1,6 @@
 [project]
 name = "langbot"
-version = "4.10.2"
+version = "4.10.4"
 description = "Production-grade platform for building agentic IM bots"
 readme = "README.md"
 license-files = ["LICENSE"]
@@ -8,7 +8,7 @@ requires-python = ">=3.11,<4.0"
 dependencies = [
    "aiocqhttp>=1.4.4",
    "aiofiles>=24.1.0",
-    "aiohttp>=3.14.0",
+    "aiohttp>=3.14.1",
    "aioshutil>=1.5",
    "aiosqlite>=0.21.0",
    "anthropic>=0.51.0",
@@ -16,7 +16,7 @@ dependencies = [
    "async-lru>=2.0.5",
    "certifi>=2025.4.26",
    "colorlog~=6.6.0",
-    "cryptography>=46.0.7",
+    "cryptography>=48.0.1",
    "dashscope>=1.25.10",
    "dingtalk-stream>=0.24.0",
    "discord-py>=2.5.2",
@@ -61,16 +61,16 @@ dependencies = [
    "beautifulsoup4>=4.12.3",
    "ebooklib>=0.18",
    "html2text>=2024.2.26",
-    "langchain>=0.2.0",
+    "langchain>=1.3.9",
    "langchain-core>=1.3.3",
-    "langsmith>=0.8.0",
+    "langsmith>=0.8.18",
    "python-multipart>=0.0.27",
    "Mako>=1.3.12",
    "langchain-text-splitters>=1.1.2",
    "chromadb>=1.0.0,<2.0.0",
    "qdrant-client (>=1.15.1,<2.0.0)",
    "pyseekdb==1.1.0.post3",
-    "langbot-plugin==0.4.5",
+    "langbot-plugin==0.4.6",
    "asyncpg>=0.30.0",
    "line-bot-sdk>=3.19.0",
    "matrix-nio>=0.25.2",
@@ -26,7 +26,7 @@ and LangBot's own Local Agent) working with the LangBot ecosystem.
 ## Quick start (for an AI agent)
-1. Read this README, `AGENTS.md`, and `qa-agent-docs/` to understand the layout.
+1. Read this README, `AGENTS.md`, and `docs/user-guide.md` to understand the layout.
 2. Read `skills/.env` for shared local defaults. On a new machine, copy
   `skills/.env.example` to `skills/.env.local` (gitignored) and override
   machine-specific values there. Never commit secrets.
@@ -48,6 +48,7 @@ bin/lbs env show     # inspect resolved env defaults (redacted)
 bin/lbs env doctor   # diagnose local environment readiness
 bin/lbs case list --ready
 bin/lbs test plan <case-id>
 bin/lbs suite plan langbot-debug-chat-load-gate
 ```
 ## Maintenance rule
@@ -0,0 +1,171 @@
 # LangBot QA Skills User Guide
 Use this guide as the first operational path after reading `README.md` and
 `AGENTS.md`.
 ## 1. Configure Local Inputs
 Read `skills/.env`, then create `skills/.env.local` for machine-local values.
 Do not commit `.env.local`, browser profiles, reports, tokens, API keys, OAuth
 state, or provider credentials.
 Minimum local fields for live browser QA:
 ```bash
 LANGBOT_REPO=/path/to/LangBot
 LANGBOT_WEB_REPO=/path/to/LangBot/web
 LANGBOT_BACKEND_URL=http://127.0.0.1:5300
 LANGBOT_FRONTEND_URL=http://127.0.0.1:3000
 LANGBOT_DEV_FRONTEND_URL=http://127.0.0.1:3000
 LANGBOT_BROWSER_PROFILE=/path/to/langbot-browser-profile
 LANGBOT_CHROMIUM_EXECUTABLE=/path/to/chromium-or-playwright-chrome
 LANGBOT_E2E_LOGIN_USER=qa-local@example.com
 ```
 `LANGBOT_E2E_LOGIN_USER` is a local QA account. The setup automation uses the
 LangBot recovery key from the active checkout to initialize or refresh that
 local account and write a browser `localStorage` token. It does not need the
 user's GitHub or Space credentials.
 ## 2. Check Readiness
 From `skills/`:
 ```bash
 bin/lbs env show
 bin/lbs env doctor
 bin/lbs validate
 bin/lbs index --check
 ```
 `env doctor` should report reachable backend and frontend URLs before live
 browser cases are run. Missing Space provider credentials are not a LangBot
 product pass; classify them as `env_issue` and configure the local Space
 provider before measuring Debug Chat performance.
 ## 3. Start Services
 Start the backend from `LANGBOT_REPO`:
 ```bash
 cd "$LANGBOT_REPO"
 uv run main.py
 ```
 Start the standalone frontend from `LANGBOT_WEB_REPO` and point it at the
 backend:
 ```bash
 cd "$LANGBOT_WEB_REPO"
 VITE_API_BASE_URL="$LANGBOT_BACKEND_URL" pnpm dev --host 0.0.0.0
 ```
 If `VITE_API_BASE_URL` is missing, browser tests can load the Vite page but send
 API requests to the frontend port, which produces false UI failures.
 ## 4. Prepare User-Path Fixtures
 For local-agent Debug Chat cases and the user-path performance gate:
 ```bash
 node scripts/e2e/ensure-local-agent-pipeline.mjs --write-env
 ```
 The script:
 - refreshes the local QA login and browser token;
 - marks the local wizard as skipped;
 - creates or updates a local QA pipeline;
 - scans Space LLM models, tests candidates, and switches to the first working
  Space model with tested fallback models;
 - writes `LANGBOT_PIPELINE_URL`, `LANGBOT_PIPELINE_NAME`, and local-agent
  pipeline/model variables into `skills/.env.local`;
 - returns `env_issue` when no Space model can be scanned or tested.
 Useful model controls:
 ```bash
 LANGBOT_E2E_MODEL_TEST_LIMIT=8
 LANGBOT_E2E_MODEL_FALLBACK_COUNT=3
 LANGBOT_E2E_SKIP_MODEL_UUIDS=uuid-a,uuid-b
 LANGBOT_E2E_SKIP_MODEL_NAMES=model-a,model-b
 LANGBOT_E2E_SCAN_SPACE_MODELS=true
 ```
 The setup writes a current-runtime compatibility `max-round` value into the
 pipeline config because this backend still reads that field directly during
 message truncation. Do not treat it as a long-term QA contract.
 ## 5. Run Gates
 Fast contract gate, no live service required:
 ```bash
 bin/lbs suite run langbot-performance-contract-gate --run-id langbot-contract-local
 ```
 Live backend gate:
 ```bash
 bin/lbs suite run langbot-live-backend-gate --run-id langbot-backend-local
 ```
 Browser-visible user-path performance gate:
 ```bash
 bin/lbs suite plan langbot-user-path-performance-gate
 bin/lbs suite run langbot-user-path-performance-gate --run-id langbot-user-path-local --include-manual-check
 ```
 Controlled Debug Chat message-path load gate (manual/non-required; run fake-provider cases serially when they share `LANGBOT_FAKE_PROVIDER_URL`):
 ```bash
 bin/lbs suite plan langbot-debug-chat-load-gate
 bin/lbs test run langbot-fake-provider-debug-chat-load --run-id langbot-fake-load-local
 bin/lbs test run langbot-fake-provider-debug-chat-slow-load --run-id langbot-fake-slow-local
 bin/lbs test run langbot-fake-provider-debug-chat-fault-recovery --run-id langbot-fake-fault-local
 bin/lbs test run langbot-space-debug-chat-concurrency-smoke --run-id langbot-space-smoke-local
 ```
 Cross-pipeline Debug Chat isolation is a separate manual regression gate because
 current releases may fail it due to product bug #2286:
 ```bash
 bin/lbs suite plan langbot-debug-chat-isolation-gate
 bin/lbs suite run langbot-debug-chat-isolation-gate --run-id langbot-debug-chat-isolation-local --include-manual-check
 ```
 Start with `langbot-fake-provider-debug-chat-load`. It launches a local
 OpenAI-compatible fake provider, creates the matching provider/model/pipeline,
 then sends concurrent WebSocket Debug Chat messages through the real backend.
 Use `langbot-fake-provider-debug-chat-slow-load` to measure the same path under
 deterministic streaming latency. Use
 `langbot-fake-provider-debug-chat-fault-recovery` to inject bounded provider
 HTTP failures and confirm later Debug Chat requests recover. Use the separate
 `langbot-debug-chat-isolation-gate` to verify that concurrent Debug Chat traffic
 on two pipelines does not leak assistant responses across pipeline boundaries;
 current releases may fail that gate because of #2286, so keep it out of the
 normal load gate until the product fix lands.
 Use `langbot-space-debug-chat-concurrency-smoke` only as a low-volume live
 provider smoke; it includes Space/model/network latency and should be compared
 against the fake-provider baseline before attributing failures to LangBot.
 `manual_check` means the agent must confirm the declared preconditions for that
 run window. When setup automation is declared, run output may stop early with
 `env_issue`; fix that environment input before treating the product path as
 measured.
 ## 6. Read Results
 Suite reports live under `skills/reports/`. Evidence lives under
 `skills/reports/evidence/<run-id>/`.
 For performance cases, inspect:
 - `metrics.json` for p50/p95/p99, error rate, and total duration;
 - `automation-result.json` for threshold decisions and artifacts;
 - `console.log` and `network.log` for frontend/API failures;
 - backend logs for provider, runner, WebSocket, or persistence failures.
 Do not call a user-path performance result a LangBot overhead regression until
 provider/tool/network time has been separated or ruled out.
@@ -48,7 +48,18 @@
    },
    "type": {
      "type": "string",
-      "enum": ["smoke", "regression", "feature", "provider", "exploratory"]
+      "enum": [
        "smoke",
        "regression",
        "feature",
        "provider",
        "exploratory",
        "contract",
        "performance",
        "reliability",
        "chaos",
        "security"
      ]
    },
    "priority": {
      "type": "string",
@@ -102,7 +113,11 @@
          "backend_log",
          "frontend_log",
          "api_diagnostic",
-          "filesystem"
+          "filesystem",
          "metrics",
          "trace",
          "profile",
          "resource_log"
        ]
      },
      "minItems": 1
@@ -188,9 +203,101 @@
      "type": "string",
      "enum": ["person", "group"]
    },
    "automation_debug_chat_response_p95_ms": {
      "type": "string"
    },
    "automation_debug_chat_max_error_rate": {
      "type": "string"
    },
    "automation_debug_chat_load_requests": {
      "type": "string"
    },
    "automation_debug_chat_load_concurrency": {
      "type": "string"
    },
    "automation_debug_chat_load_timeout_ms": {
      "type": "string"
    },
    "automation_debug_chat_load_response_p95_ms": {
      "type": "string"
    },
    "automation_debug_chat_load_first_response_p95_ms": {
      "type": "string"
    },
    "automation_debug_chat_load_max_error_rate": {
      "type": "string"
    },
    "automation_debug_chat_load_min_error_rate": {
      "type": "string"
    },
    "automation_debug_chat_load_min_error_count": {
      "type": "string"
    },
    "automation_debug_chat_load_min_ok_count": {
      "type": "string"
    },
    "automation_debug_chat_load_min_provider_fault_count": {
      "type": "string"
    },
    "automation_debug_chat_load_expected_prefix": {
      "type": "string"
    },
    "automation_debug_chat_load_prompt_template": {
      "type": "string"
    },
    "automation_debug_chat_load_stream": {
      "type": "string",
      "enum": ["0", "1", "false", "true"]
    },
    "automation_debug_chat_load_reset": {
      "type": "string",
      "enum": ["0", "1", "false", "true"]
    },
    "automation_debug_chat_load_fail_on_final_mismatch": {
      "type": "string",
      "enum": ["0", "1", "false", "true"]
    },
    "automation_fake_provider_response_text": {
      "type": "string"
    },
    "automation_fake_provider_first_token_delay_ms": {
      "type": "string"
    },
    "automation_fake_provider_chunk_delay_ms": {
      "type": "string"
    },
    "automation_fake_provider_chunk_count": {
      "type": "string"
    },
    "automation_fake_provider_fail_first_n": {
      "type": "string"
    },
    "automation_fake_provider_fail_every_n": {
      "type": "string"
    },
    "automation_fake_provider_fault_status": {
      "type": "string"
    },
    "automation_fake_provider_fail_after_first_chunk": {
      "type": "string",
      "enum": ["0", "1", "false", "true"]
    },
    "automation_fake_provider_dynamic_response": {
      "type": "string",
      "enum": ["0", "1", "false", "true"]
    },
    "automation_filesystem_checks_json": {
      "type": "string"
    },
    "metrics_thresholds_json": {
      "type": "string"
    },
    "load_profile_json": {
      "type": "string"
    },
    "fault_model_json": {
      "type": "string"
    },
    "automation_pipeline_url_env": {
      "type": "string",
      "pattern": "^[A-Z][A-Z0-9_]*$"
@@ -18,7 +18,17 @@
    },
    "type": {
      "type": "string",
-      "enum": ["smoke", "regression", "release_gate", "exploratory"]
+      "enum": [
        "smoke",
        "regression",
        "release_gate",
        "exploratory",
        "contract",
        "performance",
        "reliability",
        "chaos",
        "security"
      ]
    },
    "priority": {
      "type": "string",
@@ -0,0 +1,205 @@
 #!/usr/bin/env node
 import { spawn } from "node:child_process";
 import { mkdir, readFile, writeFile } from "node:fs/promises";
 import { dirname, resolve } from "node:path";
 import { env } from "node:process";
 import {
  appendLine,
  ensureEvidence,
  evidencePaths,
  loadEnvFiles,
  redact,
  writeResult,
 } from "./lib/langbot-e2e.mjs";
 const caseId = "ensure-fake-provider-cross-pipelines";
 const DEFAULT_PIPELINE_A_NAME = "LangBot QA Fake Provider Debug Chat A";
 const DEFAULT_PIPELINE_B_NAME = "LangBot QA Fake Provider Debug Chat B";
 await loadEnvFiles();
 const paths = evidencePaths(caseId);
 await ensureEvidence(paths);
 const writeEnv = process.argv.includes("--write-env");
 const envLocalPath = resolve("skills/.env.local");
 const pipelineAName = env.LANGBOT_FAKE_PROVIDER_PIPELINE_A_NAME || DEFAULT_PIPELINE_A_NAME;
 const pipelineBName = env.LANGBOT_FAKE_PROVIDER_PIPELINE_B_NAME || DEFAULT_PIPELINE_B_NAME;
 const result = {
  source: "setup_automation",
  case_id: caseId,
  run_id: paths.runId,
  status: "fail",
  reason: "",
  pipeline_a: {
    name: pipelineAName,
    id: "",
    url: "",
  },
  pipeline_b: {
    name: pipelineBName,
    id: "",
    url: "",
  },
  fake_provider: {
    url: "",
    base_url: "",
    pid: null,
  },
  wrote_env: false,
  evidence: {
    console_log: paths.consoleLog,
    automation_result_json: paths.automationResultJson,
    result_json: paths.resultJson,
  },
  evidence_collected: ["api_diagnostic", "filesystem"],
 };
 try {
  console.error(`[langbot-qa] configuring cross-pipeline QA fixtures: pipeline_a=\"${pipelineAName}\", pipeline_b=\"${pipelineBName}\"`);
  console.error("[langbot-qa] run these fake-provider setup/probe commands serially when they share LANGBOT_FAKE_PROVIDER_URL.");
  if (pipelineAName === pipelineBName) {
    throw new Error("LANGBOT_FAKE_PROVIDER_PIPELINE_A_NAME and LANGBOT_FAKE_PROVIDER_PIPELINE_B_NAME must be different.");
  }
  const setupA = await runPipelineSetup(pipelineAName, "A");
  const setupB = await runPipelineSetup(pipelineBName, "B");
  result.pipeline_a = {
    name: setupA.pipeline_name || pipelineAName,
    id: setupA.pipeline_id || "",
    url: setupA.pipeline_url || "",
  };
  result.pipeline_b = {
    name: setupB.pipeline_name || pipelineBName,
    id: setupB.pipeline_id || "",
    url: setupB.pipeline_url || "",
  };
  result.fake_provider = {
    url: setupB.fake_provider?.url || setupA.fake_provider?.url || "",
    base_url: setupB.fake_provider?.base_url || setupA.fake_provider?.base_url || "",
    pid: setupB.fake_provider?.pid ?? setupA.fake_provider?.pid ?? null,
  };
  if (!result.pipeline_a.url || !result.pipeline_b.url || !result.fake_provider.url) {
    throw new Error("Cross-pipeline fake provider setup did not return both pipeline URLs and provider URL.");
  }
  if (writeEnv) {
    await upsertEnvLocal(envLocalPath, {
      LANGBOT_FAKE_PROVIDER_URL: result.fake_provider.url,
      LANGBOT_FAKE_PROVIDER_BASE_URL: result.fake_provider.base_url,
      LANGBOT_FAKE_PROVIDER_PID: result.fake_provider.pid ? String(result.fake_provider.pid) : "",
      LANGBOT_FAKE_PROVIDER_PIPELINE_A_URL: result.pipeline_a.url,
      LANGBOT_FAKE_PROVIDER_PIPELINE_A_NAME: result.pipeline_a.name,
      LANGBOT_FAKE_PROVIDER_PIPELINE_B_URL: result.pipeline_b.url,
      LANGBOT_FAKE_PROVIDER_PIPELINE_B_NAME: result.pipeline_b.name,
    });
    result.wrote_env = true;
  }
  result.status = "pass";
  result.reason = "Fake provider cross-pipeline fixtures are configured.";
 } catch (error) {
  result.status = looksLikeEnvIssue(error) ? "env_issue" : "fail";
  result.reason = safeReason(error.message);
 } finally {
  await writeResult(paths, result);
  console.log(JSON.stringify(result, null, 2));
 }
 process.exit(result.status === "pass" ? 0 : result.status === "env_issue" ? 2 : 1);
 function runPipelineSetup(pipelineName, label) {
  return new Promise((resolvePromise, rejectPromise) => {
    const child = spawn(process.execPath, ["scripts/e2e/ensure-fake-provider-pipeline.mjs"], {
      cwd: resolve("."),
      env: {
        ...env,
        LANGBOT_FAKE_PROVIDER_PIPELINE_NAME: pipelineName,
        LANGBOT_FAKE_PROVIDER_FIRST_TOKEN_DELAY_MS: env.LANGBOT_FAKE_PROVIDER_FIRST_TOKEN_DELAY_MS || "25",
        LANGBOT_FAKE_PROVIDER_CHUNK_DELAY_MS: env.LANGBOT_FAKE_PROVIDER_CHUNK_DELAY_MS || "10",
        LANGBOT_FAKE_PROVIDER_CHUNK_COUNT: env.LANGBOT_FAKE_PROVIDER_CHUNK_COUNT || "0",
        LANGBOT_FAKE_PROVIDER_FAIL_FIRST_N: "0",
        LANGBOT_FAKE_PROVIDER_FAIL_EVERY_N: "0",
        LANGBOT_FAKE_PROVIDER_FAULT_STATUS: env.LANGBOT_FAKE_PROVIDER_FAULT_STATUS || "500",
        LANGBOT_FAKE_PROVIDER_FAIL_AFTER_FIRST_CHUNK: "false",
        LANGBOT_FAKE_PROVIDER_DYNAMIC_RESPONSE: "true",
      },
      stdio: ["ignore", "pipe", "pipe"],
    });
    let stdout = "";
    let stderr = "";
    child.stdout.on("data", (chunk) => {
      const text = chunk.toString();
      stdout += text;
      appendLine(paths.consoleLog, `[setup ${label} stdout] ${text.trimEnd()}`).catch(() => {});
    });
    child.stderr.on("data", (chunk) => {
      const text = chunk.toString();
      stderr += text;
      appendLine(paths.consoleLog, `[setup ${label} stderr] ${text.trimEnd()}`).catch(() => {});
    });
    child.on("error", rejectPromise);
    child.on("close", (code) => {
      const parsed = parseJsonOutput(stdout);
      if (code !== 0 || parsed.status !== "pass") {
        rejectPromise(new Error(parsed.reason || stderr || `Fake provider pipeline setup ${label} exited with ${code}.`));
        return;
      }
      resolvePromise(parsed);
    });
  });
 }
 function parseJsonOutput(text) {
  const trimmed = String(text || "").trim();
  if (!trimmed) return {};
  try {
    return JSON.parse(trimmed);
  } catch {
    const start = trimmed.indexOf("{");
    const end = trimmed.lastIndexOf("}");
    if (start >= 0 && end > start) {
      try {
        return JSON.parse(trimmed.slice(start, end + 1));
      } catch {
        return {};
      }
    }
    return {};
  }
 }
 async function upsertEnvLocal(path, updates) {
  await mkdir(dirname(path), { recursive: true });
  let text = "";
  try {
    text = await readFile(path, "utf8");
  } catch {
    text = "";
  }
  const lines = text.split(/\r?\n/);
  const seen = new Set();
  const next = lines.map((line) => {
    const trimmed = line.trim();
    const match = trimmed.match(/^([A-Z][A-Z0-9_]*)=/);
    if (!match || updates[match[1]] === undefined) return line;
    seen.add(match[1]);
    return `${match[1]}=${updates[match[1]]}`;
  });
  for (const [key, value] of Object.entries(updates)) {
    if (!seen.has(key)) next.push(`${key}=${value}`);
  }
  await writeFile(path, `${next.join("\n").replace(/\n+$/, "")}\n`, "utf8");
 }
 function looksLikeEnvIssue(error) {
  const message = String(error?.message || error || "");
  return /fetch failed|ECONNREFUSED|ENOTFOUND|LANGBOT_.*not configured|Could not read recovery_key|Backend did not respond/i.test(message);
 }
 function safeReason(value) {
  return redact(String(value || "")).slice(0, 1000);
 }
@@ -0,0 +1,635 @@
 #!/usr/bin/env node
 import { spawn } from "node:child_process";
 import { open, readFile, mkdir, writeFile } from "node:fs/promises";
 import { dirname, resolve } from "node:path";
 import { env } from "node:process";
 import {
  apiJson,
  ensureEvidence,
  evidencePaths,
  loadEnvFiles,
  redact,
  resetAndAuthLocalUser,
  writeResult,
 } from "./lib/langbot-e2e.mjs";
 const RUNNER_ID = "local-agent";
 const DEFAULT_LOCAL_PASSWORD = "LangBotE2ELocalPass!2026";
 const DEFAULT_PIPELINE_NAME = "LangBot QA Fake Provider Debug Chat";
 const DEFAULT_PROVIDER_NAME = "LangBot QA Fake OpenAI Provider";
 const QA_RESOURCE_DESCRIPTION = "Managed by LangBot skills QA automation for controlled fake-provider Debug Chat tests. Safe to delete when local QA fixtures are no longer needed.";
 const DEFAULT_MODEL_NAME = "gpt-4o-mini";
 const DEFAULT_REQUESTER = "openai-chat-completions";
 const caseId = "ensure-fake-provider-pipeline";
 await loadEnvFiles();
 const paths = evidencePaths(caseId);
 await ensureEvidence(paths);
 const writeEnv = process.argv.includes("--write-env");
 const frontendUrl = env.LANGBOT_FRONTEND_URL || "";
 const backendUrl = env.LANGBOT_BACKEND_URL || "";
 const envLocalPath = resolve("skills/.env.local");
 const repoRoot = resolve(env.LANGBOT_REPO || "..");
 const fakeStateDir = resolve(env.LANGBOT_FAKE_PROVIDER_STATE_DIR || resolve(repoRoot, ".qa/fake-provider"));
 const fakeStatePath = resolve(fakeStateDir, "state.json");
 const fakeStdoutPath = resolve(fakeStateDir, "fake-provider.stdout.log");
 const fakeStderrPath = resolve(fakeStateDir, "fake-provider.stderr.log");
 const pipelineName = env.LANGBOT_FAKE_PROVIDER_PIPELINE_NAME || DEFAULT_PIPELINE_NAME;
 const providerName = env.LANGBOT_FAKE_PROVIDER_NAME || DEFAULT_PROVIDER_NAME;
 const requester = env.LANGBOT_FAKE_PROVIDER_REQUESTER || DEFAULT_REQUESTER;
 const modelName = env.LANGBOT_FAKE_PROVIDER_MODEL_NAME || DEFAULT_MODEL_NAME;
 const result = {
  source: "automation",
  case_id: caseId,
  run_id: paths.runId,
  status: "fail",
  reason: "",
  frontend_url: frontendUrl,
  backend_url: backendUrl,
  fake_provider: {
    url: "",
    base_url: "",
    pid: null,
    reused: false,
    config: {},
    state_file: fakeStatePath,
    stdout_log: fakeStdoutPath,
    stderr_log: fakeStderrPath,
  },
  provider: {
    uuid: "",
    name: providerName,
    requester,
    created: false,
    updated: false,
  },
  model: {
    uuid: "",
    name: modelName,
    created: false,
    updated: false,
    test_status: "not_run",
    test_reason: "",
  },
  pipeline_id: "",
  pipeline_name: pipelineName,
  pipeline_url: "",
  created: false,
  updated: false,
  wrote_env: false,
  evidence: {
    console_log: paths.consoleLog,
    network_log: paths.networkLog,
    automation_result_json: paths.automationResultJson,
    result_json: paths.resultJson,
  },
  evidence_collected: ["api_diagnostic", "network", "filesystem"],
 };
 try {
  console.error(`[langbot-qa] configuring QA-owned fake-provider fixtures: provider=\"${providerName}\", pipeline=\"${pipelineName}\"`);
  console.error("[langbot-qa] this setup may create or update local QA provider/model/pipeline resources on the selected backend.");
  if (!backendUrl) {
    result.status = "env_issue";
    throw new Error("LANGBOT_BACKEND_URL is not configured.");
  }
  if (!frontendUrl) {
    result.status = "env_issue";
    throw new Error("LANGBOT_FRONTEND_URL is not configured.");
  }
  const fakeProvider = await ensureFakeProvider();
  const setupConfig = await configureFakeProvider(fakeProvider.url, healthyFakeProviderConfig(), true);
  result.fake_provider = {
    ...result.fake_provider,
    ...fakeProvider,
    config: setupConfig.config || healthyFakeProviderConfig(),
  };
  const user = env.LANGBOT_E2E_LOGIN_USER || "";
  const password = env.LANGBOT_E2E_LOGIN_PASSWORD || DEFAULT_LOCAL_PASSWORD;
  if (!user) {
    result.status = "env_issue";
    throw new Error("LANGBOT_E2E_LOGIN_USER is required so this setup can create/update the fake provider pipeline.");
  }
  const auth = await resetAndAuthLocalUser({ backendUrl, user, password });
  const wizard = await skipWizard({ backendUrl, token: auth.token });
  if (wizard.status !== "pass") {
    result.status = "fail";
    throw new Error(wizard.reason || "Failed to mark the local QA wizard as skipped.");
  }
  const provider = await ensureProvider({
    backendUrl,
    token: auth.token,
    name: providerName,
    requester,
    baseUrl: fakeProvider.base_url,
  });
  result.provider = provider;
  const model = await ensureModel({
    backendUrl,
    token: auth.token,
    providerUuid: provider.uuid,
    name: modelName,
  });
  result.model = model;
  const pipeline = await ensurePipeline({
    backendUrl,
    token: auth.token,
    name: pipelineName,
    modelUuid: model.uuid,
  });
  Object.assign(result, pipeline);
  result.pipeline_url = `${frontendUrl.replace(/\/$/, "")}/home/pipelines?id=${encodeURIComponent(pipeline.pipeline_id)}`;
  const runConfig = await configureFakeProvider(fakeProvider.url, targetFakeProviderConfig(), true);
  result.fake_provider.config = runConfig.config || targetFakeProviderConfig();
  if (writeEnv) {
    await upsertEnvLocal(envLocalPath, {
      LANGBOT_E2E_LOGIN_USER: user,
      LANGBOT_FAKE_PROVIDER_URL: fakeProvider.url,
      LANGBOT_FAKE_PROVIDER_BASE_URL: fakeProvider.base_url,
      LANGBOT_FAKE_PROVIDER_PID: fakeProvider.pid ? String(fakeProvider.pid) : "",
      LANGBOT_FAKE_PROVIDER_PROVIDER_UUID: provider.uuid,
      LANGBOT_FAKE_PROVIDER_MODEL_UUID: model.uuid,
      LANGBOT_FAKE_PROVIDER_PIPELINE_URL: result.pipeline_url,
      LANGBOT_FAKE_PROVIDER_PIPELINE_NAME: pipelineName,
    });
    result.wrote_env = true;
  }
  result.status = "pass";
  result.reason = `Fake provider pipeline is configured with ${requester}/${modelName}.`;
 } catch (error) {
  result.status = result.status === "env_issue" ? "env_issue" : "fail";
  result.reason = result.reason || safeReason(error.message);
 } finally {
  await writeResult(paths, result);
  console.log(JSON.stringify(result, null, 2));
 }
 process.exit(result.status === "pass" ? 0 : result.status === "env_issue" ? 2 : 1);
 async function ensureFakeProvider() {
  const envUrl = normalizeProviderRootUrl(env.LANGBOT_FAKE_PROVIDER_URL || "");
  if (envUrl && await fakeProviderHealthy(envUrl) && await fakeProviderConfigurable(envUrl)) {
    return {
      url: envUrl,
      base_url: `${envUrl}/v1`,
      pid: null,
      reused: true,
    };
  }
  const state = await readState(fakeStatePath);
  const stateUrl = normalizeProviderRootUrl(state.url || "");
  if (stateUrl && await fakeProviderHealthy(stateUrl)) {
    if (await fakeProviderConfigurable(stateUrl)) {
      return {
        url: stateUrl,
        base_url: state.base_url || `${stateUrl}/v1`,
        pid: Number.isInteger(state.pid) ? state.pid : null,
        reused: true,
      };
    }
    if (Number.isInteger(state.pid)) await stopProcess(state.pid);
  }
  await mkdir(fakeStateDir, { recursive: true });
  await writeFile(fakeStatePath, `${JSON.stringify({ status: "starting", started_at: new Date().toISOString() }, null, 2)}\n`, "utf8");
  const stdout = await open(fakeStdoutPath, "a");
  const stderr = await open(fakeStderrPath, "a");
  const scriptPath = resolve("scripts/e2e/fake-openai-provider.mjs");
  const host = env.LANGBOT_FAKE_PROVIDER_HOST || "127.0.0.1";
  const port = env.LANGBOT_FAKE_PROVIDER_PORT || "0";
  const child = spawn(process.execPath, [
    scriptPath,
    `--host=${host}`,
    `--port=${port}`,
    `--state-file=${fakeStatePath}`,
  ], {
    cwd: resolve("."),
    detached: true,
    env: {
      ...env,
      LANGBOT_FAKE_PROVIDER_MODEL_NAME: modelName,
    },
    stdio: ["ignore", stdout.fd, stderr.fd],
  });
  child.unref();
  await stdout.close();
  await stderr.close();
  const started = await waitForFakeProviderState(fakeStatePath, child.pid, 10_000);
  if (!started.url || !await fakeProviderHealthy(started.url) || !await fakeProviderConfigurable(started.url)) {
    throw new Error(`Fake provider did not become healthy. See ${fakeStderrPath}`);
  }
  return {
    url: started.url,
    base_url: started.base_url || `${started.url}/v1`,
    pid: child.pid ?? started.pid ?? null,
    reused: false,
  };
 }
 async function configureFakeProvider(rootUrl, config, resetRequestCount) {
  const response = await fetch(`${normalizeProviderRootUrl(rootUrl)}/__qa/config`, {
    method: "POST",
    headers: { "content-type": "application/json" },
    body: JSON.stringify({
      config,
      reset_request_count: resetRequestCount,
    }),
    signal: AbortSignal.timeout(3000),
  });
  const json = await response.json().catch(() => ({}));
  if (!response.ok || json.ok !== true) {
    throw new Error(`Fake provider config failed with HTTP ${response.status}.`);
  }
  return json;
 }
 async function fakeProviderHealthy(rootUrl) {
  try {
    const response = await fetch(`${rootUrl.replace(/\/$/, "")}/healthz`, {
      signal: AbortSignal.timeout(2000),
    });
    if (!response.ok) return false;
    const json = await response.json().catch(() => ({}));
    return json.ok === true;
  } catch {
    return false;
  }
 }
 async function fakeProviderConfigurable(rootUrl) {
  try {
    const response = await fetch(`${rootUrl.replace(/\/$/, "")}/__qa/config`, {
      signal: AbortSignal.timeout(2000),
    });
    if (!response.ok) return false;
    const json = await response.json().catch(() => ({}));
    return json.ok === true && json.config && typeof json.config === "object";
  } catch {
    return false;
  }
 }
 async function stopProcess(pid) {
  try {
    process.kill(pid, "SIGTERM");
  } catch {
    return;
  }
  await sleep(500);
 }
 async function waitForFakeProviderState(path, expectedPid, timeoutMs) {
  const startedAt = Date.now();
  let lastState = {};
  while (Date.now() - startedAt < timeoutMs) {
    const state = await readState(path);
    if (state.url && (!expectedPid || state.pid === expectedPid)) return state;
    lastState = state;
    await sleep(150);
  }
  return lastState;
 }
 async function readState(path) {
  try {
    return JSON.parse(await readFile(path, "utf8"));
  } catch {
    return {};
  }
 }
 function normalizeProviderRootUrl(value) {
  const trimmed = String(value || "").trim().replace(/\/$/, "");
  return trimmed.endsWith("/v1") ? trimmed.slice(0, -3) : trimmed;
 }
 function healthyFakeProviderConfig() {
  return {
    response_text: "OK",
    first_token_delay_ms: 25,
    chunk_delay_ms: 10,
    chunk_count: 0,
    fault_status: 500,
    fail_first_n: 0,
    fail_every_n: 0,
    fail_after_first_chunk: false,
    dynamic_response: true,
  };
 }
 function targetFakeProviderConfig() {
  return {
    response_text: env.LANGBOT_FAKE_PROVIDER_RESPONSE_TEXT || "OK",
    first_token_delay_ms: nonNegativeInteger(env.LANGBOT_FAKE_PROVIDER_FIRST_TOKEN_DELAY_MS, 25),
    chunk_delay_ms: nonNegativeInteger(env.LANGBOT_FAKE_PROVIDER_CHUNK_DELAY_MS, 10),
    chunk_count: nonNegativeInteger(env.LANGBOT_FAKE_PROVIDER_CHUNK_COUNT, 0),
    fault_status: httpFaultStatus(env.LANGBOT_FAKE_PROVIDER_FAULT_STATUS, 500),
    fail_first_n: nonNegativeInteger(env.LANGBOT_FAKE_PROVIDER_FAIL_FIRST_N, 0),
    fail_every_n: nonNegativeInteger(env.LANGBOT_FAKE_PROVIDER_FAIL_EVERY_N, 0),
    fail_after_first_chunk: envBool(env.LANGBOT_FAKE_PROVIDER_FAIL_AFTER_FIRST_CHUNK, false),
    dynamic_response: envBool(env.LANGBOT_FAKE_PROVIDER_DYNAMIC_RESPONSE, true),
  };
 }
 async function skipWizard({ backendUrl, token }) {
  const response = await apiJson(backendUrl, "/api/v1/system/wizard/completed", {
    method: "POST",
    token,
    body: { status: "skipped" },
  });
  const ok = response.status < 400 && response.json.code === 0;
  return {
    status: ok ? "pass" : "fail",
    http_status: response.status,
    code: response.json.code ?? null,
    reason: ok ? "Wizard marked skipped for local QA." : response.json.msg || "Wizard status update failed.",
  };
 }
 async function ensureProvider({ backendUrl, token, name, requester, baseUrl }) {
  const list = await apiJson(backendUrl, "/api/v1/provider/providers", { token });
  if (isApiFailure(list)) {
    throw new Error(list.json.msg || "Failed to list providers.");
  }
  const providers = list.json.data?.providers || [];
  const existing = providers.find((provider) => (
    provider.name === name
      || (provider.requester === requester && String(provider.base_url || "").replace(/\/$/, "") === baseUrl.replace(/\/$/, ""))
  ));
  const body = {
    name,
    requester,
    base_url: baseUrl,
    api_keys: [env.LANGBOT_FAKE_PROVIDER_API_KEY || "langbot-fake-provider-key"],
  };
  if (existing?.uuid) {
    const update = await apiJson(backendUrl, `/api/v1/provider/providers/${encodeURIComponent(existing.uuid)}`, {
      method: "PUT",
      token,
      body,
    });
    if (isApiFailure(update)) {
      throw new Error(update.json.msg || "Failed to update fake provider.");
    }
    return {
      uuid: existing.uuid,
      name,
      requester,
      created: false,
      updated: true,
    };
  }
  const create = await apiJson(backendUrl, "/api/v1/provider/providers", {
    method: "POST",
    token,
    body,
  });
  const uuid = create.json.data?.uuid || "";
  if (isApiFailure(create) || !uuid) {
    throw new Error(create.json.msg || "Failed to create fake provider.");
  }
  return {
    uuid,
    name,
    requester,
    created: true,
    updated: false,
  };
 }
 async function ensureModel({ backendUrl, token, providerUuid, name }) {
  const list = await apiJson(backendUrl, `/api/v1/provider/models/llm?provider_uuid=${encodeURIComponent(providerUuid)}`, { token });
  if (isApiFailure(list)) {
    throw new Error(list.json.msg || "Failed to list fake provider models.");
  }
  const models = list.json.data?.models || [];
  const existing = models.find((model) => model.name === name);
  const body = {
    name,
    provider_uuid: providerUuid,
    abilities: [],
    context_length: positiveInteger(env.LANGBOT_FAKE_PROVIDER_CONTEXT_LENGTH, 8192),
    extra_args: {},
    prefered_ranking: 0,
  };
  let modelUuid = existing?.uuid || "";
  let created = false;
  let updated = false;
  if (modelUuid) {
    const update = await apiJson(backendUrl, `/api/v1/provider/models/llm/${encodeURIComponent(modelUuid)}`, {
      method: "PUT",
      token,
      body,
    });
    if (isApiFailure(update)) {
      throw new Error(update.json.msg || "Failed to update fake provider model.");
    }
    updated = true;
  } else {
    const create = await apiJson(backendUrl, "/api/v1/provider/models/llm", {
      method: "POST",
      token,
      body,
    });
    modelUuid = create.json.data?.uuid || "";
    if (isApiFailure(create) || !modelUuid) {
      throw new Error(create.json.msg || "Failed to create fake provider model.");
    }
    created = true;
  }
  const test = await apiJson(backendUrl, `/api/v1/provider/models/llm/${encodeURIComponent(modelUuid)}/test`, {
    method: "POST",
    token,
    body: { extra_args: {} },
  });
  if (isApiFailure(test)) {
    throw new Error(safeReason(test.json.msg || test.json.message || "Fake provider model test failed."));
  }
  return {
    uuid: modelUuid,
    name,
    created,
    updated,
    test_status: "pass",
    test_reason: "",
  };
 }
 async function ensurePipeline({ backendUrl, token, name, modelUuid }) {
  const list = await apiJson(backendUrl, "/api/v1/pipelines", { token });
  if (isApiFailure(list)) {
    throw new Error(list.json.msg || "Failed to list pipelines.");
  }
  const pipelines = list.json.data?.pipelines || [];
  let pipeline = pipelines.find((item) => item.name === name) || null;
  let created = false;
  if (!pipeline) {
    const create = await apiJson(backendUrl, "/api/v1/pipelines", {
      method: "POST",
      token,
      body: {
        name,
        description: QA_RESOURCE_DESCRIPTION,
        emoji: "QA",
      },
    });
    const pipelineId = create.json.data?.uuid || "";
    if (isApiFailure(create) || !pipelineId) {
      throw new Error(create.json.msg || "Failed to create fake provider pipeline.");
    }
    created = true;
    pipeline = { uuid: pipelineId };
  }
  const loaded = await apiJson(backendUrl, `/api/v1/pipelines/${encodeURIComponent(pipeline.uuid)}`, { token });
  pipeline = loaded.json.data?.pipeline || null;
  if (isApiFailure(loaded) || !pipeline?.uuid) {
    throw new Error(loaded.json.msg || "Failed to load fake provider pipeline.");
  }
  const config = pipeline.config && typeof pipeline.config === "object" ? pipeline.config : {};
  const ai = config.ai && typeof config.ai === "object" ? config.ai : {};
  const existingLocalAgentConfig = ai["local-agent"] && typeof ai["local-agent"] === "object"
    ? ai["local-agent"]
    : {};
  const localAgentConfig = {
    timeout: 60,
    prompt: [{ role: "system", content: "You are a deterministic QA assistant. Reply exactly as instructed." }],
    "remove-think": false,
    "knowledge-bases": [],
    "box-session-id-template": "{launcher_type}_{launcher_id}",
    "retrieval-top-k": 5,
    "rerank-model": "",
    "rerank-top-k": 5,
    "max-tool-iterations": 20,
    "tool-execution-mode": "parallel",
    "max-tool-result-chars": 20000,
    "context-history-fetch-limit": 20,
    "context-window-tokens": 8192,
    "context-reserve-tokens": 1024,
    "context-keep-recent-tokens": 2048,
    "context-summary-tokens": 1024,
    ...existingLocalAgentConfig,
    // Current backend truncation still reads this field directly.
    "max-round": positiveInteger(existingLocalAgentConfig["max-round"], 10),
    model: {
      primary: modelUuid,
      fallbacks: [],
    },
  };
  const updatedConfig = {
    ...config,
    ai: {
      ...ai,
      runner: {
        ...(ai.runner && typeof ai.runner === "object" ? ai.runner : {}),
        id: RUNNER_ID,
        runner: RUNNER_ID,
        "expire-time": 0,
      },
      "local-agent": localAgentConfig,
    },
  };
  const update = await apiJson(backendUrl, `/api/v1/pipelines/${encodeURIComponent(pipeline.uuid)}`, {
    method: "PUT",
    token,
    body: {
      name,
      description: QA_RESOURCE_DESCRIPTION,
      emoji: "QA",
      config: updatedConfig,
    },
  });
  if (isApiFailure(update)) {
    throw new Error(update.json.msg || "Failed to update fake provider pipeline.");
  }
  return {
    pipeline_id: pipeline.uuid,
    pipeline_name: name,
    created,
    updated: true,
  };
 }
 function isApiFailure(response) {
  return response.status >= 400 || (response.json.code !== undefined && response.json.code !== 0);
 }
 function positiveInteger(value, fallback) {
  const parsed = Number(value);
  return Number.isInteger(parsed) && parsed > 0 ? parsed : fallback;
 }
 function nonNegativeInteger(value, fallback) {
  const parsed = Number(value);
  return Number.isInteger(parsed) && parsed >= 0 ? parsed : fallback;
 }
 function httpFaultStatus(value, fallback) {
  const parsed = Number(value);
  return Number.isInteger(parsed) && parsed >= 400 && parsed <= 599 ? parsed : fallback;
 }
 function envBool(value, fallback) {
  if (value === undefined || value === "") return fallback;
  if (/^(1|true|yes|on)$/i.test(String(value))) return true;
  if (/^(0|false|no|off)$/i.test(String(value))) return false;
  return fallback;
 }
 function sleep(ms) {
  return new Promise((resolve) => setTimeout(resolve, ms));
 }
 function safeReason(value) {
  return redact(String(value || "")).slice(0, 1000);
 }
 async function upsertEnvLocal(path, updates) {
  await mkdir(dirname(path), { recursive: true });
  let text = "";
  try {
    text = await readFile(path, "utf8");
  } catch {
    text = "";
  }
  const lines = text.split(/\r?\n/);
  const seen = new Set();
  const next = lines.map((line) => {
    const trimmed = line.trim();
    const equals = trimmed.indexOf("=");
    if (equals <= 0 || trimmed.startsWith("#")) return line;
    const key = trimmed.slice(0, equals).trim();
    if (!(key in updates)) return line;
    seen.add(key);
    return `${key}=${updates[key]}`;
  });
  for (const [key, value] of Object.entries(updates)) {
    if (!seen.has(key)) next.push(`${key}=${value}`);
  }
  await writeFile(path, `${next.filter((line, index) => line !== "" || index < next.length - 1).join("\n")}\n`, "utf8");
 }
@@ -10,6 +10,7 @@ import {
  ensureEvidence,
  evidencePaths,
  loadEnvFiles,
  redact,
  resetAndAuthLocalUser,
  safeScreenshot,
  setBrowserToken,
@@ -17,9 +18,12 @@ import {
  writeResult,
 } from "./lib/langbot-e2e.mjs";
-const RUNNER_ID = "plugin:langbot/local-agent/default";
+const RUNNER_ID = "local-agent";
 const SPACE_PROVIDER_UUID = "00000000-0000-0000-0000-000000000000";
 const DEFAULT_PIPELINE_NAME = "Agent QA Local Agent Debug Chat";
 const DEFAULT_LOCAL_PASSWORD = "LangBotE2ELocalPass!2026";
 const DEFAULT_MODEL_TEST_LIMIT = 8;
 const DEFAULT_MODEL_FALLBACK_COUNT = 3;
 const caseId = "ensure-local-agent-pipeline";
 await loadEnvFiles();
@@ -45,11 +49,18 @@ const result = {
  pipeline_url: "",
  runner_id: RUNNER_ID,
  selected_model_id: "",
  selected_model_name: "",
  fallback_model_ids: [],
  model_count: 0,
  space_model_count: 0,
  scanned_space_model_count: 0,
  tested_model_count: 0,
  model_tests: [],
  created: false,
  updated: false,
  wrote_env: false,
  auth: null,
  wizard: null,
  browser_token_check: null,
  page_signal: "",
  evidence: {
@@ -71,6 +82,7 @@ try {
  const user = env.LANGBOT_E2E_LOGIN_USER || "";
  const password = env.LANGBOT_E2E_LOGIN_PASSWORD || DEFAULT_LOCAL_PASSWORD;
  if (!user) {
    result.status = "env_issue";
    throw new Error("LANGBOT_E2E_LOGIN_USER is required so this setup can create/update the pipeline via backend API.");
  }
@@ -81,6 +93,13 @@ try {
    backend_token_check: auth.check,
  };
  const wizard = await skipWizard({ backendUrl, token: auth.token });
  result.wizard = wizard;
  if (wizard.status !== "pass") {
    result.status = "fail";
    throw new Error(wizard.reason || "Failed to mark the local QA wizard as skipped.");
  }
  const prepared = await ensureLocalAgentPipeline({
    backendUrl,
    token: auth.token,
@@ -99,6 +118,10 @@ try {
      LANGBOT_PIPELINE_NAME: result.pipeline_name || pipelineName,
      LANGBOT_LOCAL_AGENT_PIPELINE_URL: result.pipeline_url,
      LANGBOT_LOCAL_AGENT_PIPELINE_NAME: result.pipeline_name || pipelineName,
      ...(result.selected_model_id ? {
        LANGBOT_LOCAL_AGENT_MODEL_UUID: result.selected_model_id,
        LANGBOT_E2E_MODEL_UUID: result.selected_model_id,
      } : {}),
    });
    result.wrote_env = true;
  }
@@ -127,6 +150,21 @@ try {
 process.exit(result.status === "pass" ? 0 : result.status === "env_issue" ? 2 : 1);
 async function skipWizard({ backendUrl, token }) {
  const response = await apiJson(backendUrl, "/api/v1/system/wizard/completed", {
    method: "POST",
    token,
    body: { status: "skipped" },
  });
  const ok = response.status < 400 && response.json.code === 0;
  return {
    status: ok ? "pass" : "fail",
    http_status: response.status,
    code: response.json.code ?? null,
    reason: ok ? "Wizard marked skipped for local QA." : response.json.msg || "Wizard status update failed.",
  };
 }
 async function ensureLocalAgentPipeline({ backendUrl, token, pipelineName, runnerId }) {
  const [pipelineList, modelList] = await Promise.all([
    apiJson(backendUrl, "/api/v1/pipelines", { token }),
@@ -149,7 +187,19 @@ async function ensureLocalAgentPipeline({ backendUrl, token, pipelineName, runne
  }
  const models = modelList.json.data?.models || [];
-  const selectedModel = models.find((model) => model.uuid) || null;
+  const skippedModelIds = new Set(
    String(env.LANGBOT_E2E_SKIP_MODEL_UUIDS || "")
      .split(",")
      .map((item) => item.trim())
      .filter(Boolean),
  );
  const skippedModelNames = new Set(
    String(env.LANGBOT_E2E_SKIP_MODEL_NAMES || "")
      .split(",")
      .map((item) => item.trim())
      .filter(Boolean),
  );
  const spaceModels = models.filter((model) => isSpaceModel(model) && !skippedModelIds.has(model.uuid));
  const pipelines = pipelineList.json.data?.pipelines || [];
  let pipeline = pipelines.find((item) => item.name === pipelineName) || null;
  let created = false;
@@ -170,6 +220,7 @@ async function ensureLocalAgentPipeline({ backendUrl, token, pipelineName, runne
        reason: createdResponse.json.msg || "Failed to create pipeline.",
        create_status: createdResponse.status,
        model_count: models.length,
        space_model_count: spaceModels.length,
      };
    }
    const pipelineId = createdResponse.json.data?.uuid || "";
@@ -183,6 +234,7 @@ async function ensureLocalAgentPipeline({ backendUrl, token, pipelineName, runne
      status: "fail",
      reason: "Pipeline was not created or resolved.",
      model_count: models.length,
      space_model_count: spaceModels.length,
    };
  }
@@ -194,27 +246,37 @@ async function ensureLocalAgentPipeline({ backendUrl, token, pipelineName, runne
      get_status: loaded.status,
      pipeline_id: pipeline.uuid,
      model_count: models.length,
      space_model_count: spaceModels.length,
    };
  }
  pipeline = loaded.json.data.pipeline;
  const config = pipeline.config && typeof pipeline.config === "object" ? pipeline.config : {};
  const ai = config.ai && typeof config.ai === "object" ? config.ai : {};
-  const runnerConfig = ai.runner_config && typeof ai.runner_config === "object" ? ai.runner_config : {};
+  const rawExistingLocalAgentConfig = ai["local-agent"] && typeof ai["local-agent"] === "object"
-  const rawExistingLocalAgentConfig = runnerConfig[runnerId] && typeof runnerConfig[runnerId] === "object"
+    ? ai["local-agent"]
    ? runnerConfig[runnerId]
    : {};
  const existingLocalAgentConfig = rawExistingLocalAgentConfig;
  const existingModel = existingLocalAgentConfig.model && typeof existingLocalAgentConfig.model === "object"
    ? existingLocalAgentConfig.model
    : {};
  const requestedModelId = env.LANGBOT_LOCAL_AGENT_MODEL_UUID || env.LANGBOT_E2E_MODEL_UUID || "";
-  const selectedModelId = requestedModelId || existingModel.primary || selectedModel?.uuid || "";
+  const selected = await selectWorkingSpaceModel({
    backendUrl,
    token,
    models,
    skippedModelIds,
    skippedModelNames,
    requestedModelId,
    existingModelId: existingModel.primary || "",
  });
  const selectedModelId = selected.selected_model_id || "";
  const localAgentConfig = {
    timeout: 300,
    prompt: [{ role: "system", content: "You are a helpful assistant." }],
    "remove-think": false,
    "knowledge-bases": [],
    "box-session-id-template": "{launcher_type}_{launcher_id}",
    "retrieval-top-k": 5,
    "rerank-model": "",
    "rerank-top-k": 5,
@@ -227,9 +289,11 @@ async function ensureLocalAgentPipeline({ backendUrl, token, pipelineName, runne
    "context-keep-recent-tokens": 20000,
    "context-summary-tokens": 8000,
    ...existingLocalAgentConfig,
    // Current backend truncation still reads this field directly.
    "max-round": positiveInteger(existingLocalAgentConfig["max-round"], 10),
    model: {
      primary: selectedModelId,
-      fallbacks: requestedModelId ? [] : Array.isArray(existingModel.fallbacks) ? existingModel.fallbacks : [],
+      fallbacks: selected.fallback_model_ids || [],
    },
  };
  const updatedConfig = {
@@ -239,12 +303,10 @@ async function ensureLocalAgentPipeline({ backendUrl, token, pipelineName, runne
      runner: {
        ...(ai.runner && typeof ai.runner === "object" ? ai.runner : {}),
        id: runnerId,
        runner: runnerId,
        "expire-time": 0,
      },
-      runner_config: {
+      "local-agent": localAgentConfig,
        ...runnerConfig,
        [runnerId]: localAgentConfig,
      },
    },
  };
@@ -265,19 +327,31 @@ async function ensureLocalAgentPipeline({ backendUrl, token, pipelineName, runne
      update_status: updateResponse.status,
      pipeline_id: pipeline.uuid,
      model_count: models.length,
      space_model_count: spaceModels.length,
      scanned_space_model_count: selected.scanned_space_model_count,
      tested_model_count: selected.tested_model_count,
      model_tests: selected.model_tests,
      selected_model_id: selectedModelId,
      selected_model_name: selected.selected_model_name,
      fallback_model_ids: selected.fallback_model_ids,
    };
  }
  return {
    status: selectedModelId ? "pass" : "env_issue",
    reason: selectedModelId
-      ? "Local-agent pipeline is configured for Debug Chat."
+      ? `Local-agent pipeline is configured for Debug Chat with Space model ${selected.selected_model_name || selectedModelId} and ${selected.fallback_model_ids.length} fallback(s).`
-      : "Pipeline was created but no LLM model is configured in this LangBot instance.",
+      : selected.reason || "No working Space LLM model is configured in this LangBot instance.",
    pipeline_id: pipeline.uuid,
-    pipeline_name: pipeline.name,
+    pipeline_name: pipelineName,
    model_count: models.length,
    space_model_count: spaceModels.length,
    scanned_space_model_count: selected.scanned_space_model_count,
    tested_model_count: selected.tested_model_count,
    model_tests: selected.model_tests,
    selected_model_id: selectedModelId,
    selected_model_name: selected.selected_model_name,
    fallback_model_ids: selected.fallback_model_ids,
    created,
    updated: true,
  };
@@ -287,6 +361,229 @@ function isApiFailure(response) {
  return response.status >= 400 || (response.json.code !== undefined && response.json.code !== 0);
 }
 function isSpaceModel(model) {
  const provider = model?.provider && typeof model.provider === "object" ? model.provider : {};
  return model?.provider_uuid === SPACE_PROVIDER_UUID
    || provider.uuid === SPACE_PROVIDER_UUID
    || provider.requester === "space-chat-completions"
    || provider.name === "LangBot Models";
 }
 async function selectWorkingSpaceModel({
  backendUrl,
  token,
  models,
  skippedModelIds,
  skippedModelNames,
  requestedModelId,
  existingModelId,
 }) {
  const modelTests = [];
  const testLimit = positiveInteger(env.LANGBOT_E2E_MODEL_TEST_LIMIT, DEFAULT_MODEL_TEST_LIMIT);
  const fallbackCount = positiveInteger(env.LANGBOT_E2E_MODEL_FALLBACK_COUNT, DEFAULT_MODEL_FALLBACK_COUNT);
  const workingModels = [];
  const spaceModels = rankModels(models.filter((model) => (
    model.uuid
      && isSpaceModel(model)
      && !skippedModelIds.has(model.uuid)
      && !skippedModelNames.has(model.name)
  )));
  const requestedModel = requestedModelId
    ? spaceModels.find((model) => model.uuid === requestedModelId) || null
    : null;
  const existingModel = existingModelId
    ? spaceModels.find((model) => model.uuid === existingModelId) || null
    : null;
  const candidates = uniqueCandidates([
    ...(requestedModel ? [existingCandidate(requestedModel, "requested")] : []),
    ...(existingModel ? [existingCandidate(existingModel, "existing-pipeline")] : []),
    ...spaceModels.map((model) => existingCandidate(model, "configured-space")),
  ]);
  let scanResult = { status: "skipped", models: [], reason: "" };
  if (env.LANGBOT_E2E_SCAN_SPACE_MODELS !== "false") {
    scanResult = await scanSpaceModels({ backendUrl, token });
    if (scanResult.status === "pass") {
      const knownNames = new Set(spaceModels.map((model) => model.name));
      candidates.push(...scanResult.models
        .filter((model) => model.name && !knownNames.has(model.name) && !skippedModelNames.has(model.name))
        .map((model) => scannedCandidate(model)));
    }
  }
  const unique = uniqueCandidates(candidates);
  for (const candidate of unique.slice(0, testLimit)) {
    const test = await ensureAndTestModel({ backendUrl, token, candidate });
    modelTests.push(test);
    if (test.status === "pass" && test.model_uuid) {
      workingModels.push(test);
      if (workingModels.length >= fallbackCount + 1) break;
    }
  }
  if (workingModels.length > 0) {
    const [primary, ...fallbacks] = workingModels;
    return {
      status: "pass",
      reason: "",
      selected_model_id: primary.model_uuid,
      selected_model_name: primary.model_name,
      fallback_model_ids: fallbacks.map((model) => model.model_uuid),
      scanned_space_model_count: scanResult.models.length,
      tested_model_count: modelTests.length,
      model_tests: modelTests,
    };
  }
  const baseReason = unique.length === 0
    ? scanResult.reason || "No Space LLM model candidates are available."
    : `No working Space LLM model found after testing ${modelTests.length} candidate(s).`;
  return {
    status: "env_issue",
    reason: requestedModelId && !requestedModel
      ? `Requested Space LLM model ${requestedModelId} is missing or skipped; ${baseReason}`
      : baseReason,
    selected_model_id: "",
    selected_model_name: "",
    fallback_model_ids: [],
    scanned_space_model_count: scanResult.models.length,
    tested_model_count: modelTests.length,
    model_tests: modelTests,
  };
 }
 async function scanSpaceModels({ backendUrl, token }) {
  const response = await apiJson(
    backendUrl,
    `/api/v1/provider/providers/${encodeURIComponent(SPACE_PROVIDER_UUID)}/scan-models?type=llm`,
    { token },
  );
  if (isApiFailure(response)) {
    return {
      status: "env_issue",
      models: [],
      reason: safeReason(response.json.msg || response.json.message || "Failed to scan Space LLM models."),
    };
  }
  return {
    status: "pass",
    models: response.json.data?.models || [],
    reason: "",
  };
 }
 async function ensureAndTestModel({ backendUrl, token, candidate }) {
  let modelUuid = candidate.uuid || "";
  let created = false;
  if (!modelUuid) {
    const create = await apiJson(backendUrl, "/api/v1/provider/models/llm", {
      method: "POST",
      token,
      body: {
        name: candidate.name,
        provider_uuid: SPACE_PROVIDER_UUID,
        abilities: candidate.abilities || [],
        context_length: candidate.context_length ?? null,
        extra_args: {},
        prefered_ranking: positiveInteger(candidate.prefered_ranking, 0),
      },
    });
    modelUuid = create.json.data?.uuid || "";
    if (isApiFailure(create) || !modelUuid) {
      return modelTestResult(candidate, {
        status: "fail",
        reason: safeReason(create.json.msg || "Failed to create scanned Space model."),
        http_status: create.status,
      });
    }
    created = true;
  }
  const test = await apiJson(backendUrl, `/api/v1/provider/models/llm/${encodeURIComponent(modelUuid)}/test`, {
    method: "POST",
    token,
    body: { extra_args: {} },
  });
  const passed = !isApiFailure(test);
  if (!passed && created) {
    await apiJson(backendUrl, `/api/v1/provider/models/llm/${encodeURIComponent(modelUuid)}`, {
      method: "DELETE",
      token,
    }).catch(() => {});
  }
  return modelTestResult(candidate, {
    status: passed ? "pass" : "fail",
    reason: passed ? "" : safeReason(test.json.msg || test.json.message || "Space model test failed."),
    http_status: test.status,
    model_uuid: modelUuid,
    created,
  });
 }
 function modelTestResult(candidate, details) {
  return {
    source: candidate.source,
    model_uuid: details.model_uuid || candidate.uuid || "",
    model_name: candidate.name,
    status: details.status,
    reason: details.reason || "",
    http_status: details.http_status ?? null,
    created: Boolean(details.created),
  };
 }
 function existingCandidate(model, source) {
  return {
    source,
    uuid: model.uuid,
    name: model.name,
    abilities: model.abilities || [],
    context_length: model.context_length,
    prefered_ranking: model.prefered_ranking,
  };
 }
 function scannedCandidate(model) {
  return {
    source: "scanned-space",
    uuid: "",
    name: model.name || model.id,
    abilities: model.abilities || [],
    context_length: model.context_length,
    prefered_ranking: model.prefered_ranking,
  };
 }
 function uniqueCandidates(candidates) {
  const seen = new Set();
  const result = [];
  for (const candidate of candidates) {
    const key = candidate.uuid ? `uuid:${candidate.uuid}` : `name:${candidate.name}`;
    if (!candidate.name || seen.has(key)) continue;
    seen.add(key);
    result.push(candidate);
  }
  return result;
 }
 function rankModels(models) {
  return [...models].sort((left, right) => {
    const leftRank = Number.isFinite(Number(left.prefered_ranking)) ? Number(left.prefered_ranking) : 9999;
    const rightRank = Number.isFinite(Number(right.prefered_ranking)) ? Number(right.prefered_ranking) : 9999;
    if (leftRank !== rightRank) return leftRank - rightRank;
    return String(left.name || "").localeCompare(String(right.name || ""));
  });
 }
 function positiveInteger(value, fallback) {
  const parsed = Number(value);
  return Number.isInteger(parsed) && parsed > 0 ? parsed : fallback;
 }
 function safeReason(value) {
  return redact(String(value || "")).slice(0, 1000);
 }
 async function upsertEnvLocal(path, updates) {
  let text = "";
  try {
@@ -0,0 +1,496 @@
 #!/usr/bin/env node
 import { createServer } from "node:http";
 import { mkdir, writeFile } from "node:fs/promises";
 import { dirname, resolve } from "node:path";
 import { env, exit } from "node:process";
 const args = parseArgs(process.argv.slice(2));
 const host = args.host || env.LANGBOT_FAKE_PROVIDER_HOST || "127.0.0.1";
 const port = integer(args.port ?? env.LANGBOT_FAKE_PROVIDER_PORT, 0);
 const stateFile = args["state-file"] || env.LANGBOT_FAKE_PROVIDER_STATE_FILE || "";
 const modelName = env.LANGBOT_FAKE_PROVIDER_MODEL_NAME || "gpt-4o-mini";
 const config = {
  response_text: env.LANGBOT_FAKE_PROVIDER_RESPONSE_TEXT || "OK",
  first_token_delay_ms: integer(env.LANGBOT_FAKE_PROVIDER_FIRST_TOKEN_DELAY_MS, 25),
  chunk_delay_ms: integer(env.LANGBOT_FAKE_PROVIDER_CHUNK_DELAY_MS, 10),
  chunk_count: integer(env.LANGBOT_FAKE_PROVIDER_CHUNK_COUNT, 0),
  fault_status: integer(env.LANGBOT_FAKE_PROVIDER_FAULT_STATUS, 500),
  fail_first_n: integer(env.LANGBOT_FAKE_PROVIDER_FAIL_FIRST_N, 0),
  fail_every_n: integer(env.LANGBOT_FAKE_PROVIDER_FAIL_EVERY_N, 0),
  fail_after_first_chunk: bool(env.LANGBOT_FAKE_PROVIDER_FAIL_AFTER_FIRST_CHUNK, false),
  dynamic_response: !/^(0|false|no|off)$/i.test(env.LANGBOT_FAKE_PROVIDER_DYNAMIC_RESPONSE || ""),
  request_log_limit: integer(env.LANGBOT_FAKE_PROVIDER_REQUEST_LOG_LIMIT, 500),
 };
 let requestCount = 0;
 const recentRequests = [];
 const server = createServer(async (request, response) => {
  const startedAt = Date.now();
  const startedPerf = performance.now();
  let requestRecord = null;
  const url = new URL(request.url || "/", `http://${request.headers.host || `${host}:${port}`}`);
  try {
    if (request.method === "GET" && url.pathname === "/healthz") {
      sendJson(response, 200, {
        ok: true,
        model: modelName,
        config,
        request_count: requestCount,
        recent_request_count: recentRequests.length,
      });
      return;
    }
    if (request.method === "GET" && url.pathname === "/__qa/config") {
      sendJson(response, 200, {
        ok: true,
        model: modelName,
        config,
        request_count: requestCount,
        recent_requests: recentRequests,
      });
      return;
    }
    if (request.method === "POST" && url.pathname === "/__qa/config") {
      const body = await readJson(request);
      applyConfig(body.config && typeof body.config === "object" ? body.config : body);
      if (body.reset_request_count !== false) resetRequestState();
      sendJson(response, 200, {
        ok: true,
        model: modelName,
        config,
        request_count: requestCount,
      });
      return;
    }
    if (request.method === "POST" && url.pathname === "/__qa/reset") {
      resetRequestState();
      sendJson(response, 200, {
        ok: true,
        model: modelName,
        config,
        request_count: requestCount,
      });
      return;
    }
    if (request.method === "GET" && ["/models", "/v1/models"].includes(url.pathname)) {
      sendJson(response, 200, {
        object: "list",
        data: [
          {
            id: modelName,
            object: "model",
            created: 1,
            owned_by: "langbot-qa",
            type: "llm",
          },
        ],
      });
      return;
    }
    if (request.method === "POST" && ["/chat/completions", "/v1/chat/completions"].includes(url.pathname)) {
      requestCount += 1;
      const body = await readJson(request);
      const requestId = `chatcmpl-langbot-fake-${requestCount}`;
      const shouldFail = requestCount <= config.fail_first_n
        || (config.fail_every_n > 0 && requestCount % config.fail_every_n === 0);
      const replyText = responseTextForBody(body);
      requestRecord = recordRequest({
        id: requestId,
        request_number: requestCount,
        path: url.pathname,
        stream: Boolean(body.stream),
        model: body.model || "",
        message_count: Array.isArray(body.messages) ? body.messages.length : 0,
        should_fail: shouldFail,
        status: "running",
        http_status: null,
        expected_text: replyText,
        response_text_preview: previewText(replyText),
        started_at: new Date(startedAt).toISOString(),
        started_epoch_ms: startedAt,
        configured_first_token_delay_ms: config.first_token_delay_ms,
        configured_chunk_delay_ms: config.chunk_delay_ms,
        configured_chunk_count: config.chunk_count,
      });
      if (shouldFail) {
        await sleep(config.first_token_delay_ms);
        sendJson(response, config.fault_status, {
          error: {
            message: `LangBot fake provider injected HTTP ${config.fault_status}`,
            type: "fake_provider_fault",
            code: "fake_provider_fault",
          },
        });
        finishRequestRecord(requestRecord, startedPerf, {
          status: "http_fault",
          http_status: config.fault_status,
        });
        return;
      }
      if (body.stream) {
        await streamCompletion(response, {
          requestId,
          model: body.model || modelName,
          content: replyText,
          failAfterFirstChunk: config.fail_after_first_chunk,
          requestRecord,
          startedPerf,
        });
      } else {
        await sleep(config.first_token_delay_ms + config.chunk_delay_ms);
        sendJson(response, 200, completionPayload({
          requestId,
          model: body.model || modelName,
          content: replyText,
        }));
        markRequestTiming(requestRecord, "first_chunk", startedPerf);
        markRequestTiming(requestRecord, "first_content_chunk", startedPerf);
        requestRecord.content_chunk_count = 1;
        finishRequestRecord(requestRecord, startedPerf, {
          status: "ok",
          http_status: 200,
        });
      }
      return;
    }
    sendJson(response, 404, {
      error: {
        message: `No fake provider route for ${request.method} ${url.pathname}`,
        type: "not_found",
      },
    });
  } catch (error) {
    if (requestRecord) {
      finishRequestRecord(requestRecord, startedPerf, {
        status: "fake_provider_error",
        http_status: 500,
        error: error instanceof Error ? error.message : String(error),
      });
    }
    sendJson(response, 500, {
      error: {
        message: error instanceof Error ? error.message : String(error),
        type: "fake_provider_error",
      },
    });
  } finally {
    const durationMs = Date.now() - startedAt;
    if (url.pathname !== "/healthz") {
      console.log(JSON.stringify({
        at: new Date().toISOString(),
        method: request.method,
        path: url.pathname,
        duration_ms: durationMs,
      }));
    }
  }
 });
 server.listen(port, host, async () => {
  const address = server.address();
  const selectedPort = typeof address === "object" && address ? address.port : port;
  const url = `http://${host}:${selectedPort}`;
  const state = {
    status: "ready",
    pid: process.pid,
    url,
    base_url: `${url}/v1`,
    model: modelName,
    started_at: new Date().toISOString(),
  };
  if (stateFile) {
    const path = resolve(stateFile);
    await mkdir(dirname(path), { recursive: true });
    await writeFile(path, `${JSON.stringify(state, null, 2)}\n`, "utf8");
  }
  console.log(JSON.stringify(state));
 });
 server.on("error", (error) => {
  console.error(JSON.stringify({
    status: "error",
    reason: error instanceof Error ? error.message : String(error),
  }));
  exit(1);
 });
 process.on("SIGTERM", () => {
  server.close(() => exit(0));
 });
 function parseArgs(argv) {
  const result = {};
  for (const item of argv) {
    const match = item.match(/^--([^=]+)(?:=(.*))?$/);
    if (!match) continue;
    result[match[1]] = match[2] ?? "1";
  }
  return result;
 }
 function integer(value, fallback) {
  const parsed = Number.parseInt(String(value ?? ""), 10);
  return Number.isFinite(parsed) && parsed >= 0 ? parsed : fallback;
 }
 function bool(value, fallback) {
  if (value === undefined || value === "") return fallback;
  if (/^(1|true|yes|on)$/i.test(String(value))) return true;
  if (/^(0|false|no|off)$/i.test(String(value))) return false;
  return fallback;
 }
 function sleep(ms) {
  return new Promise((resolve) => setTimeout(resolve, Math.max(0, ms)));
 }
 async function readJson(request) {
  let text = "";
  for await (const chunk of request) text += chunk.toString();
  if (!text) return {};
  return JSON.parse(text);
 }
 function sendJson(response, status, payload) {
  const text = `${JSON.stringify(payload)}\n`;
  response.writeHead(status, {
    "content-type": "application/json",
    "content-length": Buffer.byteLength(text),
  });
  response.end(text);
 }
 function completionPayload({ requestId, model, content }) {
  const completionTokens = tokenEstimate(content);
  return {
    id: requestId,
    object: "chat.completion",
    created: Math.floor(Date.now() / 1000),
    model,
    choices: [
      {
        index: 0,
        message: {
          role: "assistant",
          content,
        },
        finish_reason: "stop",
      },
    ],
    usage: {
      prompt_tokens: 8,
      completion_tokens: completionTokens,
      total_tokens: 8 + completionTokens,
    },
  };
 }
 async function streamCompletion(response, {
  requestId,
  model,
  content,
  failAfterFirstChunk: failMidStream,
  requestRecord,
  startedPerf,
 }) {
  response.writeHead(200, {
    "content-type": "text/event-stream; charset=utf-8",
    "cache-control": "no-cache",
    "connection": "keep-alive",
  });
  await sleep(config.first_token_delay_ms);
  markRequestTiming(requestRecord, "first_chunk", startedPerf);
  writeSse(response, {
    id: requestId,
    object: "chat.completion.chunk",
    created: Math.floor(Date.now() / 1000),
    model,
    choices: [{ index: 0, delta: { role: "assistant" }, finish_reason: null }],
  });
  const chunks = splitContent(content);
  for (let index = 0; index < chunks.length; index += 1) {
    await sleep(config.chunk_delay_ms);
    if (index === 0) markRequestTiming(requestRecord, "first_content_chunk", startedPerf);
    requestRecord.content_chunk_count = (requestRecord.content_chunk_count || 0) + 1;
    writeSse(response, {
      id: requestId,
      object: "chat.completion.chunk",
      created: Math.floor(Date.now() / 1000),
      model,
      choices: [{ index: 0, delta: { content: chunks[index] }, finish_reason: null }],
    });
    if (failMidStream && index === 0) {
      finishRequestRecord(requestRecord, startedPerf, {
        status: "mid_stream_disconnect",
        http_status: 200,
      });
      response.destroy(new Error("LangBot fake provider injected mid-stream disconnect"));
      return;
    }
  }
  await sleep(config.chunk_delay_ms);
  const completionTokens = tokenEstimate(content);
  writeSse(response, {
    id: requestId,
    object: "chat.completion.chunk",
    created: Math.floor(Date.now() / 1000),
    model,
    choices: [{ index: 0, delta: {}, finish_reason: "stop" }],
    usage: {
      prompt_tokens: 8,
      completion_tokens: completionTokens,
      total_tokens: 8 + completionTokens,
    },
  });
  response.write("data: [DONE]\n\n");
  response.end();
  finishRequestRecord(requestRecord, startedPerf, {
    status: "ok",
    http_status: 200,
  });
 }
 function writeSse(response, payload) {
  response.write(`data: ${JSON.stringify(payload)}\n\n`);
 }
 function splitContent(content) {
  const text = String(content);
  const requested = config.chunk_count;
  if (requested <= 1 || text.length <= 1) return [text];
  const chunkSize = Math.max(1, Math.ceil(text.length / requested));
  const chunks = [];
  for (let index = 0; index < text.length; index += chunkSize) {
    chunks.push(text.slice(index, index + chunkSize));
  }
  return chunks;
 }
 function tokenEstimate(content) {
  return Math.max(1, Math.ceil(String(content || "").length / 4));
 }
 function responseTextForBody(body) {
  if (!config.dynamic_response) {
    return config.response_text;
  }
  const messages = Array.isArray(body.messages) ? body.messages : [];
  const lastUser = [...messages].reverse().find((message) => message?.role === "user");
  const text = flattenContent(lastUser?.content || "");
  const quoted = text.match(/["'“”](.{1,80}?)["'“”]/);
  if (quoted?.[1]) return quoted[1].trim();
  const exact = text.match(/(?:reply|回复|输出|return)\s+(?:exactly\s+)?([A-Za-z0-9_.:@-]{1,80})/i);
  if (exact?.[1]) return exact[1].trim().replace(/[。.!?]+$/, "");
  const only = text.match(/只回复\s*([A-Za-z0-9_.:@-]{1,80})/);
  if (only?.[1]) return only[1].trim().replace(/[。.!?]+$/, "");
  return config.response_text;
 }
 function flattenContent(content) {
  if (typeof content === "string") return content;
  if (Array.isArray(content)) {
    return content
      .map((item) => {
        if (typeof item === "string") return item;
        if (item && typeof item === "object") return item.text || "";
        return "";
      })
      .join("\n");
  }
  return "";
 }
 function recordRequest(entry) {
  const item = {
    ...entry,
    at: new Date().toISOString(),
    finished_at: null,
    finished_epoch_ms: null,
    duration_ms: null,
    first_chunk_at: null,
    first_chunk_epoch_ms: null,
    first_chunk_ms: null,
    first_content_chunk_at: null,
    first_content_chunk_epoch_ms: null,
    first_content_chunk_ms: null,
    content_chunk_count: 0,
  };
  recentRequests.push(item);
  while (recentRequests.length > config.request_log_limit) recentRequests.shift();
  return item;
 }
 function markRequestTiming(entry, key, startedPerf) {
  if (!entry || entry[`${key}_at`]) return;
  const now = Date.now();
  entry[`${key}_at`] = new Date(now).toISOString();
  entry[`${key}_epoch_ms`] = now;
  entry[`${key}_ms`] = rounded(performance.now() - startedPerf);
 }
 function finishRequestRecord(entry, startedPerf, updates = {}) {
  if (!entry || entry.finished_at) return;
  const now = Date.now();
  Object.assign(entry, updates);
  entry.finished_at = new Date(now).toISOString();
  entry.finished_epoch_ms = now;
  entry.duration_ms = rounded(performance.now() - startedPerf);
 }
 function rounded(value) {
  return Number(value.toFixed(3));
 }
 function previewText(value) {
  return String(value || "").slice(0, 120);
 }
 function resetRequestState() {
  requestCount = 0;
  recentRequests.length = 0;
 }
 function applyConfig(updates) {
  if (!updates || typeof updates !== "object") return;
  assignString(updates, "response_text");
  assignNonNegativeInteger(updates, "first_token_delay_ms");
  assignNonNegativeInteger(updates, "chunk_delay_ms");
  assignNonNegativeInteger(updates, "chunk_count");
  assignNonNegativeInteger(updates, "fail_first_n");
  assignNonNegativeInteger(updates, "fail_every_n");
  assignNonNegativeInteger(updates, "request_log_limit");
  if (updates.fault_status !== undefined) {
    const parsed = Number.parseInt(String(updates.fault_status), 10);
    if (Number.isInteger(parsed) && parsed >= 400 && parsed <= 599) config.fault_status = parsed;
  }
  assignBoolean(updates, "fail_after_first_chunk");
  assignBoolean(updates, "dynamic_response");
 }
 function assignString(updates, key) {
  if (updates[key] !== undefined) config[key] = String(updates[key]);
 }
 function assignNonNegativeInteger(updates, key) {
  if (updates[key] === undefined) return;
  const parsed = Number.parseInt(String(updates[key]), 10);
  if (Number.isInteger(parsed) && parsed >= 0) config[key] = parsed;
 }
 function assignBoolean(updates, key) {
  if (updates[key] === undefined) return;
  config[key] = bool(updates[key], config[key]);
 }
@@ -72,6 +72,7 @@ export async function writeResult(paths, result) {
 }
 export async function loadEnvFiles(paths = ["skills/.env", "skills/.env.local"]) {
  const processEnvKeys = new Set(Object.keys(env));
  for (const path of paths) {
    let text = "";
    try {
@@ -86,7 +87,7 @@ export async function loadEnvFiles(paths = ["skills/.env", "skills/.env.local"])
      if (equals <= 0) continue;
      const key = trimmed.slice(0, equals).trim();
      const value = trimmed.slice(equals + 1).trim().replace(/^["']|["']$/g, "");
-      if (!(key in env)) env[key] = value;
+      if (!processEnvKeys.has(key)) env[key] = value;
    }
  }
 }
@@ -54,6 +54,7 @@ const debugChatSessionType = env.LANGBOT_E2E_DEBUG_CHAT_SESSION_TYPE || "person"
 const pipelineConfigDiagnosticPath = resolve(paths.evidenceDir, "pipeline-config-diagnostic.json");
 const debugChatResetDiagnosticPath = resolve(paths.evidenceDir, "debug-chat-reset-diagnostic.json");
 const pipelineConfigRestoreDiagnosticPath = resolve(paths.evidenceDir, "pipeline-config-restore-diagnostic.json");
 const metricsPath = resolve(paths.evidenceDir, "metrics.json");
 const startedAt = new Date();
 let browser;
@@ -80,10 +81,11 @@ let result = {
    console_log: paths.consoleLog,
    network_log: paths.networkLog,
    screenshot: paths.screenshot,
    metrics_json: metricsPath,
    automation_result_json: paths.automationResultJson,
    result_json: paths.resultJson,
  },
-  evidence_collected: ["ui", "screenshot", "console", "network"],
+  evidence_collected: ["ui", "screenshot", "console", "network", "metrics"],
 };
 function boolFromEnv(value, defaultValue) {
@@ -103,6 +105,29 @@ function parseJsonEnv(key, fallback) {
  }
 }
 function positiveNumberEnv(key, fallback) {
  const value = Number(env[key] || "");
  return Number.isFinite(value) && value >= 0 ? value : fallback;
 }
 function percentile(values, percentileValue) {
  if (values.length === 0) return 0;
  const sorted = [...values].sort((a, b) => a - b);
  const index = Math.min(sorted.length - 1, Math.ceil((percentileValue / 100) * sorted.length) - 1);
  return Number(sorted[index].toFixed(3));
 }
 function stats(values) {
  if (values.length === 0) return { min: 0, p50: 0, p95: 0, p99: 0, max: 0 };
  return {
    min: Number(Math.min(...values).toFixed(3)),
    p50: percentile(values, 50),
    p95: percentile(values, 95),
    p99: percentile(values, 99),
    max: Number(Math.max(...values).toFixed(3)),
  };
 }
 function promptStepsFromEnv() {
  const rawSteps = parseJsonEnv("LANGBOT_E2E_PROMPTS_JSON", null);
  if (rawSteps === null) {
@@ -658,6 +683,7 @@ try {
      } else {
        for (let index = 0; index < promptSteps.length; index += 1) {
          const step = promptSteps[index];
          const promptStartedAt = Date.now();
          const chatResult = await runDebugChatPrompt(page, {
            prompt: step.prompt,
            expectedText: step.expectedText,
@@ -665,11 +691,13 @@ try {
            imagePath: index === 0 ? imagePath : "",
            failureSignals: failureSignals.length > 0 ? failureSignals : undefined,
          });
          const promptDurationMs = Date.now() - promptStartedAt;
          result.chat_results.push({
            index,
            expected_text: step.expectedText,
            status: chatResult.status,
            reason: chatResult.reason,
            response_duration_ms: promptDurationMs,
            min_expected_count: chatResult.min_expected_count,
            final_count: chatResult.final_count,
            before_assistant_expected_count: chatResult.before_assistant_expected_count,
@@ -714,6 +742,56 @@ try {
  const finishedAt = new Date();
  result.finished_at = finishedAt.toISOString();
  result.finished_at_local = localIsoWithOffset(finishedAt);
  result.duration_ms = finishedAt.getTime() - startedAt.getTime();
  const responseDurations = result.chat_results
    .map((item) => item.response_duration_ms)
    .filter((value) => Number.isFinite(value));
  const passedPrompts = result.chat_results.filter((item) => item.status === "pass").length;
  const attemptedPrompts = result.chat_results.length;
  const errorRate = attemptedPrompts === 0 ? 1 : Number(((attemptedPrompts - passedPrompts) / attemptedPrompts).toFixed(4));
  const responseStats = stats(responseDurations);
  const responseP95BudgetMs = positiveNumberEnv(
    "LANGBOT_E2E_DEBUG_CHAT_RESPONSE_P95_MS",
    positiveNumberEnv("LANGBOT_DEBUG_CHAT_RESPONSE_P95_MS", safeResponseTimeoutMs),
  );
  const maxErrorRate = positiveNumberEnv("LANGBOT_E2E_DEBUG_CHAT_MAX_ERROR_RATE", 0);
  const metrics = {
    probe: caseId,
    url: result.url,
    prompt_count: result.prompt_count,
    attempted_prompt_count: attemptedPrompts,
    passed_prompt_count: passedPrompts,
    error_rate: errorRate,
    response_duration_ms: responseStats,
    total_duration_ms: result.duration_ms,
    chat_results: result.chat_results,
  };
  result.metrics_summary = {
    prompt_count: metrics.prompt_count,
    attempted_prompt_count: metrics.attempted_prompt_count,
    passed_prompt_count: metrics.passed_prompt_count,
    error_rate: metrics.error_rate,
    response_p50_ms: metrics.response_duration_ms.p50,
    response_p95_ms: metrics.response_duration_ms.p95,
    total_duration_ms: metrics.total_duration_ms,
  };
  result.thresholds_summary = {
    response_p95_ms: {
      actual: metrics.response_duration_ms.p95,
      max: responseP95BudgetMs,
      pass: attemptedPrompts > 0 && metrics.response_duration_ms.p95 <= responseP95BudgetMs,
    },
    error_rate: {
      actual: metrics.error_rate,
      max: maxErrorRate,
      pass: metrics.error_rate <= maxErrorRate,
    },
  };
  await writeFile(metricsPath, `${JSON.stringify(metrics, null, 2)}\n`, "utf8");
  if (result.status === "pass" && !Object.values(result.thresholds_summary).every((item) => item.pass)) {
    result.status = "fail";
    result.reason = "Debug Chat performance breached response latency or error-rate thresholds.";
  }
  const existingEvidence = {};
  for (const [key, value] of Object.entries(result.evidence)) {
    if (typeof value !== "string") continue;
@@ -130,6 +130,7 @@
        "references/local-agent-runner.md",
        "references/mcp-stdio-testing.md",
        "references/model-provider-testing.md",
        "references/performance-reliability-testing.md",
        "references/pipeline-debug-chat.md",
        "references/plugin-e2e-smoke.md",
        "references/sandbox-skill-authoring.md",
@@ -150,6 +151,16 @@
        "agent-runner-release-preflight",
        "agent-runner-runtime-chaos",
        "dify-agent-debug-chat",
        "langbot-fake-provider-debug-chat-cross-pipeline-isolation",
        "langbot-fake-provider-debug-chat-fault-recovery",
        "langbot-fake-provider-debug-chat-load",
        "langbot-fake-provider-debug-chat-slow-load",
        "langbot-fault-taxonomy-contract",
        "langbot-live-backend-latency",
        "langbot-live-backend-log-health",
        "langbot-live-control-plane-api",
        "langbot-overhead-accounting-contract",
        "langbot-space-debug-chat-concurrency-smoke",
        "langrag-kb-retrieve",
        "langrag-parser-golden-e2e",
        "langrag-sentinel-kb-discover",
@@ -165,6 +176,7 @@
        "mcp-stdio-register",
        "mcp-stdio-tool-call",
        "pipeline-debug-chat",
        "pipeline-debug-chat-performance",
        "plugin-e2e-smoke",
        "provider-deepseek",
        "qa-plugin-smoke-live-install",
@@ -486,6 +498,316 @@
            "backend_log"
          ]
        },
        {
          "id": "langbot-fake-provider-debug-chat-cross-pipeline-isolation",
          "title": "LangBot Debug Chat fake-provider cross-pipeline isolation probe",
          "mode": "probe",
          "area": "reliability",
          "type": "reliability",
          "priority": "p1",
          "risk": "high",
          "ci_eligible": false,
          "tags": [
            "reliability",
            "debug-chat",
            "websocket",
            "fake-provider",
            "isolation",
            "concurrency",
            "metrics"
          ],
          "automation": "skills/langbot-testing/probes/langbot-debug-chat-cross-pipeline-isolation.mjs",
          "setup_automation": [
            "node:scripts/e2e/ensure-fake-provider-cross-pipelines.mjs --write-env"
          ],
          "setup_provides_env": [
            "LANGBOT_FAKE_PROVIDER_URL",
            "LANGBOT_FAKE_PROVIDER_BASE_URL",
            "LANGBOT_FAKE_PROVIDER_PID",
            "LANGBOT_FAKE_PROVIDER_PIPELINE_A_URL",
            "LANGBOT_FAKE_PROVIDER_PIPELINE_A_NAME",
            "LANGBOT_FAKE_PROVIDER_PIPELINE_B_URL",
            "LANGBOT_FAKE_PROVIDER_PIPELINE_B_NAME"
          ],
          "evidence_required": [
            "metrics",
            "network",
            "api_diagnostic",
            "filesystem"
          ]
        },
        {
          "id": "langbot-fake-provider-debug-chat-fault-recovery",
          "title": "LangBot Debug Chat fake-provider fault recovery probe",
          "mode": "probe",
          "area": "reliability",
          "type": "chaos",
          "priority": "p1",
          "risk": "high",
          "ci_eligible": false,
          "tags": [
            "reliability",
            "chaos",
            "debug-chat",
            "websocket",
            "fake-provider",
            "fault-injection",
            "metrics"
          ],
          "automation": "skills/langbot-testing/probes/langbot-debug-chat-concurrency.mjs",
          "setup_automation": [
            "node:scripts/e2e/ensure-fake-provider-pipeline.mjs --write-env"
          ],
          "setup_provides_env": [
            "LANGBOT_FAKE_PROVIDER_URL",
            "LANGBOT_FAKE_PROVIDER_BASE_URL",
            "LANGBOT_FAKE_PROVIDER_PID",
            "LANGBOT_FAKE_PROVIDER_PROVIDER_UUID",
            "LANGBOT_FAKE_PROVIDER_MODEL_UUID",
            "LANGBOT_FAKE_PROVIDER_PIPELINE_URL",
            "LANGBOT_FAKE_PROVIDER_PIPELINE_NAME"
          ],
          "evidence_required": [
            "metrics",
            "network",
            "api_diagnostic",
            "filesystem"
          ]
        },
        {
          "id": "langbot-fake-provider-debug-chat-load",
          "title": "LangBot Debug Chat controlled fake-provider load probe",
          "mode": "probe",
          "area": "performance",
          "type": "performance",
          "priority": "p1",
          "risk": "medium",
          "ci_eligible": false,
          "tags": [
            "performance",
            "debug-chat",
            "websocket",
            "fake-provider",
            "load",
            "metrics"
          ],
          "automation": "skills/langbot-testing/probes/langbot-debug-chat-concurrency.mjs",
          "setup_automation": [
            "node:scripts/e2e/ensure-fake-provider-pipeline.mjs --write-env"
          ],
          "setup_provides_env": [
            "LANGBOT_FAKE_PROVIDER_URL",
            "LANGBOT_FAKE_PROVIDER_BASE_URL",
            "LANGBOT_FAKE_PROVIDER_PID",
            "LANGBOT_FAKE_PROVIDER_PROVIDER_UUID",
            "LANGBOT_FAKE_PROVIDER_MODEL_UUID",
            "LANGBOT_FAKE_PROVIDER_PIPELINE_URL",
            "LANGBOT_FAKE_PROVIDER_PIPELINE_NAME"
          ],
          "evidence_required": [
            "metrics",
            "network",
            "api_diagnostic",
            "filesystem"
          ]
        },
        {
          "id": "langbot-fake-provider-debug-chat-slow-load",
          "title": "LangBot Debug Chat slow fake-provider load probe",
          "mode": "probe",
          "area": "performance",
          "type": "performance",
          "priority": "p1",
          "risk": "medium",
          "ci_eligible": false,
          "tags": [
            "performance",
            "debug-chat",
            "websocket",
            "fake-provider",
            "slow-provider",
            "load",
            "metrics"
          ],
          "automation": "skills/langbot-testing/probes/langbot-debug-chat-concurrency.mjs",
          "setup_automation": [
            "node:scripts/e2e/ensure-fake-provider-pipeline.mjs --write-env"
          ],
          "setup_provides_env": [
            "LANGBOT_FAKE_PROVIDER_URL",
            "LANGBOT_FAKE_PROVIDER_BASE_URL",
            "LANGBOT_FAKE_PROVIDER_PID",
            "LANGBOT_FAKE_PROVIDER_PROVIDER_UUID",
            "LANGBOT_FAKE_PROVIDER_MODEL_UUID",
            "LANGBOT_FAKE_PROVIDER_PIPELINE_URL",
            "LANGBOT_FAKE_PROVIDER_PIPELINE_NAME"
          ],
          "evidence_required": [
            "metrics",
            "network",
            "api_diagnostic",
            "filesystem"
          ]
        },
        {
          "id": "langbot-fault-taxonomy-contract",
          "title": "LangBot fault taxonomy and cleanup contract",
          "mode": "probe",
          "area": "reliability",
          "type": "chaos",
          "priority": "p1",
          "risk": "medium",
          "ci_eligible": true,
          "tags": [
            "reliability",
            "chaos",
            "contract",
            "synthetic"
          ],
          "automation": "skills/langbot-testing/probes/langbot-fault-taxonomy-contract.mjs",
          "setup_automation": [],
          "setup_provides_env": [],
          "evidence_required": [
            "metrics",
            "filesystem"
          ]
        },
        {
          "id": "langbot-live-backend-latency",
          "title": "LangBot live backend basic latency probe",
          "mode": "probe",
          "area": "performance",
          "type": "performance",
          "priority": "p1",
          "risk": "medium",
          "ci_eligible": false,
          "tags": [
            "performance",
            "live-backend",
            "latency",
            "metrics"
          ],
          "automation": "skills/langbot-testing/probes/langbot-live-backend-latency.mjs",
          "setup_automation": [],
          "setup_provides_env": [],
          "evidence_required": [
            "metrics",
            "network",
            "api_diagnostic",
            "filesystem"
          ]
        },
        {
          "id": "langbot-live-backend-log-health",
          "title": "LangBot live backend log health probe",
          "mode": "probe",
          "area": "reliability",
          "type": "reliability",
          "priority": "p1",
          "risk": "medium",
          "ci_eligible": false,
          "tags": [
            "reliability",
            "live-backend",
            "backend-log",
            "metrics"
          ],
          "automation": "skills/langbot-testing/probes/langbot-live-backend-log-health.mjs",
          "setup_automation": [],
          "setup_provides_env": [],
          "evidence_required": [
            "metrics",
            "backend_log",
            "filesystem"
          ]
        },
        {
          "id": "langbot-live-control-plane-api",
          "title": "LangBot live control-plane API probe",
          "mode": "probe",
          "area": "performance",
          "type": "performance",
          "priority": "p1",
          "risk": "medium",
          "ci_eligible": false,
          "tags": [
            "performance",
            "reliability",
            "live-backend",
            "control-plane",
            "metrics"
          ],
          "automation": "skills/langbot-testing/probes/langbot-live-control-plane-api.mjs",
          "setup_automation": [],
          "setup_provides_env": [],
          "evidence_required": [
            "metrics",
            "network",
            "api_diagnostic",
            "filesystem"
          ]
        },
        {
          "id": "langbot-overhead-accounting-contract",
          "title": "LangBot overhead accounting metrics contract",
          "mode": "probe",
          "area": "performance",
          "type": "performance",
          "priority": "p1",
          "risk": "medium",
          "ci_eligible": true,
          "tags": [
            "performance",
            "metrics",
            "contract",
            "synthetic"
          ],
          "automation": "skills/langbot-testing/probes/langbot-overhead-accounting-contract.mjs",
          "setup_automation": [],
          "setup_provides_env": [],
          "evidence_required": [
            "metrics",
            "resource_log",
            "filesystem"
          ]
        },
        {
          "id": "langbot-space-debug-chat-concurrency-smoke",
          "title": "LangBot Debug Chat real Space-provider concurrency smoke",
          "mode": "probe",
          "area": "performance",
          "type": "performance",
          "priority": "p1",
          "risk": "high",
          "ci_eligible": false,
          "tags": [
            "performance",
            "debug-chat",
            "websocket",
            "space",
            "live-provider",
            "smoke",
            "metrics"
          ],
          "automation": "skills/langbot-testing/probes/langbot-debug-chat-concurrency.mjs",
          "setup_automation": [
            "node:scripts/e2e/ensure-local-agent-pipeline.mjs --write-env"
          ],
          "setup_provides_env": [
            "LANGBOT_PIPELINE_URL",
            "LANGBOT_PIPELINE_NAME",
            "LANGBOT_LOCAL_AGENT_PIPELINE_URL",
            "LANGBOT_LOCAL_AGENT_PIPELINE_NAME",
            "LANGBOT_LOCAL_AGENT_MODEL_UUID",
            "LANGBOT_E2E_MODEL_UUID"
          ],
          "evidence_required": [
            "metrics",
            "network",
            "api_diagnostic",
            "filesystem"
          ]
        },
        {
          "id": "langrag-kb-retrieve",
          "title": "LangRAG knowledge base ingests and retrieves a sentinel document",
@@ -911,6 +1233,38 @@
            "backend_log"
          ]
        },
        {
          "id": "pipeline-debug-chat-performance",
          "title": "Pipeline Debug Chat user-path performance probe",
          "mode": "agent-browser",
          "area": "pipeline",
          "type": "performance",
          "priority": "p1",
          "risk": "medium",
          "ci_eligible": false,
          "tags": [
            "performance",
            "pipeline",
            "debug-chat",
            "user-path",
            "metrics"
          ],
          "automation": "scripts/e2e/pipeline-debug-chat.mjs",
          "setup_automation": [
            "node:scripts/e2e/ensure-local-agent-pipeline.mjs --write-env"
          ],
          "setup_provides_env": [
            "LANGBOT_PIPELINE_URL",
            "LANGBOT_PIPELINE_NAME"
          ],
          "evidence_required": [
            "ui",
            "screenshot",
            "console",
            "network",
            "metrics"
          ]
        },
        {
          "id": "plugin-e2e-smoke",
          "title": "Plugin system installs a local plugin and exposes tool/page APIs",
@@ -1059,6 +1413,12 @@
      "suites": [
        "agent-runner-release-gate",
        "core-smoke",
        "langbot-debug-chat-isolation-gate",
        "langbot-debug-chat-load-gate",
        "langbot-live-backend-gate",
        "langbot-performance-contract-gate",
        "langbot-performance-reliability-gate",
        "langbot-user-path-performance-gate",
        "local-agent-gate"
      ],
      "suite_summaries": [
@@ -1121,6 +1481,113 @@
            "local-agent-basic-debug-chat"
          ]
        },
        {
          "id": "langbot-debug-chat-isolation-gate",
          "title": "LangBot Debug Chat isolation gate",
          "description": "Manual/non-required cross-pipeline Debug Chat isolation gate. Current releases may fail this gate because of product bug #2286; use it as regression evidence after the routing fix lands.",
          "type": "reliability",
          "priority": "p1",
          "tags": [
            "reliability",
            "debug-chat",
            "websocket",
            "isolation",
            "concurrency"
          ],
          "cases": [
            "langbot-fake-provider-debug-chat-cross-pipeline-isolation"
          ]
        },
        {
          "id": "langbot-debug-chat-load-gate",
          "title": "LangBot Debug Chat load gate",
          "description": "Manual/non-required message-path load checks for Pipeline Debug Chat: controlled fake-provider baseline, slow-provider and fault-recovery profiles, plus optional real Space-provider smoke. Cross-pipeline isolation is split into langbot-debug-chat-isolation-gate because current releases may fail it due to product bug #2286.",
          "type": "performance",
          "priority": "p1",
          "tags": [
            "performance",
            "debug-chat",
            "websocket",
            "load"
          ],
          "cases": [
            "langbot-fake-provider-debug-chat-load",
            "langbot-fake-provider-debug-chat-slow-load",
            "langbot-fake-provider-debug-chat-fault-recovery",
            "langbot-space-debug-chat-concurrency-smoke"
          ]
        },
        {
          "id": "langbot-live-backend-gate",
          "title": "LangBot live backend reliability gate",
          "description": "Live backend control-plane responsiveness and runtime log health checks for a locally running LangBot instance.",
          "type": "reliability",
          "priority": "p1",
          "tags": [
            "performance",
            "reliability",
            "live-backend",
            "metrics"
          ],
          "cases": [
            "langbot-live-backend-latency",
            "langbot-live-control-plane-api",
            "langbot-live-backend-log-health"
          ]
        },
        {
          "id": "langbot-performance-contract-gate",
          "title": "LangBot performance contract gate",
          "description": "Fast synthetic contract checks for performance metric accounting and non-destructive reliability fault taxonomy.",
          "type": "contract",
          "priority": "p1",
          "tags": [
            "performance",
            "reliability",
            "contract",
            "metrics"
          ],
          "cases": [
            "langbot-overhead-accounting-contract",
            "langbot-fault-taxonomy-contract"
          ]
        },
        {
          "id": "langbot-performance-reliability-gate",
          "title": "LangBot performance and reliability starter gate",
          "description": "Starter gate for LangBot performance accounting, live backend control-plane latency, and non-destructive fault taxonomy checks.",
          "type": "reliability",
          "priority": "p1",
          "tags": [
            "performance",
            "reliability",
            "metrics",
            "chaos"
          ],
          "cases": [
            "langbot-overhead-accounting-contract",
            "langbot-fault-taxonomy-contract",
            "langbot-live-backend-latency",
            "langbot-live-control-plane-api",
            "langbot-live-backend-log-health"
          ]
        },
        {
          "id": "langbot-user-path-performance-gate",
          "title": "LangBot user-path performance gate",
          "description": "Browser-visible performance checks for user-facing LangBot paths such as Pipeline Debug Chat.",
          "type": "performance",
          "priority": "p1",
          "tags": [
            "performance",
            "browser",
            "debug-chat",
            "user-path"
          ],
          "cases": [
            "pipeline-debug-chat-performance"
          ]
        },
        {
          "id": "local-agent-gate",
          "title": "Local Agent runner regression gate",
@@ -1265,6 +1732,7 @@
        "sandbox-native-tools-unavailable",
        "socks-proxy-without-socksio",
        "survey-widget-blocks-debug-chat",
        "telemetry-proxy-noise",
        "tool-name-collision-between-mcp-and-plugin",
        "uv-run-resyncs-local-sdk"
      ],
@@ -1449,6 +1917,14 @@
            "mcp-stdio-tool-call"
          ]
        },
        {
          "id": "telemetry-proxy-noise",
          "title": "Telemetry posting fails through the proxy while the target flow succeeds",
          "category": "env_issue",
          "related_cases": [
            "langbot-space-debug-chat-concurrency-smoke"
          ]
        },
        {
          "id": "tool-name-collision-between-mcp-and-plugin",
          "title": "MCP and plugin expose the same tool name",
@@ -26,6 +26,23 @@ LANGBOT_NO_PROXY=localhost,127.0.0.1,::1
 LANGBOT_PIPELINE_URL=
 LANGBOT_PIPELINE_NAME=
 # Optional fake OpenAI-compatible provider controls for Debug Chat load tests.
 # Leave URL empty to let setup automation start a local provider and write the
 # selected URL to skills/.env.local.
 LANGBOT_FAKE_PROVIDER_URL=
 LANGBOT_FAKE_PROVIDER_HOST=127.0.0.1
 LANGBOT_FAKE_PROVIDER_PORT=
 LANGBOT_FAKE_PROVIDER_MODEL_NAME=gpt-4o-mini
 LANGBOT_FAKE_PROVIDER_RESPONSE_TEXT=OK
 LANGBOT_FAKE_PROVIDER_FIRST_TOKEN_DELAY_MS=25
 LANGBOT_FAKE_PROVIDER_CHUNK_DELAY_MS=10
 LANGBOT_FAKE_PROVIDER_CHUNK_COUNT=0
 LANGBOT_FAKE_PROVIDER_FAIL_FIRST_N=0
 LANGBOT_FAKE_PROVIDER_FAIL_EVERY_N=0
 LANGBOT_FAKE_PROVIDER_FAULT_STATUS=500
 LANGBOT_FAKE_PROVIDER_FAIL_AFTER_FIRST_CHUNK=false
 LANGBOT_FAKE_PROVIDER_DYNAMIC_RESPONSE=true
 # Optional case-specific runner targets. Prefer these for runner-specific cases
 # so the automation cannot silently test the wrong runner.
 LANGBOT_LOCAL_AGENT_PIPELINE_URL=
@@ -53,7 +53,7 @@ Start the new frontend from the web repo:
 ```bash
 cd "$LANGBOT_WEB_REPO"
-npm run dev
+VITE_API_BASE_URL="$LANGBOT_BACKEND_URL" pnpm dev --host 0.0.0.0
 ```
 Healthy startup includes:
@@ -68,6 +68,10 @@ Quick check:
 curl -I --max-time 3 "$LANGBOT_FRONTEND_URL"
 ```
 If `VITE_API_BASE_URL` is missing, Vite still serves the page but frontend API
 calls may go to the frontend port instead of the backend port. That produces
 false browser failures in login, wizard, pipeline, and Debug Chat cases.
 ## Completion Signal
 Environment setup is not complete until the required frontend/backend URLs are reachable and the chosen browser-control path can open the WebUI.
@@ -42,6 +42,38 @@ MyPlugin/
 Each component has a `.yaml` (metadata) and `.py` (implementation).
 ## README & i18n convention (enforced on the marketplace)
 A plugin published to LangBot Space serves a localized README on its detail page.
 The resolver (`langbot-space` `PluginService.GetPluginREADME`) works like this:
 - **Root `README.md` MUST be in English.** It is the default and the fallback —
  when no per-language README matches the viewer's locale, the page serves the
  root `README.md`. A non-English root README makes the English/default view show
  the wrong language.
 - **All other languages live under `readme/README_{lang}.md`** — e.g.
  `readme/README_zh_Hans.md`, `readme/README_ja_JP.md`. The 8 supported locales:
  `en_US, zh_Hans, zh_Hant, ja_JP, th_TH, vi_VN, es_ES, ru_RU`.
 - `manifest.yaml` `metadata.label` / `metadata.description` should carry the same
  8-locale i18n set (`repository` must be a real, alive URL).
 ```
 MyPlugin/
 ├── manifest.yaml
 ├── README.md                   # English (default + fallback) — REQUIRED, must be English
 └── readme/
    ├── README_zh_Hans.md
    ├── README_zh_Hant.md
    ├── README_ja_JP.md
    ├── README_th_TH.md
    ├── README_vi_VN.md
    ├── README_es_ES.md
    └── README_ru_RU.md
 ```
 `manifest.yaml` (incl. `repository`) is the source of truth — the marketplace
 syncs from it, so edit the package and re-publish rather than patching live data.
 ## Critical SDK Pitfalls
 ### 1. MessageChain is a RootModel — iterate directly
@@ -21,6 +21,7 @@ Use this skill when an agent needs to verify LangBot behavior through the WebUI
 - **Sandbox-backed skill authoring**: read `references/sandbox-skill-authoring.md`.
 - **LangRAG knowledge bases**: read `references/langrag-knowledge-base.md`.
 - **MCP stdio tool testing**: read `references/mcp-stdio-testing.md`.
 - **Performance, reliability, or chaos probes**: read `references/performance-reliability-testing.md`.
 - **Drive a live instance over MCP (not raw HTTP)**: use the `langbot-mcp-ops` skill — the instance exposes an MCP server at `http://<host>:5300/mcp` (reuses API keys). Useful for setting up bots/pipelines/models as test fixtures programmatically.
 - **Known failures and fixes**: read `references/troubleshooting.md`.
 - **Reusable test groups**: run `bin/lbs suite list` and `bin/lbs suite plan <suite-id>` before manually assembling a case set.
@@ -36,6 +37,8 @@ Use this skill when an agent needs to verify LangBot behavior through the WebUI
 - Use an authenticated browser profile prepared by `langbot-env-setup`.
 - Do not expose API keys, OAuth secrets, tokens, or localStorage token values in output.
 - A WebUI test is not complete until the visible UI result is checked against backend logs or network behavior.
 - A performance result is not complete without `metrics` evidence and a clear split between LangBot overhead and external provider/tool/network time.
 - A chaos or reliability result is not complete until the fault scope, cleanup, and recovery checks are recorded.
 - For a suite, use `bin/lbs suite start <suite-id>` to create the suite evidence root, per-case directories, and `suite-start.json`/`suite-start.md` handoff files; use `bin/lbs test result <case-id>` to write final per-case `result.json`, then run `bin/lbs suite report <suite-id> --evidence-dir <dir>`.
 - Do not mark a case `pass` until `test result --evidence` covers every value in the case's `evidence_required`.
 - For runner-specific Debug Chat cases, use the case-specific pipeline env declared by `automation_pipeline_url_env` / `automation_pipeline_name_env`; do not silently reuse a generic `LANGBOT_PIPELINE_URL`.
@@ -0,0 +1,84 @@
 id: langbot-fake-provider-debug-chat-cross-pipeline-isolation
 title: "LangBot Debug Chat fake-provider cross-pipeline isolation probe"
 mode: probe
 area: reliability
 type: reliability
 priority: p1
 risk: high
 ci_eligible: false
 tags:
  - reliability
  - debug-chat
  - websocket
  - fake-provider
  - isolation
  - concurrency
  - metrics
 skills:
  - langbot-env-setup
  - langbot-testing
 env:
  - LANGBOT_BACKEND_URL
  - LANGBOT_FRONTEND_URL
  - LANGBOT_E2E_LOGIN_USER
 automation: skills/langbot-testing/probes/langbot-debug-chat-cross-pipeline-isolation.mjs
 automation_env:
  - LANGBOT_BACKEND_URL
  - LANGBOT_E2E_LOGIN_USER
  - LANGBOT_FAKE_PROVIDER_URL
  - LANGBOT_FAKE_PROVIDER_PIPELINE_A_URL
  - LANGBOT_FAKE_PROVIDER_PIPELINE_A_NAME
  - LANGBOT_FAKE_PROVIDER_PIPELINE_B_URL
  - LANGBOT_FAKE_PROVIDER_PIPELINE_B_NAME
 automation_debug_chat_load_requests: "6"
 automation_debug_chat_load_concurrency: "4"
 automation_debug_chat_load_timeout_ms: "30000"
 automation_debug_chat_load_response_p95_ms: "5000"
 automation_debug_chat_load_max_error_rate: "0"
 automation_debug_chat_load_prompt_template: '请只回复 "{expected}"，不要解释，不要添加其他字符。'
 automation_debug_chat_load_stream: "true"
 automation_debug_chat_load_reset: "true"
 metrics_thresholds_json: '{"cross_pipeline_leak_count":{"max":0},"response_p95_ms":{"max":5000},"error_rate":{"max":0}}'
 load_profile_json: '{"requests_per_pipeline":6,"pipelines":2,"concurrency":4,"path":"Pipeline Debug Chat WebSocket","provider":"controlled fake OpenAI-compatible provider","metric":"cross-pipeline response isolation and send-to-final-assistant-response"}'
 setup_automation:
  - "node:scripts/e2e/ensure-fake-provider-cross-pipelines.mjs --write-env"
 setup_provides_env:
  - LANGBOT_FAKE_PROVIDER_URL
  - LANGBOT_FAKE_PROVIDER_BASE_URL
  - LANGBOT_FAKE_PROVIDER_PID
  - LANGBOT_FAKE_PROVIDER_PIPELINE_A_URL
  - LANGBOT_FAKE_PROVIDER_PIPELINE_A_NAME
  - LANGBOT_FAKE_PROVIDER_PIPELINE_B_URL
  - LANGBOT_FAKE_PROVIDER_PIPELINE_B_NAME
 steps:
  - "Start or reuse the local fake OpenAI-compatible provider."
  - "Create or update two local-agent pipelines that both point at the controlled fake provider."
  - "Reset both Debug Chat sessions and the fake-provider request log."
  - "Open concurrent WebSocket Debug Chat connections to both pipelines and send unique pipeline-scoped response tokens."
 checks:
  - "automation-result.json status is pass only when every request receives its own expected token and cross_pipeline_leak_count is zero."
  - "metrics_summary includes by_pipeline status counts, fake-provider request count, and LangBot/provider timing estimates."
  - "samples.json contains per-request pipeline labels so any leak can be attributed to the receiving pipeline."
 evidence_required:
  - metrics
  - network
  - api_diagnostic
  - filesystem
 diagnostics:
  - "This probe targets Debug Chat isolation under concurrent traffic from two pipelines."
  - "It is designed to expose regressions where global pipeline state causes one pipeline's assistant response to be delivered to another pipeline's Debug Chat session."
  - "Same-pipeline foreign responses are tolerated because Debug Chat intentionally broadcasts within the same pipeline/session; cross-pipeline tokens are never tolerated."
  - "Known product bug: current releases may fail this probe because Debug Chat replies can read singleton WebSocket proxy pipeline state after another pipeline overwrites it. See https://github.com/langbot-app/LangBot/issues/2286."
 expected_failures:
  - "https://github.com/langbot-app/LangBot/issues/2286"
 success_patterns:
  - "Debug Chat cross-pipeline isolation probe passed"
 failure_patterns:
  - "cross_pipeline_leak"
  - "Timed out after"
  - "WebSocket connection error"
  - "Final assistant response did not include"
 troubleshooting:
  - backend-not-listening
  - debug-chat-history-contaminates-automation
  - local-agent-model-route-unavailable
@@ -0,0 +1,95 @@
 id: langbot-fake-provider-debug-chat-fault-recovery
 title: "LangBot Debug Chat fake-provider fault recovery probe"
 mode: probe
 area: reliability
 type: chaos
 priority: p1
 risk: high
 ci_eligible: false
 tags:
  - reliability
  - chaos
  - debug-chat
  - websocket
  - fake-provider
  - fault-injection
  - metrics
 skills:
  - langbot-env-setup
  - langbot-testing
 env:
  - LANGBOT_BACKEND_URL
  - LANGBOT_FRONTEND_URL
  - LANGBOT_E2E_LOGIN_USER
 automation: skills/langbot-testing/probes/langbot-debug-chat-concurrency.mjs
 automation_env:
  - LANGBOT_BACKEND_URL
  - LANGBOT_E2E_LOGIN_USER
  - LANGBOT_FAKE_PROVIDER_PIPELINE_URL
  - LANGBOT_FAKE_PROVIDER_PIPELINE_NAME
 automation_pipeline_url_env: LANGBOT_FAKE_PROVIDER_PIPELINE_URL
 automation_pipeline_name_env: LANGBOT_FAKE_PROVIDER_PIPELINE_NAME
 automation_debug_chat_load_requests: "6"
 automation_debug_chat_load_concurrency: "1"
 automation_debug_chat_load_timeout_ms: "15000"
 automation_debug_chat_load_response_p95_ms: "5000"
 automation_debug_chat_load_max_error_rate: "0"
 automation_debug_chat_load_min_ok_count: "6"
 automation_debug_chat_load_min_provider_fault_count: "2"
 automation_debug_chat_load_expected_prefix: "FAULTQA"
 automation_debug_chat_load_prompt_template: '请只回复 "{expected}"，不要解释，不要添加其他字符。'
 automation_debug_chat_load_stream: "true"
 automation_debug_chat_load_reset: "true"
 automation_debug_chat_load_fail_on_final_mismatch: "true"
 automation_fake_provider_first_token_delay_ms: "25"
 automation_fake_provider_chunk_delay_ms: "10"
 automation_fake_provider_chunk_count: "0"
 automation_fake_provider_fail_first_n: "2"
 automation_fake_provider_fail_every_n: "0"
 automation_fake_provider_fault_status: "503"
 metrics_thresholds_json: '{"response_p95_ms":{"max":5000},"error_rate":{"max":0},"ok_count_min":{"min":6},"fake_provider_fault_count_min":{"min":2}}'
 fault_model_json: '{"provider_fault":"HTTP 503 for first 2 fake-provider chat completions after reset","expected_behavior":"LangBot retries or otherwise recovers from bounded provider failures so every Debug Chat request receives its expected response without backend crash."}'
 load_profile_json: '{"requests":6,"concurrency":1,"path":"Pipeline Debug Chat WebSocket","provider":"controlled fake OpenAI-compatible provider","classification":"fault-recovery-not-throughput-benchmark"}'
 setup_automation:
  - "node:scripts/e2e/ensure-fake-provider-pipeline.mjs --write-env"
 setup_provides_env:
  - LANGBOT_FAKE_PROVIDER_URL
  - LANGBOT_FAKE_PROVIDER_BASE_URL
  - LANGBOT_FAKE_PROVIDER_PID
  - LANGBOT_FAKE_PROVIDER_PROVIDER_UUID
  - LANGBOT_FAKE_PROVIDER_MODEL_UUID
  - LANGBOT_FAKE_PROVIDER_PIPELINE_URL
  - LANGBOT_FAKE_PROVIDER_PIPELINE_NAME
 steps:
  - "Configure the local fake provider to return HTTP 503 for the first two chat completions after reset."
  - "Create or update the LangBot provider, model, and local-agent pipeline that points at the fake provider."
  - "Reset the target Debug Chat session and fake-provider request counter."
  - "Send a sequential Debug Chat batch and verify later requests recover after the injected provider faults."
 checks:
  - "automation-result.json status is pass when the fake provider records at least two injected faults, every Debug Chat request succeeds, and total user-visible error rate stays at zero."
  - "metrics_summary includes fake_provider_fault_count and status_counts for the same run window."
  - "backend logs show request handling for the same run window without unexpected Traceback or task-leak findings."
 evidence_required:
  - metrics
  - network
  - api_diagnostic
  - filesystem
 diagnostics:
  - "This is a fault-recovery probe, not a throughput benchmark."
  - "Provider faults may be retried inside the provider/requester path; judge this case by fake_provider_fault_count plus user-visible success/error metrics."
  - "The profile uses concurrency 1 because Debug Chat broadcasts assistant responses to every connection in a session, and failed responses do not carry the unique success token needed for concurrent attribution."
 success_patterns:
  - "Debug Chat WebSocket concurrency probe passed"
  - "Streaming completed"
 failure_patterns:
  - "fake_provider_fault"
  - "HTTP 503"
  - "Timed out after"
  - "All models failed during streaming setup"
 expected_failures:
  - "fake_provider_fault"
  - "HTTP 503"
 troubleshooting:
  - backend-not-listening
  - debug-chat-history-contaminates-automation
  - local-agent-model-route-unavailable
@@ -0,0 +1,81 @@
 id: langbot-fake-provider-debug-chat-load
 title: "LangBot Debug Chat controlled fake-provider load probe"
 mode: probe
 area: performance
 type: performance
 priority: p1
 risk: medium
 ci_eligible: false
 tags:
  - performance
  - debug-chat
  - websocket
  - fake-provider
  - load
  - metrics
 skills:
  - langbot-env-setup
  - langbot-testing
 env:
  - LANGBOT_BACKEND_URL
  - LANGBOT_FRONTEND_URL
  - LANGBOT_E2E_LOGIN_USER
 automation: skills/langbot-testing/probes/langbot-debug-chat-concurrency.mjs
 automation_env:
  - LANGBOT_BACKEND_URL
  - LANGBOT_E2E_LOGIN_USER
  - LANGBOT_FAKE_PROVIDER_PIPELINE_URL
  - LANGBOT_FAKE_PROVIDER_PIPELINE_NAME
 automation_pipeline_url_env: LANGBOT_FAKE_PROVIDER_PIPELINE_URL
 automation_pipeline_name_env: LANGBOT_FAKE_PROVIDER_PIPELINE_NAME
 automation_debug_chat_load_requests: "12"
 automation_debug_chat_load_concurrency: "4"
 automation_debug_chat_load_timeout_ms: "30000"
 automation_debug_chat_load_response_p95_ms: "5000"
 automation_debug_chat_load_first_response_p95_ms: "3000"
 automation_debug_chat_load_max_error_rate: "0"
 automation_debug_chat_load_expected_prefix: "FAKEQA"
 automation_debug_chat_load_prompt_template: '请只回复 "{expected}"，不要解释，不要添加其他字符。'
 automation_debug_chat_load_stream: "true"
 automation_debug_chat_load_reset: "true"
 metrics_thresholds_json: '{"response_p95_ms":{"max":5000},"first_response_p95_ms":{"max":3000},"error_rate":{"max":0}}'
 load_profile_json: '{"requests":12,"concurrency":4,"path":"Pipeline Debug Chat WebSocket","provider":"controlled fake OpenAI-compatible provider","metric":"send-to-final-assistant-response"}'
 setup_automation:
  - "node:scripts/e2e/ensure-fake-provider-pipeline.mjs --write-env"
 setup_provides_env:
  - LANGBOT_FAKE_PROVIDER_URL
  - LANGBOT_FAKE_PROVIDER_BASE_URL
  - LANGBOT_FAKE_PROVIDER_PID
  - LANGBOT_FAKE_PROVIDER_PROVIDER_UUID
  - LANGBOT_FAKE_PROVIDER_MODEL_UUID
  - LANGBOT_FAKE_PROVIDER_PIPELINE_URL
  - LANGBOT_FAKE_PROVIDER_PIPELINE_NAME
 steps:
  - "Start or reuse the local fake OpenAI-compatible provider."
  - "Create or update the LangBot provider, model, and local-agent pipeline that points at the fake provider."
  - "Reset the target Debug Chat session."
  - "Open concurrent WebSocket Debug Chat connections and send unique deterministic prompts through the real backend pipeline."
 checks:
  - "automation-result.json status is pass when every request receives its own expected assistant response."
  - "metrics_summary includes request count, concurrency, p50/p95 response latency, first response latency, throughput, and error rate."
  - "thresholds_summary shows response_p95_ms, first_response_p95_ms, and error_rate pass."
 evidence_required:
  - metrics
  - network
  - api_diagnostic
  - filesystem
 diagnostics:
  - "This probe removes external model latency from the measurement; it still exercises the live LangBot backend, provider requester, local-agent runner, pipeline, and Debug Chat WebSocket adapter."
  - "Use this as the repeatable message-path baseline before comparing against Space or another real provider."
 success_patterns:
  - "Debug Chat WebSocket concurrency probe passed"
  - "Streaming completed"
 failure_patterns:
  - "WebSocket connection error"
  - "Timed out after"
  - "Final assistant response did not include"
  - "All models failed during streaming setup"
 troubleshooting:
  - backend-not-listening
  - debug-chat-history-contaminates-automation
  - local-agent-model-route-unavailable
@@ -0,0 +1,88 @@
 id: langbot-fake-provider-debug-chat-slow-load
 title: "LangBot Debug Chat slow fake-provider load probe"
 mode: probe
 area: performance
 type: performance
 priority: p1
 risk: medium
 ci_eligible: false
 tags:
  - performance
  - debug-chat
  - websocket
  - fake-provider
  - slow-provider
  - load
  - metrics
 skills:
  - langbot-env-setup
  - langbot-testing
 env:
  - LANGBOT_BACKEND_URL
  - LANGBOT_FRONTEND_URL
  - LANGBOT_E2E_LOGIN_USER
 automation: skills/langbot-testing/probes/langbot-debug-chat-concurrency.mjs
 automation_env:
  - LANGBOT_BACKEND_URL
  - LANGBOT_E2E_LOGIN_USER
  - LANGBOT_FAKE_PROVIDER_PIPELINE_URL
  - LANGBOT_FAKE_PROVIDER_PIPELINE_NAME
 automation_pipeline_url_env: LANGBOT_FAKE_PROVIDER_PIPELINE_URL
 automation_pipeline_name_env: LANGBOT_FAKE_PROVIDER_PIPELINE_NAME
 automation_debug_chat_load_requests: "8"
 automation_debug_chat_load_concurrency: "4"
 automation_debug_chat_load_timeout_ms: "45000"
 automation_debug_chat_load_response_p95_ms: "10000"
 automation_debug_chat_load_first_response_p95_ms: "7000"
 automation_debug_chat_load_max_error_rate: "0"
 automation_debug_chat_load_expected_prefix: "SLOWQA"
 automation_debug_chat_load_prompt_template: '请只回复 "{expected}"，不要解释，不要添加其他字符。'
 automation_debug_chat_load_stream: "true"
 automation_debug_chat_load_reset: "true"
 automation_fake_provider_first_token_delay_ms: "1000"
 automation_fake_provider_chunk_delay_ms: "250"
 automation_fake_provider_chunk_count: "4"
 automation_fake_provider_fail_first_n: "0"
 automation_fake_provider_fail_every_n: "0"
 automation_fake_provider_fault_status: "500"
 metrics_thresholds_json: '{"response_p95_ms":{"max":10000},"first_response_p95_ms":{"max":7000},"error_rate":{"max":0}}'
 load_profile_json: '{"requests":8,"concurrency":4,"path":"Pipeline Debug Chat WebSocket","provider":"controlled slow fake OpenAI-compatible provider","metric":"send-to-final-assistant-response","provider_profile":{"first_token_delay_ms":1000,"chunk_delay_ms":250,"chunk_count":4}}'
 setup_automation:
  - "node:scripts/e2e/ensure-fake-provider-pipeline.mjs --write-env"
 setup_provides_env:
  - LANGBOT_FAKE_PROVIDER_URL
  - LANGBOT_FAKE_PROVIDER_BASE_URL
  - LANGBOT_FAKE_PROVIDER_PID
  - LANGBOT_FAKE_PROVIDER_PROVIDER_UUID
  - LANGBOT_FAKE_PROVIDER_MODEL_UUID
  - LANGBOT_FAKE_PROVIDER_PIPELINE_URL
  - LANGBOT_FAKE_PROVIDER_PIPELINE_NAME
 steps:
  - "Configure the local fake provider with deterministic slow streaming latency."
  - "Create or update the LangBot provider, model, and local-agent pipeline that points at the fake provider."
  - "Reset the target Debug Chat session."
  - "Open concurrent WebSocket Debug Chat connections and send unique deterministic prompts through the real backend pipeline."
 checks:
  - "automation-result.json status is pass when every request receives its own expected assistant response."
  - "metrics_summary shows zero errors under the slow-provider profile."
  - "thresholds_summary shows response_p95_ms, first_response_p95_ms, and error_rate pass."
 evidence_required:
  - metrics
  - network
  - api_diagnostic
  - filesystem
 diagnostics:
  - "This probe keeps the model deterministic while injecting provider latency, so it catches backend timeout, streaming, and WebSocket backpressure issues without Space variability."
  - "Compare with langbot-fake-provider-debug-chat-load to separate fixed LangBot overhead from provider-latency amplification."
 success_patterns:
  - "Debug Chat WebSocket concurrency probe passed"
  - "Streaming completed"
 failure_patterns:
  - "WebSocket connection error"
  - "Timed out after"
  - "Final assistant response did not include"
  - "All models failed during streaming setup"
 troubleshooting:
  - backend-not-listening
  - debug-chat-history-contaminates-automation
  - local-agent-model-route-unavailable
@@ -0,0 +1,35 @@
 id: langbot-fault-taxonomy-contract
 title: "LangBot fault taxonomy and cleanup contract"
 mode: probe
 area: reliability
 type: chaos
 priority: p1
 risk: medium
 ci_eligible: true
 tags:
  - reliability
  - chaos
  - contract
  - synthetic
 skills:
  - langbot-testing
 automation: skills/langbot-testing/probes/langbot-fault-taxonomy-contract.mjs
 fault_model_json: '{"kind":"taxonomy-contract","destructive":false,"scenarios":["provider-timeout","plugin-runtime-disconnect","mcp-stdio-server-exit","operator-missing-login","transient-marketplace-timeout"]}'
 steps:
  - "Run `rtk bin/lbs test run langbot-fault-taxonomy-contract --dry-run` first; remove `--dry-run` after checking the evidence directory."
  - "Automation validates that representative fault scenarios declare target, injected fault, expected status, recovery check, and cleanup."
  - "Review metrics.json, fault-model.json, and automation-result.json under LBS_EVIDENCE_DIR."
 checks:
  - "automation-result.json status is pass."
  - "Every scenario has an expected status in pass, fail, blocked, env_issue, or flaky."
  - "Every scenario declares a cleanup action and recovery check."
 evidence_required:
  - metrics
  - filesystem
 diagnostics:
  - "This is a non-destructive taxonomy contract probe; it does not inject real runtime faults."
  - "Use it as a gate before adding live chaos cases that kill runtimes, route traffic through a proxy, or disrupt a backend dependency."
 success_patterns:
  - "Fault taxonomy contract declares status"
 failure_patterns:
  - "missing required scenario fields"
@@ -0,0 +1,42 @@
 id: langbot-live-backend-latency
 title: "LangBot live backend basic latency probe"
 mode: probe
 area: performance
 type: performance
 priority: p1
 risk: medium
 ci_eligible: false
 tags:
  - performance
  - live-backend
  - latency
  - metrics
 skills:
  - langbot-testing
 env:
  - LANGBOT_BACKEND_URL
 automation: skills/langbot-testing/probes/langbot-live-backend-latency.mjs
 metrics_thresholds_json: '{"backend_p95_ms":{"max":1000},"error_rate":{"max":0}}'
 load_profile_json: '{"requests":12,"concurrency":2,"endpoints":["/healthz"]}'
 steps:
  - "Confirm the selected LangBot backend is the intended test target."
  - "Run `rtk bin/lbs test run langbot-live-backend-latency --dry-run` first; remove `--dry-run` after checking LANGBOT_BACKEND_URL and evidence directory."
  - "Automation sends a small request batch to LANGBOT_BACKEND_URL/healthz and records latency, status counts, and network errors."
 checks:
  - "automation-result.json status is pass when the backend responds and p95/error-rate thresholds pass."
  - "automation-result.json status is env_issue when the backend is not reachable."
  - "metrics.json and network.log are written under LBS_EVIDENCE_DIR."
 evidence_required:
  - metrics
  - network
  - api_diagnostic
  - filesystem
 diagnostics:
  - "This probe measures backend health endpoint reachability latency only; it does not cover model/provider, browser, Debug Chat, RAG, or plugin runtime latency."
 success_patterns:
  - "Live backend latency probe passed"
 failure_patterns:
  - "Backend did not respond"
  - "breached latency or error-rate thresholds"
 troubleshooting:
  - socks-proxy-without-socksio
@@ -0,0 +1,45 @@
 id: langbot-live-backend-log-health
 title: "LangBot live backend log health probe"
 mode: probe
 area: reliability
 type: reliability
 priority: p1
 risk: medium
 ci_eligible: false
 tags:
  - reliability
  - live-backend
  - backend-log
  - metrics
 skills:
  - langbot-testing
 env:
  - LANGBOT_BACKEND_URL
 automation: skills/langbot-testing/probes/langbot-live-backend-log-health.mjs
 metrics_thresholds_json: '{"fail_count":{"max":0}}'
 load_profile_json: '{"lookback_seconds":300,"log_source":"LANGBOT_BACKEND_LOG or latest LANGBOT_REPO/data/logs/langbot-*.log"}'
 steps:
  - "Confirm the selected LangBot backend log belongs to the intended test target."
  - "Run `rtk bin/lbs test run langbot-live-backend-log-health --dry-run` first; remove `--dry-run` after checking evidence directory and log source."
  - "Automation scans the recent backend log window for fail-severity runtime findings such as Traceback, ImportError, ERROR, unclosed sessions, and unawaited coroutines."
 checks:
  - "automation-result.json status is pass only when fail_count is 0."
  - "metrics_summary includes scanned_line_count, fail_count, warning_count, and finding_count."
  - "findings.json and scanned-backend.log are written under LBS_EVIDENCE_DIR."
 evidence_required:
  - metrics
  - backend_log
  - filesystem
 diagnostics:
  - "Set LANGBOT_BACKEND_LOG to an explicit log path when the latest log file is not the run target."
  - "Set LANGBOT_BACKEND_LOG_SINCE or LANGBOT_BACKEND_LOG_LOOKBACK_SECONDS to control the scan window."
  - "This probe measures runtime log health; it does not prove user-facing Debug Chat, plugin, model, or RAG behavior."
 success_patterns:
  - "Live backend log health passed"
 failure_patterns:
  - "Traceback"
  - "ImportError"
  - "ERROR"
  - "unclosed"
 troubleshooting:
  - socks-proxy-without-socksio
@@ -0,0 +1,44 @@
 id: langbot-live-control-plane-api
 title: "LangBot live control-plane API probe"
 mode: probe
 area: performance
 type: performance
 priority: p1
 risk: medium
 ci_eligible: false
 tags:
  - performance
  - reliability
  - live-backend
  - control-plane
  - metrics
 skills:
  - langbot-testing
 env:
  - LANGBOT_BACKEND_URL
 automation: skills/langbot-testing/probes/langbot-live-control-plane-api.mjs
 metrics_thresholds_json: '{"error_rate":{"max":0},"response_shape_failures":{"max":0},"healthz_p95_ms":{"max":500},"system_info_p95_ms":{"max":1000}}'
 load_profile_json: '{"requests":20,"concurrency":4,"endpoints":["/healthz","/api/v1/system/info"],"auth_required":false}'
 steps:
  - "Confirm the selected LangBot backend is the intended test target."
  - "Run `rtk bin/lbs test run langbot-live-control-plane-api --dry-run` first; remove `--dry-run` after checking LANGBOT_BACKEND_URL and evidence directory."
  - "Automation sends a small request batch to /healthz and /api/v1/system/info, then validates status code, JSON shape, and latency budgets."
 checks:
  - "automation-result.json status is pass when every control-plane request returns HTTP 200, JSON code 0, and required response fields."
  - "metrics_summary includes per-endpoint p50/p95 latency, error rate, status counts, and response_shape_failures."
  - "thresholds_summary shows error_rate, response_shape_failures, healthz_p95_ms, and system_info_p95_ms all pass."
 evidence_required:
  - metrics
  - network
  - api_diagnostic
  - filesystem
 diagnostics:
  - "This probe measures unauthenticated backend control-plane readiness; it does not cover authenticated UI flows, Debug Chat, model calls, plugins, or RAG."
  - "A system_info shape failure usually means the API contract or startup state changed and should be investigated before treating latency as healthy."
 success_patterns:
  - "Live control-plane API probe passed"
 failure_patterns:
  - "Backend did not respond"
  - "breached shape, latency, or error-rate thresholds"
 troubleshooting:
  - socks-proxy-without-socksio
@@ -0,0 +1,37 @@
 id: langbot-overhead-accounting-contract
 title: "LangBot overhead accounting metrics contract"
 mode: probe
 area: performance
 type: performance
 priority: p1
 risk: medium
 ci_eligible: true
 tags:
  - performance
  - metrics
  - contract
  - synthetic
 skills:
  - langbot-testing
 automation: skills/langbot-testing/probes/langbot-overhead-accounting-contract.mjs
 metrics_thresholds_json: '{"sample_count":{"min":50},"langbot_overhead_p95_ms":{"max":25},"accounting_gap_max_ms":{"max":0.001}}'
 load_profile_json: '{"kind":"synthetic-overhead-accounting","samples":80,"external_latency_segments":["provider","external_tool","network"]}'
 steps:
  - "Run `rtk bin/lbs test run langbot-overhead-accounting-contract --dry-run` first; remove `--dry-run` after checking the evidence directory."
  - "Automation generates deterministic message-path latency samples and separates LangBot overhead from provider/tool/network latency."
  - "Review metrics.json, thresholds.json, resource-log.json, and automation-result.json under LBS_EVIDENCE_DIR."
 checks:
  - "automation-result.json status is pass."
  - "metrics_summary includes sample_count, langbot_overhead_p95_ms, e2e_latency_p95_ms, external_latency_p95_ms, and accounting_gap_max_ms."
  - "thresholds_summary shows sample_count, langbot_overhead_p95_ms, and accounting_gap_max_ms all pass."
 evidence_required:
  - metrics
  - resource_log
  - filesystem
 diagnostics:
  - "This is a synthetic contract probe for the QA harness; it is not live product performance."
  - "Use it to verify that reports can carry overhead accounting metrics before running live backend or browser performance probes."
 success_patterns:
  - "Overhead accounting contract passed"
 failure_patterns:
  - "breached one or more thresholds"
@@ -0,0 +1,84 @@
 id: langbot-space-debug-chat-concurrency-smoke
 title: "LangBot Debug Chat real Space-provider concurrency smoke"
 mode: probe
 area: performance
 type: performance
 priority: p1
 risk: high
 ci_eligible: false
 tags:
  - performance
  - debug-chat
  - websocket
  - space
  - live-provider
  - smoke
  - metrics
 skills:
  - langbot-env-setup
  - langbot-testing
 env:
  - LANGBOT_BACKEND_URL
  - LANGBOT_FRONTEND_URL
  - LANGBOT_E2E_LOGIN_USER
 automation: skills/langbot-testing/probes/langbot-debug-chat-concurrency.mjs
 automation_env:
  - LANGBOT_BACKEND_URL
  - LANGBOT_E2E_LOGIN_USER
  - LANGBOT_LOCAL_AGENT_PIPELINE_URL
  - LANGBOT_LOCAL_AGENT_PIPELINE_NAME
 automation_pipeline_url_env: LANGBOT_LOCAL_AGENT_PIPELINE_URL
 automation_pipeline_name_env: LANGBOT_LOCAL_AGENT_PIPELINE_NAME
 automation_debug_chat_load_requests: "3"
 automation_debug_chat_load_concurrency: "2"
 automation_debug_chat_load_timeout_ms: "120000"
 automation_debug_chat_load_response_p95_ms: "120000"
 automation_debug_chat_load_max_error_rate: "0"
 automation_debug_chat_load_expected_prefix: "SPACEQA"
 automation_debug_chat_load_prompt_template: '请只回复 "{expected}"，不要解释，不要添加其他字符。'
 automation_debug_chat_load_stream: "true"
 automation_debug_chat_load_reset: "true"
 metrics_thresholds_json: '{"response_p95_ms":{"max":120000},"error_rate":{"max":0}}'
 load_profile_json: '{"requests":3,"concurrency":2,"path":"Pipeline Debug Chat WebSocket","provider":"LangBot Space model route","metric":"send-to-final-assistant-response","classification":"smoke-not-benchmark"}'
 setup_automation:
  - "node:scripts/e2e/ensure-local-agent-pipeline.mjs --write-env"
 setup_provides_env:
  - LANGBOT_PIPELINE_URL
  - LANGBOT_PIPELINE_NAME
  - LANGBOT_LOCAL_AGENT_PIPELINE_URL
  - LANGBOT_LOCAL_AGENT_PIPELINE_NAME
  - LANGBOT_LOCAL_AGENT_MODEL_UUID
  - LANGBOT_E2E_MODEL_UUID
 preconditions:
  - "The selected local LangBot instance is safe for a low-volume real Space model smoke run."
  - "Treat Space/provider/network failures as environment or dependency findings until fake-provider baseline evidence separates LangBot overhead."
 steps:
  - "Prepare a local-agent pipeline with a tested Space model and fallback models."
  - "Reset the target Debug Chat session."
  - "Open a small number of concurrent WebSocket Debug Chat connections and send unique deterministic prompts through the live Space provider path."
 checks:
  - "automation-result.json status is pass when every request receives its own expected assistant response."
  - "metrics_summary includes request count, concurrency, p95 response latency, throughput, and error rate."
  - "The report classifies the result as a live-provider smoke, not a stable LangBot overhead benchmark."
 evidence_required:
  - metrics
  - network
  - api_diagnostic
  - filesystem
 diagnostics:
  - "This probe measures real user-path latency through Space and includes provider latency, model behavior, and network effects."
  - "Compare with langbot-fake-provider-debug-chat-load before attributing slow or failed runs to LangBot itself."
 success_patterns:
  - "Debug Chat WebSocket concurrency probe passed"
  - "Streaming completed"
 failure_patterns:
  - "invalid api key"
  - "WebSocket connection error"
  - "Timed out after"
  - "Final assistant response did not include"
  - "All models failed during streaming setup"
 troubleshooting:
  - local-agent-model-route-unavailable
  - marketplace-network-flaky
  - proxy-env-mismatch
  - telemetry-proxy-noise
@@ -0,0 +1,80 @@
 id: pipeline-debug-chat-performance
 title: "Pipeline Debug Chat user-path performance probe"
 mode: agent-browser
 area: pipeline
 type: performance
 priority: p1
 risk: medium
 ci_eligible: false
 tags:
  - performance
  - pipeline
  - debug-chat
  - user-path
  - metrics
 skills:
  - langbot-env-setup
  - langbot-testing
 env:
  - LANGBOT_FRONTEND_URL
  - LANGBOT_BACKEND_URL
 env_any:
  - LANGBOT_PIPELINE_URL|LANGBOT_PIPELINE_NAME
 automation: scripts/e2e/pipeline-debug-chat.mjs
 automation_env:
  - LANGBOT_FRONTEND_URL
  - LANGBOT_BACKEND_URL
  - LANGBOT_BROWSER_PROFILE
  - LANGBOT_CHROMIUM_EXECUTABLE
  - LANGBOT_E2E_PROMPT
  - LANGBOT_E2E_EXPECTED_TEXT
  - LANGBOT_E2E_RESPONSE_TIMEOUT_MS
 automation_env_any:
  - LANGBOT_PIPELINE_URL|LANGBOT_PIPELINE_NAME
 automation_prompt: "请只回复 OK，用于性能测试。"
 automation_expected_text: "OK"
 automation_response_timeout_ms: "120000"
 automation_reset_debug_chat: "true"
 automation_debug_chat_response_p95_ms: "120000"
 automation_debug_chat_max_error_rate: "0"
 metrics_thresholds_json: '{"response_p95_ms":{"max":120000},"error_rate":{"max":0}}'
 load_profile_json: '{"prompts":1,"browser":true,"path":"Pipeline Debug Chat","metric":"send-to-visible-completion"}'
 setup_automation:
  - "node:scripts/e2e/ensure-local-agent-pipeline.mjs --write-env"
 setup_provides_env:
  - LANGBOT_PIPELINE_URL
  - LANGBOT_PIPELINE_NAME
 preconditions:
  - "LANGBOT_PIPELINE_URL or LANGBOT_PIPELINE_NAME points to the pipeline intended for this Debug Chat performance run."
  - "The target pipeline is safe to reset Debug Chat history for this run."
  - "The target pipeline has a known-good runner/model; provider latency should be interpreted separately from LangBot overhead."
 steps:
  - "Open LANGBOT_FRONTEND_URL with the prepared browser profile."
  - "Open the target pipeline and select Debug Chat."
  - "Reset Debug Chat history through the backend API when configured."
  - "Send the deterministic prompt and wait for the expected assistant response."
 checks:
  - "automation-result.json status is pass when the expected assistant response appears."
  - "metrics_summary includes response_p50_ms, response_p95_ms, error_rate, and total_duration_ms."
  - "thresholds_summary shows response_p95_ms and error_rate pass."
 evidence_required:
  - ui
  - screenshot
  - console
  - network
  - metrics
 diagnostics:
  - "This case measures browser-visible send-to-completion latency; it does not split provider latency from LangBot overhead."
  - "Use backend logs and provider diagnostics to explain slow runs before calling them LangBot regressions."
 success_patterns:
  - "Processing request from person_websocket"
  - "Streaming completed"
 failure_patterns:
  - "Action invoke_llm_stream call timed out"
  - "Task exception was never retrieved"
  - "All models failed during streaming setup"
 troubleshooting:
  - debug-chat-history-contaminates-automation
  - local-agent-model-route-unavailable
  - plugin-runtime-timeout
  - proxy-env-mismatch
@@ -1 +1,3 @@
-dist/
+dist/*
 !dist/
 !dist/qa-plugin-smoke-0.1.0.lbpkg
@@ -0,0 +1,837 @@
 #!/usr/bin/env node
 import crypto from "node:crypto";
 import net from "node:net";
 import tls from "node:tls";
 import { mkdir, writeFile } from "node:fs/promises";
 import { join, resolve } from "node:path";
 import { env, exit } from "node:process";
 import {
  apiJson,
  appendLine,
  ensureEvidence,
  evidencePaths,
  loadEnvFiles,
  localIsoWithOffset,
  redact,
  resetAndAuthLocalUser,
  writeResult,
 } from "../../../scripts/e2e/lib/langbot-e2e.mjs";
 import {
  buildProviderTimingMetrics,
  summarizeFakeProviderState,
 } from "./lib/fake-provider-timing.mjs";
 const DEFAULT_LOCAL_PASSWORD = "LangBotE2ELocalPass!2026";
 await loadEnvFiles();
 const caseId = env.LBS_CASE_ID || "langbot-debug-chat-concurrency";
 const paths = evidencePaths(caseId);
 await ensureEvidence(paths);
 const startedAt = new Date();
 const metricsPath = resolve(paths.evidenceDir, "metrics.json");
 const samplesPath = resolve(paths.evidenceDir, "samples.json");
 const fakeProviderStatePath = resolve(paths.evidenceDir, "fake-provider-state.json");
 const resetDiagnosticPath = resolve(paths.evidenceDir, "debug-chat-reset-diagnostic.json");
 const backendUrl = env.LANGBOT_BACKEND_URL || "";
 const fakeProviderUrl = env.LANGBOT_FAKE_PROVIDER_URL || "";
 const pipelineUrl = env.LANGBOT_E2E_PIPELINE_URL || env.LANGBOT_PIPELINE_URL || "";
 const pipelineName = env.LANGBOT_E2E_PIPELINE_NAME || env.LANGBOT_PIPELINE_NAME || "";
 const sessionType = env.LANGBOT_DEBUG_CHAT_LOAD_SESSION_TYPE || env.LANGBOT_E2E_DEBUG_CHAT_SESSION_TYPE || "person";
 const totalRequests = positiveInteger(env.LANGBOT_DEBUG_CHAT_LOAD_REQUESTS, defaultRequests(caseId));
 const concurrency = Math.min(totalRequests, positiveInteger(env.LANGBOT_DEBUG_CHAT_LOAD_CONCURRENCY, defaultConcurrency(caseId)));
 const timeoutMs = positiveInteger(env.LANGBOT_DEBUG_CHAT_LOAD_TIMEOUT_MS, defaultTimeout(caseId));
 const expectedPrefix = env.LANGBOT_DEBUG_CHAT_LOAD_EXPECTED_PREFIX || "LBQA";
 const promptTemplate = env.LANGBOT_DEBUG_CHAT_LOAD_PROMPT_TEMPLATE
  || "请只回复 \"{expected}\"，不要解释，不要添加其他字符。";
 const stream = bool(env.LANGBOT_DEBUG_CHAT_LOAD_STREAM, true);
 const resetBeforeRun = bool(env.LANGBOT_DEBUG_CHAT_LOAD_RESET, true);
 const responseP95BudgetMs = positiveNumber(env.LANGBOT_DEBUG_CHAT_LOAD_RESPONSE_P95_MS, defaultP95Budget(caseId));
 const firstResponseP95BudgetMs = positiveNumber(env.LANGBOT_DEBUG_CHAT_LOAD_FIRST_RESPONSE_P95_MS, 0);
 const maxErrorRate = positiveNumber(env.LANGBOT_DEBUG_CHAT_LOAD_MAX_ERROR_RATE, 0);
 const minErrorRate = positiveNumber(env.LANGBOT_DEBUG_CHAT_LOAD_MIN_ERROR_RATE, 0);
 const minErrorCount = nonNegativeInteger(env.LANGBOT_DEBUG_CHAT_LOAD_MIN_ERROR_COUNT, 0);
 const minOkCount = nonNegativeInteger(env.LANGBOT_DEBUG_CHAT_LOAD_MIN_OK_COUNT, 0);
 const minProviderFaultCount = nonNegativeInteger(env.LANGBOT_DEBUG_CHAT_LOAD_MIN_PROVIDER_FAULT_COUNT, 0);
 const failOnFinalMismatch = bool(env.LANGBOT_DEBUG_CHAT_LOAD_FAIL_ON_FINAL_MISMATCH, false);
 const failureSignals = textList(env.LANGBOT_E2E_FAILURE_SIGNALS || env.LANGBOT_DEBUG_CHAT_LOAD_FAILURE_SIGNALS || "");
 const result = {
  source: "automation",
  case_id: caseId,
  run_id: paths.runId,
  status: "fail",
  reason: "",
  started_at: startedAt.toISOString(),
  started_at_local: localIsoWithOffset(startedAt),
  finished_at: "",
  finished_at_local: "",
  duration_ms: 0,
  backend_url: backendUrl,
  pipeline_url: pipelineUrl,
  pipeline_name: pipelineName,
  pipeline_id: "",
  session_type: sessionType,
  load_profile: {
    requests: totalRequests,
    concurrency,
    timeout_ms: timeoutMs,
    stream,
    reset_before_run: resetBeforeRun,
    fail_on_final_mismatch: failOnFinalMismatch,
  },
  evidence: {
    network_log: paths.networkLog,
    metrics_json: metricsPath,
    samples_json: samplesPath,
    fake_provider_state_json: fakeProviderStatePath,
    debug_chat_reset_diagnostic_json: resetDiagnosticPath,
    automation_result_json: paths.automationResultJson,
    result_json: paths.resultJson,
  },
  evidence_collected: ["metrics", "network", "api_diagnostic", "filesystem"],
 };
 try {
  if (!backendUrl) {
    result.status = "env_issue";
    throw new Error("LANGBOT_BACKEND_URL is not configured.");
  }
  if (!["person", "group"].includes(sessionType)) {
    throw new Error(`LANGBOT_DEBUG_CHAT_LOAD_SESSION_TYPE must be person or group, got ${sessionType}.`);
  }
  const backendReady = await backendReachable(backendUrl);
  if (!backendReady) {
    result.status = "env_issue";
    throw new Error(`Backend did not respond at ${backendUrl}.`);
  }
  const user = env.LANGBOT_E2E_LOGIN_USER || "";
  const password = env.LANGBOT_E2E_LOGIN_PASSWORD || DEFAULT_LOCAL_PASSWORD;
  if (!user) {
    result.status = "env_issue";
    throw new Error("LANGBOT_E2E_LOGIN_USER is required so this probe can resolve/reset the Debug Chat session.");
  }
  const auth = await resetAndAuthLocalUser({ backendUrl, user, password });
  const pipeline = await resolvePipeline({ backendUrl, token: auth.token, pipelineUrl, pipelineName });
  result.pipeline_id = pipeline.id;
  result.pipeline_name = pipeline.name || pipelineName;
  if (!result.pipeline_url && env.LANGBOT_FRONTEND_URL) {
    result.pipeline_url = `${env.LANGBOT_FRONTEND_URL.replace(/\/$/, "")}/home/pipelines?id=${encodeURIComponent(pipeline.id)}`;
  }
  if (resetBeforeRun) {
    const reset = await apiJson(backendUrl, `/api/v1/pipelines/${encodeURIComponent(pipeline.id)}/ws/reset/${encodeURIComponent(sessionType)}`, {
      method: "POST",
      token: auth.token,
    });
    const resetDiagnostic = {
      status: isApiFailure(reset) ? "fail" : "ready",
      http_status: reset.status,
      code: reset.json.code ?? null,
      reason: isApiFailure(reset) ? reset.json.msg || "Debug Chat reset failed." : "Debug Chat session reset.",
    };
    await writeFile(resetDiagnosticPath, `${JSON.stringify(resetDiagnostic, null, 2)}\n`, "utf8");
    if (resetDiagnostic.status === "fail") {
      throw new Error(resetDiagnostic.reason);
    }
  }
  const wsUrl = websocketUrl(backendUrl, pipeline.id, sessionType);
  const loadStartedAt = performance.now();
  const samples = await runLoad({
    wsUrl,
    totalRequests,
    concurrency,
    timeoutMs,
    promptTemplate,
    expectedPrefix,
    stream,
    failOnFinalMismatch,
    failureSignals,
  });
  const loadDurationMs = performance.now() - loadStartedAt;
  const fakeProviderState = await readFakeProviderState(fakeProviderUrl);
  if (fakeProviderState) {
    await writeFile(fakeProviderStatePath, `${JSON.stringify(fakeProviderState, null, 2)}\n`, "utf8");
  }
  const metrics = buildMetrics({
    samples,
    totalRequests,
    concurrency,
    timeoutMs,
    loadDurationMs,
    backendUrl,
    pipelineId: pipeline.id,
    sessionType,
    fakeProviderState,
  });
  const thresholds = buildThresholds(metrics);
  const passed = Object.values(thresholds).every((item) => item.pass);
  result.status = passed ? "pass" : "fail";
  result.reason = passed
    ? "Debug Chat WebSocket concurrency probe passed all thresholds."
    : "Debug Chat WebSocket concurrency probe breached latency or error-rate thresholds.";
  result.metrics_summary = {
    requests: metrics.total_requests,
    concurrency: metrics.concurrency,
    ok_count: metrics.ok_count,
    error_count: metrics.error_count,
    timeout_count: metrics.timeout_count,
    error_rate: metrics.error_rate,
    response_p50_ms: metrics.response_duration_ms.p50,
    response_p95_ms: metrics.response_duration_ms.p95,
    first_assistant_event_p95_ms: metrics.first_assistant_event_ms.p95,
    first_assistant_content_p95_ms: metrics.first_assistant_content_ms.p95,
    first_response_p95_ms: metrics.first_response_ms.p95,
    throughput_rps: metrics.throughput_rps,
    status_counts: metrics.status_counts,
    fake_provider_request_count: metrics.fake_provider?.request_count ?? null,
    fake_provider_fault_count: metrics.fake_provider?.fault_count ?? null,
    fake_provider_duration_p95_ms: metrics.provider_timing?.provider_duration_ms.p95 ?? null,
    langbot_overhead_estimate_p95_ms: metrics.provider_timing?.langbot_overhead_estimate_ms.p95 ?? null,
    send_to_provider_start_p95_ms: metrics.provider_timing?.send_to_provider_start_ms.p95 ?? null,
    provider_finish_to_ws_final_p95_ms: metrics.provider_timing?.provider_finish_to_ws_final_ms.p95 ?? null,
    provider_timing_matched_request_count: metrics.provider_timing?.matched_request_count ?? null,
  };
  result.thresholds_summary = thresholds;
  result.artifacts = {
    metrics_json: metricsPath,
    samples_json: samplesPath,
    fake_provider_state_json: fakeProviderState ? fakeProviderStatePath : "",
    network_log: paths.networkLog,
    automation_result_json: paths.automationResultJson,
    result_json: paths.resultJson,
  };
  await writeFile(metricsPath, `${JSON.stringify({ ...metrics, thresholds }, null, 2)}\n`, "utf8");
  await writeFile(samplesPath, `${JSON.stringify(samples, null, 2)}\n`, "utf8");
 } catch (error) {
  if (!["env_issue", "blocked"].includes(result.status)) {
    result.status = looksLikeEnvIssue(error) ? "env_issue" : "fail";
  }
  result.reason = result.reason || safeReason(error.message);
 } finally {
  const finishedAt = new Date();
  result.finished_at = finishedAt.toISOString();
  result.finished_at_local = localIsoWithOffset(finishedAt);
  result.duration_ms = finishedAt.getTime() - startedAt.getTime();
  await mkdir(paths.evidenceDir, { recursive: true });
  await writeResult(paths, result);
  console.log(JSON.stringify(result, null, 2));
 }
 exit(result.status === "pass" ? 0 : result.status === "env_issue" || result.status === "blocked" ? 2 : 1);
 function defaultRequests(id) {
  return id.includes("space") ? 3 : 12;
 }
 function defaultConcurrency(id) {
  return id.includes("space") ? 1 : 4;
 }
 function defaultTimeout(id) {
  return id.includes("space") ? 120_000 : 30_000;
 }
 function defaultP95Budget(id) {
  return id.includes("space") ? 120_000 : 5_000;
 }
 function positiveInteger(value, fallback) {
  const parsed = Number.parseInt(String(value || ""), 10);
  return Number.isInteger(parsed) && parsed > 0 ? parsed : fallback;
 }
 function nonNegativeInteger(value, fallback) {
  const parsed = Number.parseInt(String(value ?? ""), 10);
  return Number.isInteger(parsed) && parsed >= 0 ? parsed : fallback;
 }
 function positiveNumber(value, fallback) {
  const parsed = Number(value || "");
  return Number.isFinite(parsed) && parsed >= 0 ? parsed : fallback;
 }
 function bool(value, fallback) {
  if (value === undefined || value === "") return fallback;
  if (/^(1|true|yes|on)$/i.test(String(value))) return true;
  if (/^(0|false|no|off)$/i.test(String(value))) return false;
  return fallback;
 }
 function textList(value) {
  return String(value || "")
    .split(/\r?\n|,/)
    .map((item) => item.trim())
    .filter(Boolean);
 }
 async function backendReachable(baseUrl) {
  try {
    const response = await fetch(`${baseUrl.replace(/\/$/, "")}/healthz`, {
      signal: AbortSignal.timeout(3000),
    });
    return response.status < 500;
  } catch {
    return false;
  }
 }
 async function readFakeProviderState(rootUrl) {
  if (!rootUrl) return null;
  try {
    const response = await fetch(`${normalizeProviderRootUrl(rootUrl)}/__qa/config`, {
      signal: AbortSignal.timeout(3000),
    });
    const json = await response.json().catch(() => ({}));
    return {
      status: response.ok && json.ok === true ? "loaded" : "unavailable",
      url: normalizeProviderRootUrl(rootUrl),
      http_status: response.status,
      model: json.model || "",
      config: json.config || {},
      request_count: Number.isFinite(json.request_count) ? json.request_count : null,
      recent_requests: Array.isArray(json.recent_requests) ? json.recent_requests : [],
    };
  } catch (error) {
    return {
      status: "unavailable",
      url: normalizeProviderRootUrl(rootUrl),
      reason: safeReason(error.message),
      request_count: null,
      recent_requests: [],
    };
  }
 }
 function normalizeProviderRootUrl(value) {
  const trimmed = String(value || "").trim().replace(/\/$/, "");
  return trimmed.endsWith("/v1") ? trimmed.slice(0, -3) : trimmed;
 }
 function pipelineIdFromUrl(url) {
  if (!url) return "";
  try {
    const parsed = new URL(url);
    return parsed.searchParams.get("id") || "";
  } catch {
    return "";
  }
 }
 async function resolvePipeline({ backendUrl, token, pipelineUrl, pipelineName }) {
  const idFromUrl = pipelineIdFromUrl(pipelineUrl);
  if (idFromUrl) {
    const response = await apiJson(backendUrl, `/api/v1/pipelines/${encodeURIComponent(idFromUrl)}`, { token });
    const pipeline = response.json.data?.pipeline;
    if (isApiFailure(response) || !pipeline?.uuid) {
      throw new Error(response.json.msg || `Could not load pipeline ${idFromUrl}.`);
    }
    return { id: pipeline.uuid, name: pipeline.name || "" };
  }
  if (!pipelineName) {
    throw new Error("Set LANGBOT_E2E_PIPELINE_URL or LANGBOT_E2E_PIPELINE_NAME before running this probe.");
  }
  const response = await apiJson(backendUrl, "/api/v1/pipelines", { token });
  if (isApiFailure(response)) {
    throw new Error(response.json.msg || "Failed to list pipelines.");
  }
  const pipeline = (response.json.data?.pipelines || []).find((item) => item.name === pipelineName);
  if (!pipeline?.uuid) {
    throw new Error(`Could not find pipeline named ${pipelineName}.`);
  }
  return { id: pipeline.uuid, name: pipeline.name || pipelineName };
 }
 function isApiFailure(response) {
  return response.status >= 400 || (response.json.code !== undefined && response.json.code !== 0);
 }
 function websocketUrl(baseUrl, pipelineId, sessionType) {
  const parsed = new URL(baseUrl);
  parsed.protocol = parsed.protocol === "https:" ? "wss:" : "ws:";
  parsed.pathname = `/api/v1/pipelines/${encodeURIComponent(pipelineId)}/ws/connect`;
  parsed.search = `?session_type=${encodeURIComponent(sessionType)}`;
  return parsed.toString();
 }
 async function runLoad(options) {
  const samples = [];
  let nextIndex = 0;
  const workers = Array.from({ length: options.concurrency }, async () => {
    while (nextIndex < options.totalRequests) {
      const index = nextIndex;
      nextIndex += 1;
      const sample = await runSingleRequest({ ...options, index });
      samples.push(sample);
    }
  });
  await Promise.all(workers);
  return samples.sort((left, right) => left.index - right.index);
 }
 function expectedForIndex(prefix, index) {
  return `${prefix}-${String(index + 1).padStart(4, "0")}`;
 }
 function promptForIndex(template, expected) {
  return template.replaceAll("{expected}", expected);
 }
 function runSingleRequest({
  wsUrl,
  index,
  timeoutMs,
  promptTemplate,
  expectedPrefix,
  stream,
  failOnFinalMismatch,
  failureSignals,
 }) {
  return new Promise((resolve) => {
    const expected = expectedForIndex(expectedPrefix, index);
    const prompt = promptForIndex(promptTemplate, expected);
    const sample = {
      index,
      status: "running",
      ok: false,
      expected_text: expected,
      prompt,
      response_text: "",
      started_at: new Date().toISOString(),
      started_epoch_ms: Date.now(),
      connected_at: null,
      connected_epoch_ms: null,
      sent_at: null,
      sent_epoch_ms: null,
      first_assistant_event_at: null,
      first_assistant_event_epoch_ms: null,
      first_assistant_event_ms: null,
      first_assistant_content_at: null,
      first_assistant_content_epoch_ms: null,
      first_assistant_content_ms: null,
      first_response_at: null,
      first_response_epoch_ms: null,
      connected_ms: null,
      first_response_ms: null,
      response_duration_ms: null,
      finished_at: null,
      finished_epoch_ms: null,
      event_count: 0,
      foreign_response_count: 0,
      last_foreign_response_text: "",
      error: "",
      close_code: null,
      close_reason: "",
    };
    let closed = false;
    let connectedAt = 0;
    let sentAt = 0;
    const startedAt = performance.now();
    let client = null;
    const timer = setTimeout(() => {
      finish("timeout", `Timed out after ${timeoutMs} ms.`);
    }, timeoutMs);
    client = openRawWebSocket(wsUrl, {
      onOpen() {
        connectedAt = performance.now();
        const now = Date.now();
        sample.connected_at = new Date(now).toISOString();
        sample.connected_epoch_ms = now;
        sample.connected_ms = rounded(connectedAt - startedAt);
      },
      onMessage(text) {
        sample.event_count += 1;
        let data;
        try {
          data = JSON.parse(String(text || ""));
        } catch (error) {
          finish("error", `Invalid WebSocket JSON: ${error.message}`);
          return;
        }
        appendLine(paths.networkLog, JSON.stringify({
          request_index: index,
          type: data.type,
          session_type: data.session_type || "",
          role: data.data?.role || "",
          is_final: data.data?.is_final ?? null,
          content_preview: redact(String(data.data?.content || data.message || "").slice(0, 200)),
        })).catch(() => {});
        if (data.type === "connected") {
          sentAt = performance.now();
          const now = Date.now();
          sample.sent_at = new Date(now).toISOString();
          sample.sent_epoch_ms = now;
          client.send(JSON.stringify({
            type: "message",
            message: [{ type: "Plain", text: prompt }],
            stream,
          }));
          return;
        }
        if (data.type === "error") {
          finish("error", data.message || "WebSocket error message.");
          return;
        }
        if (data.type !== "response" || data.data?.role !== "assistant") return;
        const content = String(data.data.content || "");
        markFirstAssistantEvent(sample, sentAt);
        if (content) sample.response_text = content;
        if (content) markFirstAssistantContent(sample, sentAt);
        if (content.includes(expected) && sample.first_response_ms === null && sentAt > 0) {
          const now = Date.now();
          sample.first_response_at = new Date(now).toISOString();
          sample.first_response_epoch_ms = now;
          sample.first_response_ms = rounded(performance.now() - sentAt);
        }
        if (data.data.is_final === true) {
          const ok = sample.response_text.includes(expected);
          if (ok) {
            if (sample.first_response_ms === null && sentAt > 0) {
              sample.first_response_ms = rounded(performance.now() - sentAt);
            }
            finish("pass", "");
          } else if (matchesFailureSignal(sample.response_text, failureSignals)) {
            finish("app_error", `Assistant final response matched a failure signal: ${sample.response_text}`);
          } else if (failOnFinalMismatch && !containsLoadToken(sample.response_text, expectedPrefix)) {
            finish("mismatch", `Final assistant response did not include ${expected}: ${sample.response_text}`);
          } else {
            sample.foreign_response_count += 1;
            sample.last_foreign_response_text = sample.response_text;
          }
        }
      },
      onError(error) {
        finish("connection_error", `WebSocket connection error: ${error.message}`);
      },
      onClose(event) {
        sample.close_code = event.code;
        sample.close_reason = event.reason || "";
        if (!closed) finish("closed", `WebSocket closed before final assistant response: ${event.code}`);
      },
    });
    function finish(status, reason) {
      if (closed) return;
      closed = true;
      clearTimeout(timer);
      sample.status = status;
      sample.ok = status === "pass";
      sample.error = status === "timeout" && sample.foreign_response_count > 0
        ? `${reason || ""} Saw ${sample.foreign_response_count} foreign assistant response(s); last=${sample.last_foreign_response_text}`
        : reason || "";
      if (sentAt > 0) sample.response_duration_ms = rounded(performance.now() - sentAt);
      else sample.response_duration_ms = rounded(performance.now() - startedAt);
      const now = Date.now();
      sample.finished_at = new Date(now).toISOString();
      sample.finished_epoch_ms = now;
      try {
        client?.close();
      } catch {
        // Closing a failed socket should not hide the sample result.
      }
      resolve(sample);
    }
  });
 }
 function markFirstAssistantEvent(sample, sentAt) {
  if (sample.first_assistant_event_ms !== null || sentAt <= 0) return;
  const now = Date.now();
  sample.first_assistant_event_at = new Date(now).toISOString();
  sample.first_assistant_event_epoch_ms = now;
  sample.first_assistant_event_ms = rounded(performance.now() - sentAt);
 }
 function markFirstAssistantContent(sample, sentAt) {
  if (sample.first_assistant_content_ms !== null || sentAt <= 0) return;
  const now = Date.now();
  sample.first_assistant_content_at = new Date(now).toISOString();
  sample.first_assistant_content_epoch_ms = now;
  sample.first_assistant_content_ms = rounded(performance.now() - sentAt);
 }
 function containsLoadToken(text, prefix) {
  const escaped = String(prefix).replace(/[.*+?^${}()|[\]\\]/g, "\\$&");
  return new RegExp(`${escaped}-\\d{4}`).test(String(text || ""));
 }
 function matchesFailureSignal(text, signals) {
  const lower = String(text || "").toLowerCase();
  return signals.some((signal) => lower.includes(signal.toLowerCase()));
 }
 function openRawWebSocket(wsUrl, handlers) {
  const parsed = new URL(wsUrl);
  const secure = parsed.protocol === "wss:";
  const port = Number(parsed.port || (secure ? 443 : 80));
  const host = parsed.hostname;
  const path = `${parsed.pathname}${parsed.search}`;
  const key = crypto.randomBytes(16).toString("base64");
  const socket = secure
    ? tls.connect({ host, port, servername: host })
    : net.connect({ host, port });
  let opened = false;
  let closed = false;
  let buffer = Buffer.alloc(0);
  socket.setNoDelay(true);
  socket.on("connect", () => {
    const originProtocol = secure ? "https" : "http";
    const request = [
      `GET ${path} HTTP/1.1`,
      `Host: ${parsed.host}`,
      "Upgrade: websocket",
      "Connection: Upgrade",
      `Sec-WebSocket-Key: ${key}`,
      "Sec-WebSocket-Version: 13",
      `Origin: ${originProtocol}://${parsed.host}`,
      "",
      "",
    ].join("\r\n");
    socket.write(request);
  });
  socket.on("data", (chunk) => {
    buffer = Buffer.concat([buffer, chunk]);
    if (!opened) {
      const headerEnd = buffer.indexOf("\r\n\r\n");
      if (headerEnd === -1) return;
      const headerText = buffer.slice(0, headerEnd).toString("utf8");
      buffer = buffer.slice(headerEnd + 4);
      if (!/^HTTP\/1\.1 101\b/i.test(headerText)) {
        handlers.onError(new Error(`Handshake failed: ${headerText.split("\r\n")[0] || "missing status"}`));
        socket.destroy();
        return;
      }
      opened = true;
      handlers.onOpen();
    }
    processFrames();
  });
  socket.on("error", (error) => {
    if (!closed) handlers.onError(error);
  });
  socket.on("close", () => {
    if (closed) return;
    closed = true;
    handlers.onClose({ code: null, reason: "" });
  });
  function processFrames() {
    while (true) {
      const frame = readFrame(buffer);
      if (!frame) return;
      buffer = buffer.slice(frame.consumed);
      if (frame.opcode === 0x1) {
        handlers.onMessage(frame.payload.toString("utf8"));
      } else if (frame.opcode === 0x8) {
        const code = frame.payload.length >= 2 ? frame.payload.readUInt16BE(0) : null;
        const reason = frame.payload.length > 2 ? frame.payload.slice(2).toString("utf8") : "";
        closed = true;
        handlers.onClose({ code, reason });
        socket.end();
        return;
      } else if (frame.opcode === 0x9) {
        writeFrame(socket, 0xA, frame.payload);
      }
    }
  }
  return {
    send(text) {
      if (closed || !opened) return;
      writeFrame(socket, 0x1, Buffer.from(text, "utf8"));
    },
    close() {
      if (closed) return;
      closed = true;
      if (!socket.destroyed) {
        if (opened) writeFrame(socket, 0x8, Buffer.alloc(0));
        setTimeout(() => socket.end(), 50).unref();
      }
    },
  };
 }
 function readFrame(buffer) {
  if (buffer.length < 2) return null;
  const first = buffer[0];
  const second = buffer[1];
  const opcode = first & 0x0f;
  const masked = Boolean(second & 0x80);
  let length = second & 0x7f;
  let offset = 2;
  if (length === 126) {
    if (buffer.length < offset + 2) return null;
    length = buffer.readUInt16BE(offset);
    offset += 2;
  } else if (length === 127) {
    if (buffer.length < offset + 8) return null;
    const high = buffer.readUInt32BE(offset);
    const low = buffer.readUInt32BE(offset + 4);
    length = high * 2 ** 32 + low;
    offset += 8;
  }
  let mask = null;
  if (masked) {
    if (buffer.length < offset + 4) return null;
    mask = buffer.slice(offset, offset + 4);
    offset += 4;
  }
  if (buffer.length < offset + length) return null;
  let payload = buffer.slice(offset, offset + length);
  if (mask) {
    payload = Buffer.from(payload);
    for (let index = 0; index < payload.length; index += 1) {
      payload[index] ^= mask[index % 4];
    }
  }
  return {
    opcode,
    payload,
    consumed: offset + length,
  };
 }
 function writeFrame(socket, opcode, payload) {
  const body = Buffer.isBuffer(payload) ? payload : Buffer.from(payload || "");
  const mask = crypto.randomBytes(4);
  const headerLength = body.length < 126 ? 2 : body.length <= 0xffff ? 4 : 10;
  const header = Buffer.alloc(headerLength);
  header[0] = 0x80 | opcode;
  if (body.length < 126) {
    header[1] = 0x80 | body.length;
  } else if (body.length <= 0xffff) {
    header[1] = 0x80 | 126;
    header.writeUInt16BE(body.length, 2);
  } else {
    header[1] = 0x80 | 127;
    header.writeUInt32BE(Math.floor(body.length / 2 ** 32), 2);
    header.writeUInt32BE(body.length >>> 0, 6);
  }
  const masked = Buffer.from(body);
  for (let index = 0; index < masked.length; index += 1) {
    masked[index] ^= mask[index % 4];
  }
  socket.write(Buffer.concat([header, mask, masked]));
 }
 function rounded(value) {
  return Number(value.toFixed(3));
 }
 function percentile(values, percentileValue) {
  if (values.length === 0) return 0;
  const sorted = [...values].sort((a, b) => a - b);
  const index = Math.min(sorted.length - 1, Math.ceil((percentileValue / 100) * sorted.length) - 1);
  return rounded(sorted[index]);
 }
 function stats(values) {
  if (values.length === 0) return { min: 0, p50: 0, p95: 0, p99: 0, max: 0 };
  return {
    min: rounded(Math.min(...values)),
    p50: percentile(values, 50),
    p95: percentile(values, 95),
    p99: percentile(values, 99),
    max: rounded(Math.max(...values)),
  };
 }
 function buildMetrics({ samples, totalRequests, concurrency, timeoutMs, loadDurationMs, backendUrl, pipelineId, sessionType, fakeProviderState }) {
  const okSamples = samples.filter((sample) => sample.ok);
  const statusCounts = {};
  for (const sample of samples) {
    statusCounts[sample.status] = (statusCounts[sample.status] || 0) + 1;
  }
  const errorCount = samples.length - okSamples.length;
  return {
    probe: caseId,
    backend_url: backendUrl,
    pipeline_id: pipelineId,
    session_type: sessionType,
    total_requests: totalRequests,
    completed_requests: samples.length,
    concurrency,
    timeout_ms: timeoutMs,
    ok_count: okSamples.length,
    error_count: errorCount,
    timeout_count: samples.filter((sample) => sample.status === "timeout").length,
    error_rate: samples.length === 0 ? 1 : rounded(errorCount / samples.length),
    load_duration_ms: rounded(loadDurationMs),
    throughput_rps: loadDurationMs <= 0 ? 0 : rounded(okSamples.length / (loadDurationMs / 1000)),
    status_counts: statusCounts,
    connected_ms: stats(samples.map((sample) => sample.connected_ms).filter(Number.isFinite)),
    first_assistant_event_ms: stats(samples.map((sample) => sample.first_assistant_event_ms).filter(Number.isFinite)),
    first_assistant_content_ms: stats(samples.map((sample) => sample.first_assistant_content_ms).filter(Number.isFinite)),
    first_response_ms: stats(okSamples.map((sample) => sample.first_response_ms).filter(Number.isFinite)),
    response_duration_ms: stats(okSamples.map((sample) => sample.response_duration_ms).filter(Number.isFinite)),
    fake_provider: summarizeFakeProviderState(fakeProviderState),
    provider_timing: buildProviderTimingMetrics(samples, fakeProviderState),
    samples,
  };
 }
 function buildThresholds(metrics) {
  const thresholds = {
    error_rate: { actual: metrics.error_rate, max: maxErrorRate, pass: metrics.error_rate <= maxErrorRate },
    response_p95_ms: {
      actual: metrics.response_duration_ms.p95,
      max: responseP95BudgetMs,
      pass: metrics.ok_count > 0 && metrics.response_duration_ms.p95 <= responseP95BudgetMs,
    },
  };
  if (minErrorRate > 0) {
    thresholds.error_rate_min = {
      actual: metrics.error_rate,
      min: minErrorRate,
      pass: metrics.error_rate >= minErrorRate,
    };
  }
  if (minErrorCount > 0) {
    thresholds.error_count_min = {
      actual: metrics.error_count,
      min: minErrorCount,
      pass: metrics.error_count >= minErrorCount,
    };
  }
  if (minOkCount > 0) {
    thresholds.ok_count_min = {
      actual: metrics.ok_count,
      min: minOkCount,
      pass: metrics.ok_count >= minOkCount,
    };
  }
  if (minProviderFaultCount > 0) {
    const actual = metrics.fake_provider?.fault_count ?? 0;
    thresholds.fake_provider_fault_count_min = {
      actual,
      min: minProviderFaultCount,
      pass: actual >= minProviderFaultCount,
    };
  }
  if (firstResponseP95BudgetMs > 0) {
    thresholds.first_response_p95_ms = {
      actual: metrics.first_response_ms.p95,
      max: firstResponseP95BudgetMs,
      pass: metrics.ok_count > 0 && metrics.first_response_ms.p95 <= firstResponseP95BudgetMs,
    };
  }
  return thresholds;
 }
 function looksLikeEnvIssue(error) {
  const message = String(error?.message || error || "");
  return /fetch failed|ECONNREFUSED|ENOTFOUND|LANGBOT_.*not configured|Could not read recovery_key|Backend did not respond/i.test(message);
 }
 function safeReason(value) {
  return redact(String(value || "")).slice(0, 1000);
 }
@@ -0,0 +1,861 @@
 #!/usr/bin/env node
 import crypto from "node:crypto";
 import net from "node:net";
 import tls from "node:tls";
 import { mkdir, writeFile } from "node:fs/promises";
 import { resolve } from "node:path";
 import { env, exit } from "node:process";
 import {
  apiJson,
  appendLine,
  ensureEvidence,
  evidencePaths,
  loadEnvFiles,
  localIsoWithOffset,
  redact,
  resetAndAuthLocalUser,
  writeResult,
 } from "../../../scripts/e2e/lib/langbot-e2e.mjs";
 import {
  buildProviderTimingMetrics,
  summarizeFakeProviderState,
 } from "./lib/fake-provider-timing.mjs";
 const DEFAULT_LOCAL_PASSWORD = "LangBotE2ELocalPass!2026";
 await loadEnvFiles();
 const caseId = env.LBS_CASE_ID || "langbot-debug-chat-cross-pipeline-isolation";
 const paths = evidencePaths(caseId);
 await ensureEvidence(paths);
 const startedAt = new Date();
 const metricsPath = resolve(paths.evidenceDir, "metrics.json");
 const samplesPath = resolve(paths.evidenceDir, "samples.json");
 const fakeProviderStatePath = resolve(paths.evidenceDir, "fake-provider-state.json");
 const resetDiagnosticPath = resolve(paths.evidenceDir, "debug-chat-reset-diagnostic.json");
 const backendUrl = env.LANGBOT_BACKEND_URL || "";
 const fakeProviderUrl = env.LANGBOT_FAKE_PROVIDER_URL || "";
 const sessionType = env.LANGBOT_DEBUG_CHAT_LOAD_SESSION_TYPE || env.LANGBOT_E2E_DEBUG_CHAT_SESSION_TYPE || "person";
 const requestsPerPipeline = positiveInteger(env.LANGBOT_DEBUG_CHAT_LOAD_REQUESTS, 6);
 const concurrency = Math.min(requestsPerPipeline * 2, positiveInteger(env.LANGBOT_DEBUG_CHAT_LOAD_CONCURRENCY, 4));
 const timeoutMs = positiveInteger(env.LANGBOT_DEBUG_CHAT_LOAD_TIMEOUT_MS, 30_000);
 const stream = bool(env.LANGBOT_DEBUG_CHAT_LOAD_STREAM, true);
 const resetBeforeRun = bool(env.LANGBOT_DEBUG_CHAT_LOAD_RESET, true);
 const responseP95BudgetMs = positiveNumber(env.LANGBOT_DEBUG_CHAT_LOAD_RESPONSE_P95_MS, 5_000);
 const maxErrorRate = positiveNumber(env.LANGBOT_DEBUG_CHAT_LOAD_MAX_ERROR_RATE, 0);
 const promptTemplate = env.LANGBOT_DEBUG_CHAT_LOAD_PROMPT_TEMPLATE
  || "请只回复 \"{expected}\"，不要解释，不要添加其他字符。";
 const failureSignals = textList(env.LANGBOT_E2E_FAILURE_SIGNALS || env.LANGBOT_DEBUG_CHAT_LOAD_FAILURE_SIGNALS || "");
 const pipelineTargets = [
  {
    label: "A",
    expectedPrefix: "PIPEA",
    otherPrefix: "PIPEB",
    url: env.LANGBOT_FAKE_PROVIDER_PIPELINE_A_URL || "",
    name: env.LANGBOT_FAKE_PROVIDER_PIPELINE_A_NAME || "",
  },
  {
    label: "B",
    expectedPrefix: "PIPEB",
    otherPrefix: "PIPEA",
    url: env.LANGBOT_FAKE_PROVIDER_PIPELINE_B_URL || "",
    name: env.LANGBOT_FAKE_PROVIDER_PIPELINE_B_NAME || "",
  },
 ];
 const result = {
  source: "automation",
  case_id: caseId,
  run_id: paths.runId,
  status: "fail",
  reason: "",
  started_at: startedAt.toISOString(),
  started_at_local: localIsoWithOffset(startedAt),
  finished_at: "",
  finished_at_local: "",
  duration_ms: 0,
  backend_url: backendUrl,
  session_type: sessionType,
  pipelines: [],
  load_profile: {
    requests_per_pipeline: requestsPerPipeline,
    total_requests: requestsPerPipeline * 2,
    concurrency,
    timeout_ms: timeoutMs,
    stream,
    reset_before_run: resetBeforeRun,
  },
  evidence: {
    network_log: paths.networkLog,
    metrics_json: metricsPath,
    samples_json: samplesPath,
    fake_provider_state_json: fakeProviderStatePath,
    debug_chat_reset_diagnostic_json: resetDiagnosticPath,
    automation_result_json: paths.automationResultJson,
    result_json: paths.resultJson,
  },
  evidence_collected: ["metrics", "network", "api_diagnostic", "filesystem"],
 };
 try {
  if (!backendUrl) {
    result.status = "env_issue";
    throw new Error("LANGBOT_BACKEND_URL is not configured.");
  }
  if (!["person", "group"].includes(sessionType)) {
    throw new Error(`LANGBOT_DEBUG_CHAT_LOAD_SESSION_TYPE must be person or group, got ${sessionType}.`);
  }
  for (const target of pipelineTargets) {
    if (!target.url && !target.name) {
      result.status = "env_issue";
      throw new Error(`Set LANGBOT_FAKE_PROVIDER_PIPELINE_${target.label}_URL or LANGBOT_FAKE_PROVIDER_PIPELINE_${target.label}_NAME.`);
    }
  }
  const backendReady = await backendReachable(backendUrl);
  if (!backendReady) {
    result.status = "env_issue";
    throw new Error(`Backend did not respond at ${backendUrl}.`);
  }
  const user = env.LANGBOT_E2E_LOGIN_USER || "";
  const password = env.LANGBOT_E2E_LOGIN_PASSWORD || DEFAULT_LOCAL_PASSWORD;
  if (!user) {
    result.status = "env_issue";
    throw new Error("LANGBOT_E2E_LOGIN_USER is required so this probe can resolve/reset Debug Chat sessions.");
  }
  const auth = await resetAndAuthLocalUser({ backendUrl, user, password });
  const pipelines = [];
  for (const target of pipelineTargets) {
    const pipeline = await resolvePipeline({
      backendUrl,
      token: auth.token,
      pipelineUrl: target.url,
      pipelineName: target.name,
    });
    pipelines.push({
      ...target,
      id: pipeline.id,
      name: pipeline.name || target.name,
      wsUrl: websocketUrl(backendUrl, pipeline.id, sessionType),
    });
  }
  result.pipelines = pipelines.map((pipeline) => ({
    label: pipeline.label,
    id: pipeline.id,
    name: pipeline.name,
    url: pipeline.url,
  }));
  if (resetBeforeRun) {
    const resetDiagnostics = [];
    for (const pipeline of pipelines) {
      const reset = await apiJson(backendUrl, `/api/v1/pipelines/${encodeURIComponent(pipeline.id)}/ws/reset/${encodeURIComponent(sessionType)}`, {
        method: "POST",
        token: auth.token,
      });
      resetDiagnostics.push({
        pipeline_label: pipeline.label,
        pipeline_id: pipeline.id,
        status: isApiFailure(reset) ? "fail" : "ready",
        http_status: reset.status,
        code: reset.json.code ?? null,
        reason: isApiFailure(reset) ? reset.json.msg || "Debug Chat reset failed." : "Debug Chat session reset.",
      });
    }
    await writeFile(resetDiagnosticPath, `${JSON.stringify(resetDiagnostics, null, 2)}\n`, "utf8");
    const failedReset = resetDiagnostics.find((item) => item.status === "fail");
    if (failedReset) throw new Error(failedReset.reason);
  }
  await resetFakeProvider(fakeProviderUrl);
  const jobs = [];
  for (let index = 0; index < requestsPerPipeline; index += 1) {
    for (const pipeline of pipelines) {
      jobs.push({ ...pipeline, index });
    }
  }
  const loadStartedAt = performance.now();
  const samples = await runLoad({
    jobs,
    concurrency,
    timeoutMs,
    promptTemplate,
    stream,
    failureSignals,
  });
  const loadDurationMs = performance.now() - loadStartedAt;
  const fakeProviderState = await readFakeProviderState(fakeProviderUrl);
  if (fakeProviderState) {
    await writeFile(fakeProviderStatePath, `${JSON.stringify(fakeProviderState, null, 2)}\n`, "utf8");
  }
  const metrics = buildMetrics({
    samples,
    requestsPerPipeline,
    concurrency,
    timeoutMs,
    loadDurationMs,
    backendUrl,
    sessionType,
    fakeProviderState,
  });
  const thresholds = buildThresholds(metrics);
  const passed = Object.values(thresholds).every((item) => item.pass);
  result.status = passed ? "pass" : "fail";
  result.reason = passed
    ? "Debug Chat cross-pipeline isolation probe passed all thresholds."
    : "Debug Chat cross-pipeline isolation probe found leaks, errors, or latency threshold breaches.";
  result.metrics_summary = {
    requests_per_pipeline: metrics.requests_per_pipeline,
    total_requests: metrics.total_requests,
    concurrency: metrics.concurrency,
    ok_count: metrics.ok_count,
    error_count: metrics.error_count,
    cross_pipeline_leak_count: metrics.cross_pipeline_leak_count,
    timeout_count: metrics.timeout_count,
    error_rate: metrics.error_rate,
    response_p95_ms: metrics.response_duration_ms.p95,
    first_response_p95_ms: metrics.first_response_ms.p95,
    throughput_rps: metrics.throughput_rps,
    status_counts: metrics.status_counts,
    by_pipeline: metrics.by_pipeline,
    fake_provider_request_count: metrics.fake_provider?.request_count ?? null,
    fake_provider_duration_p95_ms: metrics.provider_timing?.provider_duration_ms.p95 ?? null,
    langbot_overhead_estimate_p95_ms: metrics.provider_timing?.langbot_overhead_estimate_ms.p95 ?? null,
    send_to_provider_start_p95_ms: metrics.provider_timing?.send_to_provider_start_ms.p95 ?? null,
    provider_finish_to_ws_final_p95_ms: metrics.provider_timing?.provider_finish_to_ws_final_ms.p95 ?? null,
  };
  result.thresholds_summary = thresholds;
  result.artifacts = {
    metrics_json: metricsPath,
    samples_json: samplesPath,
    fake_provider_state_json: fakeProviderState ? fakeProviderStatePath : "",
    network_log: paths.networkLog,
    automation_result_json: paths.automationResultJson,
    result_json: paths.resultJson,
  };
  await writeFile(metricsPath, `${JSON.stringify({ ...metrics, thresholds }, null, 2)}\n`, "utf8");
  await writeFile(samplesPath, `${JSON.stringify(samples, null, 2)}\n`, "utf8");
 } catch (error) {
  if (!["env_issue", "blocked"].includes(result.status)) {
    result.status = looksLikeEnvIssue(error) ? "env_issue" : "fail";
  }
  result.reason = result.reason || safeReason(error.message);
 } finally {
  const finishedAt = new Date();
  result.finished_at = finishedAt.toISOString();
  result.finished_at_local = localIsoWithOffset(finishedAt);
  result.duration_ms = finishedAt.getTime() - startedAt.getTime();
  await mkdir(paths.evidenceDir, { recursive: true });
  await writeResult(paths, result);
  console.log(JSON.stringify(result, null, 2));
 }
 exit(result.status === "pass" ? 0 : result.status === "env_issue" || result.status === "blocked" ? 2 : 1);
 async function backendReachable(baseUrl) {
  try {
    const response = await fetch(`${baseUrl.replace(/\/$/, "")}/healthz`, {
      signal: AbortSignal.timeout(3000),
    });
    return response.status < 500;
  } catch {
    return false;
  }
 }
 async function resetFakeProvider(rootUrl) {
  if (!rootUrl) return;
  try {
    await fetch(`${normalizeProviderRootUrl(rootUrl)}/__qa/reset`, {
      method: "POST",
      signal: AbortSignal.timeout(3000),
    });
  } catch {
    // Missing fake-provider diagnostics should not hide the isolation result.
  }
 }
 async function readFakeProviderState(rootUrl) {
  if (!rootUrl) return null;
  try {
    const response = await fetch(`${normalizeProviderRootUrl(rootUrl)}/__qa/config`, {
      signal: AbortSignal.timeout(3000),
    });
    const json = await response.json().catch(() => ({}));
    return {
      status: response.ok && json.ok === true ? "loaded" : "unavailable",
      url: normalizeProviderRootUrl(rootUrl),
      http_status: response.status,
      model: json.model || "",
      config: json.config || {},
      request_count: Number.isFinite(json.request_count) ? json.request_count : null,
      recent_requests: Array.isArray(json.recent_requests) ? json.recent_requests : [],
    };
  } catch (error) {
    return {
      status: "unavailable",
      url: normalizeProviderRootUrl(rootUrl),
      reason: safeReason(error.message),
      request_count: null,
      recent_requests: [],
    };
  }
 }
 function normalizeProviderRootUrl(value) {
  const trimmed = String(value || "").trim().replace(/\/$/, "");
  return trimmed.endsWith("/v1") ? trimmed.slice(0, -3) : trimmed;
 }
 function pipelineIdFromUrl(url) {
  if (!url) return "";
  try {
    const parsed = new URL(url);
    return parsed.searchParams.get("id") || "";
  } catch {
    return "";
  }
 }
 async function resolvePipeline({ backendUrl, token, pipelineUrl, pipelineName }) {
  const idFromUrl = pipelineIdFromUrl(pipelineUrl);
  if (idFromUrl) {
    const response = await apiJson(backendUrl, `/api/v1/pipelines/${encodeURIComponent(idFromUrl)}`, { token });
    const pipeline = response.json.data?.pipeline;
    if (isApiFailure(response) || !pipeline?.uuid) {
      throw new Error(response.json.msg || `Could not load pipeline ${idFromUrl}.`);
    }
    return { id: pipeline.uuid, name: pipeline.name || "" };
  }
  if (!pipelineName) {
    throw new Error("Set pipeline URL or name before running this probe.");
  }
  const response = await apiJson(backendUrl, "/api/v1/pipelines", { token });
  if (isApiFailure(response)) {
    throw new Error(response.json.msg || "Failed to list pipelines.");
  }
  const pipeline = (response.json.data?.pipelines || []).find((item) => item.name === pipelineName);
  if (!pipeline?.uuid) {
    throw new Error(`Could not find pipeline named ${pipelineName}.`);
  }
  return { id: pipeline.uuid, name: pipeline.name || pipelineName };
 }
 function isApiFailure(response) {
  return response.status >= 400 || (response.json.code !== undefined && response.json.code !== 0);
 }
 function websocketUrl(baseUrl, pipelineId, sessionTypeValue) {
  const parsed = new URL(baseUrl);
  parsed.protocol = parsed.protocol === "https:" ? "wss:" : "ws:";
  parsed.pathname = `/api/v1/pipelines/${encodeURIComponent(pipelineId)}/ws/connect`;
  parsed.search = `?session_type=${encodeURIComponent(sessionTypeValue)}`;
  return parsed.toString();
 }
 async function runLoad(options) {
  const samples = [];
  const queue = [...options.jobs];
  const workers = Array.from({ length: options.concurrency }, async () => {
    while (queue.length > 0) {
      const job = queue.shift();
      if (!job) continue;
      const sample = await runSingleRequest({ ...options, job });
      samples.push(sample);
    }
  });
  await Promise.all(workers);
  return samples.sort((left, right) => (
    left.pipeline_label.localeCompare(right.pipeline_label) || left.index - right.index
  ));
 }
 function expectedForIndex(prefix, index) {
  return `${prefix}-${String(index + 1).padStart(4, "0")}`;
 }
 function promptForIndex(template, expected) {
  return template.replaceAll("{expected}", expected);
 }
 function runSingleRequest({
  job,
  timeoutMs,
  promptTemplate,
  stream,
  failureSignals,
 }) {
  return new Promise((resolvePromise) => {
    const expected = expectedForIndex(job.expectedPrefix, job.index);
    const prompt = promptForIndex(promptTemplate, expected);
    const sample = {
      index: job.index,
      pipeline_label: job.label,
      pipeline_id: job.id,
      pipeline_name: job.name,
      status: "running",
      ok: false,
      expected_text: expected,
      expected_prefix: job.expectedPrefix,
      other_prefix: job.otherPrefix,
      prompt,
      response_text: "",
      started_at: new Date().toISOString(),
      started_epoch_ms: Date.now(),
      connected_at: null,
      connected_epoch_ms: null,
      sent_at: null,
      sent_epoch_ms: null,
      first_assistant_event_at: null,
      first_assistant_event_epoch_ms: null,
      first_assistant_event_ms: null,
      first_assistant_content_at: null,
      first_assistant_content_epoch_ms: null,
      first_assistant_content_ms: null,
      first_response_at: null,
      first_response_epoch_ms: null,
      connected_ms: null,
      first_response_ms: null,
      response_duration_ms: null,
      finished_at: null,
      finished_epoch_ms: null,
      event_count: 0,
      same_pipeline_foreign_response_count: 0,
      cross_pipeline_leak_count: 0,
      last_foreign_response_text: "",
      error: "",
      close_code: null,
      close_reason: "",
    };
    let closed = false;
    let connectedAt = 0;
    let sentAt = 0;
    const startedPerf = performance.now();
    let client = null;
    const timer = setTimeout(() => {
      finish("timeout", `Timed out after ${timeoutMs} ms.`);
    }, timeoutMs);
    client = openRawWebSocket(job.wsUrl, {
      onOpen() {
        connectedAt = performance.now();
        const now = Date.now();
        sample.connected_at = new Date(now).toISOString();
        sample.connected_epoch_ms = now;
        sample.connected_ms = rounded(connectedAt - startedPerf);
      },
      onMessage(text) {
        sample.event_count += 1;
        let data;
        try {
          data = JSON.parse(String(text || ""));
        } catch (error) {
          finish("error", `Invalid WebSocket JSON: ${error.message}`);
          return;
        }
        appendLine(paths.networkLog, JSON.stringify({
          pipeline_label: job.label,
          request_index: job.index,
          type: data.type,
          session_type: data.session_type || "",
          role: data.data?.role || "",
          is_final: data.data?.is_final ?? null,
          content_preview: redact(String(data.data?.content || data.message || "").slice(0, 200)),
        })).catch(() => {});
        if (data.type === "connected") {
          sentAt = performance.now();
          const now = Date.now();
          sample.sent_at = new Date(now).toISOString();
          sample.sent_epoch_ms = now;
          client.send(JSON.stringify({
            type: "message",
            message: [{ type: "Plain", text: prompt }],
            stream,
          }));
          return;
        }
        if (data.type === "error") {
          finish("error", data.message || "WebSocket error message.");
          return;
        }
        if (data.type !== "response" || data.data?.role !== "assistant") return;
        const content = String(data.data.content || "");
        markFirstAssistantEvent(sample, sentAt);
        if (content) sample.response_text = content;
        if (content) markFirstAssistantContent(sample, sentAt);
        if (containsPipelineToken(content, job.otherPrefix)) {
          sample.cross_pipeline_leak_count += 1;
          finish("cross_pipeline_leak", `Pipeline ${job.label} received response from ${job.otherPrefix}: ${content}`);
          return;
        }
        if (content.includes(expected) && sample.first_response_ms === null && sentAt > 0) {
          const now = Date.now();
          sample.first_response_at = new Date(now).toISOString();
          sample.first_response_epoch_ms = now;
          sample.first_response_ms = rounded(performance.now() - sentAt);
        }
        if (data.data.is_final === true) {
          const ok = sample.response_text.includes(expected);
          if (ok) {
            if (sample.first_response_ms === null && sentAt > 0) {
              const now = Date.now();
              sample.first_response_at = new Date(now).toISOString();
              sample.first_response_epoch_ms = now;
              sample.first_response_ms = rounded(performance.now() - sentAt);
            }
            finish("pass", "");
          } else if (matchesFailureSignal(sample.response_text, failureSignals)) {
            finish("app_error", `Assistant final response matched a failure signal: ${sample.response_text}`);
          } else if (containsPipelineToken(sample.response_text, job.expectedPrefix)) {
            sample.same_pipeline_foreign_response_count += 1;
            sample.last_foreign_response_text = sample.response_text;
          } else {
            finish("mismatch", `Final assistant response did not include ${expected}: ${sample.response_text}`);
          }
        }
      },
      onError(error) {
        finish("connection_error", `WebSocket connection error: ${error.message}`);
      },
      onClose(event) {
        sample.close_code = event.code;
        sample.close_reason = event.reason || "";
        if (!closed) finish("closed", `WebSocket closed before final assistant response: ${event.code}`);
      },
    });
    function finish(status, reason) {
      if (closed) return;
      closed = true;
      clearTimeout(timer);
      sample.status = status;
      sample.ok = status === "pass";
      sample.error = status === "timeout" && sample.same_pipeline_foreign_response_count > 0
        ? `${reason || ""} Saw ${sample.same_pipeline_foreign_response_count} same-pipeline foreign assistant response(s); last=${sample.last_foreign_response_text}`
        : reason || "";
      if (sentAt > 0) sample.response_duration_ms = rounded(performance.now() - sentAt);
      else sample.response_duration_ms = rounded(performance.now() - startedPerf);
      const now = Date.now();
      sample.finished_at = new Date(now).toISOString();
      sample.finished_epoch_ms = now;
      try {
        client?.close();
      } catch {
        // Closing a failed socket should not hide the sample result.
      }
      resolvePromise(sample);
    }
  });
 }
 function markFirstAssistantEvent(sample, sentAt) {
  if (sample.first_assistant_event_ms !== null || sentAt <= 0) return;
  const now = Date.now();
  sample.first_assistant_event_at = new Date(now).toISOString();
  sample.first_assistant_event_epoch_ms = now;
  sample.first_assistant_event_ms = rounded(performance.now() - sentAt);
 }
 function markFirstAssistantContent(sample, sentAt) {
  if (sample.first_assistant_content_ms !== null || sentAt <= 0) return;
  const now = Date.now();
  sample.first_assistant_content_at = new Date(now).toISOString();
  sample.first_assistant_content_epoch_ms = now;
  sample.first_assistant_content_ms = rounded(performance.now() - sentAt);
 }
 function containsPipelineToken(text, prefix) {
  const escaped = String(prefix).replace(/[.*+?^${}()|[\]\\]/g, "\\$&");
  return new RegExp(`${escaped}-\\d{4}`).test(String(text || ""));
 }
 function matchesFailureSignal(text, signals) {
  const lower = String(text || "").toLowerCase();
  return signals.some((signal) => lower.includes(signal.toLowerCase()));
 }
 function openRawWebSocket(wsUrl, handlers) {
  const parsed = new URL(wsUrl);
  const secure = parsed.protocol === "wss:";
  const port = Number(parsed.port || (secure ? 443 : 80));
  const host = parsed.hostname;
  const path = `${parsed.pathname}${parsed.search}`;
  const key = crypto.randomBytes(16).toString("base64");
  const socket = secure
    ? tls.connect({ host, port, servername: host })
    : net.connect({ host, port });
  let opened = false;
  let closed = false;
  let buffer = Buffer.alloc(0);
  socket.setNoDelay(true);
  socket.on("connect", () => {
    const originProtocol = secure ? "https" : "http";
    const request = [
      `GET ${path} HTTP/1.1`,
      `Host: ${parsed.host}`,
      "Upgrade: websocket",
      "Connection: Upgrade",
      `Sec-WebSocket-Key: ${key}`,
      "Sec-WebSocket-Version: 13",
      `Origin: ${originProtocol}://${parsed.host}`,
      "",
      "",
    ].join("\r\n");
    socket.write(request);
  });
  socket.on("data", (chunk) => {
    buffer = Buffer.concat([buffer, chunk]);
    if (!opened) {
      const headerEnd = buffer.indexOf("\r\n\r\n");
      if (headerEnd === -1) return;
      const headerText = buffer.slice(0, headerEnd).toString("utf8");
      buffer = buffer.slice(headerEnd + 4);
      if (!/^HTTP\/1\.1 101\b/i.test(headerText)) {
        handlers.onError(new Error(`Handshake failed: ${headerText.split("\r\n")[0] || "missing status"}`));
        socket.destroy();
        return;
      }
      opened = true;
      handlers.onOpen();
    }
    processFrames();
  });
  socket.on("error", (error) => {
    if (!closed) handlers.onError(error);
  });
  socket.on("close", () => {
    if (closed) return;
    closed = true;
    handlers.onClose({ code: null, reason: "" });
  });
  function processFrames() {
    while (true) {
      const frame = readFrame(buffer);
      if (!frame) return;
      buffer = buffer.slice(frame.consumed);
      if (frame.opcode === 0x1) {
        handlers.onMessage(frame.payload.toString("utf8"));
      } else if (frame.opcode === 0x8) {
        const code = frame.payload.length >= 2 ? frame.payload.readUInt16BE(0) : null;
        const reason = frame.payload.length > 2 ? frame.payload.slice(2).toString("utf8") : "";
        closed = true;
        handlers.onClose({ code, reason });
        socket.end();
        return;
      } else if (frame.opcode === 0x9) {
        writeFrame(socket, 0xA, frame.payload);
      }
    }
  }
  return {
    send(text) {
      if (closed || !opened) return;
      writeFrame(socket, 0x1, Buffer.from(text, "utf8"));
    },
    close() {
      if (closed) return;
      closed = true;
      if (!socket.destroyed) {
        if (opened) writeFrame(socket, 0x8, Buffer.alloc(0));
        setTimeout(() => socket.end(), 50).unref();
      }
    },
  };
 }
 function readFrame(buffer) {
  if (buffer.length < 2) return null;
  const first = buffer[0];
  const second = buffer[1];
  const opcode = first & 0x0f;
  const masked = Boolean(second & 0x80);
  let length = second & 0x7f;
  let offset = 2;
  if (length === 126) {
    if (buffer.length < offset + 2) return null;
    length = buffer.readUInt16BE(offset);
    offset += 2;
  } else if (length === 127) {
    if (buffer.length < offset + 8) return null;
    const high = buffer.readUInt32BE(offset);
    const low = buffer.readUInt32BE(offset + 4);
    length = high * 2 ** 32 + low;
    offset += 8;
  }
  let mask = null;
  if (masked) {
    if (buffer.length < offset + 4) return null;
    mask = buffer.slice(offset, offset + 4);
    offset += 4;
  }
  if (buffer.length < offset + length) return null;
  let payload = buffer.slice(offset, offset + length);
  if (mask) {
    payload = Buffer.from(payload);
    for (let index = 0; index < payload.length; index += 1) {
      payload[index] ^= mask[index % 4];
    }
  }
  return {
    opcode,
    payload,
    consumed: offset + length,
  };
 }
 function writeFrame(socket, opcode, payload) {
  const body = Buffer.isBuffer(payload) ? payload : Buffer.from(payload || "");
  const mask = crypto.randomBytes(4);
  const headerLength = body.length < 126 ? 2 : body.length <= 0xffff ? 4 : 10;
  const header = Buffer.alloc(headerLength);
  header[0] = 0x80 | opcode;
  if (body.length < 126) {
    header[1] = 0x80 | body.length;
  } else if (body.length <= 0xffff) {
    header[1] = 0x80 | 126;
    header.writeUInt16BE(body.length, 2);
  } else {
    header[1] = 0x80 | 127;
    header.writeUInt32BE(Math.floor(body.length / 2 ** 32), 2);
    header.writeUInt32BE(body.length >>> 0, 6);
  }
  const masked = Buffer.from(body);
  for (let index = 0; index < masked.length; index += 1) {
    masked[index] ^= mask[index % 4];
  }
  socket.write(Buffer.concat([header, mask, masked]));
 }
 function buildMetrics({ samples, requestsPerPipeline, concurrency, timeoutMs, loadDurationMs, backendUrl, sessionType, fakeProviderState }) {
  const okSamples = samples.filter((sample) => sample.ok);
  const statusCounts = {};
  const byPipeline = {};
  for (const sample of samples) {
    statusCounts[sample.status] = (statusCounts[sample.status] || 0) + 1;
    if (!byPipeline[sample.pipeline_label]) {
      byPipeline[sample.pipeline_label] = {
        ok_count: 0,
        error_count: 0,
        cross_pipeline_leak_count: 0,
        timeout_count: 0,
      };
    }
    if (sample.ok) byPipeline[sample.pipeline_label].ok_count += 1;
    else byPipeline[sample.pipeline_label].error_count += 1;
    byPipeline[sample.pipeline_label].cross_pipeline_leak_count += sample.cross_pipeline_leak_count || 0;
    if (sample.status === "timeout") byPipeline[sample.pipeline_label].timeout_count += 1;
  }
  const errorCount = samples.length - okSamples.length;
  return {
    probe: caseId,
    backend_url: backendUrl,
    session_type: sessionType,
    requests_per_pipeline: requestsPerPipeline,
    total_requests: requestsPerPipeline * 2,
    completed_requests: samples.length,
    concurrency,
    timeout_ms: timeoutMs,
    ok_count: okSamples.length,
    error_count: errorCount,
    timeout_count: samples.filter((sample) => sample.status === "timeout").length,
    cross_pipeline_leak_count: samples.reduce((count, sample) => count + (sample.cross_pipeline_leak_count || 0), 0),
    error_rate: samples.length === 0 ? 1 : rounded(errorCount / samples.length),
    load_duration_ms: rounded(loadDurationMs),
    throughput_rps: loadDurationMs <= 0 ? 0 : rounded(okSamples.length / (loadDurationMs / 1000)),
    status_counts: statusCounts,
    by_pipeline: byPipeline,
    connected_ms: stats(samples.map((sample) => sample.connected_ms).filter(Number.isFinite)),
    first_assistant_event_ms: stats(samples.map((sample) => sample.first_assistant_event_ms).filter(Number.isFinite)),
    first_assistant_content_ms: stats(samples.map((sample) => sample.first_assistant_content_ms).filter(Number.isFinite)),
    first_response_ms: stats(okSamples.map((sample) => sample.first_response_ms).filter(Number.isFinite)),
    response_duration_ms: stats(okSamples.map((sample) => sample.response_duration_ms).filter(Number.isFinite)),
    fake_provider: summarizeFakeProviderState(fakeProviderState),
    provider_timing: buildProviderTimingMetrics(samples, fakeProviderState),
    samples,
  };
 }
 function buildThresholds(metrics) {
  return {
    cross_pipeline_leak_count: {
      actual: metrics.cross_pipeline_leak_count,
      max: 0,
      pass: metrics.cross_pipeline_leak_count === 0,
    },
    error_rate: {
      actual: metrics.error_rate,
      max: maxErrorRate,
      pass: metrics.error_rate <= maxErrorRate,
    },
    response_p95_ms: {
      actual: metrics.response_duration_ms.p95,
      max: responseP95BudgetMs,
      pass: metrics.ok_count > 0 && metrics.response_duration_ms.p95 <= responseP95BudgetMs,
    },
  };
 }
 function positiveInteger(value, fallback) {
  const parsed = Number.parseInt(String(value || ""), 10);
  return Number.isInteger(parsed) && parsed > 0 ? parsed : fallback;
 }
 function positiveNumber(value, fallback) {
  const parsed = Number(value || "");
  return Number.isFinite(parsed) && parsed >= 0 ? parsed : fallback;
 }
 function bool(value, fallback) {
  if (value === undefined || value === "") return fallback;
  if (/^(1|true|yes|on)$/i.test(String(value))) return true;
  if (/^(0|false|no|off)$/i.test(String(value))) return false;
  return fallback;
 }
 function textList(value) {
  return String(value || "")
    .split(/\r?\n|,/)
    .map((item) => item.trim())
    .filter(Boolean);
 }
 function rounded(value) {
  return Number(value.toFixed(3));
 }
 function percentile(values, percentileValue) {
  if (values.length === 0) return 0;
  const sorted = [...values].sort((a, b) => a - b);
  const index = Math.min(sorted.length - 1, Math.ceil((percentileValue / 100) * sorted.length) - 1);
  return rounded(sorted[index]);
 }
 function stats(values) {
  if (values.length === 0) return { min: 0, p50: 0, p95: 0, p99: 0, max: 0 };
  return {
    min: rounded(Math.min(...values)),
    p50: percentile(values, 50),
    p95: percentile(values, 95),
    p99: percentile(values, 99),
    max: rounded(Math.max(...values)),
  };
 }
 function looksLikeEnvIssue(error) {
  const message = String(error?.message || error || "");
  return /fetch failed|ECONNREFUSED|ENOTFOUND|LANGBOT_.*not configured|Could not read recovery_key|Backend did not respond/i.test(message);
 }
 function safeReason(value) {
  return redact(String(value || "")).slice(0, 1000);
 }
@@ -0,0 +1,159 @@
 #!/usr/bin/env node
 import { mkdir, writeFile } from "node:fs/promises";
 import { join, resolve } from "node:path";
 import { env, exit } from "node:process";
 function pad(value, size = 2) {
  return String(value).padStart(size, "0");
 }
 function localIsoWithOffset(date = new Date()) {
  const offsetMinutes = -date.getTimezoneOffset();
  const sign = offsetMinutes >= 0 ? "+" : "-";
  const absolute = Math.abs(offsetMinutes);
  return [
    `${date.getFullYear()}-${pad(date.getMonth() + 1)}-${pad(date.getDate())}`,
    `T${pad(date.getHours())}:${pad(date.getMinutes())}:${pad(date.getSeconds())}.${pad(date.getMilliseconds(), 3)}`,
    `${sign}${pad(Math.floor(absolute / 60))}:${pad(absolute % 60)}`,
  ].join("");
 }
 function timestampSlug(date = new Date()) {
  return date.toISOString().replace(/\.\d{3}Z$/, "Z").replace(/[^0-9A-Za-z]+/g, "-").replace(/^-|-$/g, "");
 }
 const scenarios = [
  {
    id: "provider-timeout",
    target: "provider",
    injected_fault: "fake provider request exceeds the configured timeout",
    expected_status: "env_issue",
    recovery_check: "provider route is reachable or the case remains outside product pass/fail",
    cleanup: "stop fake provider or reset proxy route",
  },
  {
    id: "plugin-runtime-disconnect",
    target: "plugin-runtime",
    injected_fault: "runtime control channel disconnects during an action",
    expected_status: "fail",
    recovery_check: "runtime reconnects and a deterministic plugin action succeeds",
    cleanup: "restart the local plugin runtime process",
  },
  {
    id: "mcp-stdio-server-exit",
    target: "mcp",
    injected_fault: "stdio server exits mid-call",
    expected_status: "fail",
    recovery_check: "server can be registered again and exposes the expected tool",
    cleanup: "remove temporary MCP server registration",
  },
  {
    id: "operator-missing-login",
    target: "webui",
    injected_fault: "browser profile is not authenticated",
    expected_status: "blocked",
    recovery_check: "authenticated profile can open the same WebUI origin",
    cleanup: "no product cleanup; refresh local login state",
  },
  {
    id: "transient-marketplace-timeout",
    target: "marketplace",
    injected_fault: "marketplace request times out once and then succeeds",
    expected_status: "flaky",
    recovery_check: "rerun passes with the same product revision and no code change",
    cleanup: "clear retry-only evidence and keep the run classified as flaky",
  },
 ];
 function validateScenario(scenario) {
  const missing = ["id", "target", "injected_fault", "expected_status", "recovery_check", "cleanup"]
    .filter((key) => !scenario[key]);
  const allowedStatuses = new Set(["pass", "fail", "blocked", "env_issue", "flaky"]);
  return {
    id: scenario.id,
    pass: missing.length === 0 && allowedStatuses.has(scenario.expected_status),
    missing,
    expected_status: scenario.expected_status,
  };
 }
 async function main() {
  const root = resolve(env.LBS_ROOT || process.cwd());
  const caseId = "langbot-fault-taxonomy-contract";
  const runId = env.LBS_RUN_ID || `${timestampSlug()}-${caseId}`;
  const evidenceDir = resolve(env.LBS_EVIDENCE_DIR || join(root, "reports", "evidence", runId));
  await mkdir(evidenceDir, { recursive: true });
  const startedAt = new Date();
  const validations = scenarios.map(validateScenario);
  const statusCounts = {};
  for (const scenario of scenarios) {
    statusCounts[scenario.expected_status] = (statusCounts[scenario.expected_status] || 0) + 1;
  }
  const metrics = {
    probe: caseId,
    scenario_count: scenarios.length,
    status_counts: statusCounts,
    scenarios,
    validations,
  };
  const thresholds = {
    scenario_count: { actual: scenarios.length, min: 5, pass: scenarios.length >= 5 },
    invalid_scenario_count: {
      actual: validations.filter((item) => !item.pass).length,
      max: 0,
      pass: validations.every((item) => item.pass),
    },
    cleanup_declared_count: {
      actual: scenarios.filter((item) => item.cleanup).length,
      min: scenarios.length,
      pass: scenarios.every((item) => item.cleanup),
    },
  };
  const status = Object.values(thresholds).every((item) => item.pass) ? "pass" : "fail";
  const metricsPath = join(evidenceDir, "metrics.json");
  const faultModelPath = join(evidenceDir, "fault-model.json");
  const automationResultPath = join(evidenceDir, "automation-result.json");
  const resultPath = join(evidenceDir, "result.json");
  await writeFile(metricsPath, `${JSON.stringify(metrics, null, 2)}\n`, "utf8");
  await writeFile(faultModelPath, `${JSON.stringify({ scenarios }, null, 2)}\n`, "utf8");
  const finishedAt = new Date();
  const result = {
    source: "automation",
    case_id: caseId,
    run_id: runId,
    status,
    reason: status === "pass"
      ? "Fault taxonomy contract declares status, recovery, and cleanup for every scenario."
      : "Fault taxonomy contract is missing required scenario fields.",
    started_at: startedAt.toISOString(),
    started_at_local: localIsoWithOffset(startedAt),
    finished_at: finishedAt.toISOString(),
    finished_at_local: localIsoWithOffset(finishedAt),
    duration_ms: finishedAt.getTime() - startedAt.getTime(),
    metrics_summary: {
      scenario_count: metrics.scenario_count,
      status_counts: metrics.status_counts,
      invalid_scenario_count: thresholds.invalid_scenario_count.actual,
    },
    thresholds_summary: thresholds,
    artifacts: {
      metrics_json: metricsPath,
      fault_model_json: faultModelPath,
      automation_result_json: automationResultPath,
      result_json: resultPath,
    },
    evidence_collected: ["metrics", "filesystem"],
  };
  const resultText = `${JSON.stringify(result, null, 2)}\n`;
  await writeFile(automationResultPath, resultText, "utf8");
  await writeFile(resultPath, resultText, "utf8");
  console.log(JSON.stringify(result, null, 2));
  exit(status === "pass" ? 0 : 1);
 }
 await main();
@@ -0,0 +1,212 @@
 #!/usr/bin/env node
 import { mkdir, writeFile } from "node:fs/promises";
 import { join, resolve } from "node:path";
 import { env, exit } from "node:process";
 function pad(value, size = 2) {
  return String(value).padStart(size, "0");
 }
 function localIsoWithOffset(date = new Date()) {
  const offsetMinutes = -date.getTimezoneOffset();
  const sign = offsetMinutes >= 0 ? "+" : "-";
  const absolute = Math.abs(offsetMinutes);
  return [
    `${date.getFullYear()}-${pad(date.getMonth() + 1)}-${pad(date.getDate())}`,
    `T${pad(date.getHours())}:${pad(date.getMinutes())}:${pad(date.getSeconds())}.${pad(date.getMilliseconds(), 3)}`,
    `${sign}${pad(Math.floor(absolute / 60))}:${pad(absolute % 60)}`,
  ].join("");
 }
 function timestampSlug(date = new Date()) {
  return date.toISOString().replace(/\.\d{3}Z$/, "Z").replace(/[^0-9A-Za-z]+/g, "-").replace(/^-|-$/g, "");
 }
 function percentile(values, percentileValue) {
  if (values.length === 0) return 0;
  const sorted = [...values].sort((a, b) => a - b);
  const index = Math.min(sorted.length - 1, Math.ceil((percentileValue / 100) * sorted.length) - 1);
  return Number(sorted[index].toFixed(3));
 }
 function stats(values) {
  if (values.length === 0) return { min: 0, p50: 0, p95: 0, p99: 0, max: 0 };
  return {
    min: Number(Math.min(...values).toFixed(3)),
    p50: percentile(values, 50),
    p95: percentile(values, 95),
    p99: percentile(values, 99),
    max: Number(Math.max(...values).toFixed(3)),
  };
 }
 function parseJsonList(value, fallback) {
  if (!value) return fallback;
  try {
    const parsed = JSON.parse(value);
    return Array.isArray(parsed) && parsed.every((item) => typeof item === "string") ? parsed : fallback;
  } catch {
    return fallback;
  }
 }
 function joinUrl(baseUrl, path) {
  const base = baseUrl.replace(/\/+$/, "");
  const suffix = path.startsWith("/") ? path : `/${path}`;
  return `${base}${suffix}`;
 }
 async function fetchOnce(url, timeoutMs) {
  const controller = new AbortController();
  const timeout = setTimeout(() => controller.abort(), timeoutMs);
  const started = performance.now();
  try {
    const response = await fetch(url, { method: "GET", signal: controller.signal });
    await response.arrayBuffer();
    const latencyMs = performance.now() - started;
    return {
      url,
      ok: response.status < 500,
      status: response.status,
      latency_ms: Number(latencyMs.toFixed(3)),
      error: "",
    };
  } catch (error) {
    const latencyMs = performance.now() - started;
    return {
      url,
      ok: false,
      status: 0,
      latency_ms: Number(latencyMs.toFixed(3)),
      error: error instanceof Error ? error.message : String(error),
    };
  } finally {
    clearTimeout(timeout);
  }
 }
 async function runBatches(urls, totalRequests, concurrency, timeoutMs) {
  const queue = Array.from({ length: totalRequests }, (_, index) => urls[index % urls.length]);
  const results = [];
  while (queue.length > 0) {
    const batch = queue.splice(0, concurrency);
    results.push(...await Promise.all(batch.map((url) => fetchOnce(url, timeoutMs))));
  }
  return results;
 }
 async function main() {
  const root = resolve(env.LBS_ROOT || process.cwd());
  const caseId = "langbot-live-backend-latency";
  const runId = env.LBS_RUN_ID || `${timestampSlug()}-${caseId}`;
  const evidenceDir = resolve(env.LBS_EVIDENCE_DIR || join(root, "reports", "evidence", runId));
  await mkdir(evidenceDir, { recursive: true });
  const startedAt = new Date();
  const backendUrl = env.LANGBOT_BACKEND_URL || "";
  const endpoints = parseJsonList(env.LANGBOT_PERF_ENDPOINTS_JSON, ["/healthz"]);
  const totalRequests = Number(env.LANGBOT_PERF_REQUESTS || "12");
  const concurrency = Number(env.LANGBOT_PERF_CONCURRENCY || "2");
  const timeoutMs = Number(env.LANGBOT_PERF_TIMEOUT_MS || "5000");
  const p95BudgetMs = Number(env.LANGBOT_PERF_BACKEND_P95_MS || "1000");
  const maxErrorRate = Number(env.LANGBOT_PERF_MAX_ERROR_RATE || "0");
  const metricsPath = join(evidenceDir, "metrics.json");
  const networkLogPath = join(evidenceDir, "network.log");
  const automationResultPath = join(evidenceDir, "automation-result.json");
  const resultPath = join(evidenceDir, "result.json");
  let status = "fail";
  let reason = "";
  let results = [];
  if (!backendUrl) {
    status = "env_issue";
    reason = "LANGBOT_BACKEND_URL is not configured.";
  } else {
    const urls = endpoints.map((path) => joinUrl(backendUrl, path));
    results = await runBatches(urls, totalRequests, concurrency, timeoutMs);
    const okCount = results.filter((item) => item.ok).length;
    const errorCount = results.length - okCount;
    const errorRate = results.length === 0 ? 1 : errorCount / results.length;
    const latencies = results.filter((item) => item.ok).map((item) => item.latency_ms);
    const latencyStats = stats(latencies);
    const allConnectionFailures = results.length > 0 && results.every((item) => item.status === 0);
    if (allConnectionFailures) {
      status = "env_issue";
      reason = `Backend did not respond at ${backendUrl}.`;
    } else if (latencyStats.p95 <= p95BudgetMs && errorRate <= maxErrorRate) {
      status = "pass";
      reason = "Live backend latency probe passed all thresholds.";
    } else {
      status = "fail";
      reason = "Live backend latency probe breached latency or error-rate thresholds.";
    }
  }
  const statusCounts = {};
  for (const item of results) {
    const key = item.status === 0 ? "network_error" : String(item.status);
    statusCounts[key] = (statusCounts[key] || 0) + 1;
  }
  const okResults = results.filter((item) => item.ok);
  const metrics = {
    probe: caseId,
    backend_url: backendUrl,
    endpoints,
    total_requests: totalRequests,
    concurrency,
    timeout_ms: timeoutMs,
    ok_count: okResults.length,
    error_count: results.length - okResults.length,
    error_rate: results.length === 0 ? 1 : Number(((results.length - okResults.length) / results.length).toFixed(4)),
    latency_ms: stats(okResults.map((item) => item.latency_ms)),
    status_counts: statusCounts,
  };
  const thresholds = {
    backend_p95_ms: { actual: metrics.latency_ms.p95, max: p95BudgetMs, pass: metrics.latency_ms.p95 <= p95BudgetMs },
    error_rate: { actual: metrics.error_rate, max: maxErrorRate, pass: metrics.error_rate <= maxErrorRate },
  };
  await writeFile(metricsPath, `${JSON.stringify({ ...metrics, samples: results }, null, 2)}\n`, "utf8");
  await writeFile(networkLogPath, results.map((item) => JSON.stringify(item)).join("\n") + (results.length > 0 ? "\n" : ""), "utf8");
  const finishedAt = new Date();
  const result = {
    source: "automation",
    case_id: caseId,
    run_id: runId,
    status,
    reason,
    started_at: startedAt.toISOString(),
    started_at_local: localIsoWithOffset(startedAt),
    finished_at: finishedAt.toISOString(),
    finished_at_local: localIsoWithOffset(finishedAt),
    duration_ms: finishedAt.getTime() - startedAt.getTime(),
    url: backendUrl,
    metrics_summary: {
      requests: metrics.total_requests,
      concurrency: metrics.concurrency,
      ok_count: metrics.ok_count,
      error_rate: metrics.error_rate,
      latency_p50_ms: metrics.latency_ms.p50,
      latency_p95_ms: metrics.latency_ms.p95,
      status_counts: metrics.status_counts,
    },
    thresholds_summary: thresholds,
    artifacts: {
      metrics_json: metricsPath,
      network_log: networkLogPath,
      automation_result_json: automationResultPath,
      result_json: resultPath,
    },
    evidence_collected: ["metrics", "network", "api_diagnostic", "filesystem"],
  };
  const resultText = `${JSON.stringify(result, null, 2)}\n`;
  await writeFile(automationResultPath, resultText, "utf8");
  await writeFile(resultPath, resultText, "utf8");
  console.log(JSON.stringify(result, null, 2));
  exit(status === "pass" ? 0 : status === "env_issue" ? 2 : 1);
 }
 await main();
@@ -0,0 +1,205 @@
 #!/usr/bin/env node
 import { existsSync, readdirSync, statSync } from "node:fs";
 import { mkdir, readFile, writeFile } from "node:fs/promises";
 import { join, resolve } from "node:path";
 import { env, exit } from "node:process";
 function pad(value, size = 2) {
  return String(value).padStart(size, "0");
 }
 function localIsoWithOffset(date = new Date()) {
  const offsetMinutes = -date.getTimezoneOffset();
  const sign = offsetMinutes >= 0 ? "+" : "-";
  const absolute = Math.abs(offsetMinutes);
  return [
    `${date.getFullYear()}-${pad(date.getMonth() + 1)}-${pad(date.getDate())}`,
    `T${pad(date.getHours())}:${pad(date.getMinutes())}:${pad(date.getSeconds())}.${pad(date.getMilliseconds(), 3)}`,
    `${sign}${pad(Math.floor(absolute / 60))}:${pad(absolute % 60)}`,
  ].join("");
 }
 function timestampSlug(date = new Date()) {
  return date.toISOString().replace(/\.\d{3}Z$/, "Z").replace(/[^0-9A-Za-z]+/g, "-").replace(/^-|-$/g, "");
 }
 function repoRootFromEnv(root) {
  return env.LANGBOT_REPO ? resolve(env.LANGBOT_REPO) : resolve(root, "..");
 }
 function latestBackendLog(root) {
  const explicit = env.LANGBOT_BACKEND_LOG;
  if (explicit) return resolve(explicit);
  const logsDir = join(repoRootFromEnv(root), "data", "logs");
  if (!existsSync(logsDir)) return "";
  const candidates = readdirSync(logsDir)
    .filter((name) => /^langbot-.*\.log$/.test(name))
    .map((name) => join(logsDir, name))
    .filter((path) => {
      try {
        return statSync(path).isFile();
      } catch {
        return false;
      }
    })
    .sort((left, right) => statSync(right).mtimeMs - statSync(left).mtimeMs);
  return candidates[0] || "";
 }
 function parseSince(startedAt) {
  if (env.LANGBOT_BACKEND_LOG_SINCE) return new Date(env.LANGBOT_BACKEND_LOG_SINCE);
  const lookbackSeconds = Number(env.LANGBOT_BACKEND_LOG_LOOKBACK_SECONDS || "300");
  return new Date(startedAt.getTime() - lookbackSeconds * 1000);
 }
 function parseTimestamp(line, year) {
  const localMatch = line.match(/^\[(\d{2})-(\d{2}) (\d{2}):(\d{2}):(\d{2})\.(\d{3})\]/);
  if (localMatch) {
    const [, month, day, hour, minute, second, millisecond] = localMatch;
    return new Date(`${year}-${month}-${day}T${hour}:${minute}:${second}.${millisecond}+08:00`);
  }
  const accessMatch = line.match(/^\[(\d{4})-(\d{2})-(\d{2}) (\d{2}):(\d{2}):(\d{2}) ([+-]\d{4})\]/);
  if (accessMatch) {
    const [, fullYear, month, day, hour, minute, second, offset] = accessMatch;
    const normalizedOffset = `${offset.slice(0, 3)}:${offset.slice(3)}`;
    return new Date(`${fullYear}-${month}-${day}T${hour}:${minute}:${second}${normalizedOffset}`);
  }
  return null;
 }
 function findingForLine(line, number) {
  const rules = [
    { severity: "fail", kind: "python_traceback", pattern: /\bTraceback(?: \(most recent call last\))?/i },
    { severity: "fail", kind: "unretrieved_task_exception", pattern: /Task exception was never retrieved/i },
    { severity: "fail", kind: "unawaited_coroutine", pattern: /RuntimeWarning:\s+coroutine .* was never awaited/i },
    { severity: "fail", kind: "unclosed_client_session", pattern: /Unclosed client session/i },
    { severity: "fail", kind: "unclosed_connector", pattern: /Unclosed connector/i },
    { severity: "fail", kind: "import_error", pattern: /\bImportError\b/i },
    { severity: "fail", kind: "error_log", pattern: /\b(?:ERROR|CRITICAL)\b/ },
    { severity: "warning", kind: "warning_log", pattern: /\bWARNING\b/ },
  ];
  for (const rule of rules) {
    if (rule.pattern.test(line)) {
      return {
        severity: rule.severity,
        kind: rule.kind,
        line: number,
        excerpt: line,
      };
    }
  }
  return null;
 }
 function scanLines(text, since, year) {
  const findings = [];
  const scanned = [];
  let includeContinuation = false;
  const lines = text.split(/\r?\n/);
  for (const [index, line] of lines.entries()) {
    const number = index + 1;
    const timestamp = parseTimestamp(line, year);
    if (timestamp) includeContinuation = timestamp >= since;
    if (!includeContinuation) continue;
    scanned.push({ number, text: line });
    const finding = findingForLine(line, number);
    if (finding) findings.push(finding);
  }
  return { findings, scanned, total_lines: lines.length };
 }
 async function main() {
  const root = resolve(env.LBS_ROOT || process.cwd());
  const caseId = "langbot-live-backend-log-health";
  const runId = env.LBS_RUN_ID || `${timestampSlug()}-${caseId}`;
  const evidenceDir = resolve(env.LBS_EVIDENCE_DIR || join(root, "reports", "evidence", runId));
  await mkdir(evidenceDir, { recursive: true });
  const startedAt = new Date();
  const since = parseSince(startedAt);
  const logPath = latestBackendLog(root);
  const metricsPath = join(evidenceDir, "metrics.json");
  const findingsPath = join(evidenceDir, "findings.json");
  const scannedLogPath = join(evidenceDir, "scanned-backend.log");
  const automationResultPath = join(evidenceDir, "automation-result.json");
  const resultPath = join(evidenceDir, "result.json");
  let status = "fail";
  let reason = "";
  let scan = { findings: [], scanned: [], total_lines: 0 };
  if (!logPath || !existsSync(logPath)) {
    status = "env_issue";
    reason = "No LangBot backend log file was found. Set LANGBOT_BACKEND_LOG or LANGBOT_REPO.";
  } else {
    const text = await readFile(logPath, "utf8");
    scan = scanLines(text, since, startedAt.getFullYear());
    const failCount = scan.findings.filter((item) => item.severity === "fail").length;
    status = failCount === 0 ? "pass" : "fail";
    reason = status === "pass"
      ? "Live backend log health passed; no fail-severity findings in the scanned window."
      : "Live backend log health found fail-severity backend log findings.";
  }
  const warningCount = scan.findings.filter((item) => item.severity === "warning").length;
  const failCount = scan.findings.filter((item) => item.severity === "fail").length;
  const metrics = {
    probe: caseId,
    backend_log: logPath,
    since: since.toISOString(),
    scanned_line_count: scan.scanned.length,
    total_line_count: scan.total_lines,
    fail_count: failCount,
    warning_count: warningCount,
    finding_count: scan.findings.length,
  };
  const thresholds = {
    fail_count: { actual: failCount, max: 0, pass: failCount === 0 },
  };
  await writeFile(metricsPath, `${JSON.stringify(metrics, null, 2)}\n`, "utf8");
  await writeFile(findingsPath, `${JSON.stringify(scan.findings, null, 2)}\n`, "utf8");
  await writeFile(scannedLogPath, scan.scanned.map((item) => `${item.number}: ${item.text}`).join("\n") + (scan.scanned.length > 0 ? "\n" : ""), "utf8");
  const finishedAt = new Date();
  const result = {
    source: "automation",
    case_id: caseId,
    run_id: runId,
    status,
    reason,
    started_at: startedAt.toISOString(),
    started_at_local: localIsoWithOffset(startedAt),
    finished_at: finishedAt.toISOString(),
    finished_at_local: localIsoWithOffset(finishedAt),
    duration_ms: finishedAt.getTime() - startedAt.getTime(),
    url: logPath,
    metrics_summary: {
      scanned_line_count: metrics.scanned_line_count,
      fail_count: metrics.fail_count,
      warning_count: metrics.warning_count,
      finding_count: metrics.finding_count,
    },
    thresholds_summary: thresholds,
    artifacts: {
      metrics_json: metricsPath,
      findings_json: findingsPath,
      scanned_backend_log: scannedLogPath,
      automation_result_json: automationResultPath,
      result_json: resultPath,
    },
    evidence_collected: ["metrics", "backend_log", "filesystem"],
  };
  const resultText = `${JSON.stringify(result, null, 2)}\n`;
  await writeFile(automationResultPath, resultText, "utf8");
  await writeFile(resultPath, resultText, "utf8");
  console.log(JSON.stringify(result, null, 2));
  exit(status === "pass" ? 0 : status === "env_issue" ? 2 : 1);
 }
 await main();
@@ -0,0 +1,311 @@
 #!/usr/bin/env node
 import { mkdir, writeFile } from "node:fs/promises";
 import { join, resolve } from "node:path";
 import { env, exit } from "node:process";
 function pad(value, size = 2) {
  return String(value).padStart(size, "0");
 }
 function localIsoWithOffset(date = new Date()) {
  const offsetMinutes = -date.getTimezoneOffset();
  const sign = offsetMinutes >= 0 ? "+" : "-";
  const absolute = Math.abs(offsetMinutes);
  return [
    `${date.getFullYear()}-${pad(date.getMonth() + 1)}-${pad(date.getDate())}`,
    `T${pad(date.getHours())}:${pad(date.getMinutes())}:${pad(date.getSeconds())}.${pad(date.getMilliseconds(), 3)}`,
    `${sign}${pad(Math.floor(absolute / 60))}:${pad(absolute % 60)}`,
  ].join("");
 }
 function timestampSlug(date = new Date()) {
  return date.toISOString().replace(/\.\d{3}Z$/, "Z").replace(/[^0-9A-Za-z]+/g, "-").replace(/^-|-$/g, "");
 }
 function percentile(values, percentileValue) {
  if (values.length === 0) return 0;
  const sorted = [...values].sort((a, b) => a - b);
  const index = Math.min(sorted.length - 1, Math.ceil((percentileValue / 100) * sorted.length) - 1);
  return Number(sorted[index].toFixed(3));
 }
 function stats(values) {
  if (values.length === 0) return { min: 0, p50: 0, p95: 0, p99: 0, max: 0 };
  return {
    min: Number(Math.min(...values).toFixed(3)),
    p50: percentile(values, 50),
    p95: percentile(values, 95),
    p99: percentile(values, 99),
    max: Number(Math.max(...values).toFixed(3)),
  };
 }
 function joinUrl(baseUrl, path) {
  const base = baseUrl.replace(/\/+$/, "");
  const suffix = path.startsWith("/") ? path : `/${path}`;
  return `${base}${suffix}`;
 }
 function parseJsonObject(value, fallback) {
  if (!value) return fallback;
  try {
    const parsed = JSON.parse(value);
    return parsed && typeof parsed === "object" && !Array.isArray(parsed) ? parsed : fallback;
  } catch {
    return fallback;
  }
 }
 function controlPlaneEndpoints() {
  return [
    {
      id: "healthz",
      path: "/healthz",
      expected_status: 200,
      expected_code: 0,
      p95_budget_ms: Number(env.LANGBOT_PERF_HEALTHZ_P95_MS || "500"),
      required_data_fields: [],
    },
    {
      id: "system_info",
      path: "/api/v1/system/info",
      expected_status: 200,
      expected_code: 0,
      p95_budget_ms: Number(env.LANGBOT_PERF_SYSTEM_INFO_P95_MS || "1000"),
      required_data_fields: ["version", "edition", "enable_marketplace"],
    },
  ];
 }
 async function fetchEndpoint(backendUrl, endpoint, timeoutMs) {
  const url = joinUrl(backendUrl, endpoint.path);
  const controller = new AbortController();
  const timeout = setTimeout(() => controller.abort(), timeoutMs);
  const started = performance.now();
  let bodyText = "";
  let json = null;
  let jsonValid = false;
  let error = "";
  try {
    const response = await fetch(url, {
      method: "GET",
      headers: { "accept": "application/json" },
      signal: controller.signal,
    });
    bodyText = await response.text();
    try {
      json = bodyText ? JSON.parse(bodyText) : null;
      jsonValid = json !== null;
    } catch (parseError) {
      error = parseError instanceof Error ? parseError.message : String(parseError);
    }
    const data = json && typeof json === "object" && json.data && typeof json.data === "object" ? json.data : {};
    const missingFields = endpoint.required_data_fields.filter((field) => !(field in data));
    const statusOk = response.status === endpoint.expected_status;
    const codeOk = !json || typeof json !== "object" ? false : json.code === endpoint.expected_code;
    const shapeOk = jsonValid && missingFields.length === 0;
    const latencyMs = performance.now() - started;
    return {
      endpoint_id: endpoint.id,
      path: endpoint.path,
      url,
      status: response.status,
      ok: statusOk && codeOk && shapeOk,
      status_ok: statusOk,
      code_ok: codeOk,
      json_valid: jsonValid,
      missing_fields: missingFields,
      response_code: json && typeof json === "object" ? json.code : null,
      latency_ms: Number(latencyMs.toFixed(3)),
      error,
    };
  } catch (fetchError) {
    const latencyMs = performance.now() - started;
    return {
      endpoint_id: endpoint.id,
      path: endpoint.path,
      url,
      status: 0,
      ok: false,
      status_ok: false,
      code_ok: false,
      json_valid: false,
      missing_fields: endpoint.required_data_fields,
      response_code: null,
      latency_ms: Number(latencyMs.toFixed(3)),
      error: fetchError instanceof Error ? fetchError.message : String(fetchError),
    };
  } finally {
    clearTimeout(timeout);
  }
 }
 async function runBatches(backendUrl, endpoints, totalRequests, concurrency, timeoutMs) {
  const queue = Array.from({ length: totalRequests }, (_, index) => endpoints[index % endpoints.length]);
  const results = [];
  while (queue.length > 0) {
    const batch = queue.splice(0, concurrency);
    results.push(...await Promise.all(batch.map((endpoint) => fetchEndpoint(backendUrl, endpoint, timeoutMs))));
  }
  return results;
 }
 function endpointMetrics(endpoints, results) {
  return Object.fromEntries(endpoints.map((endpoint) => {
    const samples = results.filter((item) => item.endpoint_id === endpoint.id);
    const okSamples = samples.filter((item) => item.ok);
    return [
      endpoint.id,
      {
        path: endpoint.path,
        requests: samples.length,
        ok_count: okSamples.length,
        error_rate: samples.length === 0 ? 1 : Number(((samples.length - okSamples.length) / samples.length).toFixed(4)),
        latency_ms: stats(okSamples.map((item) => item.latency_ms)),
        p95_budget_ms: endpoint.p95_budget_ms,
      },
    ];
  }));
 }
 async function main() {
  const root = resolve(env.LBS_ROOT || process.cwd());
  const caseId = "langbot-live-control-plane-api";
  const runId = env.LBS_RUN_ID || `${timestampSlug()}-${caseId}`;
  const evidenceDir = resolve(env.LBS_EVIDENCE_DIR || join(root, "reports", "evidence", runId));
  await mkdir(evidenceDir, { recursive: true });
  const startedAt = new Date();
  const backendUrl = env.LANGBOT_BACKEND_URL || "";
  const endpoints = controlPlaneEndpoints();
  const configuredBudgets = parseJsonObject(env.LANGBOT_CONTROL_PLANE_P95_BUDGETS_JSON, {});
  for (const endpoint of endpoints) {
    const budget = configuredBudgets[endpoint.id];
    if (typeof budget === "number" && Number.isFinite(budget)) endpoint.p95_budget_ms = budget;
  }
  const totalRequests = Number(env.LANGBOT_CONTROL_PLANE_REQUESTS || "20");
  const concurrency = Number(env.LANGBOT_CONTROL_PLANE_CONCURRENCY || "4");
  const timeoutMs = Number(env.LANGBOT_CONTROL_PLANE_TIMEOUT_MS || "5000");
  const maxErrorRate = Number(env.LANGBOT_CONTROL_PLANE_MAX_ERROR_RATE || "0");
  const metricsPath = join(evidenceDir, "metrics.json");
  const endpointsPath = join(evidenceDir, "endpoints.json");
  const networkLogPath = join(evidenceDir, "network.log");
  const automationResultPath = join(evidenceDir, "automation-result.json");
  const resultPath = join(evidenceDir, "result.json");
  let status = "fail";
  let reason = "";
  let results = [];
  if (!backendUrl) {
    status = "env_issue";
    reason = "LANGBOT_BACKEND_URL is not configured.";
  } else {
    results = await runBatches(backendUrl, endpoints, totalRequests, concurrency, timeoutMs);
    const allConnectionFailures = results.length > 0 && results.every((item) => item.status === 0);
    if (allConnectionFailures) {
      status = "env_issue";
      reason = `Backend did not respond at ${backendUrl}.`;
    }
  }
  const okResults = results.filter((item) => item.ok);
  const statusCounts = {};
  for (const item of results) {
    const key = item.status === 0 ? "network_error" : String(item.status);
    statusCounts[key] = (statusCounts[key] || 0) + 1;
  }
  const perEndpoint = endpointMetrics(endpoints, results);
  const responseShapeFailures = results.filter((item) => !item.json_valid || item.missing_fields.length > 0 || !item.code_ok).length;
  const errorRate = results.length === 0 ? 1 : Number(((results.length - okResults.length) / results.length).toFixed(4));
  const thresholds = {
    error_rate: { actual: errorRate, max: maxErrorRate, pass: errorRate <= maxErrorRate },
    response_shape_failures: { actual: responseShapeFailures, max: 0, pass: responseShapeFailures === 0 },
  };
  for (const endpoint of endpoints) {
    const actual = perEndpoint[endpoint.id].latency_ms.p95;
    thresholds[`${endpoint.id}_p95_ms`] = {
      actual,
      max: endpoint.p95_budget_ms,
      pass: actual <= endpoint.p95_budget_ms,
    };
  }
  if (status !== "env_issue") {
    const passed = Object.values(thresholds).every((item) => item.pass);
    status = passed ? "pass" : "fail";
    reason = passed
      ? "Live control-plane API probe passed all thresholds."
      : "Live control-plane API probe breached shape, latency, or error-rate thresholds.";
  }
  const metrics = {
    probe: caseId,
    backend_url: backendUrl,
    total_requests: totalRequests,
    concurrency,
    timeout_ms: timeoutMs,
    ok_count: okResults.length,
    error_count: results.length - okResults.length,
    error_rate: errorRate,
    status_counts: statusCounts,
    response_shape_failures: responseShapeFailures,
    endpoints: perEndpoint,
  };
  await writeFile(metricsPath, `${JSON.stringify({ ...metrics, samples: results }, null, 2)}\n`, "utf8");
  await writeFile(endpointsPath, `${JSON.stringify(endpoints, null, 2)}\n`, "utf8");
  await writeFile(networkLogPath, results.map((item) => JSON.stringify(item)).join("\n") + (results.length > 0 ? "\n" : ""), "utf8");
  const finishedAt = new Date();
  const result = {
    source: "automation",
    case_id: caseId,
    run_id: runId,
    status,
    reason,
    started_at: startedAt.toISOString(),
    started_at_local: localIsoWithOffset(startedAt),
    finished_at: finishedAt.toISOString(),
    finished_at_local: localIsoWithOffset(finishedAt),
    duration_ms: finishedAt.getTime() - startedAt.getTime(),
    url: backendUrl,
    metrics_summary: {
      requests: metrics.total_requests,
      concurrency: metrics.concurrency,
      ok_count: metrics.ok_count,
      error_rate: metrics.error_rate,
      response_shape_failures: metrics.response_shape_failures,
      endpoints: Object.fromEntries(Object.entries(metrics.endpoints).map(([id, value]) => [
        id,
        {
          path: value.path,
          ok_count: value.ok_count,
          error_rate: value.error_rate,
          latency_p50_ms: value.latency_ms.p50,
          latency_p95_ms: value.latency_ms.p95,
        },
      ])),
      status_counts: metrics.status_counts,
    },
    thresholds_summary: thresholds,
    artifacts: {
      metrics_json: metricsPath,
      endpoints_json: endpointsPath,
      network_log: networkLogPath,
      automation_result_json: automationResultPath,
      result_json: resultPath,
    },
    evidence_collected: ["metrics", "network", "api_diagnostic", "filesystem"],
  };
  const resultText = `${JSON.stringify(result, null, 2)}\n`;
  await writeFile(automationResultPath, resultText, "utf8");
  await writeFile(resultPath, resultText, "utf8");
  console.log(JSON.stringify(result, null, 2));
  exit(status === "pass" ? 0 : status === "env_issue" ? 2 : 1);
 }
 await main();
@@ -0,0 +1,162 @@
 #!/usr/bin/env node
 import { mkdir, writeFile } from "node:fs/promises";
 import { join, resolve } from "node:path";
 import { env, exit } from "node:process";
 function pad(value, size = 2) {
  return String(value).padStart(size, "0");
 }
 function localIsoWithOffset(date = new Date()) {
  const offsetMinutes = -date.getTimezoneOffset();
  const sign = offsetMinutes >= 0 ? "+" : "-";
  const absolute = Math.abs(offsetMinutes);
  return [
    `${date.getFullYear()}-${pad(date.getMonth() + 1)}-${pad(date.getDate())}`,
    `T${pad(date.getHours())}:${pad(date.getMinutes())}:${pad(date.getSeconds())}.${pad(date.getMilliseconds(), 3)}`,
    `${sign}${pad(Math.floor(absolute / 60))}:${pad(absolute % 60)}`,
  ].join("");
 }
 function timestampSlug(date = new Date()) {
  return date.toISOString().replace(/\.\d{3}Z$/, "Z").replace(/[^0-9A-Za-z]+/g, "-").replace(/^-|-$/g, "");
 }
 function percentile(values, percentileValue) {
  if (values.length === 0) return 0;
  const sorted = [...values].sort((a, b) => a - b);
  const index = Math.min(sorted.length - 1, Math.ceil((percentileValue / 100) * sorted.length) - 1);
  return Number(sorted[index].toFixed(3));
 }
 function stats(values) {
  return {
    min: Number(Math.min(...values).toFixed(3)),
    p50: percentile(values, 50),
    p95: percentile(values, 95),
    p99: percentile(values, 99),
    max: Number(Math.max(...values).toFixed(3)),
  };
 }
 function threshold(actual, limit, operator) {
  const pass = operator === "<=" ? actual <= limit : actual >= limit;
  return { actual, [operator === "<=" ? "max" : "min"]: limit, pass };
 }
 function makeSample(index) {
  const ingress = 1 + (index % 5) * 0.22;
  const pipeline = 2.8 + (index % 7) * 0.31;
  const persistence = 1.1 + (index % 4) * 0.2;
  const pluginIpc = 1.9 + (index % 6) * 0.27;
  const rag = index % 3 === 0 ? 4.4 : 0.8 + (index % 5) * 0.18;
  const streaming = 1.5 + (index % 8) * 0.24;
  const provider = 80 + (index % 13) * 11;
  const externalTool = index % 4 === 0 ? 25 + (index % 9) * 3 : 0;
  const network = 8 + (index % 10) * 1.7;
  const overhead = ingress + pipeline + persistence + pluginIpc + rag + streaming;
  const external = provider + externalTool + network;
  const total = overhead + external;
  return {
    index,
    segments_ms: {
      ingress,
      pipeline,
      persistence,
      plugin_ipc: pluginIpc,
      rag,
      streaming,
      provider,
      external_tool: externalTool,
      network,
    },
    langbot_overhead_ms: Number(overhead.toFixed(3)),
    external_latency_ms: Number(external.toFixed(3)),
    e2e_latency_ms: Number(total.toFixed(3)),
    accounting_gap_ms: Number((total - external - overhead).toFixed(6)),
  };
 }
 async function main() {
  const root = resolve(env.LBS_ROOT || process.cwd());
  const caseId = "langbot-overhead-accounting-contract";
  const runId = env.LBS_RUN_ID || `${timestampSlug()}-${caseId}`;
  const evidenceDir = resolve(env.LBS_EVIDENCE_DIR || join(root, "reports", "evidence", runId));
  await mkdir(evidenceDir, { recursive: true });
  const startedAt = new Date();
  const sampleCount = Number(env.LANGBOT_PERF_CONTRACT_SAMPLES || "80");
  const overheadP95BudgetMs = Number(env.LANGBOT_PERF_OVERHEAD_P95_MS || "25");
  const samples = Array.from({ length: sampleCount }, (_, index) => makeSample(index));
  const overheads = samples.map((sample) => sample.langbot_overhead_ms);
  const e2e = samples.map((sample) => sample.e2e_latency_ms);
  const external = samples.map((sample) => sample.external_latency_ms);
  const gaps = samples.map((sample) => Math.abs(sample.accounting_gap_ms));
  const memory = process.memoryUsage();
  const metrics = {
    probe: caseId,
    sample_count: sampleCount,
    langbot_overhead_ms: stats(overheads),
    e2e_latency_ms: stats(e2e),
    external_latency_ms: stats(external),
    accounting_gap_max_ms: Number(Math.max(...gaps).toFixed(6)),
    samples,
  };
  const thresholds = {
    sample_count: threshold(sampleCount, 50, ">="),
    langbot_overhead_p95_ms: threshold(metrics.langbot_overhead_ms.p95, overheadP95BudgetMs, "<="),
    accounting_gap_max_ms: threshold(metrics.accounting_gap_max_ms, 0.001, "<="),
  };
  const status = Object.values(thresholds).every((item) => item.pass) ? "pass" : "fail";
  const metricsPath = join(evidenceDir, "metrics.json");
  const thresholdsPath = join(evidenceDir, "thresholds.json");
  const resourceLogPath = join(evidenceDir, "resource-log.json");
  const automationResultPath = join(evidenceDir, "automation-result.json");
  const resultPath = join(evidenceDir, "result.json");
  await writeFile(metricsPath, `${JSON.stringify(metrics, null, 2)}\n`, "utf8");
  await writeFile(thresholdsPath, `${JSON.stringify(thresholds, null, 2)}\n`, "utf8");
  await writeFile(resourceLogPath, `${JSON.stringify({ memory, pid: process.pid }, null, 2)}\n`, "utf8");
  const finishedAt = new Date();
  const result = {
    source: "automation",
    case_id: caseId,
    run_id: runId,
    status,
    reason: status === "pass"
      ? "Overhead accounting contract passed all thresholds."
      : "Overhead accounting contract breached one or more thresholds.",
    started_at: startedAt.toISOString(),
    started_at_local: localIsoWithOffset(startedAt),
    finished_at: finishedAt.toISOString(),
    finished_at_local: localIsoWithOffset(finishedAt),
    duration_ms: finishedAt.getTime() - startedAt.getTime(),
    metrics_summary: {
      sample_count: metrics.sample_count,
      langbot_overhead_p95_ms: metrics.langbot_overhead_ms.p95,
      e2e_latency_p95_ms: metrics.e2e_latency_ms.p95,
      external_latency_p95_ms: metrics.external_latency_ms.p95,
      accounting_gap_max_ms: metrics.accounting_gap_max_ms,
    },
    thresholds_summary: thresholds,
    artifacts: {
      metrics_json: metricsPath,
      thresholds_json: thresholdsPath,
      resource_log_json: resourceLogPath,
      automation_result_json: automationResultPath,
      result_json: resultPath,
    },
    evidence_collected: ["metrics", "resource_log", "filesystem"],
  };
  const resultText = `${JSON.stringify(result, null, 2)}\n`;
  await writeFile(automationResultPath, resultText, "utf8");
  await writeFile(resultPath, resultText, "utf8");
  console.log(JSON.stringify(result, null, 2));
  exit(status === "pass" ? 0 : 1);
 }
 await main();
@@ -0,0 +1,134 @@
 export function summarizeFakeProviderState(state) {
  if (!state) return null;
  const recentRequests = Array.isArray(state.recent_requests) ? state.recent_requests : [];
  const chatRequests = recentRequests.filter((request) => String(request?.path || "").includes("/chat/completions"));
  const successfulRequests = chatRequests.filter((request) => request?.status === "ok");
  const faultRequests = chatRequests.filter((request) => (
    request?.should_fail === true
      || request?.status === "http_fault"
      || (Number.isFinite(request?.http_status) && request.http_status >= 400)
  ));
  return {
    status: state.status || "unknown",
    url: state.url || "",
    request_count: Number.isFinite(state.request_count) ? state.request_count : recentRequests.length,
    recent_request_count: recentRequests.length,
    chat_request_count: chatRequests.length,
    fault_count: faultRequests.length,
    streamed_request_count: chatRequests.filter((request) => request?.stream === true).length,
    duration_ms: stats(chatRequests.map((request) => numberOrNull(request?.duration_ms)).filter(Number.isFinite)),
    successful_duration_ms: stats(successfulRequests.map((request) => numberOrNull(request?.duration_ms)).filter(Number.isFinite)),
    first_chunk_ms: stats(successfulRequests.map((request) => numberOrNull(request?.first_chunk_ms)).filter(Number.isFinite)),
    first_content_chunk_ms: stats(successfulRequests.map((request) => numberOrNull(request?.first_content_chunk_ms)).filter(Number.isFinite)),
    content_chunk_count: stats(successfulRequests.map((request) => numberOrNull(request?.content_chunk_count)).filter(Number.isFinite)),
    config: state.config || {},
  };
 }
 export function buildProviderTimingMetrics(samples, state) {
  const recentRequests = Array.isArray(state?.recent_requests) ? state.recent_requests : [];
  const byExpectedText = new Map();
  for (const request of recentRequests) {
    const expected = String(request?.expected_text || "");
    if (!expected) continue;
    if (!byExpectedText.has(expected)) byExpectedText.set(expected, []);
    byExpectedText.get(expected).push(request);
  }
  const segments = [];
  const missingExpectedText = [];
  for (const sample of samples) {
    const expected = String(sample?.expected_text || "");
    if (!expected) continue;
    const request = (byExpectedText.get(expected) || []).shift();
    if (!request) {
      missingExpectedText.push(expected);
      continue;
    }
    const segment = buildTimingSegment(sample, request);
    if (segment) segments.push(segment);
  }
  const values = (key) => segments.map((segment) => numberOrNull(segment[key])).filter(Number.isFinite);
  return {
    matched_request_count: segments.length,
    missing_provider_match_count: missingExpectedText.length,
    missing_expected_text: missingExpectedText.slice(0, 20),
    send_to_provider_start_ms: stats(values("send_to_provider_start_ms")),
    provider_duration_ms: stats(values("provider_duration_ms")),
    provider_finish_to_ws_final_ms: stats(values("provider_finish_to_ws_final_ms")),
    langbot_overhead_estimate_ms: stats(values("langbot_overhead_estimate_ms")),
    e2e_minus_provider_ms: stats(values("e2e_minus_provider_ms")),
    provider_first_content_to_ws_first_content_ms: stats(values("provider_first_content_to_ws_first_content_ms")),
    segments,
  };
 }
 function buildTimingSegment(sample, request) {
  const sentEpochMs = numberOrNull(sample.sent_epoch_ms);
  const finishedEpochMs = numberOrNull(sample.finished_epoch_ms);
  const providerStartedEpochMs = numberOrNull(request.started_epoch_ms);
  const providerFinishedEpochMs = numberOrNull(request.finished_epoch_ms);
  const providerFirstContentEpochMs = numberOrNull(request.first_content_chunk_epoch_ms);
  const wsFirstContentEpochMs = numberOrNull(sample.first_assistant_content_epoch_ms);
  const responseDurationMs = numberOrNull(sample.response_duration_ms);
  const providerDurationMs = numberOrNull(request.duration_ms);
  const sendToProviderStartMs = finiteDelta(providerStartedEpochMs, sentEpochMs);
  const providerFinishToWsFinalMs = finiteDelta(finishedEpochMs, providerFinishedEpochMs);
  const e2eMinusProviderMs = Number.isFinite(responseDurationMs) && Number.isFinite(providerDurationMs)
    ? rounded(responseDurationMs - providerDurationMs)
    : null;
  const overheadEstimateMs = Number.isFinite(sendToProviderStartMs) && Number.isFinite(providerFinishToWsFinalMs)
    ? rounded(sendToProviderStartMs + providerFinishToWsFinalMs)
    : e2eMinusProviderMs;
  return {
    sample_index: sample.index,
    pipeline_label: sample.pipeline_label || "",
    expected_text: sample.expected_text || "",
    provider_request_id: request.id || "",
    provider_request_number: request.request_number ?? null,
    response_duration_ms: responseDurationMs,
    provider_duration_ms: providerDurationMs,
    send_to_provider_start_ms: sendToProviderStartMs,
    provider_finish_to_ws_final_ms: providerFinishToWsFinalMs,
    langbot_overhead_estimate_ms: overheadEstimateMs,
    e2e_minus_provider_ms: e2eMinusProviderMs,
    provider_first_content_to_ws_first_content_ms: finiteDelta(wsFirstContentEpochMs, providerFirstContentEpochMs),
    provider_status: request.status || "",
    provider_http_status: request.http_status ?? null,
  };
 }
 function finiteDelta(left, right) {
  return Number.isFinite(left) && Number.isFinite(right) ? rounded(left - right) : null;
 }
 export function stats(values) {
  if (values.length === 0) return { min: 0, p50: 0, p95: 0, p99: 0, max: 0 };
  return {
    min: rounded(Math.min(...values)),
    p50: percentile(values, 50),
    p95: percentile(values, 95),
    p99: percentile(values, 99),
    max: rounded(Math.max(...values)),
  };
 }
 export function percentile(values, percentileValue) {
  if (values.length === 0) return 0;
  const sorted = [...values].sort((a, b) => a - b);
  const index = Math.min(sorted.length - 1, Math.ceil((percentileValue / 100) * sorted.length) - 1);
  return rounded(sorted[index]);
 }
 export function rounded(value) {
  return Number(value.toFixed(3));
 }
 function numberOrNull(value) {
  const number = Number(value);
  return Number.isFinite(number) ? number : null;
 }
@@ -0,0 +1,285 @@
 # Performance And Reliability Testing
 Use this reference when a QA request asks whether LangBot is fast enough,
 stable under load, or resilient to controlled faults.
 These probes are manual/non-required QA gates unless a case or suite explicitly
 states otherwise. They depend on a live local backend, mutable QA fixtures, and
 operator-selected environment variables, so do not promote them to required CI
 checks until fake-provider isolation, ownership markers, and cleanup are in
 place.
 ## Scope
 Treat `skills/` as the QA control plane:
 - Cases define intent, readiness, thresholds, and required evidence.
 - Probe scripts collect metrics, traces, resource logs, and artifacts.
 - Reports classify the same run as `pass`, `fail`, `blocked`,
  `env_issue`, or `flaky`.
 Do not turn `skills/` into a load generator or chaos engine. Call a focused
 tool from a `mode: probe` case when the test needs one, for example k6,
 Locust, pytest-benchmark, Playwright trace collection, Toxiproxy, Docker, or a
 Kubernetes disruption tool.
 ## LangBot Performance Model
 For LangBot, performance is the cost LangBot adds around external systems:
 ```text
 LangBot overhead = end-to-end latency - provider latency - external tool latency - network/fault injection latency
 ```
 Measure user experience and internal composition separately:
 - WebUI load and interaction latency.
 - Debug Chat send-to-first-visible-token and send-to-completion latency.
 - Pipeline, RAG, plugin runtime, MCP, AgentRunner, and persistence segment
  latency.
 - Queue wait time, concurrency, throughput, timeout rate, and p95/p99 latency.
 - Startup, plugin install, knowledge-base ingestion, migration, and recovery
  time.
 Do not report a single message round-trip time as "LangBot performance" unless
 the report also explains external provider/tool/network time.
 ## Evidence Contract
 Performance and reliability cases should declare the evidence they need:
 - `metrics`: machine-readable latency, throughput, error-rate, or recovery
  metrics, usually `metrics.json`.
 - `resource_log`: CPU, memory, process, connection, queue, or file descriptor
  samples.
 - `trace`: browser, HTTP, database, or runtime trace artifacts.
 - `profile`: CPU, memory, or flamegraph profile artifacts.
 - `backend_log`, `network`, `api_diagnostic`, and `filesystem` as supporting
  evidence when relevant.
 Automation should write `automation-result.json` with these fields when
 available:
 ```json
 {
  "status": "pass",
  "reason": "Probe passed all thresholds.",
  "metrics_summary": {
    "langbot_overhead_p95_ms": 12.4,
    "error_rate": 0
  },
  "thresholds_summary": {
    "langbot_overhead_p95_ms": { "actual": 12.4, "max": 50, "pass": true }
  },
  "artifacts": {
    "metrics_json": "/path/to/metrics.json"
  },
  "evidence_collected": ["metrics", "filesystem"]
 }
 ```
 Synthetic contract probes are useful for checking the QA harness, but they are
 not live product performance results. Label them as contract probes in the case
 title, checks, and report.
 ## Chaos And Reliability Rules
 Chaos tests must be narrow and reversible:
 - Declare the fault model in `fault_model_json`.
 - Record blast radius, target component, injection method, duration, and abort
  conditions.
 - Capture recovery checks and cleanup steps in the case.
 - Classify unavailable dependencies as `env_issue` unless the target behavior
  is LangBot's handling of that dependency failure.
 - Do not run destructive fault injection against a shared or production-like
  instance without explicit operator approval.
 Recommended first fault models:
 - Provider timeout or HTTP 429 from a fake provider endpoint.
 - Plugin runtime disconnect/reconnect in a local instance.
 - MCP stdio server exits mid-call.
 - RAG parser fixture fails once and recovers on retry.
 - Backend API endpoint returns 5xx from a controlled local proxy.
 ## Starter Live Probes
 The starter gate separates QA-harness contracts from live product checks:
 - `langbot-overhead-accounting-contract` verifies that reports can carry
  overhead accounting metrics. It uses deterministic synthetic samples and is
  not live product performance.
 - `langbot-fault-taxonomy-contract` verifies that fault scenarios declare
  expected status, recovery, and cleanup before destructive chaos tests are
  added.
 - `langbot-live-backend-latency` checks the unauthenticated `/healthz`
  endpoint for basic backend responsiveness.
 - `langbot-live-control-plane-api` checks `/healthz` and
  `/api/v1/system/info` for HTTP 200, JSON `code: 0`, response shape, and
  per-endpoint p95 latency.
 - `langbot-live-backend-log-health` scans the recent backend log window for
  fail-severity runtime findings. It is the reliability guard that should fail
  the gate when HTTP probes pass but backend logs contain Traceback, ImportError,
  ERROR, unclosed sessions, or unawaited coroutine signals.
 Do not treat these starter live probes as Debug Chat or model-provider
 performance. They are control-plane readiness checks; user-facing performance
 needs browser/WebSocket/message-path measurements.
 ## Debug Chat Load And Fake Provider Baseline
 Use `langbot-fake-provider-debug-chat-load` before real-provider load checks.
 The setup automation starts a local OpenAI-compatible fake provider, registers
 it as a normal LangBot provider/model, configures a local-agent pipeline, resets
 Debug Chat, and then drives concurrent WebSocket messages through the live
 backend.
 This is not a mocked backend test. It still exercises:
 - provider/model persistence and runtime reload;
 - LiteLLM OpenAI-compatible requester path;
 - local-agent runner selection and pipeline execution;
 - Debug Chat WebSocket adapter and broadcast behavior;
 - backend concurrency, timeout, and error-rate accounting.
 The fake provider is deterministic and can inject controlled latency or faults
 with `LANGBOT_FAKE_PROVIDER_*` variables, so it is the baseline for LangBot
 message-path overhead. A fake-provider process keeps process-global config,
 request counters, and recent request history; run fake-provider probes serially
 or give each run its own provider instance. Concurrent probes against the same
 fake-provider URL can reset or reconfigure each other's metrics.
 The probe uses unique expected response tokens per
 request because Debug Chat broadcasts messages to every connection in the same
 session; unique tokens prevent one connection from counting another
 connection's response as its own.
 When the fake provider is used, reports also include provider-side timing in
 `metrics.json`:
 - `fake_provider.duration_ms` and `fake_provider.first_content_chunk_ms`
  measure the controlled provider itself.
 - `provider_timing.send_to_provider_start_ms` estimates WebSocket ingress,
  pipeline dispatch, runner setup, and requester time before the provider
  receives the request.
 - `provider_timing.provider_finish_to_ws_final_ms` estimates the path from
  provider completion back to the final Debug Chat WebSocket response.
 - `provider_timing.langbot_overhead_estimate_ms` is the sum of those two
  LangBot-side segments when wall-clock timestamps can be matched by the
  unique expected response token.
 After the baseline passes, run `langbot-fake-provider-debug-chat-slow-load` to
 keep the same live backend path while injecting deterministic streaming latency.
 Run `langbot-fake-provider-debug-chat-fault-recovery` to inject bounded HTTP
 provider failures and require both observed failures and later successful
 requests. The fault-recovery case is deliberately sequential because failed
 Debug Chat responses do not carry a unique success token that can be attributed
 to one concurrent connection.
 Run `langbot-fake-provider-debug-chat-cross-pipeline-isolation` separately via
 `langbot-debug-chat-isolation-gate`. Current LangBot releases may fail it because
 of product bug [#2286](https://github.com/langbot-app/LangBot/issues/2286), where
 Debug Chat replies can read singleton WebSocket proxy pipeline state after a
 later message overwrites it. Treat that failure as regression evidence for the
 product fix rather than as a fake-provider latency finding.
 Use `langbot-space-debug-chat-concurrency-smoke` after the fake-provider
 baseline. It runs a deliberately small real Space-provider batch and reports
 user-visible latency, not pure LangBot overhead. Space/model/network failures
 are dependency findings until the fake baseline shows the same symptom.
 If a Space smoke passes but log guard finds telemetry posting Tracebacks,
 classify that separately as `telemetry-proxy-noise` instead of clearing the
 proxy or treating the Debug Chat path as failed.
 Useful commands:
 ```bash
 rtk bin/lbs test run langbot-fake-provider-debug-chat-load --run-id langbot-fake-load-local
 rtk bin/lbs test run langbot-fake-provider-debug-chat-slow-load --run-id langbot-fake-slow-local
 rtk bin/lbs test run langbot-fake-provider-debug-chat-fault-recovery --run-id langbot-fake-fault-local
 rtk bin/lbs suite run langbot-debug-chat-isolation-gate --run-id langbot-debug-chat-isolation-local --include-manual-check
 rtk bin/lbs test run langbot-space-debug-chat-concurrency-smoke --run-id langbot-space-smoke-local
 rtk bin/lbs suite run langbot-debug-chat-load-gate --run-id langbot-debug-chat-load-local --include-manual-check
 ```
 ## Gate Layers
 Use the smallest gate that answers the quality question:
 - `langbot-performance-contract-gate`: fast synthetic checks for report shape,
  threshold accounting, and fault taxonomy. Good for PR feedback when no live
  service is running.
 - `langbot-live-backend-gate`: live backend `/healthz`,
  `/api/v1/system/info`, and backend log health. Good after starting a local
  LangBot backend.
 - `langbot-user-path-performance-gate`: browser-visible user path performance,
  starting with Pipeline Debug Chat send-to-visible-completion latency. Run it
  only when the browser profile and target pipeline are ready.
 - `langbot-debug-chat-load-gate`: manual WebSocket Debug Chat load checks,
  starting with controlled fake-provider baseline, slow-provider, and
  fault-recovery profiles, plus an optional low-volume real Space-provider
  smoke. Run fake-provider cases serially when they share a provider URL.
 - `langbot-debug-chat-isolation-gate`: manual cross-pipeline Debug Chat
  isolation regression gate. Current releases may fail because of #2286; keep it
  separate from the normal load gate until that product fix lands.
 - `langbot-performance-reliability-gate`: combined starter gate for synthetic
  contracts plus live backend checks.
 Keep environment diagnostics separate from product regressions. For example, a
 SOCKS proxy without Python `socksio` support should be fixed or clearly
 classified by `bin/lbs env doctor`; do not hide the resulting backend
 Traceback in reports.
 ## Debug Chat Performance
 `pipeline-debug-chat-performance` reuses the browser Debug Chat automation and
 adds `metrics.json`, `metrics_summary`, and `thresholds_summary` to
 `automation-result.json`.
 Current metric:
 ```text
 response_duration_ms = prompt send -> expected assistant response visible and stable
 ```
 This is a user-path metric, not pure LangBot overhead. If it regresses, inspect
 provider latency, model route health, plugin/runtime logs, WebSocket behavior,
 and browser console/network evidence before attributing the whole duration to
 LangBot.
 ### User-Path Gate Runbook
 1. Start the backend and frontend. The frontend must be launched with
   `VITE_API_BASE_URL="$LANGBOT_BACKEND_URL"` so browser API calls reach the
   backend.
 2. Run `node scripts/e2e/ensure-local-agent-pipeline.mjs --write-env`. The
   setup refreshes the local QA login, skips the wizard, prepares a Debug Chat
   pipeline, scans Space models, tests candidates, writes tested fallback
   models, and writes the selected pipeline/model env values to
   `skills/.env.local`.
 3. If setup returns `env_issue`, read `model_tests` and provider errors first.
   A missing Space key, failed Space scan, or unavailable model route is not a
   LangBot performance regression.
 4. Run
   `bin/lbs suite run langbot-user-path-performance-gate --include-manual-check`.
 5. Interpret `response_p95_ms` as browser-visible send-to-completion time. It
   includes provider latency; use backend logs and model test evidence to
   separate LangBot overhead from the external model route.
 The setup keeps a `max-round` value in the generated pipeline config only
 because the current backend truncator still reads that field directly. Do not
 use it as a quality requirement for future local-agent behavior.
 ## Running The First Gate
 Start with the reusable suite:
 ```bash
 rtk bin/lbs suite plan langbot-performance-reliability-gate
 rtk bin/lbs suite start langbot-performance-reliability-gate --run-id langbot-perf-rel-local
 ```
 Run synthetic contract probes first. Run live probes only after the selected
 backend/frontend instance is reachable and the run owner accepts any fault
 scope.
@@ -0,0 +1,13 @@
 id: langbot-debug-chat-isolation-gate
 title: "LangBot Debug Chat isolation gate"
 description: "Manual/non-required cross-pipeline Debug Chat isolation gate. Current releases may fail this gate because of product bug #2286; use it as regression evidence after the routing fix lands."
 type: reliability
 priority: p1
 tags:
  - reliability
  - debug-chat
  - websocket
  - isolation
  - concurrency
 cases:
  - langbot-fake-provider-debug-chat-cross-pipeline-isolation
@@ -0,0 +1,15 @@
 id: langbot-debug-chat-load-gate
 title: "LangBot Debug Chat load gate"
 description: "Manual/non-required message-path load checks for Pipeline Debug Chat: controlled fake-provider baseline, slow-provider and fault-recovery profiles, plus optional real Space-provider smoke. Cross-pipeline isolation is split into langbot-debug-chat-isolation-gate because current releases may fail it due to product bug #2286."
 type: performance
 priority: p1
 tags:
  - performance
  - debug-chat
  - websocket
  - load
 cases:
  - langbot-fake-provider-debug-chat-load
  - langbot-fake-provider-debug-chat-slow-load
  - langbot-fake-provider-debug-chat-fault-recovery
  - langbot-space-debug-chat-concurrency-smoke
@@ -0,0 +1,14 @@
 id: langbot-live-backend-gate
 title: "LangBot live backend reliability gate"
 description: "Live backend control-plane responsiveness and runtime log health checks for a locally running LangBot instance."
 type: reliability
 priority: p1
 tags:
  - performance
  - reliability
  - live-backend
  - metrics
 cases:
  - langbot-live-backend-latency
  - langbot-live-control-plane-api
  - langbot-live-backend-log-health
@@ -0,0 +1,13 @@
 id: langbot-performance-contract-gate
 title: "LangBot performance contract gate"
 description: "Fast synthetic contract checks for performance metric accounting and non-destructive reliability fault taxonomy."
 type: contract
 priority: p1
 tags:
  - performance
  - reliability
  - contract
  - metrics
 cases:
  - langbot-overhead-accounting-contract
  - langbot-fault-taxonomy-contract
@@ -0,0 +1,16 @@
 id: langbot-performance-reliability-gate
 title: "LangBot performance and reliability starter gate"
 description: "Starter gate for LangBot performance accounting, live backend control-plane latency, and non-destructive fault taxonomy checks."
 type: reliability
 priority: p1
 tags:
  - performance
  - reliability
  - metrics
  - chaos
 cases:
  - langbot-overhead-accounting-contract
  - langbot-fault-taxonomy-contract
  - langbot-live-backend-latency
  - langbot-live-control-plane-api
  - langbot-live-backend-log-health
@@ -0,0 +1,12 @@
 id: langbot-user-path-performance-gate
 title: "LangBot user-path performance gate"
 description: "Browser-visible performance checks for user-facing LangBot paths such as Pipeline Debug Chat."
 type: performance
 priority: p1
 tags:
  - performance
  - browser
  - debug-chat
  - user-path
 cases:
  - pipeline-debug-chat-performance
@@ -0,0 +1,23 @@
 id: telemetry-proxy-noise
 title: "Telemetry posting fails through the proxy while the target flow succeeds"
 date: 2026-06-25
 category: env_issue
 symptoms:
  - "The target Debug Chat or provider smoke request completes successfully."
  - "The same log window contains a Traceback for telemetry posting."
  - "The traceback references the Space telemetry endpoint."
 patterns:
  - "Failed to post telemetry"
  - "https://space.langbot.app/api/v1/telemetry"
  - "httpx.ConnectError"
 likely_causes:
  - "The backend process inherited proxy settings that are required for model/provider access but unreliable for telemetry posting."
  - "The telemetry endpoint is temporarily unreachable through the local proxy route."
  - "TLS or proxy negotiation failed for the non-critical telemetry request."
 fix_steps:
  - "Keep the proxy configuration needed for model/provider access; do not clear it only to hide telemetry noise."
  - "Check that uppercase and lowercase proxy variables are consistent before rerunning a live Space smoke."
  - "Classify the target flow and log-health result separately: a successful Debug Chat run can still have an environment log-health finding."
 verification: "A rerun shows the target case success patterns and no telemetry Traceback in the scanned log window, or the report explicitly records the telemetry issue as environment noise."
 related_cases:
  - langbot-space-debug-chat-concurrency-smoke
@@ -1,5 +1,7 @@
 import { existsSync } from "node:fs";
 import { spawnSync } from "node:child_process";
 import { Socket } from "node:net";
 import { join } from "node:path";
 import type { CommandContext } from "../types.ts";
 import { parseOptions } from "../cli.ts";
 import { loadEnv } from "../fs.ts";
@@ -88,6 +90,37 @@ function compareProxyPair(env: Record<string, string>, upper: string, lower: str
  return null;
 }
 function envValue(env: Record<string, string>, key: string): string {
  return process.env[key] ?? env[key] ?? "";
 }
 function activeSocksProxy(env: Record<string, string>): { key: string; value: string } | null {
  for (const key of ["ALL_PROXY", "all_proxy", "HTTPS_PROXY", "https_proxy", "HTTP_PROXY", "http_proxy"]) {
    const value = envValue(env, key);
    if (/^socks/i.test(value)) return { key, value };
  }
  return null;
 }
 function checkSocksio(env: Record<string, string>): string | null {
  const proxy = activeSocksProxy(env);
  if (!proxy) return null;
  const repo = env.LANGBOT_REPO;
  const python = repo ? join(repo, ".venv", "bin", "python") : "";
  if (!python || !existsSync(python)) {
    return `SOCKS proxy ${proxy.key} is configured (${redactEnvValue(proxy.key, proxy.value)}), but LangBot venv python was not found; after creating the venv, verify it can import socksio.`;
  }
  const result = spawnSync(python, ["-c", "import socksio"], {
    encoding: "utf8",
    timeout: 5000,
  });
  if (result.status === 0) return null;
  return `SOCKS proxy ${proxy.key} is configured (${redactEnvValue(proxy.key, proxy.value)}), but ${python} cannot import socksio; run \`${python} -m pip install socksio\` or start LangBot without SOCKS proxy env.`;
 }
 export async function commandEnvDoctor(ctx: CommandContext): Promise<number> {
  const env = loadEnv(ctx.root);
  const failures: string[] = [];
@@ -117,6 +150,8 @@ export async function commandEnvDoctor(ctx: CommandContext): Promise<number> {
  ]) {
    if (mismatch) failures.push(mismatch);
  }
  const socksioFailure = checkSocksio(env);
  if (socksioFailure) failures.push(socksioFailure);
  for (const [label, result] of await Promise.all([
    checkUrl("LANGBOT_BACKEND_URL", env.LANGBOT_BACKEND_URL).then((result) => ["LANGBOT_BACKEND_URL", result] as const),
@@ -465,6 +465,41 @@ function outputTail(value: string | Buffer | null | undefined): string {
  return String(value ?? "").trim().slice(-4000);
 }
 function exitStatusFromResultStatus(status: string): number {
  if (status === "pass") return 0;
  if (status === "blocked" || status === "env_issue" || status === "flaky") return 2;
  return 1;
 }
 function executionStatusFromExitStatus(status: number): string {
  if (status === 0) return "ok";
  if (status === 2) return "classified";
  return "nonzero";
 }
 function executionFromCaseResultFile(caseItem: Record<string, unknown>): Record<string, unknown> | null {
  const resultPath = join(String(caseItem.evidence_dir), "result.json");
  if (!existsSync(resultPath)) return null;
  try {
    const parsed = JSON.parse(readFileSync(resultPath, "utf8")) as Record<string, unknown>;
    if (
      parsed.case_id !== caseItem.id ||
      parsed.run_id !== caseItem.run_id ||
      typeof parsed.status !== "string"
    ) return null;
    const exitStatus = exitStatusFromResultStatus(parsed.status);
    return {
      status: executionStatusFromExitStatus(exitStatus),
      exit_status: exitStatus,
      reason: typeof parsed.reason === "string" ? parsed.reason : "result.json completed",
      result_status: parsed.status,
      result_json: resultPath,
    };
  } catch {
    return null;
  }
 }
 function executionProblemStatus(executions: Array<Record<string, unknown>>): string {
  const statuses = executions.map((item) => String(item.status));
  if (statuses.includes("nonzero")) return "fail";
@@ -523,12 +558,18 @@ export function commandSuiteRun(ctx: CommandContext): number {
      encoding: "utf8",
      stdio: options.json === true ? "pipe" : "inherit",
    });
-    const status = result.error ? 1 : result.status ?? 1;
+    const fileExecution = result.error ? executionFromCaseResultFile(caseItem) : null;
    const status = typeof fileExecution?.exit_status === "number"
      ? fileExecution.exit_status
      : result.error ? 1 : result.status ?? 1;
    executions.push({
      id: caseItem.id,
-      status: status === 0 ? "ok" : "nonzero",
+      status: fileExecution?.status ?? executionStatusFromExitStatus(status),
      exit_status: status,
-      reason: result.error?.message || "",
+      reason: fileExecution?.reason ?? result.error?.message ?? "",
      result_status: fileExecution?.result_status,
      result_json: fileExecution?.result_json,
      spawn_error: fileExecution && result.error ? result.error.message : undefined,
      stdout: outputTail(result.stdout),
      stderr: outputTail(result.stderr),
    });
@@ -271,7 +271,7 @@ function reportTemplate(mode: string): Record<string, string> {
      target_tested: "Probe target, endpoint, file, command, or service actually checked",
      execution_path: "automation script | shell command | direct API | other",
      probe_result: "What the probe observed",
-      logs_or_artifacts: "Log, filesystem, API, or other artifact paths collected",
+      metrics_or_artifacts: "Metrics, logs, filesystem artifacts, traces, or profiles collected",
      diagnostics: "Extra diagnostics used, if any",
      matched_troubleshooting: "Troubleshooting ids matched, if any",
      assets_to_update: "New case/reference/troubleshooting entries to add",
@@ -320,7 +320,7 @@ function manualEvidenceTemplate(mode: string): ManualEvidenceTemplate {
      target_tested: "TODO: probe target, endpoint, file, command, or service actually checked",
      execution_path: "TODO: automation script | shell command | direct API | other",
      probe_result: "TODO: observed probe result",
-      logs_or_artifacts: "TODO: evidence paths or skipped reason",
+      metrics_or_artifacts: "TODO: metrics, logs, filesystem artifacts, traces, or profiles collected",
      diagnostics: "TODO: additional diagnostics used, if any",
      matched_troubleshooting: "TODO: troubleshooting ids matched, if any",
      assets_to_update: "TODO: case/reference/troubleshooting updates to make",
@@ -1099,6 +1099,41 @@ function executionTail(value: string | Buffer | null | undefined): string {
  return String(value ?? "").trim().slice(-4000);
 }
 function exitStatusFromResultStatus(status: string): number {
  if (status === "pass") return 0;
  if (status === "blocked" || status === "env_issue" || status === "flaky") return 2;
  return 1;
 }
 function executionStatusFromExitStatus(status: number): string {
  if (status === 0) return "ok";
  if (status === 2) return "classified";
  return "nonzero";
 }
 function executionFromAutomationResultFile(
  evidenceDir: string,
  caseId: string,
  runId: string,
 ): { status: string; exit_status: number; reason: string; result_status: string; path: string } | null {
  const resultPath = join(evidenceDir, "automation-result.json");
  if (!existsSync(resultPath)) return null;
  try {
    const parsed = JSON.parse(readFileSync(resultPath, "utf8")) as Record<string, unknown>;
    if (parsed.case_id !== caseId || parsed.run_id !== runId || typeof parsed.status !== "string") return null;
    const exitStatus = exitStatusFromResultStatus(parsed.status);
    return {
      status: executionStatusFromExitStatus(exitStatus),
      exit_status: exitStatus,
      reason: typeof parsed.reason === "string" ? parsed.reason : "automation-result.json completed",
      result_status: parsed.status,
      path: resultPath,
    };
  } catch {
    return null;
  }
 }
 function runSetupAutomation(
  ctx: CommandContext,
  item: StructuredItem,
@@ -1224,6 +1259,30 @@ export function commandTestRun(ctx: CommandContext): number {
  });
  if (result.error) {
    const fileExecution = executionFromAutomationResultFile(
      run.automation.evidence_dir,
      String(run.case.id),
      run.run_id,
    );
    if (fileExecution) {
      if (options.json !== true) {
        console.error(`WARN: automation spawn reported an error, but ${fileExecution.path} completed: ${result.error.message}`);
      }
      if (options.json === true) {
        console.log(JSON.stringify({
          run,
          setup_executions: setupExecutions,
          automation_execution: {
            ...fileExecution,
            spawn_error: result.error.message,
            stdout: executionTail(result.stdout),
            stderr: executionTail(result.stderr),
          },
          exit_status: fileExecution.exit_status,
        }, null, 2));
      }
      return fileExecution.exit_status;
    }
    if (options.json !== true) console.error(`ERROR: failed to run automation: ${result.error.message}`);
    if (options.json === true) {
      console.log(JSON.stringify({
@@ -1247,7 +1306,7 @@ export function commandTestRun(ctx: CommandContext): number {
      run,
      setup_executions: setupExecutions,
      automation_execution: {
-        status: status === 0 ? "ok" : "nonzero",
+        status: executionStatusFromExitStatus(status),
        exit_status: status,
        stdout: executionTail(result.stdout),
        stderr: executionTail(result.stderr),
@@ -1311,6 +1370,7 @@ function renderMarkdownReport(report: TestReport): string {
  const environment = report.environment;
  const logGuard = report.log_guard;
  const troubleshooting = report.troubleshooting;
  const automation = report.automation_result;
  const lines: string[] = [];
  lines.push(`# Test Report: ${reportCase.id}`);
@@ -1323,20 +1383,41 @@ function renderMarkdownReport(report: TestReport): string {
  lines.push(`Type: ${reportCase.type}`);
  lines.push("");
  lines.push("## Result");
  if (automation.status === "loaded" && automation.result) {
    lines.push(`- result: ${automation.result}`);
    if (automation.reason) lines.push(`- reason: ${automation.reason}`);
    if (automation.url) lines.push(`- target_tested: ${automation.url}`);
    if (automation.path) lines.push(`- automation_result: ${automation.path}`);
    if (automation.artifacts) lines.push(`- artifacts: ${JSON.stringify(automation.artifacts)}`);
  } else {
    lines.push(`- result: ${evidence.result}`);
    for (const [key, value] of Object.entries(evidence)) {
      if (key !== "result") lines.push(`- ${key}: ${value}`);
    }
  }
  lines.push("");
  lines.push("## Automation Result");
-  lines.push(`- status: ${report.automation_result.status}`);
+  lines.push(`- status: ${automation.status}`);
-  if (report.automation_result.path) lines.push(`- path: ${report.automation_result.path}`);
+  if (automation.path) lines.push(`- path: ${automation.path}`);
-  if (report.automation_result.result) lines.push(`- result: ${report.automation_result.result}`);
+  if (automation.result) lines.push(`- result: ${automation.result}`);
-  if (report.automation_result.reason) lines.push(`- reason: ${report.automation_result.reason}`);
+  if (automation.reason) lines.push(`- reason: ${automation.reason}`);
-  if (report.automation_result.started_at_local) lines.push(`- started_at_local: ${report.automation_result.started_at_local}`);
+  if (automation.duration_ms !== undefined) lines.push(`- duration_ms: ${automation.duration_ms}`);
-  if (report.automation_result.finished_at_local) lines.push(`- finished_at_local: ${report.automation_result.finished_at_local}`);
+  if (automation.started_at_local) lines.push(`- started_at_local: ${automation.started_at_local}`);
-  if (report.automation_result.url) lines.push(`- url: ${report.automation_result.url}`);
+  if (automation.finished_at_local) lines.push(`- finished_at_local: ${automation.finished_at_local}`);
-  if (report.automation_result.expected_text) lines.push(`- expected_text: ${report.automation_result.expected_text}`);
+  if (automation.url) lines.push(`- url: ${automation.url}`);
  if (automation.expected_text) lines.push(`- expected_text: ${automation.expected_text}`);
  if (automation.metrics_summary) {
    lines.push("- metrics_summary:");
    lines.push(`  ${JSON.stringify(automation.metrics_summary)}`);
  }
  if (automation.thresholds_summary) {
    lines.push("- thresholds_summary:");
    lines.push(`  ${JSON.stringify(automation.thresholds_summary)}`);
  }
  if (automation.artifacts) {
    lines.push("- artifacts:");
    lines.push(`  ${JSON.stringify(automation.artifacts)}`);
  }
  lines.push("");
  lines.push("## Environment");
  for (const [key, value] of Object.entries(environment)) lines.push(`- ${key}=${value}`);
@@ -126,6 +126,9 @@ function validateCaseItem(root: string, item: StructuredItem, skillNames: Set<st
    ...validateEnvKeyScalar(item, "automation_pipeline_url_env"),
    ...validateEnvKeyScalar(item, "automation_pipeline_name_env"),
    ...validateJsonScalar(item, "automation_filesystem_checks_json"),
    ...validateJsonScalar(item, "metrics_thresholds_json"),
    ...validateJsonScalar(item, "load_profile_json"),
    ...validateJsonScalar(item, "fault_model_json"),
    ...listValue(item.fields, "setup_automation").flatMap((entry) => (
      validateSetupAutomationEntry(root, entry, caseIds).map((error) => `${item.path}: ${error}`)
    )),
@@ -183,10 +186,62 @@ function validateCaseItem(root: string, item: StructuredItem, skillNames: Set<st
  if (timeout && (!/^\d+$/.test(timeout) || Number.parseInt(timeout, 10) <= 0)) {
    errors.push(`${item.path}: 'automation_response_timeout_ms' must be a positive integer string`);
  }
  for (const key of [
    "automation_debug_chat_load_requests",
    "automation_debug_chat_load_concurrency",
    "automation_debug_chat_load_timeout_ms",
    "automation_debug_chat_load_response_p95_ms",
    "automation_debug_chat_load_first_response_p95_ms",
  ]) {
    const value = scalar(item.fields, key);
    if (value && (!/^\d+$/.test(value) || Number.parseInt(value, 10) <= 0)) {
      errors.push(`${item.path}: '${key}' must be a positive integer string`);
    }
  }
  for (const key of [
    "automation_debug_chat_load_min_error_count",
    "automation_debug_chat_load_min_ok_count",
    "automation_debug_chat_load_min_provider_fault_count",
    "automation_fake_provider_first_token_delay_ms",
    "automation_fake_provider_chunk_delay_ms",
    "automation_fake_provider_chunk_count",
    "automation_fake_provider_fail_first_n",
    "automation_fake_provider_fail_every_n",
  ]) {
    const value = scalar(item.fields, key);
    if (value && (!/^\d+$/.test(value) || Number.parseInt(value, 10) < 0)) {
      errors.push(`${item.path}: '${key}' must be a non-negative integer string`);
    }
  }
  for (const key of ["automation_debug_chat_load_max_error_rate", "automation_debug_chat_load_min_error_rate"]) {
    const value = scalar(item.fields, key);
    if (value && (!/^(?:0(?:\.\d+)?|1(?:\.0+)?)$/.test(value))) {
      errors.push(`${item.path}: '${key}' must be a number string between 0 and 1`);
    }
  }
  const fakeProviderFaultStatus = scalar(item.fields, "automation_fake_provider_fault_status");
  if (fakeProviderFaultStatus) {
    const parsed = Number.parseInt(fakeProviderFaultStatus, 10);
    if (!/^\d+$/.test(fakeProviderFaultStatus) || parsed < 400 || parsed > 599) {
      errors.push(`${item.path}: 'automation_fake_provider_fault_status' must be an HTTP 4xx or 5xx status string`);
    }
  }
  const streamOutput = scalar(item.fields, "automation_stream_output");
  if (streamOutput && !["0", "1", "false", "true"].includes(streamOutput)) {
    errors.push(`${item.path}: 'automation_stream_output' must be one of 0, 1, false, or true`);
  }
  for (const key of [
    "automation_debug_chat_load_stream",
    "automation_debug_chat_load_reset",
    "automation_debug_chat_load_fail_on_final_mismatch",
    "automation_fake_provider_fail_after_first_chunk",
    "automation_fake_provider_dynamic_response",
  ]) {
    const value = scalar(item.fields, key);
    if (value && !["0", "1", "false", "true"].includes(value)) {
      errors.push(`${item.path}: '${key}' must be one of 0, 1, false, or true`);
    }
  }
  const imageBase64Fixture = scalar(item.fields, "automation_image_base64_fixture");
  if (imageBase64Fixture && !existsSync(join(root, imageBase64Fixture))) {
    errors.push(`${item.path}: automation image fixture does not exist: ${imageBase64Fixture}`);
@@ -9,7 +9,18 @@ export const requiredEnvKeys = [
 ];
 export const caseModeValues = ["agent-browser", "probe"];
-export const caseTypeValues = ["smoke", "regression", "feature", "provider", "exploratory"];
+export const caseTypeValues = [
  "smoke",
  "regression",
  "feature",
  "provider",
  "exploratory",
  "contract",
  "performance",
  "reliability",
  "chaos",
  "security",
 ];
 export const casePriorityValues = ["p0", "p1", "p2"];
 export const caseRiskValues = ["low", "medium", "high"];
 export const caseEvidenceValues = [
@@ -21,10 +32,24 @@ export const caseEvidenceValues = [
  "frontend_log",
  "api_diagnostic",
  "filesystem",
  "metrics",
  "trace",
  "profile",
  "resource_log",
 ];
 export const testResultStatusValues = ["pass", "fail", "blocked", "env_issue", "flaky"];
 export const troubleshootingCategoryValues = ["product", "env_issue", "external_dependency", "blocked", "flaky"];
-export const suiteTypeValues = ["smoke", "regression", "release_gate", "exploratory"];
+export const suiteTypeValues = [
  "smoke",
  "regression",
  "release_gate",
  "exploratory",
  "contract",
  "performance",
  "reliability",
  "chaos",
  "security",
 ];
 export const suiteRequiredStrings = ["id", "title", "description", "type", "priority"];
 export const suiteRequiredLists = ["tags", "cases"];
@@ -91,6 +91,7 @@ export type AutomationResultEvidence = {
  path?: string;
  result?: string;
  reason?: string;
  duration_ms?: number;
  started_at?: string;
  started_at_local?: string;
  finished_at?: string;
@@ -98,6 +99,9 @@ export type AutomationResultEvidence = {
  url?: string;
  prompt?: string;
  expected_text?: string;
  metrics_summary?: Record<string, unknown>;
  thresholds_summary?: Record<string, unknown>;
  artifacts?: Record<string, unknown>;
 };
 type MutableScanState = {
@@ -594,6 +598,18 @@ function stringField(data: Record<string, unknown>, key: string): string | undef
  return typeof value === "string" && value.trim() ? value : undefined;
 }
 function numberField(data: Record<string, unknown>, key: string): number | undefined {
  const value = data[key];
  return typeof value === "number" && Number.isFinite(value) ? value : undefined;
 }
 function objectField(data: Record<string, unknown>, key: string): Record<string, unknown> | undefined {
  const value = data[key];
  return value && typeof value === "object" && !Array.isArray(value)
    ? value as Record<string, unknown>
    : undefined;
 }
 function evidenceDirFromOptions(options: Record<string, string | boolean>): string | undefined {
  const explicit = typeof options["evidence-dir"] === "string" ? options["evidence-dir"] : undefined;
  if (explicit) return resolve(explicit);
@@ -628,6 +644,7 @@ export function readAutomationResultEvidence(options: Record<string, string | bo
      path: resultPath,
      result: stringField(result, "status"),
      reason: stringField(result, "reason"),
      duration_ms: numberField(result, "duration_ms"),
      started_at: stringField(result, "started_at"),
      started_at_local: stringField(result, "started_at_local"),
      finished_at: stringField(result, "finished_at"),
@@ -635,6 +652,9 @@ export function readAutomationResultEvidence(options: Record<string, string | bo
      url: stringField(result, "url"),
      prompt: redactSecrets(stringField(result, "prompt") ?? ""),
      expected_text: stringField(result, "expected_text"),
      metrics_summary: objectField(result, "metrics_summary"),
      thresholds_summary: objectField(result, "thresholds_summary"),
      artifacts: objectField(result, "artifacts"),
    };
  } catch (error) {
    return { status: "invalid", path: resultPath, reason: String(error) };
@@ -114,6 +114,32 @@ export function automationEnvDefaults(item: StructuredItem, env: EnvSource = pro
    ["automation_expected_runner_id", "LANGBOT_E2E_EXPECTED_RUNNER_ID"],
    ["automation_reset_debug_chat", "LANGBOT_E2E_RESET_DEBUG_CHAT"],
    ["automation_debug_chat_session_type", "LANGBOT_E2E_DEBUG_CHAT_SESSION_TYPE"],
    ["automation_debug_chat_response_p95_ms", "LANGBOT_E2E_DEBUG_CHAT_RESPONSE_P95_MS"],
    ["automation_debug_chat_max_error_rate", "LANGBOT_E2E_DEBUG_CHAT_MAX_ERROR_RATE"],
    ["automation_debug_chat_load_requests", "LANGBOT_DEBUG_CHAT_LOAD_REQUESTS"],
    ["automation_debug_chat_load_concurrency", "LANGBOT_DEBUG_CHAT_LOAD_CONCURRENCY"],
    ["automation_debug_chat_load_timeout_ms", "LANGBOT_DEBUG_CHAT_LOAD_TIMEOUT_MS"],
    ["automation_debug_chat_load_response_p95_ms", "LANGBOT_DEBUG_CHAT_LOAD_RESPONSE_P95_MS"],
    ["automation_debug_chat_load_first_response_p95_ms", "LANGBOT_DEBUG_CHAT_LOAD_FIRST_RESPONSE_P95_MS"],
    ["automation_debug_chat_load_max_error_rate", "LANGBOT_DEBUG_CHAT_LOAD_MAX_ERROR_RATE"],
    ["automation_debug_chat_load_min_error_rate", "LANGBOT_DEBUG_CHAT_LOAD_MIN_ERROR_RATE"],
    ["automation_debug_chat_load_min_error_count", "LANGBOT_DEBUG_CHAT_LOAD_MIN_ERROR_COUNT"],
    ["automation_debug_chat_load_min_ok_count", "LANGBOT_DEBUG_CHAT_LOAD_MIN_OK_COUNT"],
    ["automation_debug_chat_load_min_provider_fault_count", "LANGBOT_DEBUG_CHAT_LOAD_MIN_PROVIDER_FAULT_COUNT"],
    ["automation_debug_chat_load_expected_prefix", "LANGBOT_DEBUG_CHAT_LOAD_EXPECTED_PREFIX"],
    ["automation_debug_chat_load_prompt_template", "LANGBOT_DEBUG_CHAT_LOAD_PROMPT_TEMPLATE"],
    ["automation_debug_chat_load_stream", "LANGBOT_DEBUG_CHAT_LOAD_STREAM"],
    ["automation_debug_chat_load_reset", "LANGBOT_DEBUG_CHAT_LOAD_RESET"],
    ["automation_debug_chat_load_fail_on_final_mismatch", "LANGBOT_DEBUG_CHAT_LOAD_FAIL_ON_FINAL_MISMATCH"],
    ["automation_fake_provider_response_text", "LANGBOT_FAKE_PROVIDER_RESPONSE_TEXT"],
    ["automation_fake_provider_first_token_delay_ms", "LANGBOT_FAKE_PROVIDER_FIRST_TOKEN_DELAY_MS"],
    ["automation_fake_provider_chunk_delay_ms", "LANGBOT_FAKE_PROVIDER_CHUNK_DELAY_MS"],
    ["automation_fake_provider_chunk_count", "LANGBOT_FAKE_PROVIDER_CHUNK_COUNT"],
    ["automation_fake_provider_fail_first_n", "LANGBOT_FAKE_PROVIDER_FAIL_FIRST_N"],
    ["automation_fake_provider_fail_every_n", "LANGBOT_FAKE_PROVIDER_FAIL_EVERY_N"],
    ["automation_fake_provider_fault_status", "LANGBOT_FAKE_PROVIDER_FAULT_STATUS"],
    ["automation_fake_provider_fail_after_first_chunk", "LANGBOT_FAKE_PROVIDER_FAIL_AFTER_FIRST_CHUNK"],
    ["automation_fake_provider_dynamic_response", "LANGBOT_FAKE_PROVIDER_DYNAMIC_RESPONSE"],
    ["automation_filesystem_checks_json", "LANGBOT_E2E_FILESYSTEM_CHECKS_JSON"],
    ["automation_plugin_package", "LANGBOT_E2E_PLUGIN_PACKAGE"],
    ["automation_expected_plugin_id", "LANGBOT_E2E_EXPECTED_PLUGIN_ID"],
@@ -1,6 +1,6 @@
 import assert from "node:assert/strict";
 import { test } from "node:test";
-import { appendFileSync, existsSync, mkdtempSync, mkdirSync, readFileSync, rmSync, writeFileSync } from "node:fs";
+import { appendFileSync, chmodSync, existsSync, mkdtempSync, mkdirSync, readFileSync, rmSync, writeFileSync } from "node:fs";
 import { spawnSync } from "node:child_process";
 import { tmpdir } from "node:os";
 import { join } from "node:path";
@@ -676,6 +676,82 @@ test("suite run JSON captures failed case output", () => {
  }
 });
 test("suite run preserves classified env_issue automation results", () => {
  const tmp = mkdtempSync(join(tmpdir(), "lbs-suite-run-env-issue-"));
  try {
    const skillDir = join(tmp, "skills", "langbot-testing");
    const casesDir = join(skillDir, "cases");
    const suitesDir = join(skillDir, "suites");
    const scriptsDir = join(tmp, "scripts");
    mkdirSync(casesDir, { recursive: true });
    mkdirSync(suitesDir, { recursive: true });
    mkdirSync(scriptsDir, { recursive: true });
    writeFileSync(join(skillDir, "SKILL.md"), "---\nname: langbot-testing\ndescription: Testing.\n---\n\n# Testing\n");
    writeFileSync(join(tmp, "skills", ".env"), "");
    writeFileSync(
      join(casesDir, "env-case.yaml"),
      [
        "id: env-case",
        "title: Env Case",
        "mode: probe",
        "area: qa",
        "type: smoke",
        "priority: p2",
        "risk: low",
        "ci_eligible: true",
        "automation: scripts/env-issue.mjs",
        "evidence_required:",
        "  - filesystem",
      ].join("\n"),
    );
    writeFileSync(
      join(suitesDir, "mini.yaml"),
      [
        "id: mini",
        "title: Mini",
        "description: Mini suite.",
        "type: smoke",
        "priority: p2",
        "tags:",
        "  - qa",
        "cases:",
        "  - env-case",
      ].join("\n"),
    );
    writeFileSync(
      join(scriptsDir, "env-issue.mjs"),
      [
        "import { mkdirSync, writeFileSync } from 'node:fs';",
        "import { join } from 'node:path';",
        "mkdirSync(process.env.LBS_EVIDENCE_DIR, { recursive: true });",
        "const result = {",
        "  case_id: process.env.LBS_CASE_ID,",
        "  run_id: process.env.LBS_RUN_ID,",
        "  status: 'env_issue',",
        "  reason: 'backend not reachable',",
        "  evidence_collected: ['filesystem']",
        "};",
        "writeFileSync(join(process.env.LBS_EVIDENCE_DIR, 'result.json'), JSON.stringify(result));",
        "writeFileSync(join(process.env.LBS_EVIDENCE_DIR, 'automation-result.json'), JSON.stringify({ ...result, source: 'automation' }));",
        "process.exit(2);",
      ].join("\n"),
    );
    const result = capture(() => commandSuiteRun({
      root: tmp,
      args: ["suite", "run", "mini", "--run-id", "mini-run", "--evidence-dir", join(tmp, "evidence"), "--json"],
    }));
    assert.equal(result.code, 2);
    const payload = JSON.parse(result.output);
    assert.equal(payload.executions[0].status, "classified");
    assert.equal(payload.report.status, "env_issue");
    assert.equal(payload.report.execution_status, "ok");
  } finally {
    rmSync(tmp, { recursive: true, force: true });
  }
 });
 test("suite run failure cannot be masked by stale pass result", () => {
  const tmp = mkdtempSync(join(tmpdir(), "lbs-suite-run-stale-pass-"));
  try {
@@ -1369,6 +1445,56 @@ test("env doctor does not require proxy variables", async () => {
  }
 });
 test("env doctor reports missing socksio for active SOCKS proxy", async () => {
  const tmp = mkdtempSync(join(tmpdir(), "lbs-env-doctor-socksio-"));
  const originalAllProxy = process.env.ALL_PROXY;
  const originalAllProxyLower = process.env.all_proxy;
  try {
    delete process.env.ALL_PROXY;
    delete process.env.all_proxy;
    const skillsDir = join(tmp, "skills");
    const repoDir = join(tmp, "LangBot");
    const webDir = join(repoDir, "web");
    const venvBin = join(repoDir, ".venv", "bin");
    const browserProfile = join(tmp, "browser-profile");
    const chromium = join(tmp, "chromium");
    mkdirSync(skillsDir, { recursive: true });
    mkdirSync(webDir, { recursive: true });
    mkdirSync(venvBin, { recursive: true });
    mkdirSync(browserProfile, { recursive: true });
    writeFileSync(chromium, "");
    const python = join(venvBin, "python");
    writeFileSync(python, "#!/bin/sh\nexit 1\n");
    chmodSync(python, 0o755);
    writeFileSync(
      join(skillsDir, ".env"),
      [
        "LANGBOT_BACKEND_URL=http://127.0.0.1:59996",
        "LANGBOT_FRONTEND_URL=http://127.0.0.1:59996",
        "LANGBOT_DEV_FRONTEND_URL=http://127.0.0.1:59996",
        `LANGBOT_REPO=${repoDir}`,
        `LANGBOT_WEB_REPO=${webDir}`,
        `LANGBOT_BROWSER_PROFILE=${browserProfile}`,
        `LANGBOT_CHROMIUM_EXECUTABLE=${chromium}`,
        "ALL_PROXY=socks5://127.0.0.1:7890",
      ].join("\n"),
    );
    const result = await captureAsync(() => commandEnvDoctor({ root: tmp, args: ["env", "doctor"] }));
    assert.equal(result.code, 1);
    assert.match(result.output, /FAIL: SOCKS proxy ALL_PROXY is configured/);
    assert.match(result.output, /cannot import socksio/);
    assert.match(result.output, /-m pip install socksio/);
  } finally {
    if (originalAllProxy === undefined) delete process.env.ALL_PROXY;
    else process.env.ALL_PROXY = originalAllProxy;
    if (originalAllProxyLower === undefined) delete process.env.all_proxy;
    else process.env.all_proxy = originalAllProxyLower;
    rmSync(tmp, { recursive: true, force: true });
  }
 });
 test("env show redacts secret-like values by default", () => {
  const tmp = mkdtempSync(join(tmpdir(), "lbs-env-show-redact-"));
  try {
@@ -2521,6 +2647,38 @@ test("test report renders a reusable evidence template", () => {
  assert.match(result.output, /no log files provided/);
 });
 test("test report promotes loaded automation evidence into result section", () => {
  const tmp = mkdtempSync(join(tmpdir(), "lbs-report-automation-"));
  try {
    writeFileSync(
      join(tmp, "automation-result.json"),
      JSON.stringify({
        status: "pass",
        reason: "latency thresholds passed",
        url: "http://127.0.0.1:5300",
        artifacts: { metrics_json: join(tmp, "metrics.json") },
      }),
    );
    const result = capture(() => commandTestReport(ctx([
      "test",
      "report",
      "langbot-live-backend-latency",
      "--evidence-dir",
      tmp,
      "--no-auto-log",
    ])));
    assert.equal(result.code, 0);
    assert.match(result.output, /## Result\n- result: pass\n- reason: latency thresholds passed/);
    assert.match(result.output, /- target_tested: http:\/\/127\.0\.0\.1:5300/);
    assert.doesNotMatch(result.output, /target_tested: TODO/);
    assert.match(result.output, /## Automation Result/);
  } finally {
    rmSync(tmp, { recursive: true, force: true });
  }
 });
 test("validate rejects dangling case references and missing automation scripts", () => {
  const tmp = mkdtempSync(join(tmpdir(), "lbs-validate-strict-"));
  try {
@@ -1,3 +1,5 @@
 """LangBot - Production-grade platform for building agentic IM bots"""
-__version__ = '4.10.2'
+from importlib.metadata import version
 __version__ = version('langbot')
@@ -1,6 +1,9 @@
 from __future__ import annotations
 from langbot.pkg.utils import constants
 from .. import group
 from .box_visibility import should_hide_box_runtime_status
@group.group_class('box', '/api/v1/box')
@@ -9,6 +12,7 @@ class BoxRouterGroup(group.RouterGroup):
        @self.route('/status', methods=['GET'], auth_type=group.AuthType.USER_TOKEN)
        async def _() -> str:
            status = await self.ap.box_service.get_status()
            status['hidden'] = should_hide_box_runtime_status(constants.edition, status.get('enabled'))
            return self.success(data=status)
        @self.route('/sessions', methods=['GET'], auth_type=group.AuthType.USER_TOKEN)
@@ -0,0 +1,5 @@
 from __future__ import annotations
 def should_hide_box_runtime_status(edition: str, box_enabled: bool | None) -> bool:
    return edition == 'cloud' and box_enabled is False
@@ -1,3 +1,5 @@
 import base64
 import quart
 from .. import group
@@ -30,6 +32,50 @@ class SurveyRouterGroup(group.RouterGroup):
                return self.fail(2, 'Failed to submit response')
            return self.fail(3, 'Survey not available')
        @self.route('/feedback', methods=['POST'], auth_type=group.AuthType.USER_TOKEN)
        async def _feedback(user_email: str) -> str:
            """Submit on-demand user feedback from the sidebar."""
            json_data = await quart.request.get_json(silent=True) or {}
            content = str(json_data.get('content', '')).strip()
            attachments = json_data.get('attachments', [])
            if not content:
                return self.fail(1, 'content required')
            if len(content) > 5000:
                return self.fail(2, 'content too long')
            if not isinstance(attachments, list):
                return self.fail(3, 'attachments must be an array')
            if len(attachments) > 3:
                return self.fail(4, 'too many attachments')
            normalized_attachments = []
            for item in attachments:
                if not isinstance(item, dict):
                    continue
                data_url = str(item.get('data_url', ''))
                mime_type = str(item.get('mime_type', ''))[:128]
                name = str(item.get('name', ''))[:255]
                if not data_url.startswith('data:image/'):
                    continue
                try:
                    payload = data_url.split(',', 1)[1]
                    if len(base64.b64decode(payload, validate=True)) > 1024 * 1024:
                        return self.fail(5, 'attachment too large')
                except Exception:
                    return self.fail(5, 'attachment too large')
                normalized_attachments.append({'name': name, 'mime_type': mime_type, 'data_url': data_url})
            if self.ap.survey:
                ok = await self.ap.survey.submit_feedback(
                    content=content,
                    attachments=normalized_attachments,
                    user_email=user_email,
                )
                if ok:
                    return self.success()
                return self.fail(6, 'Failed to submit feedback')
            return self.fail(7, 'Survey not available')
        @self.route('/dismiss', methods=['POST'], auth_type=group.AuthType.USER_TOKEN)
        async def _dismiss() -> str:
            """Dismiss survey."""
@@ -195,6 +195,13 @@ class UserRouterGroup(group.RouterGroup):
        @self.route('/set-password', methods=['POST'], auth_type=group.AuthType.USER_TOKEN)
        async def _(user_email: str) -> str:
            """Set password for Space account (first time) or change password"""
            # Check if modifying login info is allowed
            allow_modify_login_info = self.ap.instance_config.data.get('system', {}).get(
                'allow_modify_login_info', True
            )
            if not allow_modify_login_info:
                return self.http_status(403, -1, 'Modifying login info is disabled')
            json_data = await quart.request.json
            new_password = json_data.get('new_password')
            current_password = json_data.get('current_password')
@@ -141,15 +141,25 @@ class MCPService:
        runtime_mcp_session: RuntimeMCPSession | None = None
        ctx = taskmgr.TaskContext.new()
        if server_name != '_':
            runtime_mcp_session = self.ap.tool_mgr.mcp_tool_loader.get_session(server_name)
            if runtime_mcp_session is None:
                raise ValueError(f'Server not found: {server_name}')
-            if runtime_mcp_session.status == MCPSessionStatus.ERROR:
+            persisted_session = runtime_mcp_session
-                coroutine = runtime_mcp_session.start()
+
            async def _refresh_and_report() -> None:
                if persisted_session.status == MCPSessionStatus.ERROR:
                    await persisted_session.start()
                else:
-                coroutine = runtime_mcp_session.refresh()
+                    await persisted_session.refresh()
                # Surface the discovered tools so the config page can render them
                # even for an already-hosted server.
                ctx.metadata['runtime_info'] = persisted_session.get_runtime_info_dict()
            coroutine = _refresh_and_report()
        else:
            runtime_mcp_session = await self.ap.tool_mgr.mcp_tool_loader.load_mcp_server(server_config=server_data)
@@ -160,6 +170,12 @@ class MCPService:
            async def _run_and_cleanup() -> None:
                try:
                    await test_session.start()
                    # Capture the runtime info (status + discovered tools) BEFORE
                    # shutting the transient session down. The create/edit config
                    # page has no persisted server to reload from, so without this
                    # a successful test could only show "no tools found". The
                    # frontend reads ctx.metadata.runtime_info to render the tools.
                    ctx.metadata['runtime_info'] = test_session.get_runtime_info_dict()
                finally:
                    try:
                        await test_session.shutdown()
@@ -171,7 +187,6 @@ class MCPService:
            coroutine = _run_and_cleanup()
        ctx = taskmgr.TaskContext.new()
        wrapper = self.ap.task_mgr.create_user_task(
            coroutine,
            kind='mcp-operation',
@@ -20,6 +20,15 @@ class UserService:
    def __init__(self, ap: app.Application) -> None:
        self.ap = ap
        self._create_user_lock = asyncio.Lock()
        self._password_hash_lock = asyncio.Semaphore(1)
    async def _hash_password(self, password: str) -> str:
        async with self._password_hash_lock:
            return await asyncio.to_thread(argon2.PasswordHasher().hash, password)
    async def _verify_password(self, hashed_password: str, password: str) -> None:
        async with self._password_hash_lock:
            await asyncio.to_thread(argon2.PasswordHasher().verify, hashed_password, password)
    async def is_initialized(self) -> bool:
        result = await self.ap.persistence_mgr.execute_async(sqlalchemy.select(user.User).limit(1))
@@ -28,9 +37,7 @@ class UserService:
        return result_list is not None and len(result_list) > 0
    async def create_user(self, user_email: str, password: str) -> None:
-        ph = argon2.PasswordHasher()
+        hashed_password = await self._hash_password(password)
        hashed_password = ph.hash(password)
        await self.ap.persistence_mgr.execute_async(
            sqlalchemy.insert(user.User).values(user=user_email, password=hashed_password, account_type='local')
@@ -69,9 +76,7 @@ class UserService:
        if not user_obj.password:
            raise ValueError('请使用 Space 账户登录')
-        ph = argon2.PasswordHasher()
+        await self._verify_password(user_obj.password, password)
        ph.verify(user_obj.password, password)
        return await self.generate_jwt_token(user_email)
@@ -93,17 +98,13 @@ class UserService:
        return jwt.decode(token, jwt_secret, algorithms=['HS256'])['user']
    async def reset_password(self, user_email: str, new_password: str) -> None:
-        ph = argon2.PasswordHasher()
+        hashed_password = await self._hash_password(new_password)
        hashed_password = ph.hash(new_password)
        await self.ap.persistence_mgr.execute_async(
            sqlalchemy.update(user.User).where(user.User.user == user_email).values(password=hashed_password)
        )
    async def change_password(self, user_email: str, current_password: str, new_password: str) -> None:
        ph = argon2.PasswordHasher()
        user_obj = await self.get_user_by_email(user_email)
        if user_obj is None:
            raise ValueError('User not found')
@@ -111,9 +112,9 @@ class UserService:
        if not user_obj.password:
            raise ValueError('No local password set, please set a password first')
-        ph.verify(user_obj.password, current_password)
+        await self._verify_password(user_obj.password, current_password)
-        hashed_password = ph.hash(new_password)
+        hashed_password = await self._hash_password(new_password)
        await self.ap.persistence_mgr.execute_async(
            sqlalchemy.update(user.User).where(user.User.user == user_email).values(password=hashed_password)
@@ -232,7 +233,6 @@ class UserService:
    async def set_password(self, user_email: str, new_password: str, current_password: str | None = None) -> None:
        """Set or change password for a user"""
        ph = argon2.PasswordHasher()
        user_obj = await self.get_user_by_email(user_email)
        if user_obj is None:
@@ -243,9 +243,9 @@ class UserService:
        if has_password:
            if not current_password:
                raise ValueError('Current password is required')
-            ph.verify(user_obj.password, current_password)
+            await self._verify_password(user_obj.password, current_password)
-        hashed_password = ph.hash(new_password)
+        hashed_password = await self._hash_password(new_password)
        await self.ap.persistence_mgr.execute_async(
            sqlalchemy.update(user.User).where(user.User.user == user_email).values(password=hashed_password)
        )
@@ -82,7 +82,6 @@ class BoxService:
        return self._enabled
    async def initialize(self):
        self._ensure_default_workspace()
        if not self._enabled:
            # Disabled by config: do NOT connect to a remote runtime, do NOT
            # fork a stdio subprocess. Every consumer of box_service should
@@ -99,6 +98,7 @@ class BoxService:
                await self._runtime_connector.initialize()
            else:
                await self.client.initialize()
            self._ensure_default_workspace()
            self._available = True
            self._connector_error = ''
            self.ap.logger.info(
@@ -1152,6 +1152,9 @@ class BoxService:
        if self.default_workspace is None:
            return
        if not self.shares_filesystem_with_box:
            return
        if os.path.isdir(self.default_workspace):
            return
@@ -1176,7 +1179,7 @@ class BoxService:
            return
        host_path = os.path.realpath(spec.host_path)
-        if not os.path.isdir(host_path):
+        if self.shares_filesystem_with_box and not os.path.isdir(host_path):
            raise BoxValidationError('host_path must point to an existing directory on the host')
        if not self.allowed_mount_roots:
@@ -84,13 +84,7 @@ class CommandManager:
        privilege = 1
-        admins = self.ap.instance_config.data['admins']
+        if f'{query.launcher_type.value}_{query.launcher_id}' in self.ap.instance_config.data['admins']:
        launcher_session_id = f'{query.launcher_type.value}_{query.launcher_id}'
        sender_session_id = f'person_{query.sender_id}'
        # 兼容老版本匹配 launcher_session_id（群管理: group_xxx 私聊管理: person_xxx）
        # 新实现匹配 sender_session_id（个人管理员: person_xxx，在任何群聊中生效）
        if launcher_session_id in admins or sender_session_id in admins:
            privilege = 2
        ctx = command_context.ExecuteContext(
@@ -9,7 +9,7 @@ class MCPServer(Base):
    uuid = sqlalchemy.Column(sqlalchemy.String(255), primary_key=True, unique=True)
    name = sqlalchemy.Column(sqlalchemy.String(255), nullable=False)
    enable = sqlalchemy.Column(sqlalchemy.Boolean, nullable=False, default=False)
-    mode = sqlalchemy.Column(sqlalchemy.String(255), nullable=False)  # stdio, sse, http
+    mode = sqlalchemy.Column(sqlalchemy.String(255), nullable=False)  # stdio, remote (legacy: sse, http)
    extra_args = sqlalchemy.Column(sqlalchemy.JSON, nullable=False, default={})
    # Markdown documentation captured from LangBot Space at install time so the
    # detail page can show docs even when the server is offline / has no tools.
@@ -0,0 +1,47 @@
 """normalize mcp_servers transport mode to local/remote
 The MCP transport selection for servers LangBot connects to was simplified
 from three persisted modes (``stdio`` / ``sse`` / ``http``) down to two:
 ``stdio`` (local, Box-sandboxed) and ``remote`` (the runtime auto-detects
 Streamable HTTP vs. legacy SSE from the URL). This migration rewrites any
 existing ``sse`` / ``http`` rows to ``remote`` so the stored value matches the
 new two-option UI. The connection args (url / headers / timeout /
 ssereadtimeout) live in ``extra_args`` and are left untouched — the
 auto-detecting remote transport consumes them regardless.
 Revision ID: 0006_normalize_mcp_remote_mode
 Revises: 0005_add_llm_context_length
 Create Date: 2026-06-21
 """
 import sqlalchemy as sa
 from alembic import op
 revision = '0006_normalize_mcp_remote_mode'
 down_revision = '0005_add_llm_context_length'
 branch_labels = None
 depends_on = None
 def upgrade() -> None:
    # Idempotent data migration: collapse legacy remote transports into the
    # unified ``remote`` mode. Guard against the table being absent (truly empty
    # DB migrated before create_all()).
    conn = op.get_bind()
    inspector = sa.inspect(conn)
    if 'mcp_servers' not in inspector.get_table_names():
        return
    conn.execute(sa.text("UPDATE mcp_servers SET mode = 'remote' WHERE mode IN ('sse', 'http')"))
 def downgrade() -> None:
    # The legacy distinction between ``sse`` and ``http`` cannot be recovered
    # from ``remote`` alone (the transport is auto-detected at runtime, not
    # stored). Map everything that is not ``stdio`` back to ``http`` as a
    # best-effort reversal — both legacy modes still route correctly in the
    # backend lifecycle dispatch.
    conn = op.get_bind()
    inspector = sa.inspect(conn)
    if 'mcp_servers' not in inspector.get_table_names():
        return
    conn.execute(sa.text("UPDATE mcp_servers SET mode = 'http' WHERE mode = 'remote'"))
@@ -23,13 +23,7 @@ class CommandHandler(handler.MessageHandler):
        privilege = 1
-        admins = self.ap.instance_config.data['admins']
+        if f'{query.launcher_type.value}_{query.launcher_id}' in self.ap.instance_config.data['admins']:
        launcher_session_id = f'{query.launcher_type.value}_{query.launcher_id}'
        sender_session_id = f'person_{query.sender_id}'
        # 兼容老版本匹配 launcher_session_id（群管理: group_xxx 私聊管理: person_xxx）
        # 新实现匹配 sender_session_id（个人管理员: person_xxx，在任何群聊中生效）
        if launcher_session_id in admins or sender_session_id in admins:
            privilege = 2
        spt = command_text.split(' ')
@@ -0,0 +1,509 @@
 """HTTP Bot adapter — standalone server-to-server platform adapter.
 Lets any external backend drive a LangBot pipeline over plain HTTP:
 * **Inbound**  — the backend POSTs a signed message to the unified webhook
  route ``POST /bots/<bot_uuid>``; this adapter verifies the signature, builds
  a platform event carrying the caller-defined ``session_id`` as the launcher
  id, and fires it into the normal pipeline (so message aggregation, N->1,
  works for free).
 * **Outbound** — every ``reply_message`` / ``reply_message_chunk`` the pipeline
  emits is delivered as a signed POST to the configured ``callback_url``. A
  single turn may emit many replies (1->M); each is one callback, ordered per
  session via a small worker queue.
 Design notes:
 * The callback URL is taken **only** from adapter config (never from the
  inbound message) to keep the SSRF surface closed.
 * Replies for one ``session_id`` are delivered in ``sequence`` order; the
  caller knows a turn is complete when ``is_final: true`` arrives.
 * No new HTTP route is registered — the existing unified webhook dispatcher
  (``pkg/api/http/controller/groups/webhooks.py``) calls
  ``handle_unified_webhook`` on this adapter.
 See docs/platforms/http-bot.md for the full integration guide.
 """
 from __future__ import annotations
 import asyncio
 import json
 import time
 import typing
 import uuid
 from datetime import datetime
 import aiohttp
 import pydantic
 import quart
 import langbot_plugin.api.definition.abstract.platform.adapter as abstract_platform_adapter
 import langbot_plugin.api.entities.builtin.platform.message as platform_message
 import langbot_plugin.api.entities.builtin.platform.events as platform_events
 import langbot_plugin.api.entities.builtin.platform.entities as platform_entities
 import langbot_plugin.api.definition.abstract.platform.event_logger as abstract_platform_logger
 from . import http_bot_signing as signing
 from ...utils import httpclient
 # Error envelope codes (HTTP status -> body code), documented in the design doc.
 _ERR = {
    'bad_request': (400, 40001),
    'bad_signature': (401, 40101),
    'duplicate': (409, 40901),
    'too_large': (413, 41301),
    'internal': (500, 50001),
 }
 # Max accepted inbound body size (bytes).
 _MAX_BODY = 1 * 1024 * 1024
 # Idempotency dedup window (seconds) and cap.
 _IDEMPOTENCY_TTL = 600
 _IDEMPOTENCY_MAX = 4096
 class _SessionOutbound:
    """Per-session outbound state: ordered delivery queue + sequence counter."""
    def __init__(self) -> None:
        self.queue: asyncio.Queue = asyncio.Queue(maxsize=1000)
        self.worker: asyncio.Task | None = None
        self.sequence: int = 0
        self.last_was_final: bool = True  # so the first reply of a turn starts at seq 1
 class _SyncCollector:
    """Collects reply parts for a /sync request and resolves when the turn ends."""
    def __init__(self) -> None:
        self.parts: list = []
        self.done: asyncio.Event = asyncio.Event()
 class HttpBotAdapter(abstract_platform_adapter.AbstractMessagePlatformAdapter):
    """Standalone HTTP adapter (inbound webhook + outbound callbacks)."""
    bot_uuid: str = pydantic.Field(default='', exclude=True)
    listeners: dict[
        typing.Type[platform_events.Event],
        typing.Callable[[platform_events.Event, abstract_platform_adapter.AbstractMessagePlatformAdapter], None],
    ] = pydantic.Field(default_factory=dict, exclude=True)
    # session_id -> outbound state
    outbound_states: dict[str, _SessionOutbound] = pydantic.Field(default_factory=dict, exclude=True)
    # idempotency key -> accepted-at epoch
    idempotency_cache: dict[str, float] = pydantic.Field(default_factory=dict, exclude=True)
    # session_id -> sync collector (set while a /sync request is awaiting a turn)
    sync_waiters: dict[str, '_SyncCollector'] = pydantic.Field(default_factory=dict, exclude=True)
    model_config = pydantic.ConfigDict(arbitrary_types_allowed=True)
    def __init__(self, config: dict, logger: abstract_platform_logger.AbstractEventLogger, **kwargs):
        super().__init__(config=config, logger=logger, **kwargs)
        self.bot_account_id = 'http_bot'
        self.outbound_states = {}
        self.idempotency_cache = {}
        self.sync_waiters = {}
    # -- framework hooks ------------------------------------------------------
    def set_bot_uuid(self, bot_uuid: str) -> None:
        """Called by the bot manager so the adapter knows its own bot uuid."""
        object.__setattr__(self, 'bot_uuid', bot_uuid)
    def get_launcher_id(self, event: platform_events.MessageEvent) -> str:
        """Map an inbound event to a LangBot launcher id.
        We return the caller-defined ``session_id`` (stashed on the sender /
        group id at inbound time) so that each external session maps 1:1 to an
        isolated LangBot session.
        """
        if isinstance(event, platform_events.GroupMessage):
            return str(event.sender.group.id)
        return str(event.sender.id)
    def register_listener(
        self,
        event_type: typing.Type[platform_events.Event],
        func: typing.Callable[
            [platform_events.Event, abstract_platform_adapter.AbstractMessagePlatformAdapter], typing.Awaitable[None]
        ],
    ):
        self.listeners[event_type] = func
    def unregister_listener(
        self,
        event_type: typing.Type[platform_events.Event],
        func: typing.Callable[
            [platform_events.Event, abstract_platform_adapter.AbstractMessagePlatformAdapter], typing.Awaitable[None]
        ],
    ):
        self.listeners.pop(event_type, None)
    async def is_muted(self, group_id: int) -> bool:
        return False
    async def is_stream_output_supported(self) -> bool:
        return True
    async def run_async(self):
        # Purely webhook-driven; nothing to poll. Stay alive.
        while True:
            await asyncio.sleep(3600)
    async def kill(self):
        # Cancel any outbound workers.
        for state in self.outbound_states.values():
            if state.worker and not state.worker.done():
                state.worker.cancel()
        return True
    # -- inbound --------------------------------------------------------------
    def _err(self, kind: str, detail: str = ''):
        status, code = _ERR[kind]
        return quart.jsonify({'code': code, 'msg': detail or kind, 'data': None}), status
    def _prune_idempotency(self) -> None:
        now = time.time()
        if len(self.idempotency_cache) > _IDEMPOTENCY_MAX:
            self.idempotency_cache.clear()
            return
        expired = [k for k, ts in self.idempotency_cache.items() if now - ts > _IDEMPOTENCY_TTL]
        for k in expired:
            self.idempotency_cache.pop(k, None)
    async def handle_unified_webhook(self, bot_uuid: str, path: str, request):
        """Handle an inbound POST from the unified webhook dispatcher.
        Sub-path routing:
            (no path)  -> push a message
            "reset"    -> reset a session's conversation (body: {session_id, session_type?})
            "sync"     -> push a message and wait for the final reply (collapses 1->M)
        """
        object.__setattr__(self, 'bot_uuid', bot_uuid)
        if path == 'reset':
            return await self._handle_reset(request)
        if path == 'sync':
            return await self._handle_inbound(request, sync=True)
        if path in ('', None):
            return await self._handle_inbound(request, sync=False)
        return self._err('bad_request', f'unknown sub-path: {path}')
    async def _read_and_verify(self, request) -> tuple[dict | None, typing.Any]:
        """Read body, enforce size + signature. Returns (data, error_response)."""
        body = await request.get_data()
        if body and len(body) > _MAX_BODY:
            return None, self._err('too_large', 'message too large')
        if self.config.get('signature_required', True):
            ok, reason = signing.verify(
                secret=self.config.get('inbound_secret', ''),
                body=body,
                timestamp=request.headers.get(signing.HEADER_TIMESTAMP),
                signature=request.headers.get(signing.HEADER_SIGNATURE),
            )
            if not ok:
                await self.logger.warning(f'http_bot inbound signature rejected: {reason}')
                return None, self._err('bad_signature', f'invalid signature: {reason}')
        try:
            data = json.loads(body)
        except (json.JSONDecodeError, ValueError):
            return None, self._err('bad_request', 'body is not valid JSON')
        if not isinstance(data, dict):
            return None, self._err('bad_request', 'body must be a JSON object')
        return data, None
    def _build_event(self, data: dict) -> tuple[platform_events.MessageEvent, str, str, str]:
        """Build a platform event from inbound data.
        Returns (event, session_id, session_type, message_id).
        """
        session_id = str(data['session_id'])
        session_type = data.get('session_type') or self.config.get('default_session_type', 'person')
        sender_meta = data.get('sender') or {}
        sender_name = str(sender_meta.get('name', 'User'))
        message_id = 'in_' + uuid.uuid4().hex
        chain = platform_message.MessageChain.model_validate(data['message'])
        # Carry the inbound message id + timestamp as the Source component.
        chain.insert(0, platform_message.Source(id=message_id, time=datetime.now()))
        if session_type == 'group':
            group = platform_entities.Group(
                id=session_id,
                name=str(sender_meta.get('group_name', session_id)),
                permission=platform_entities.Permission.Member,
            )
            sender = platform_entities.GroupMember(
                id=str(sender_meta.get('id', session_id)),
                member_name=sender_name,
                group=group,
                permission=platform_entities.Permission.Member,
            )
            event = platform_events.GroupMessage(sender=sender, message_chain=chain, time=datetime.now().timestamp())
        else:
            sender = platform_entities.Friend(id=session_id, nickname=sender_name, remark=sender_name)
            event = platform_events.FriendMessage(sender=sender, message_chain=chain, time=datetime.now().timestamp())
        return event, session_id, session_type, message_id
    async def _handle_inbound(self, request, sync: bool):
        data, err = await self._read_and_verify(request)
        if err is not None:
            return err
        if 'session_id' not in data or 'message' not in data:
            return self._err('bad_request', 'session_id and message are required')
        # Idempotency.
        idem = request.headers.get(signing.HEADER_IDEMPOTENCY)
        if idem:
            self._prune_idempotency()
            if idem in self.idempotency_cache:
                return self._err('duplicate', 'idempotency key already accepted')
            self.idempotency_cache[idem] = time.time()
        try:
            event, session_id, session_type, message_id = self._build_event(data)
        except Exception as e:  # noqa: BLE001
            return self._err('bad_request', f'failed to parse message: {e}')
        listener = self.listeners.get(type(event))
        if listener is None:
            return self._err('internal', 'no listener registered for event type')
        if sync:
            return await self._run_sync(event, listener, session_id, message_id)
        # Fire-and-collect: kick the pipeline, return 202 immediately.
        asyncio.create_task(listener(event, self))
        return quart.jsonify(
            {
                'code': 0,
                'msg': 'accepted',
                'data': {
                    'session_id': session_id,
                    'accepted_message_id': message_id,
                    'aggregating': True,
                },
            }
        ), 202
    async def _handle_reset(self, request):
        data, err = await self._read_and_verify(request)
        if err is not None:
            return err
        if 'session_id' not in data:
            return self._err('bad_request', 'session_id is required')
        session_id = str(data['session_id'])
        session_type = data.get('session_type') or self.config.get('default_session_type', 'person')
        launcher_type = 'group' if session_type == 'group' else 'person'
        removed = await self._reset_session(launcher_type, session_id)
        return quart.jsonify({'code': 0, 'msg': 'reset', 'data': {'session_id': session_id, 'removed': removed}}), 200
    async def _reset_session(self, launcher_type: str, launcher_id: str) -> bool:
        """Drop the matching session so the next message starts a fresh conversation."""
        sess_mgr = self.ap.sess_mgr
        before = len(sess_mgr.session_list)
        sess_mgr.session_list = [
            s
            for s in sess_mgr.session_list
            if not (
                str(s.launcher_type.value if hasattr(s.launcher_type, 'value') else s.launcher_type) == launcher_type
                and str(s.launcher_id) == launcher_id
            )
        ]
        return len(sess_mgr.session_list) < before
    # -- outbound -------------------------------------------------------------
    @staticmethod
    def _extract_session_id(message_source: platform_events.MessageEvent) -> str:
        if isinstance(message_source, platform_events.GroupMessage):
            return str(message_source.sender.group.id)
        return str(message_source.sender.id)
    @staticmethod
    def _extract_reply_to(message_source: platform_events.MessageEvent) -> str:
        for comp in message_source.message_chain:
            if isinstance(comp, platform_message.Source):
                return str(comp.id)
        return ''
    def _next_sequence(self, session_id: str, is_final: bool) -> int:
        state = self.outbound_states.setdefault(session_id, _SessionOutbound())
        if state.last_was_final:
            state.sequence = 1
        else:
            state.sequence += 1
        state.last_was_final = is_final
        return state.sequence
    async def _enqueue_callback(self, session_id: str, payload: dict) -> None:
        state = self.outbound_states.setdefault(session_id, _SessionOutbound())
        if state.worker is None or state.worker.done():
            state.worker = asyncio.create_task(self._outbound_worker(session_id, state))
        try:
            state.queue.put_nowait(payload)
        except asyncio.QueueFull:
            # Drop oldest to bound memory, then enqueue (best-effort, at-least-once).
            try:
                state.queue.get_nowait()
            except asyncio.QueueEmpty:
                pass
            await self.logger.warning(f'http_bot outbound queue full for session {session_id}; dropped oldest')
            state.queue.put_nowait(payload)
    async def _outbound_worker(self, session_id: str, state: _SessionOutbound) -> None:
        while True:
            payload = await state.queue.get()
            try:
                await self._deliver_callback(payload)
            except Exception as e:  # noqa: BLE001
                await self.logger.error(f'http_bot callback delivery failed for {session_id}: {e}')
            finally:
                state.queue.task_done()
    async def _deliver_callback(self, payload: dict) -> None:
        callback_url = self.config.get('callback_url', '')
        if not callback_url:
            await self.logger.warning('http_bot has no callback_url configured; dropping reply')
            return
        body = json.dumps(payload, ensure_ascii=False).encode()
        secret = self.config.get('outbound_secret') or self.config.get('inbound_secret', '')
        ts, sig = signing.sign(secret, body)
        headers = {
            'Content-Type': 'application/json',
            signing.HEADER_TIMESTAMP: ts,
            signing.HEADER_SIGNATURE: sig,
        }
        timeout = aiohttp.ClientTimeout(total=int(self.config.get('callback_timeout', 15)))
        max_retries = int(self.config.get('callback_max_retries', 3))
        session = httpclient.get_session()
        attempt = 0
        while True:
            attempt += 1
            try:
                async with session.post(callback_url, data=body, headers=headers, timeout=timeout) as resp:
                    if resp.status < 400:
                        return
                    if resp.status < 500 or attempt > max_retries:
                        await self.logger.warning(f'http_bot callback {callback_url} -> {resp.status}, giving up')
                        return
            except (aiohttp.ClientError, asyncio.TimeoutError) as e:
                if attempt > max_retries:
                    await self.logger.warning(f'http_bot callback {callback_url} failed after {attempt} tries: {e}')
                    return
            await asyncio.sleep(min(2 ** (attempt - 1), 30))
    async def _emit_reply(
        self,
        message_source: platform_events.MessageEvent,
        message: platform_message.MessageChain,
        is_final: bool,
        stream: bool,
    ) -> dict:
        session_id = self._extract_session_id(message_source)
        reply_to = self._extract_reply_to(message_source)
        sequence = self._next_sequence(session_id, is_final)
        parts = [c.model_dump() if hasattr(c, 'model_dump') else c.__dict__ for c in message]
        payload = {
            'session_id': session_id,
            'reply_to': reply_to,
            'sequence': sequence,
            'is_final': is_final,
            'stream': stream,
            'message': parts,
            'timestamp': datetime.now().isoformat(),
        }
        # If a /sync request is awaiting this session, collect instead of POSTing.
        collector = self.sync_waiters.get(session_id)
        if collector is not None:
            collector.parts.extend(parts)
            if is_final:
                collector.done.set()
            return payload
        await self._enqueue_callback(session_id, payload)
        return payload
    async def send_message(self, target_type: str, target_id: str, message: platform_message.MessageChain) -> dict:
        """Proactively push a message to a session (target_id == session_id)."""
        sequence = self._next_sequence(str(target_id), is_final=True)
        payload = {
            'session_id': str(target_id),
            'reply_to': '',
            'sequence': sequence,
            'is_final': True,
            'stream': False,
            'message': [c.model_dump() if hasattr(c, 'model_dump') else c.__dict__ for c in message],
            'timestamp': datetime.now().isoformat(),
        }
        await self._enqueue_callback(str(target_id), payload)
        return payload
    async def reply_message(
        self,
        message_source: platform_events.MessageEvent,
        message: platform_message.MessageChain,
        quote_origin: bool = False,
    ) -> dict:
        return await self._emit_reply(message_source, message, is_final=True, stream=False)
    async def reply_message_chunk(
        self,
        message_source: platform_events.MessageEvent,
        bot_message,
        message: platform_message.MessageChain,
        quote_origin: bool = False,
        is_final: bool = False,
    ) -> dict:
        message_is_final = is_final and getattr(bot_message, 'tool_calls', None) is None
        return await self._emit_reply(message_source, message, is_final=message_is_final, stream=True)
    # -- sync convenience mode ------------------------------------------------
    async def _run_sync(self, event, listener, session_id: str, message_id: str):
        """Push a message and wait for the final reply, collapsing 1->M parts.
        Lossy by design (drops streaming/ordering nuance); documented as such.
        Concurrency-safe: routing is via the per-session ``_sync_waiters``
        registry that ``_emit_reply`` consults, not by patching methods.
        """
        if session_id in self.sync_waiters:
            return self._err('duplicate', 'a sync request is already in flight for this session')
        collector = _SyncCollector()
        self.sync_waiters[session_id] = collector
        try:
            asyncio.create_task(listener(event, self))
            timeout = int(self.config.get('callback_timeout', 15)) * 4
            try:
                await asyncio.wait_for(collector.done.wait(), timeout=timeout)
            except asyncio.TimeoutError:
                await self.logger.warning(f'http_bot sync wait timed out for session {session_id}')
        finally:
            self.sync_waiters.pop(session_id, None)
        return quart.jsonify(
            {
                'code': 0,
                'msg': 'ok',
                'data': {
                    'session_id': session_id,
                    'reply_to': message_id,
                    'message': collector.parts,
                },
            }
        ), 200
@@ -0,0 +1,9 @@
 <svg width="800px" height="800px" viewBox="0 0 64 64" fill="none" xmlns="http://www.w3.org/2000/svg">
  <rect x="2" y="2" width="60" height="60" rx="14" fill="#2563EB"/>
  <g stroke="#FFFFFF" stroke-width="3.6" stroke-linecap="round" stroke-linejoin="round" fill="none">
    <!-- </> code icon -->
    <path d="M24 22 L14 32 L24 42"/>
    <path d="M40 22 L50 32 L40 42"/>
    <path d="M36 18 L28 46"/>
  </g>
 </svg>
@@ -0,0 +1,153 @@
 apiVersion: v1
 kind: MessagePlatformAdapter
 metadata:
  name: http_bot
  label:
    en_US: HTTP Bot
    zh_Hans: HTTP 通用接入
    zh_Hant: HTTP 通用接入
    ja_JP: HTTP ボット
  description:
    en_US: Integrate any backend over plain HTTP. Push messages in via a signed webhook, receive replies on a callback URL. Server-to-server, no long-lived connection. Preserves message aggregation (N->1) and multi-part replies (1->M).
    zh_Hans: 通过 HTTP 接入任意后端系统。以签名 Webhook 推入消息，在回调地址接收回复。面向服务间集成，无需长连接。完整保留消息聚合（多条合一）与多段回复（一条问、多条回）能力。
    zh_Hant: 透過 HTTP 接入任意後端系統。以簽名 Webhook 推入訊息，在回調地址接收回覆。面向服務間整合，無需長連線。完整保留訊息聚合（多條合一）與多段回覆（一條問、多條回）能力。
    ja_JP: 任意のバックエンドを HTTP で接続。署名付き Webhook でメッセージを送信し、コールバック URL で返信を受信します。サーバー間連携、長時間接続不要。メッセージ集約（N→1）とマルチパート返信（1→M）に対応。
  icon: http_bot.svg
 spec:
  categories:
    - popular
    - global
  help_links:
    zh: https://docs.langbot.app/zh/platforms/http-bot
    en: https://docs.langbot.app/en/platforms/http-bot
    ja: https://docs.langbot.app/ja/platforms/http-bot
  config:
    - name: webhook_url
      label:
        en_US: Inbound Webhook URL
        zh_Hans: 入站 Webhook 地址
        zh_Hant: 入站 Webhook 地址
        ja_JP: 受信 Webhook URL
      description:
        en_US: Copy this URL. Your backend POSTs messages here (signed with the inbound secret).
        zh_Hans: 复制此地址。你的后端将消息以签名方式 POST 到这里。
        zh_Hant: 複製此地址。你的後端將訊息以簽名方式 POST 到這裡。
        ja_JP: この URL をコピーしてください。バックエンドは署名付きでここにメッセージを POST します。
      type: webhook-url
      required: false
      default: ""
    - name: inbound_secret
      label:
        en_US: Inbound Signing Secret
        zh_Hans: 入站签名密钥
        zh_Hant: 入站簽名密鑰
        ja_JP: 受信署名シークレット
      description:
        en_US: HMAC-SHA256 secret your backend uses to sign inbound requests. LangBot verifies every inbound POST with it.
        zh_Hans: 你的后端用于对入站请求做 HMAC-SHA256 签名的密钥；LangBot 据此校验每个入站 POST。
        zh_Hant: 你的後端用於對入站請求做 HMAC-SHA256 簽名的密鑰；LangBot 據此校驗每個入站 POST。
        ja_JP: バックエンドが受信リクエストの署名に使う HMAC-SHA256 シークレット。LangBot は受信 POST ごとに検証します。
      type: string
      required: true
      default: ""
    - name: callback_url
      label:
        en_US: Outbound Callback URL
        zh_Hans: 出站回调地址
        zh_Hant: 出站回調地址
        ja_JP: 送信コールバック URL
      description:
        en_US: Where LangBot POSTs replies. One turn may trigger multiple callbacks (1->M). For security the callback URL is taken ONLY from this config and cannot be overridden per-message.
        zh_Hans: LangBot 将回复 POST 到此地址。一轮对话可能触发多次回调（一问多答）。出于安全考虑，回调地址只取自此配置，不允许逐条消息覆盖。
        zh_Hant: LangBot 將回覆 POST 到此地址。一輪對話可能觸發多次回調（一問多答）。出於安全考慮，回調地址只取自此配置，不允許逐條訊息覆蓋。
        ja_JP: LangBot が返信を POST する先。1 ターンで複数回のコールバック（1→M）が発生し得ます。セキュリティ上、コールバック URL はこの設定からのみ取得し、メッセージ単位で上書きできません。
      type: string
      required: true
      default: ""
    - name: outbound_secret
      label:
        en_US: Outbound Signing Secret
        zh_Hans: 出站签名密钥
        zh_Hant: 出站簽名密鑰
        ja_JP: 送信署名シークレット
      description:
        en_US: HMAC-SHA256 secret LangBot uses to sign outbound callbacks so your receiver can verify them. Falls back to the inbound secret when empty.
        zh_Hans: LangBot 用于对出站回调签名的密钥，供你的接收端校验。留空时回退使用入站密钥。
        zh_Hant: LangBot 用於對出站回調簽名的密鑰，供你的接收端校驗。留空時回退使用入站密鑰。
        ja_JP: LangBot が送信コールバックの署名に使う HMAC-SHA256 シークレット。受信側で検証できます。空の場合は受信シークレットを使用します。
      type: string
      required: false
      default: ""
    - name: default_session_type
      label:
        en_US: Default Session Type
        zh_Hans: 默认会话类型
        zh_Hant: 預設會話類型
        ja_JP: デフォルトセッションタイプ
      description:
        en_US: Session type used when an inbound message omits session_type.
        zh_Hans: 入站消息未携带 session_type 时使用的会话类型。
        zh_Hant: 入站訊息未攜帶 session_type 時使用的會話類型。
        ja_JP: 受信メッセージに session_type がない場合に使用するセッションタイプ。
      type: select
      options:
        - name: person
          label:
            en_US: Person (1-on-1)
            zh_Hans: 个人（一对一）
            zh_Hant: 個人（一對一）
            ja_JP: 個人（1 対 1）
        - name: group
          label:
            en_US: Group
            zh_Hans: 群组
            zh_Hant: 群組
            ja_JP: グループ
      required: false
      default: person
    - name: signature_required
      label:
        en_US: Require Inbound Signature
        zh_Hans: 强制入站签名校验
        zh_Hant: 強制入站簽名校驗
        ja_JP: 受信署名を必須にする
      description:
        en_US: When enabled (recommended), every inbound POST must carry a valid signature. Disable ONLY for local development behind a trusted network.
        zh_Hans: 开启（推荐）后，每个入站 POST 都必须带有效签名。仅在受信任内网的本地开发时关闭。
        zh_Hant: 開啟（推薦）後，每個入站 POST 都必須帶有效簽名。僅在受信任內網的本地開發時關閉。
        ja_JP: 有効（推奨）にすると、すべての受信 POST に有効な署名が必要です。信頼できるネットワーク内のローカル開発時のみ無効化してください。
      type: boolean
      required: false
      default: true
    - name: callback_timeout
      label:
        en_US: Callback Timeout (seconds)
        zh_Hans: 回调超时（秒）
        zh_Hant: 回調逾時（秒）
        ja_JP: コールバックタイムアウト（秒）
      description:
        en_US: Per-callback HTTP timeout.
        zh_Hans: 单次回调的 HTTP 超时时间。
        zh_Hant: 單次回調的 HTTP 逾時時間。
        ja_JP: コールバックごとの HTTP タイムアウト。
      type: integer
      required: false
      default: 15
    - name: callback_max_retries
      label:
        en_US: Callback Max Retries
        zh_Hans: 回调最大重试次数
        zh_Hant: 回調最大重試次數
        ja_JP: コールバック最大リトライ回数
      description:
        en_US: Retries on timeout or 5xx, with exponential backoff.
        zh_Hans: 超时或 5xx 时按指数退避重试的次数。
        zh_Hant: 逾時或 5xx 時按指數退避重試的次數。
        ja_JP: タイムアウトまたは 5xx 時に指数バックオフでリトライする回数。
      type: integer
      required: false
      default: 3
 execution:
  python:
    path: ./http_bot.py
    attr: HttpBotAdapter
@@ -0,0 +1,95 @@
 """HMAC signing utilities for the HTTP Bot adapter.
 A dependency-free, symmetric HMAC-SHA256 scheme used in *both* directions:
    signing_string = "{timestamp}." + raw_body_bytes
    signature      = "sha256=" + hex(HMAC_SHA256(secret, signing_string))
 Inbound requests are signed by the caller and verified here; outbound
 callbacks are signed here and verified by the caller. The scheme is trivial to
 reproduce in any language (see docs/platforms/http-bot.md for JS/curl).
 """
 from __future__ import annotations
 import hashlib
 import hmac
 import time
 # Header names (kept here so adapter + clients agree on a single source).
 HEADER_TIMESTAMP = 'X-LB-Timestamp'
 HEADER_SIGNATURE = 'X-LB-Signature'
 HEADER_IDEMPOTENCY = 'X-LB-Idempotency-Key'
 # Maximum allowed clock skew between signer and verifier (seconds).
 DEFAULT_REPLAY_WINDOW = 300
 def compute_signature(secret: str, body: bytes, timestamp: str | int) -> str:
    """Compute the ``sha256=<hex>`` signature for *body* at *timestamp*.
    Args:
        secret: Shared HMAC secret.
        body: Raw request body bytes (exactly as sent on the wire).
        timestamp: Unix timestamp (seconds) as str or int.
    Returns:
        The signature string, e.g. ``sha256=ab12...``.
    """
    signing_string = f'{timestamp}.'.encode() + body
    digest = hmac.new(secret.encode(), signing_string, hashlib.sha256).hexdigest()
    return f'sha256={digest}'
 def sign(secret: str, body: bytes, timestamp: int | None = None) -> tuple[str, str]:
    """Produce ``(timestamp, signature)`` for an outbound request.
    Args:
        secret: Shared HMAC secret.
        body: Raw request body bytes.
        timestamp: Optional fixed timestamp; defaults to ``int(time.time())``.
    Returns:
        ``(timestamp_str, signature_str)``.
    """
    ts = str(timestamp if timestamp is not None else int(time.time()))
    return ts, compute_signature(secret, body, ts)
 def verify(
    secret: str,
    body: bytes,
    timestamp: str | None,
    signature: str | None,
    replay_window: int = DEFAULT_REPLAY_WINDOW,
 ) -> tuple[bool, str]:
    """Verify an inbound signature.
    Args:
        secret: Shared HMAC secret.
        body: Raw request body bytes.
        timestamp: Value of the timestamp header.
        signature: Value of the signature header.
        replay_window: Max allowed skew in seconds.
    Returns:
        ``(ok, reason)``. ``reason`` is empty when ``ok`` is True, otherwise a
        short machine-friendly cause (``missing_headers`` / ``bad_timestamp`` /
        ``expired`` / ``signature_mismatch``).
    """
    if not timestamp or not signature:
        return False, 'missing_headers'
    try:
        ts_int = int(float(timestamp))
    except (ValueError, TypeError):
        return False, 'bad_timestamp'
    if abs(int(time.time()) - ts_int) > replay_window:
        return False, 'expired'
    expected = compute_signature(secret, body, timestamp)
    if not hmac.compare_digest(expected, signature):
        return False, 'signature_mismatch'
    return True, ''
@@ -248,6 +248,15 @@ class PluginRuntimeConnector(ManagedRuntimeConnector):
        mode = mcp_data.get('mode') or 'stdio'
        extra_args = mcp_data.get('extra_args') or {}
        # The MCP transport selection was simplified to two modes: 'stdio'
        # (local, Box-sandboxed) and 'remote' (the runtime auto-detects
        # Streamable HTTP vs. legacy SSE from the URL). Marketplace records may
        # still carry the older 'http'/'sse' modes — normalize them to 'remote'
        # so the installed server shows up correctly in the two-option UI. The
        # connection args (url/headers/timeout/ssereadtimeout) are preserved and
        # consumed by the auto-detecting remote transport regardless.
        if mode in ('http', 'sse'):
            mode = 'remote'
        # Marketplace records carry the rendered README markdown; persist it so
        # the detail page Docs tab works offline and without a marketplace round-trip.
        readme = mcp_data.get('readme') or ''
@@ -167,6 +167,36 @@ class RuntimeMCPSession:
        await self.session.initialize()
    async def _init_remote_server(self):
        """Connect to a remote MCP server, auto-detecting the transport.
        The user only supplies a URL ("remote" mode); they should not have to
        know whether the server speaks the modern Streamable HTTP transport or
        the legacy HTTP+SSE transport. Following the MCP backwards-compatibility
        guidance, we try Streamable HTTP first and fall back to SSE when it
        fails (e.g. the endpoint returns 4xx to the initialize POST).
        """
        try:
            await self._init_streamable_http_server()
            return
        except Exception as e:
            self.ap.logger.info(
                f'MCP server {self.server_name}: Streamable HTTP transport failed '
                f'({self._describe_exception(e)}), falling back to SSE'
            )
        # The Streamable HTTP attempt may have partially entered the transport /
        # session into the exit stack before failing. Tear it down and start
        # from a clean stack before trying SSE so we do not leak connections.
        try:
            await self.exit_stack.aclose()
        except Exception as cleanup_err:
            self.ap.logger.debug(f'MCP server {self.server_name}: error cleaning up before SSE fallback: {cleanup_err}')
        self.exit_stack = AsyncExitStack()
        self.session = None
        await self._init_sse_server()
    _MAX_RETRIES = 3
    _RETRY_DELAYS = [2, 4, 8]
@@ -175,6 +205,8 @@ class RuntimeMCPSession:
        try:
            if self.server_config['mode'] == 'stdio':
                await self._init_stdio_python_server()
            elif self.server_config['mode'] == 'remote':
                await self._init_remote_server()
            elif self.server_config['mode'] == 'sse':
                await self._init_sse_server()
            elif self.server_config['mode'] == 'http':
@@ -159,6 +159,21 @@ class SurveyManager:
        """Clear the pending survey (after user responds or dismisses)."""
        self._pending_survey = None
    async def _build_base_metadata(self, user_email: str | None = None) -> dict:
        metadata = {
            'version': constants.semantic_version,
            'instance_id': constants.instance_id,
        }
        if user_email:
            metadata['login_account'] = user_email
            try:
                user_obj = await self.ap.user_service.get_user_by_email(user_email)
                metadata['account_type'] = getattr(user_obj, 'account_type', '') or 'local'
                metadata['space_account_uuid'] = getattr(user_obj, 'space_account_uuid', '') or ''
            except Exception:
                pass
        return metadata
    async def submit_response(self, survey_id: str, answers: dict, completed: bool = True) -> bool:
        """Submit a survey response to Space."""
        if not self._is_space_configured():
@@ -169,9 +184,7 @@ class SurveyManager:
                'survey_id': survey_id,
                'instance_id': constants.instance_id,
                'answers': answers,
-                'metadata': {
+                'metadata': await self._build_base_metadata(),
                    'version': constants.semantic_version,
                },
                'completed': completed,
            }
            async with httpx.AsyncClient(timeout=httpx.Timeout(10)) as client:
@@ -183,6 +196,33 @@ class SurveyManager:
            self.ap.logger.warning(f'Failed to submit survey response: {e}')
        return False
    async def submit_feedback(
        self,
        content: str,
        attachments: list[dict],
        user_email: str | None = None,
    ) -> bool:
        """Submit an on-demand user feedback item to Space."""
        if not self._is_space_configured():
            return False
        try:
            url = f'{self._space_url}/api/v1/survey/feedback'
            metadata = await self._build_base_metadata(user_email)
            payload = {
                'instance_id': constants.instance_id,
                'content': content,
                'attachments': attachments,
                'metadata': metadata,
            }
            async with httpx.AsyncClient(timeout=httpx.Timeout(30)) as client:
                resp = await client.post(url, json=payload)
                if resp.status_code == 200:
                    return True
                self.ap.logger.warning(f'Failed to submit feedback: {resp.status_code} {resp.text[:200]}')
        except Exception as e:
            self.ap.logger.warning(f'Failed to submit feedback: {e}')
        return False
    async def dismiss_survey(self, survey_id: str) -> bool:
        """Dismiss a survey."""
        if not self._is_space_configured():
@@ -144,6 +144,8 @@ box:
            - './data/box'
            - '/tmp'
        workspace_quota_mb: null  # Optional disk quota override (>= 0). null = profile default.
    docker:
        cpu_limit_enabled: true  # When false, Docker sandbox containers are started without --cpus. Memory and PID limits still apply.
    e2b:
        api_key: ''  # Can also be set via E2B_API_KEY env var.
        api_url: ''  # Custom API URL for self-hosted deployments.
@@ -1,21 +0,0 @@
 <!DOCTYPE html>
 <html lang="zh">
 <head>
  <meta charset="UTF-8">
  <title>LangBot Embed Widget Test</title>
  <style>
    body { font-family: sans-serif; padding: 40px; background: #f5f5f5; }
    h1 { margin-bottom: 10px; }
    p { color: #666; }
    code { background: #e0e0e0; padding: 2px 6px; border-radius: 3px; }
  </style>
 </head>
 <body>
  <h1>LangBot Embed Widget Test Page</h1>
  <p>If the widget loaded correctly, you should see a blue chat bubble in the bottom-right corner.</p>
  <p>Replace the <code>BOT_UUID</code> below with your actual bot UUID.</p>
  <!-- Replace BOT_UUID with your real bot UUID -->
 <script data-title="LangBot" src="http://localhost:5300/api/v1/embed/a0ab80e7-742a-445f-bd0e-7d9758f1cfa7/widget.js"></script>
 </body>
 </html>
@@ -17,7 +17,21 @@ from langbot.pkg.persistence.alembic_runner import (
    run_alembic_upgrade,
    run_alembic_stamp,
    get_alembic_current,
    _ALEMBIC_DIR,
 )
 from alembic.config import Config
 from alembic.script import ScriptDirectory
 def _get_script_head() -> str:
    """Resolve the current Alembic head revision from the script directory.
    Avoids hardcoding a revision number in assertions so adding a new
    migration doesn't require editing the migration tests.
    """
    cfg = Config()
    cfg.set_main_option('script_location', _ALEMBIC_DIR)
    return ScriptDirectory.from_config(cfg).get_current_head()
 pytestmark = pytest.mark.integration
@@ -103,8 +117,10 @@ class TestSQLiteMigrationUpgrade:
        # Verify revision
        rev = await get_alembic_current(sqlite_engine)
        assert rev is not None, 'Expected a revision after upgrade'
-        # Head should be the latest migration
+        # Head should be the latest migration. Resolve the actual head from the
-        assert rev.startswith('0005'), f'Expected head to be 0005_*, got {rev}'
+        # Alembic script directory instead of hardcoding a revision number, so
        # adding a new migration doesn't require editing this assertion.
        assert rev == _get_script_head(), f'Expected head {_get_script_head()}, got {rev}'
    @pytest.mark.asyncio
    async def test_upgrade_idempotent(self, sqlite_engine):
--- a/Show More
+++ b/Show More
Author	SHA1	Message	Date
Hyu	ddb77fc43c	fix(api): guard /set-password with allow_modify_login_info (#2288 ) The /change-password and /bind-space endpoints already refuse when system.allow_modify_login_info is false, but /set-password did not, leaving a path to alter login credentials on locked-down deployments (e.g. public demo instances). Apply the same guard. Co-authored-by: dadachann <185672915+dadachann@users.noreply.github.com>	2026-06-26 16:35:50 +08:00
huanghuoguoguo	5b2826fa49	Add performance and reliability QA gates (#2283 ) * Add performance and reliability QA gates * test(skills): prepare user path performance gate * test(skills): add debug chat load gate * test(skills): extend fake provider load profiles * test(skills): add debug chat timing and isolation probes * test(skills): clarify manual QA perf gates	2026-06-25 21:02:44 +08:00
Hyu	20636ac432	Merge pull request #2284 from langbot-app/fix/api-password-thread-offload fix(api): offload password hashing from event loop	2026-06-25 20:31:44 +08:00
Hyu	af42602547	Merge pull request #2285 from langbot-app/fix/monitoring-null-payloads fix(monitoring): tolerate null API payloads	2026-06-25 20:26:25 +08:00
dadachann	53b20e2b13	fix(monitoring): tolerate null API payloads Normalize monitoring API responses before rendering so empty or error payloads with data:null cannot crash the dashboard. Also guard chart, token, and box session arrays before reading length/map.	2026-06-25 08:22:01 -04:00
dadachann	1242dc2d21	fix(api): offload password hashing from event loop	2026-06-25 06:29:16 -04:00
RockChinQ	04628d93cb	docs: add architecture guide for agents	2026-06-25 04:17:19 -04:00
RockChinQ	9c22a1521c	fix(box): defer separated workspace ownership to runtime	2026-06-25 00:09:40 -04:00
RockChinQ	c8d5039580	feat(box): expose Docker CPU limit toggle	2026-06-24 23:31:53 -04:00
dadachann	85d8d9304e	fix(web): keep feedback dialog interactive	2026-06-24 10:10:19 -04:00
Hyu	76471af179	feat(web): add sidebar feedback popover Co-authored-by: dadachann <185672915+dadachann@users.noreply.github.com>	2026-06-24 16:43:50 +08:00
RockChinQ	59b2a7cd51	fix(monitoring): hide disabled box status on cloud	2026-06-23 06:40:05 -04:00
RockChinQ	a43978ff24	chore(release): bump version to 4.10.4	2026-06-22 21:15:53 -04:00
RockChinQ	e3417dd20b	fix(release): derive package version from metadata	2026-06-22 21:10:33 -04:00
RockChinQ	2982e7c553	chore(release): bump version to 4.10.3	2026-06-22 11:12:15 -04:00
RockChinQ	e1e14e9269	chore(deps): bump langbot-plugin to 0.4.6	2026-06-22 11:08:08 -04:00
Junyan Chin	1c128a1524	docs(skills/langbot-plugin-dev): document marketplace README i18n convention (root README.md must be English; other langs in readme/)	2026-06-22 02:39:15 -04:00
RockChinQ	8ad1203fd5	docs(examples): add web-page-bot embed demo, drop stray test-embed.html Move the Page Bot (web_page_bot) embed test page out of the repo root into examples/web-page-bot/ as a proper, LangBot-styled demo: a self-contained index.html that loads the live widget.js against a running instance, plus bilingual READMEs mirroring examples/http-bot/.	2026-06-22 02:17:26 -04:00
Junyan Chin	144bec371c	feat(platform): standalone HTTP Bot adapter (server-to-server) (#2274 ) * docs(platform): add HTTP Bot adapter design (RFC) Standalone server-to-server HTTP adapter for driving a pipeline from external systems (LangBot Space ticketing et al). Inbound via the existing unified webhook route; outbound via signed callback POSTs. Preserves pipeline-native N->1 aggregation and 1->M multi-reply without a long-lived WebSocket. No core changes required (router/aggregator/pipeline untouched). * feat(platform): add standalone HTTP Bot adapter A first-class, vendor-neutral message-platform adapter (http_bot) for server-to-server integrations (LangBot Space ticketing et al). Drives a pipeline over plain HTTP with no long-lived connection: - Inbound: signed POST to the existing unified webhook route /bots/<uuid>, carrying a caller-defined session_id mapped to the LangBot launcher id via get_launcher_id -> per-session isolation. Preserves pipeline-native N->1 aggregation for free. - Outbound: each reply_message / reply_message_chunk becomes one signed callback POST to the config-only callback_url, delivered in per-session sequence order with retry/backoff -> 1->M multi-reply. - Sub-paths: /reset (drop a session) and /sync (block for the collapsed reply). - Auth: symmetric HMAC-SHA256 both directions (timestamp + replay window), no JWT/Turnstile, no socket. Decisions: callback URL is config-only (SSRF closed); reset + sync shipped; Python + TS reference clients shipped (signing verified byte-identical 3-way). No core changes: the unified webhook router, aggregator, query pool and pipeline are untouched. Adapter is auto-discovered from platform/sources/. Adds: src/langbot/pkg/platform/sources/http_bot.{py,yaml,svg} src/langbot/pkg/platform/sources/http_bot_signing.py docs/platforms/http-bot.md, docs/http-bot-openapi.json examples/http-bot/{client.py,client.ts,README.md} Updates docs/HTTP_BOT_ADAPTER_DESIGN.md (status: implemented). * docs(examples): add interactive HTTP Bot playground (browser debug console) A single-file aiohttp web app (examples/http-bot/playground.py) that lets you chat with a RUNNING http_bot bot from the browser and watch the protocol live: signed inbound POST -> 202 ack -> 1->M signed callbacks streamed back via SSE, with a debug panel showing the signature, HTTP status, and per-callback sequence/verification. Light LangBot-styled UI. On startup it reads the API key + http_bot bot from data/langbot.db and points the bot's callback_url + secrets back at itself via the LangBot API (live reload, no restart). README updated with a playground section. * docs(examples): add Chinese README for http-bot reference clients * style(platform): use </> code icon for http_bot adapter logo * docs(examples): point http-bot guide links to docs.langbot.app * style(platform): make http_bot icon a transparent monochrome </> so WebUI tints it like other adapters * Revert to colorful </> badge for http_bot icon (WebUI renders it as-is)	2026-06-22 13:38:00 +08:00
RockChinQ	74a18191dd	docs(readme): default docker compose command starts the sandbox The plain `docker compose up -d` leaves the Box sandbox runtime off (it's gated behind the box/all profile), so sandbox tools, skill add/edit and stdio MCP don't work out of the box. Use `docker compose --profile all up -d` across all 9 README translations so the default quick-start brings up the sandbox-capable stack.	2026-06-21 13:18:44 -04:00
RockChinQ	a15c98eb06	fix(web): point plugin help links to working docs URL The in-product plugin/add-extension help links went through link.langbot.app/{lang}/docs/plugins, which now 404s (it resolved to the removed /usage/plugin/plugin-intro path). Point them directly at the current docs page docs.langbot.app/{lang}/plugin/plugin-intro (verified 200 for zh/en/ja).	2026-06-21 12:59:13 -04:00
RockChinQ	cbe17cde6c	fix(web): provider card overflow on mobile via grid/flex min-width floor The previous truncate/shrink-0 pass only touched leaf nodes, but the min-content floor was set by two ancestors: the flex-1 left group lacked min-w-0, and CardHeader is a CSS grid whose implicit single column defaults to min-content. Constrain both (min-w-0 on the header grid + explicit grid-cols-[minmax(0,1fr)], min-w-0 on the inner flex groups) so the provider name / base_url+key subtitle actually truncate instead of forcing the card — and the whole settings modal — wider than the viewport.	2026-06-21 12:54:24 -04:00
RockChinQ	876e8bf804	fix(web): mobile overflow in settings panels - PanelToolbar: allow wrapping and tighten padding on small screens so the primary action (e.g. "创建 API 密钥") no longer runs off the dialog edge. - ProviderCard header: let the provider name truncate and pin the model-count badge and right-side action group with shrink-0 so credits / + controls stay inside the card on narrow viewports.	2026-06-21 12:48:18 -04:00
RockChinQ	b3848c9d05	feat(web): make tooltips tap-toggleable on touch devices Radix tooltips open on hover/focus only and stay closed on touch input, so on mobile every hover tooltip was unreachable. Detect coarse/no-hover pointers via matchMedia and drive the tooltip's open state ourselves so a tap on the trigger toggles it. Desktop hover/focus behaviour is unchanged (we only intercept the tap when the device has no hover capability). Fixes all tooltips app-wide from the shared primitive.	2026-06-21 12:46:18 -04:00
RockChinQ	85743cc75f	fix(tests): make Postgres migration head test revision-agnostic The PostgreSQL migration test had the same hardcoded 0005 head assertion as the SQLite one; resolve the actual head from the Alembic ScriptDirectory so 0006 (and future migrations) don't break it.	2026-06-21 12:10:20 -04:00
RockChinQ	c689b10c0d	fix(mcp): ruff format remote-mode files; make migration head test revision-agnostic CI follow-up to the local/remote MCP work: - Apply ruff format to provider/tools/loaders/mcp.py and the 0006 normalize-remote-mode migration (Lint job failed on formatting). - test_migrations.py hardcoded the head revision as 0005_*, which broke once 0006 landed. Resolve the actual head from the Alembic ScriptDirectory so future migrations don't require editing the test.	2026-06-21 12:04:37 -04:00
RockChinQ	812b1fff4c	fix(web): stop spurious page refresh on account menu open; plugin log auto-refresh as switch Two unrelated frontend fixes: - LanguageSelector mounts each time the sidebar account dropdown opens and unconditionally called i18n.changeLanguage() on mount, emitting a languageChanged event even when the language was unchanged. That handed every useTranslation() consumer a fresh `t` reference, re-running effects keyed on `t` (e.g. the plugins page system-status fetch) and surfacing as a page "refresh". Guard the call so it only fires on an actual change. - Plugin logs auto-refresh control changed from a toggle Button to a Switch + Label; the on/off button i18n keys are replaced by a single static logsAutoRefresh label across all 8 locales.	2026-06-21 11:58:01 -04:00
RockChinQ	9daf22d661	feat(plugin-market): align recommendation carousel with Space (pause + countdown ring) Port the Space marketplace recommendation carousel UX into the in-app add-extension page: a 10s auto-advance driven by a smooth countdown ring that doubles as a pause/resume toggle, and manual prev/next now reset the countdown. Adds market.recommendation.{pause,resume} across 8 locales.	2026-06-21 11:48:39 -04:00
RockChinQ	42a2c70b14	style(plugin-market): widen marketplace cards via auto-fill min width Replace fixed grid-cols breakpoints (which forced up to 4 narrow cards on wide screens) with auto-fill columns and a 24rem minimum card width on both the main market grid and the featured recommendation rows. The featured rows already measure real column count via ResizeObserver, so pagination adapts automatically.	2026-06-21 11:21:52 -04:00
RockChinQ	64ed6d994b	feat(mcp): simplify external MCP server config to local/remote modes Replace the three-way transport choice (stdio / sse / httpstream) for connecting LangBot to external MCP servers with two modes: local (stdio) and remote. Remote servers only require a URL; the runtime auto-detects the transport (tries Streamable HTTP, falls back to SSE). - provider/tools/loaders/mcp.py: add _init_remote_server() with Streamable-HTTP-then-SSE probing; dispatch 'remote' lifecycle, keep legacy sse/http branches for back-compat - plugin/connector.py: normalize legacy http/sse marketplace modes to 'remote' on Space install, preserving connection params - entity/persistence/mcp.py: document mode as stdio, remote (legacy: sse, http) - alembic 0006: idempotent data migration mapping existing sse/http rows to remote (downgrade maps back to http) - api/http/service/mcp.py: stash runtime_info (status + tool list) into test task metadata before tearing down the temp session - web: collapse mode dropdown to local/remote, remote renders URL+timeout only, edit auto-maps legacy sse/http to remote; show tools after test in create mode from task metadata; remove dead plugins/mcp-server/ tree - i18n: local/remote labels + mode/url hints across 8 locales	2026-06-21 11:20:32 -04:00
RockChinQ	2ff854f79a	build(Dockerfile): install Node.js LTS so sandbox can run npx-based stdio MCP servers The final runtime image (used by langbot/plugin_runtime/box) shipped uv and docker-cli but no node, so any npx-launched stdio MCP server inside the box sandbox exited with return_code=127 (command not found). Install Node.js 22 LTS via NodeSource; node/npx land in /usr/bin, which is on the nsjail read-only mount whitelist (_READONLY_SYSTEM_MOUNTS) and is bound into the sandbox chroot automatically.	2026-06-21 08:15:02 -04:00
RockChinQ	52c096ea4c	chore(deps): patch Dependabot vulns (Python + JS) Python (pyproject.toml + uv.lock): - aiohttp 3.14.0 -> 3.14.1 (8 alerts: medium+low) - cryptography -> 49.0.0 (high, floor 48.0.1) - langchain -> 1.3.10 (medium, floor 1.3.9) - langsmith -> 0.8.18 (high) - starlette 1.2.1 -> 1.3.1 (high+low, transitive) - pydantic-settings 2.12.0 -> 2.14.2 (medium, transitive) - torch 2.10.0 -> 2.12.1 (low, transitive; py>=3.14 only) JS (web/, dual lockfile npm+pnpm in sync): - vite ^8.0.5 -> ^8.0.16 (high+medium) - js-yaml -> 4.2.0 (medium, override >=4.2.0 <5) - form-data -> 4.0.6 (high, override) Unfixable (no upstream patch, left + reported): - chromadb critical <=1.5.9 (1.5.9 is latest) - PyPDF2 medium (deprecated; needs pypdf migration) Verified: uv sync + import check, pnpm frozen-lockfile, vite build.	2026-06-21 07:43:54 -04:00
Junyan Chin	eda80030b5	Improve README_CN.md with clearer Star instructions Update instructions for starring and watching the repository.	2026-06-21 19:34:34 +08:00
RockChinQ	dfbd176e42	docs(readme): move Star & Watch CTA after Key Capabilities, host gif on langbot.app	2026-06-21 07:33:00 -04:00
RockChinQ	6ddd24ae68	docs(readme): restore Star & Watch CTA with star.gif across all locales	2026-06-21 07:25:46 -04:00