Files
LangBot/ARCHITECTURE.md
T
2026-06-25 04:17:19 -04:00

15 KiB

Architecture

This document is a map of LangBot's moving parts. It is intentionally more stable than a feature guide and more concrete than the README: when you need to change behavior, start here, then follow the file references into the code.

For agent-specific working rules, see AGENTS.md. For plugin-runtime and Box-runtime implementation details, also read the sibling SDK repo: langbot-plugin-sdk.

What LangBot Is

LangBot is an open-source platform for building production IM bots backed by LLMs, agents, RAG, plugins, MCP tools, and a web management panel.

At runtime, one LangBot process owns:

  • a Quart/Hypercorn HTTP service and the built web UI on :5300;
  • messaging-platform adapters such as Discord, Telegram, Slack, WeChat, QQ, WeCom, Lark, DingTalk, KOOK, LINE, Satori, Matrix, and HTTP/WebSocket bots;
  • a pipeline engine that turns inbound platform messages into LLM/tool/plugin work and replies;
  • persistence, storage, vector database, telemetry, monitoring, and configuration managers;
  • bridges to the Plugin Runtime and Box Runtime provided by langbot-plugin-sdk;
  • an MCP server at /mcp exposing a curated agent-facing subset of the service layer.

Repository Boundary

LangBot is not a single-repo system.

  • LangBot/ is the main product: backend, web UI, platform adapters, pipeline engine, HTTP API, MCP server, RAG, persistence, skills integration, and the bridge code that talks to runtimes.
  • langbot-plugin-sdk/ is published as langbot-plugin and pinned in LangBot/pyproject.toml. It contains plugin developer APIs, shared entities, lbp, the Plugin Runtime (lbp rt), and the Box Runtime (lbp box).
  • Plugins import SDK APIs from langbot_plugin.*; the LangBot main process imports the same package for shared entities and runtime protocols.

This split matters. If a change modifies SDK entities, component APIs, action protocols, lbp rt, or lbp box, verify the sibling SDK repo and install the local SDK into LangBot's virtualenv when testing cross-repo behavior.

Startup Path

The process entrypoint is small and layered:

  1. main.py delegates to langbot.__main__.main().
  2. src/langbot/__main__.py parses --standalone-runtime, --standalone-box, and --debug, checks dependencies, generates missing config/data files, and calls pkg.core.boot.main().
  3. pkg/core/boot.py executes startup stages in order: LoadConfigStage, GenKeysStage, SetupLoggerStage, BuildAppStage, ShowNotesStage.
  4. BuildAppStage constructs the Application object by wiring managers, services, runtime connectors, and controllers.
  5. Application.run() starts the platform manager, query controller, HTTP controller, telemetry/cleanup loops, and plugin initialization.

The central runtime object is pkg/core/app.py::Application. It is a service locator for long-lived managers. That is not elegant, but it is the current architectural center; most subsystems receive ap: Application and collaborate through it.

Top-Level Layout

LangBot/
├── main.py                         # Entrypoint shim
├── pyproject.toml                  # Python package, deps, pinned langbot-plugin
├── src/langbot/
│   ├── __main__.py                 # CLI entrypoint and boot handoff
│   ├── pkg/
│   │   ├── core/                   # Application, boot stages, task manager
│   │   ├── api/                    # HTTP API + MCP server mount
│   │   ├── platform/               # IM adapters and runtime bot manager
│   │   ├── pipeline/               # Message routing and pipeline stages
│   │   ├── provider/               # LLM runners, model manager, tools
│   │   ├── plugin/                 # LangBot-side Plugin Runtime connector/handler
│   │   ├── box/                    # LangBot-side Box service/connector
│   │   ├── skill/                  # Skill metadata/activation integration
│   │   ├── rag/ , vector/          # Knowledge-base and vector DB integration
│   │   ├── persistence/            # SQLAlchemy/SQLModel, Alembic, legacy migrations
│   │   ├── storage/                # Local/S3 file storage abstraction
│   │   └── config/, entity/, utils/, telemetry/, survey/
│   ├── libs/                       # Vendored third-party platform SDKs
│   └── templates/                  # Default config and component metadata
├── web/                            # Vite + React Router + shadcn/ui + Tailwind SPA
├── docker/                         # Deployment manifests
├── skills/                         # In-repo agent skills, single source of truth
└── tests/                          # Unit/integration/e2e/manual tests

The Runtime Graph

The most useful mental model is this graph:

Platform adapter
  → RuntimeBot
  → MessageAggregator
  → QueryPool
  → Controller
  → RuntimePipeline
  → PipelineStage chain
  → RequestRunner / ToolManager / PluginRuntimeConnector / BoxService
  → response via adapter

The HTTP and MCP surfaces are parallel entrypoints into the same service layer:

HTTP client / Web UI
  → Quart route group
  → api/http/service/*
  → Application managers / persistence / runtime connectors

MCP client
  → /mcp mount
  → api/mcp/server.py tools
  → the same service layer directly

Message Flow

Inbound platform messages enter through adapter-specific SDK callbacks. The common path is:

  1. A platform adapter under pkg/platform/sources/ converts platform-specific events into SDK message/event entities.
  2. RuntimeBot in pkg/platform/botmgr.py applies pipeline routing rules and either discards the message, pushes it to webhooks, or sends it to the message aggregator.
  3. MessageAggregator batches/normalizes messages before adding a Query to QueryPool.
  4. Controller in pkg/pipeline/controller.py selects queries subject to global pipeline concurrency and per-session concurrency.
  5. RuntimePipeline in pkg/pipeline/pipelinemgr.py runs configured pipeline stages using a responsibility-chain style executor that supports generator stages.
  6. The chat stage emits plugin events, calls a configured RequestRunner, handles streaming/non-streaming responses, records telemetry, and appends conversation history.
  7. Output stages send text, cards, chunks, files, or error notices back through the original platform adapter.

Pipeline components are registered by decorators and package import side effects. When adding a new stage, loader, runner, or adapter, check the corresponding preregistration mechanism instead of inventing a second registry.

Platform Layer

Platform code lives under pkg/platform/.

  • botmgr.py owns runtime bots, routing rules, event logging, webhook pushing, and adapter lifecycle.
  • sources/ contains adapter implementations. Each adapter subclasses langbot_plugin.api.definition.abstract.platform.adapter.AbstractMessagePlatformAdapter from the SDK.
  • Platform entities such as MessageChain, Image, At, Voice, and events come from langbot-plugin-sdk, not from this repo.

The platform layer should translate between external platform APIs and LangBot's shared message/event model. It should not contain LLM-provider logic or pipeline business logic.

Pipeline Layer

Pipeline code lives under pkg/pipeline/.

Important pieces:

  • pool.py::QueryPool stores pending queries and cached in-flight queries for plugin backward-compatible calls.
  • controller.py::Controller schedules query processing and enforces concurrency.
  • pipelinemgr.py::RuntimePipeline materializes database pipeline config into a runtime stage chain.
  • process/handlers/chat.py::ChatMessageHandler is the main LLM conversation handler.
  • Stage families include response rules, banned sessions, content filters, preprocessors, rate limits, message truncation, long text handling, response-back, command handling, and wrappers.

Pipelines are configuration-driven. Prefer adding a stage or extending an existing stage family over hard-coding behavior in platform adapters.

Provider, RAG, and Tools

Provider code lives under pkg/provider/.

  • modelmgr/ manages configured model providers and requesters.
  • runners/ implements request runners such as the local agent runner and external workflow integrations.
  • tools/toolmgr.py aggregates tools from native tools, plugin tools, external MCP servers, and skill-authoring tools.
  • tools/loaders/mcp.py is the MCP client side: external MCP servers that LangBot connects to for agent tools.
  • RAG lives across pkg/rag/, pkg/vector/, model services, and plugin KnowledgeEngine actions.

Do not confuse LangBot's MCP client side with LangBot's own MCP server at /mcp; they are different surfaces.

Plugin System

The plugin system crosses the repo boundary.

In this repo:

  • pkg/plugin/connector.py connects LangBot to the Plugin Runtime over stdio or WebSocket.
  • pkg/plugin/handler.py exposes LangBot actions to the runtime and calls runtime actions for plugin operations.
  • pkg/provider/tools/loaders/plugin.py exposes plugin Tool components to LLM runners.
  • Pipeline handlers emit SDK events such as normal-message events and prompt-processing events.

In langbot-plugin-sdk:

  • src/langbot_plugin/api/ defines BasePlugin, component base classes, message/event entities, contexts, proxies, and manifests.
  • src/langbot_plugin/runtime/ implements lbp rt, plugin discovery, dependency installation, process launching, and control/debug connections.
  • src/langbot_plugin/entities/io/ defines the action protocol shared by LangBot, runtime, and plugin processes.

The Plugin Runtime supports stdio and WebSocket control transports. Direct local LangBot runs usually spawn the runtime over stdio. Containerized/standalone deployments connect over WebSocket using plugin.runtime_ws_url and --standalone-runtime.

Box Runtime and Skills

Box is the sandbox subsystem used by native agent tools, stdio MCP servers, skill authoring, and managed processes.

In this repo:

  • pkg/box/service.py is the application-facing facade for exec, sessions, managed processes, skill CRUD, status, reconnects, quotas, mounts, and sandbox profiles.
  • pkg/box/connector.py connects to the Box Runtime over stdio, Windows subprocess+WebSocket, or remote WebSocket.
  • pkg/provider/tools/loaders/native.py, mcp_stdio.py, and skill loaders depend on Box availability.
  • pkg/skill/manager.py loads skills from the Box runtime, falling back to local data/skills when needed.

In langbot-plugin-sdk:

  • src/langbot_plugin/box/server.py implements lbp box and the WebSocket endpoints on :5410.
  • src/langbot_plugin/box/runtime.py owns sandbox sessions and managed processes.
  • backend.py, nsjail_backend.py, and e2b_backend.py implement sandbox backends.
  • skill_store.py manages skill packages from the Box side.

Important config keys live under box: in src/langbot/templates/config.yaml: box.enabled, box.backend, box.runtime.endpoint, and box.local.*. Start LangBot with --standalone-box when connecting to an externally launched Box runtime.

HTTP API, Web UI, and MCP Server

pkg/api/http/controller/main.py builds a Quart app, registers route groups, serves the built SPA, and wraps the ASGI app with the MCP dispatcher.

  • HTTP route groups live under pkg/api/http/controller/groups/.
  • Service-layer logic lives under pkg/api/http/service/.
  • The built web UI is served from the frontend build path with SPA fallback.
  • The MCP server lives under pkg/api/mcp/ and is mounted at /mcp.

The MCP server intentionally exposes a curated subset of the API. Tools call service classes directly rather than making HTTP requests back into LangBot.

Maintenance rule: when adding, removing, or changing an HTTP endpoint that should be agent-accessible, update the matching MCP tool and the relevant in-repo skill under skills/ in the same pass.

Persistence and Configuration

Persistence is centered on pkg/persistence/mgr.py.

  • SQLite is the default database; PostgreSQL is supported.
  • Models live under pkg/entity/persistence/.
  • Fresh schemas are created from metadata, then legacy migrations run up to the frozen 3.x baseline, then Alembic migrations run to head.
  • New schema changes should use Alembic under pkg/persistence/alembic/versions/; do not extend the frozen legacy migration chain.

Configuration starts from src/langbot/templates/config.yaml and is generated into data/config.yaml on first run. Most long-lived managers read from ap.instance_config.data.

Frontend

The frontend lives in web/ and is a Vite SPA using React Router 7, shadcn/ui, Tailwind CSS, and pnpm. It is not Next.js, despite some historical filenames.

In development, pnpm dev serves the UI on :3000 and reads VITE_API_BASE_URL to call the backend on :5300. In production, the built frontend is packaged into the Python distribution and served by the backend.

Keep frontend API behavior aligned with pkg/api/http/service/ and route groups. User-facing strings must go through the existing i18n setup.

Agent-Facing Surfaces

LangBot is deliberately agent-friendly. The agent-facing surfaces are part of the architecture, not extra docs.

  • skills/ is the single source of truth for in-repo skills.
  • pkg/api/mcp/server.py exposes the LangBot MCP server at /mcp.
  • api.global_api_key authenticates API/MCP access without a browser login.
  • AGENTS.md and ARCHITECTURE.md tell coding agents how the repo works.

When one of these changes, update the others if the behavior or contract changed. API, MCP tools, and skills are one system; drift is a bug.

Where to Change Things

  • New HTTP API: add/adjust a service in pkg/api/http/service/, a route group in pkg/api/http/controller/groups/, tests, and MCP/skills if agent-accessible.
  • New platform adapter: add a pkg/platform/sources/* adapter, component metadata/templates as needed, i18n, docs, and tests/smoke coverage.
  • New pipeline behavior: add or extend a pipeline stage family under pkg/pipeline/; avoid putting pipeline rules in adapters.
  • New LLM provider/requester: work under pkg/provider/modelmgr/ and related service/UI surfaces.
  • New LLM tool source: extend pkg/provider/tools/loaders/ and ToolManager intentionally.
  • New plugin component/API/protocol: change langbot-plugin-sdk first or in lockstep, then update LangBot bridge code.
  • New Box capability: change both pkg/box/ and langbot-plugin-sdk/src/langbot_plugin/box/, plus config and tests.
  • New database schema: add an Alembic migration, not a legacy dbmXXX migration.

Design Biases

  • Keep platform translation, pipeline orchestration, provider execution, and runtime protocols separate.
  • Reuse existing registries and service layers instead of adding parallel paths.
  • Prefer small, explicit agent surfaces over exposing every internal API.
  • Treat cross-repo contracts with the SDK as public interfaces.
  • Test behavior at the narrowest useful layer first, then add integration/e2e coverage for runtime or platform changes.