Compare commits

..

28 Commits

Author SHA1 Message Date
RockChinQ
882c9ae8f5 fix(box): trust Box-reported skill paths when filesystem is not shared
In separated deployments (Docker Compose, k8s sidecar, --standalone-box,
remote runtime.endpoint) the Box runtime owns its own filesystem, so the
skill package_root it reports via list_skills is not resolvable on the
LangBot side. LangBot's reload_skills and build_skill_extra_mounts
validated those paths with os.path.isdir() against its own filesystem,
which silently dropped every skill in such deployments — breaking the
sandbox skill feature for the nsjail/SaaS backend.

Add BoxService.shares_filesystem_with_box, derived from the connector
transport (stdio = shared, WebSocket = separated), with an explicit
override seam for tests/embedders. Gate both isdir() guards on it: keep
local validation in shared-fs stdio mode, trust Box-reported paths
otherwise. The Box runtime only reports skills found on its own
filesystem, so those paths are valid there by construction.

Adds topology-derivation tests (real connector, no mocks) and
skill-retention tests for both shared and separated filesystems.
2026-06-07 12:46:52 -04:00
RockChinQ
47fe9bde03 docs(docker): move k8s deployment docs to wiki, drop README_K8S.md
The Kubernetes deployment guide now lives only in the wiki
(docs.langbot.app -> Installation -> Kubernetes). Remove the in-repo
docker/README_K8S.md, repoint the README language variants and the
docker-compose / kubernetes.yaml header comments to the wiki, and keep
kubernetes.yaml self-describing via inline comments.
2026-06-07 11:36:39 -04:00
RockChinQ
5c3a619e2d docs(docker): add Box sandbox runtime to k8s manifest and deploy guide
The k8s manifest was missing the Box runtime that backs the sandbox
tools, the activate skill tool, skill add/edit and stdio MCP. Add a
langbot-box Deployment/Service (port 5410), wire langbot to it via
BOX__RUNTIME__ENDPOINT (explicit Service name since the in-container
default langbot_box uses an underscore, invalid for k8s DNS), and share
the Box workspace root as a node hostPath pinned via podAffinity so the
node Docker daemon resolves bind-mount paths consistently. Document the
component, the shared-FS constraint, security implications and readiness
checks in README_K8S.md (zh + en).
2026-06-07 11:18:27 -04:00
RockChinQ
e223edeb45 docs(agents): add --standalone-box flag and box config keys 2026-06-07 08:57:43 -04:00
RockChinQ
d2c3146334 docs(agents): refresh AGENTS.md for current architecture and runtime/box debugging 2026-06-07 08:43:30 -04:00
Haoxuan Xing
7d9c8e3065 Merge pull request #2231 from langbot-app/TyperBody-patch-1
Update key capabilities in README.md
2026-06-07 13:08:19 +08:00
Haoxuan Xing
f12ed81e1e Update key capabilities in README.md
Added links to Deerflow and Weknora in the capabilities section.
2026-06-07 13:05:46 +08:00
Haoxuan Xing
6d4d19b6d7 Merge pull request #2230 from langbot-app/feat/addweknoradeerflow
Add DeerFlow LangGraph API as a Provider Runner
2026-06-07 12:22:55 +08:00
Typer_Body
07b90f12a2 ruff3 2026-06-07 02:38:05 +08:00
Typer_Body
fd896c6974 ruff2 2026-06-07 02:35:10 +08:00
Typer_Body
1fbfa868fb ruff 2026-06-07 02:31:42 +08:00
Typer_Body
ad05819c2e readme 2026-06-07 02:26:25 +08:00
Typer_Body
0c6f71738c deerflow 2026-06-07 02:17:40 +08:00
Typer_Body
af451e7006 weknora2 2026-06-07 01:14:02 +08:00
Typer_Body
59f20bcc73 weknora 2026-06-07 01:08:39 +08:00
RockChinQ
7eca3cdfca feat(web): show sub-entity name in document title on detail pages
Detail pages (plugin / MCP / pipeline / knowledge base / skill) only showed
the type in the tab title. Drive the /home document title from HomeLayout,
which has the selected entity name via context: '<entity> · <type> · LangBot'
when a sub-entity is open, '<type> · LangBot' otherwise. The top-level hook
now skips /home and only handles login/register/reset-password/wizard.
Type label falls back to a route-derived i18n key on direct page loads.
2026-06-06 12:12:08 -04:00
RockChinQ
c40354f838 feat(web): dynamic document title per route
The browser tab title was hard-coded to 'LangBot' in index.html and never
changed. Add a useDocumentTitle hook that maps the active route to an
existing i18n key and sets document.title to '<page> · LangBot', driven by
a new top-level RootLayout route element. Re-runs on navigation and on
language change so the title stays localized. Falls back to the bare app
name for unmapped routes.
2026-06-06 12:07:41 -04:00
RockChinQ
21a5b4658a fix(plugin-market): keep fixed card width regardless of result count
The result grid used auto-fit tracks, so a single search result stretched
to fill the whole row. Switch to fixed responsive column counts (1/2/3/4
across breakpoints), matching langbot-space, so cards keep a consistent
max width no matter how many results are shown.
2026-06-06 11:40:02 -04:00
RockChinQ
073acaa053 feat(plugin-market): move extension count into search box placeholder
Mirror the langbot-space marketplace change: drop the '共 xxx 个扩展'
stats line below the tag filter, surface the count in the search
placeholder ('搜索 xxx 个扩展、能力或场景...') when no query is active,
and show the total at the bottom via allLoadedCount when searching.
Adds searchPlaceholderCount + allLoadedCount to all 8 locales.
2026-06-06 11:33:46 -04:00
RockChinQ
38759b229d feat(plugin-market): show per-format extension counts in type filter
Mirror the LangBot Space marketplace: the advanced-filter type options
(plugin / MCP / skill) now display their live extension count, e.g.
"插件 (74)". Counts are fetched on mount via three lightweight
searchMarketplaceExtensions calls (page_size=1) reading total per type.
The all-formats option intentionally shows no count.
2026-06-06 08:11:59 -04:00
RockChinQ
efe32e34ae fix(deps): patch Dependabot vulnerability alerts (Python + web)
Python (pyproject.toml + uv.lock):
- aiohttp 3.13.5->3.14.0, langchain-core 1.3.2->1.4.1, langsmith 0.7.36->0.8.9,
  lxml 6.0.2->6.1.1, Mako 1.3.11->1.3.12, PyJWT 2.11.0->2.13.0,
  python-multipart 0.0.26->0.0.32, urllib3 2.6.3->2.7.0, Pygments 2.19.2->2.20.0,
  idna 3.11->3.18, pip 26.0->26.1.2, python-dotenv 1.2.1->1.2.2,
  requests 2.32.5->2.34.2, starlette 0.52.1->1.2.1, uv 0.11.7->0.11.19

web (package.json + both lockfiles):
- axios ->1.17.0, postcss ->8.5.15, react-router(-dom) ->7.17.0 (direct)
- overrides for transitive: flatted >=3.4.2, follow-redirects >=1.16.0,
  minimatch (3.1.3 / 9.0.7), picomatch (2.3.2 / 4.0.4)
- regenerated both package-lock.json and pnpm-lock.yaml in sync

Verified: uv sync + core imports OK; pnpm --frozen-lockfile + tsc + vite build pass.

Not fixable (no upstream patch yet, tracked separately):
- chromadb (critical, <=1.5.9 is latest) — awaiting upstream release
- PyPDF2 (medium, deprecated; needs migration to pypdf, code change)
2026-06-06 06:06:59 -04:00
Junyan Chin
46db4de11a Update QQ Group link in README_CN.md 2026-06-06 17:20:19 +08:00
RockChinQ
170a6756f4 fix(add-extension): load real icon in install confirm dialog from URL params
When the install confirm dialog is opened via URL query params (e.g. from a
marketplace deep link), installInfo carried no icon, so the icon fell back to
the /resources/icon endpoint which 404s for extensions whose icon is an
external URL (simpleicons / iconify), showing a Package placeholder.

Fetch the icon from the marketplace detail API (mcp/skill/plugin) after opening
the dialog and inject it into installInfo, and reset the icon-failed state when
the resolved URL changes so the <img> retries instead of sticking on the
placeholder.
2026-06-06 04:45:46 -04:00
RockChinQ
7330732f62 fix(ci): bump migration head assertion to 0004, apply prettier
- Update test_migrations / test_migrations_postgres head assertion from
  0003 to 0004 after adding the mcp readme migration.
- Reformat MCPForm.tsx / MCPReadme.tsx to satisfy prettier/prettier.
2026-06-06 03:56:14 -04:00
RockChinQ
b08e5ca09a feat(mcp): add Docs/Tools tablist on detail page, tidy sidebar label
Wrap the MCP detail right panel in a compact left-aligned Docs/Tools
tablist (Docs first). Move the tool count into the Tools tab label and
drop the redundant panel title/subtitle; connecting/failed states still
render the status component. Shorten the sidebar 'Installed Extensions'
entry to 'Installed' across all 8 locales, and add tabTools/tabDocs/
noReadme strings.
2026-06-06 03:52:17 -04:00
RockChinQ
dff80a0c0a fix(marketplace): use external icon URL when icon field is absolute
Many MCP / skill records store their icon as an absolute external URL
(simpleicons.org / iconify.design) rather than an uploaded file, so the
/resources/icon endpoint 404s and the card icon breaks. Add
resolveMarketplaceIconURL() which prefers an absolute http(s) icon field
and otherwise falls back to the resources endpoint.
2026-06-06 03:52:09 -04:00
RockChinQ
f54ae4b91c feat(mcp): persist and display marketplace README
Capture the README markdown from LangBot Space when installing an MCP
server and store it on the mcp_servers record (new readme column +
alembic migration 0004). The detail page can then render docs offline,
independent of the server's runtime/connection state.
2026-06-06 03:52:00 -04:00
RockChinQ
e5b3cced1f feat(market): show 24 plugins per page 2026-06-05 11:33:02 -04:00
165 changed files with 9684 additions and 4963 deletions

145
AGENTS.md
View File

@@ -1,81 +1,134 @@
# AGENTS.md
This file is for guiding code agents (like Claude Code, GitHub Copilot, OpenAI Codex, etc.) to work in LangBot project.
This file guides code agents (Claude Code, GitHub Copilot, OpenAI Codex, etc.) working in the LangBot project. `CLAUDE.md` is a symlink to this file.
## Project Overview
LangBot is a open-source LLM native instant messaging bot development platform, aiming to provide an out-of-the-box IM robot development experience, with Agent, RAG, MCP and other LLM application functions, supporting global instant messaging platforms, and providing rich API interfaces, supporting custom development.
LangBot is an open-source, LLM-native instant-messaging bot development platform. It aims to provide an out-of-the-box IM bot development experience with Agent, RAG, MCP and other LLM application capabilities, supporting mainstream global IM platforms and exposing rich APIs for custom development.
LangBot has a comprehensive frontend, all operations can be performed through the frontend. The project splited into these major parts:
LangBot has a comprehensive web frontend almost every operation can be performed through it.
- `./src/langbot`: The main python package of the project, below are the main modules in this package:
- `./pkg`: The core python package of the project backend.
- `./pkg/platform`: The platform module of the project, containing the logic of message platform adapters, bot managers, message session managers, etc.
- `./pkg/provider`: The provider module of the project, containing the logic of LLM providers, tool providers, etc.
- `./pkg/pipeline`: The pipeline module of the project, containing the logic of pipelines, stages, query pool, etc.
- `./pkg/api`: The api module of the project, containing the http api controllers and services.
- `./pkg/plugin`: LangBot bridge for connecting with plugin system.
- `./libs`: Some SDKs we previously developed for the project, such as `qq_official_api`, `wecom_api`, etc.
- `./templates`: Templates of config files, components, etc.
- `./web`: Frontend codebase, built with Next.js + **shadcn** + **Tailwind CSS**.
- `./docker`: docker-compose deployment files.
- **Python**: `>=3.11,<4.0`, dependencies managed by `uv`. Package version is in `pyproject.toml`.
- **Frontend**: `web/` is a **Vite + React Router 7 + shadcn/ui + Tailwind CSS** SPA, managed by `pnpm`. (Note: this is NOT Next.js — the `dev` script is `vite`.)
- **Backend framework**: Quart (the async flavour of Flask). The HTTP API and the pre-built web UI are both served by the backend on `http://127.0.0.1:5300`.
## Backend Development
## Repository Layout
We use `uv` to manage dependencies.
```
LangBot/
├── main.py # Entrypoint shim -> langbot.__main__.main()
├── pyproject.toml # Python project + deps (uv), pins langbot-plugin==<x.y.z>
├── src/langbot/
│ ├── __main__.py # Real entrypoint, CLI args (--standalone-runtime, --standalone-box, --debug)
│ ├── pkg/ # Core backend package
│ │ ├── api/ # HTTP API controllers + services (Quart)
│ │ ├── core/ # App bootstrap, stages, task manager
│ │ ├── platform/ # IM platform adapters, bot managers, session managers
│ │ ├── provider/ # LLM providers, requesters, tool providers
│ │ ├── pipeline/ # Pipelines, stages, query pool
│ │ ├── plugin/ # Bridge connecting LangBot to the plugin runtime (see below)
│ │ ├── box/ # Code-sandbox subsystem (Docker / nsjail / E2B backends)
│ │ ├── skill/ # Skill subsystem
│ │ ├── rag/ , vector/ # RAG + vector store
│ │ ├── command/ # Built-in commands
│ │ ├── persistence/ # ORM models + Alembic migrations (SQLite & PostgreSQL)
│ │ ├── storage/ # Object/file storage abstractions
│ │ ├── config/, entity/, discover/, utils/, telemetry/, survey/
│ ├── libs/ # Vendored SDKs (qq_official_api, wecom_api, etc.)
│ └── templates/ # Config/component templates (e.g. templates/config.yaml)
├── web/ # Frontend SPA (Vite + React Router 7 + shadcn + Tailwind)
└── docker/ # docker-compose deployment files
```
## Development Environment Setup
Full guide lives in the wiki: **["开发配置" / Dev Config](https://docs.langbot.app/zh/develop/dev-config)**. Summary:
### Backend
```bash
pip install uv
uv sync --dev
uv sync --dev # uv creates a .venv/ for you; point your editor's interpreter at it
uv run main.py # serves API + web UI on http://127.0.0.1:5300
```
Start the backend and run the project in development mode.
On first run the config file is generated at `data/config.yaml`. DB is SQLite by default (zero setup); PostgreSQL is supported. Migrations run automatically on startup.
```bash
uv run main.py
```
### Frontend
Then you can access the project at `http://127.0.0.1:5300`.
## Frontend Development
We use `pnpm` to manage dependencies.
Requires Node.js + [pnpm](https://pnpm.io/installation).
```bash
cd web
cp .env.example .env
cp .env.example .env # Windows: copy .env.example .env
pnpm install
pnpm dev
pnpm dev # http://127.0.0.1:3000 (npm install / npm run dev also work)
```
Then you can access the project at `http://127.0.0.1:3000`.
`pnpm dev` reads `VITE_API_BASE_URL` from `web/.env` so the dev frontend can reach the backend on port `5300`. In production the frontend is pre-built into static files served by the backend on the same origin.
## Plugin System Architecture
### Code formatting
LangBot is composed of various internal components such as Large Language Model tools, commands, messaging platform adapters, LLM requesters, and more. To meet extensibility and flexibility requirements, we have implemented a production-grade plugin system.
The repo runs lint + format checks in CI. Install the pre-commit hooks so the same checks run locally before each commit:
Each plugin runs in an independent process, managed uniformly by the Plugin Runtime. It has two operating modes: `stdio` and `websocket`. When LangBot is started directly by users (not running in a container), it uses `stdio` mode, which is common for personal users or lightweight environments. When LangBot runs in a container, it uses `websocket` mode, designed specifically for production environments.
```bash
uv run pre-commit install
```
Plugin Runtime automatically starts each installed plugin and interacts through stdio. In plugin development scenarios, developers can use the lbp command-line tool to start plugins and connect to the running Runtime via WebSocket for debugging.
## Plugin System
> Plugin SDK, CLI, Runtime, and entities definitions shared between LangBot and plugins are contained in the [`langbot-plugin-sdk`](https://github.com/langbot-app/langbot-plugin-sdk) repository.
LangBot's plugin system (Plugin SDK, CLI `lbp`, Plugin Runtime, and the shared entity/API definitions) lives in a **separate repository**: [`langbot-plugin-sdk`](https://github.com/langbot-app/langbot-plugin-sdk). LangBot depends on it via the pinned `langbot-plugin` package in `pyproject.toml`.
## Some Development Tips and Standards
### Architecture (what to know inside this repo)
- LangBot is a global project, any comments in code should be in English, and user experience should be considered in all aspects.
- Thus you should consider the i18n support in all aspects.
- LangBot is widely adopted in both toC and toB scenarios, so you should consider the compatibility and security in all aspects.
- If you were asked to make a commit, please follow the commit message format:
- format: <type>(<scope>): <subject>
- type: must be a specific type, such as feat (new feature), fix (bug fix), docs (documentation), style (code style), refactor (refactoring), perf (performance optimization), etc.
- scope: the scope of the commit, such as the package name, the file name, the function name, the class name, the module name, etc.
- subject: the subject of the commit, such as the description of the commit, the reason for the commit, the impact of the commit, etc.
- LangBot uses [Alembic](https://alembic.sqlalchemy.org/) to manage database migrations, supporting both SQLite and PostgreSQL. Migration files are located in `src/langbot/pkg/persistence/alembic/versions/`. If you changed the definition of database entities (ORM models), generate a new migration script by running `uv run python -m langbot.pkg.persistence.alembic_runner autogenerate "description of your change"` in the project root (requires `data/config.yaml` to exist). Review and edit the generated script before committing. Migrations are executed automatically on LangBot startup. For data migrations (e.g. modifying JSON field content), you need to manually add the migration code in the generated script.
- Plugins run as independent processes managed by the **Plugin Runtime**. The Runtime supports two control transports: `stdio` and `websocket`.
- When LangBot is started directly by a user (not in a container), it spawns and connects to the Runtime over **stdio** (lightweight/personal use).
- When LangBot runs in a container, it connects to a standalone Runtime over **WebSocket** (production).
- The bridge code lives in `src/langbot/pkg/plugin/` (`connector.py`, `handler.py`).
- Relevant config (`data/config.yaml`): `plugin.runtime_ws_url` (e.g. `ws://langbot_plugin_runtime:5400/control/ws`). Start LangBot with `--standalone-runtime` to make it connect to an externally-launched Runtime over WebSocket instead of spawning one over stdio.
### Debugging the Plugin Runtime / CLI / SDK
This is documented in detail in the **SDK repo's `AGENTS.md`** and in the wiki page **["调试插件运行时、CLI、SDK" / Plugin Runtime](https://docs.langbot.app/zh/develop/plugin-runtime)**. The short version:
- Clone `LangBot` and `langbot-plugin-sdk` as siblings under one parent dir so the editor resolves shared entities.
- Start a standalone Runtime from the SDK repo: `uv run --no-sync lbp rt` (control port `5400`, debug port `5401`).
- To make LangBot use a locally-modified SDK: from the SDK dir, with LangBot's `.venv` active, run `uv pip install .`, then launch LangBot with `uv run --no-sync main.py --standalone-runtime` (keep `--no-sync` so your local SDK isn't overwritten).
### Debugging the Box (sandbox) runtime
The Box subsystem (`src/langbot/pkg/box/`) is the code sandbox. It picks the first available backend among **Docker / nsjail / E2B**. The standalone Box runtime is launched via the SDK CLI: `lbp box`. Backend selection details, the `lbp box` flags, and the SDK-side architecture are documented in the SDK repo's `AGENTS.md`.
Relevant config (`data/config.yaml`, `box:` section): `box.enabled` (master switch — disabling it also disables the native sandbox tools, skill add/edit, and stdio-mode MCP servers), `box.backend` (`'local'` = Docker/nsjail auto-pick, or `'docker'` / `'nsjail'` / `'e2b'`; also settable via `BOX__BACKEND`), and `box.runtime.endpoint` (external Box runtime base URL, e.g. `ws://127.0.0.1:5410`; empty = local auto-managed runtime). Like the plugin runtime, LangBot can connect to an externally-launched Box runtime by setting that endpoint and starting with `--standalone-box`.
> A common false "No supported sandbox backend (Docker / nsjail / E2B) is available" comes from Docker being installed and running but the current user not being in the `docker` group → `docker info` gets `permission denied` on the socket. Fix: `sudo usermod -aG docker <user>` and restart the backend in a shell that has the new group.
## Development Standards
- LangBot is a global project: **all code comments and docstrings must be in English**, and every user-facing string must support **i18n** (`en_US` + `zh_Hans` at minimum, plus `ja_JP` where the repo already has it).
- LangBot is adopted in both toC and toB scenarios — always consider compatibility and security.
- **Commit message format**: `<type>(<scope>): <subject>`
- `type`: one of `feat`, `fix`, `docs`, `style`, `refactor`, `perf`, `test`, `chore`, etc.
- `scope`: the affected package/module/file/class.
- `subject`: concise description of the change.
### Database migrations (Alembic)
LangBot uses [Alembic](https://alembic.sqlalchemy.org/) for migrations, supporting both SQLite and PostgreSQL from a single set of scripts. Migration files live in `src/langbot/pkg/persistence/alembic/versions/`.
If you change ORM model definitions, generate a migration:
```bash
# Run from the project root (requires data/config.yaml to exist)
uv run python -m langbot.pkg.persistence.alembic_runner autogenerate "description of your change"
```
Review and edit the generated script before committing. Migrations execute automatically on startup. `autogenerate` detects schema changes (add/drop columns, tables, type changes) but **data migrations** (e.g. mutating JSON field contents) must be hand-written into the generated script. `env.py` sets `render_as_batch=True`, so SQLite's ALTER TABLE limits are handled automatically — no need to branch per database. More in the wiki ["开发配置"](https://docs.langbot.app/zh/develop/dev-config#数据库迁移).
## Some Principles
- Keep it simple, stupid.
- Entities should not be multiplied unnecessarily
- Entities should not be multiplied unnecessarily.
- 八荣八耻
以瞎猜接口为耻,以认真查询为荣。
@@ -85,4 +138,4 @@ Plugin Runtime automatically starts each installed plugin and interacts through
以跳过验证为耻,以主动测试为荣。
以破坏架构为耻,以遵循规范为荣。
以假装理解为耻,以诚实无知为荣。
以盲目修改为耻,以谨慎重构为荣。
以盲目修改为耻,以谨慎重构为荣。

View File

@@ -38,7 +38,7 @@ LangBot is an **open-source, production-grade platform** for building AI-powered
### Key Capabilities
- **AI Conversations & Agents** — Multi-turn dialogues, tool calling, multi-modal support, streaming output. Built-in RAG (knowledge base) with deep integration to [Dify](https://dify.ai), [Coze](https://coze.com), [n8n](https://n8n.io), [Langflow](https://langflow.org).
- **AI Conversations & Agents** — Multi-turn dialogues, tool calling, multi-modal support, streaming output. Built-in RAG (knowledge base) with deep integration to [Dify](https://dify.ai), [Coze](https://coze.com), [n8n](https://n8n.io), [Langflow](https://langflow.org), [Deerflow](https://deerflow.tech), [Weknora](https://weknora.weixin.qq.com).
- **Universal IM Platform Support** — One codebase for Discord, Telegram, Slack, LINE, QQ, WeChat, WeCom, Lark, DingTalk, KOOK.
- **Production-Ready** — Access control, rate limiting, sensitive word filtering, comprehensive monitoring, and exception handling. Trusted by enterprises.
- **Plugin Ecosystem** — Hundreds of plugins, event-driven architecture, component extensions, and [MCP protocol](https://modelcontextprotocol.io/) support.
@@ -78,7 +78,7 @@ docker compose up -d
[![Deploy on Zeabur](https://zeabur.com/button.svg)](https://zeabur.com/en-US/templates/ZKTBDH)
[![Deploy on Railway](https://railway.com/button.svg)](https://railway.app/template/yRrAyL?referralCode=vogKPF)
**More options:** [Docker](https://link.langbot.app/en/docs/docker) · [Manual](https://link.langbot.app/en/docs/manual-deploy) · [BTPanel](https://link.langbot.app/en/docs/bt-panel) · [Kubernetes](./docker/README_K8S.md)
**More options:** [Docker](https://link.langbot.app/en/docs/docker) · [Manual](https://link.langbot.app/en/docs/manual-deploy) · [BTPanel](https://link.langbot.app/en/docs/bt-panel) · [Kubernetes](https://docs.langbot.app/en/deploy/langbot/kubernetes)
---

View File

@@ -13,7 +13,7 @@
[English](README.md) / 简体中文 / [繁體中文](README_TW.md) / [日本語](README_JP.md) / [Español](README_ES.md) / [Français](README_FR.md) / [한국어](README_KO.md) / [Русский](README_RU.md) / [Tiếng Việt](README_VI.md)
[![Discord](https://img.shields.io/discord/1335141740050649118?logo=discord&labelColor=%20%235462eb&logoColor=%20%23f5f5f5&color=%20%235462eb)](https://discord.gg/wdNEHETs87)
[![QQ Group](https://img.shields.io/badge/%E7%A4%BE%E5%8C%BAQQ%E7%BE%A4-1030838208-blue)](https://qm.qq.com/q/DxZZcNxM1W)
[![QQ Group](https://img.shields.io/badge/%E7%A4%BE%E5%8C%BAQQ%E7%BE%A4-1030838208-blue)](https://qm.qq.com/q/IrlV8QFacU)
[![Ask DeepWiki](https://deepwiki.com/badge.svg)](https://deepwiki.com/langbot-app/LangBot)
[![GitHub release (latest by date)](https://img.shields.io/github/v/release/langbot-app/LangBot)](https://github.com/langbot-app/LangBot/releases/latest)
<img src="https://img.shields.io/badge/python-3.10 ~ 3.13 -blue.svg" alt="python">
@@ -38,7 +38,7 @@ LangBot 是一个**开源的生产级平台**,用于构建 AI 驱动的即时
### 核心能力
- **AI 对话与 Agent** — 多轮对话、工具调用、多模态、流式输出。自带 RAG知识库深度集成 [Dify](https://dify.ai)、[Coze](https://coze.com)、[n8n](https://n8n.io)、[Langflow](https://langflow.org) 等 LLMOps 平台。
- **AI 对话与 Agent** — 多轮对话、工具调用、多模态、流式输出。自带 RAG知识库深度集成 [Dify](https://dify.ai)、[Coze](https://coze.com)、[n8n](https://n8n.io)、[Langflow](https://langflow.org)、[Deerflow](https://deerflow.tech)、[Weknora](https://weknora.weixin.qq.com)等 LLMOps 平台。
- **全平台支持** — 一套代码,覆盖 QQ、微信、企业微信、飞书、钉钉、Discord、Telegram、Slack、LINE、KOOK 等平台。
- **生产就绪** — 访问控制、限速、敏感词过滤、全面监控与异常处理,已被多家企业采用。
- **插件生态** — 数百个插件,跨进程的事件驱动架构,组件扩展,适配 [MCP 协议](https://modelcontextprotocol.io/)。
@@ -78,7 +78,7 @@ docker compose up -d
[![Deploy on Zeabur](https://zeabur.com/button.svg)](https://zeabur.com/zh-CN/templates/ZKTBDH)
[![Deploy on Railway](https://railway.com/button.svg)](https://railway.app/template/yRrAyL?referralCode=vogKPF)
**更多方式:** [Docker](https://link.langbot.app/zh/docs/docker) · [手动部署](https://link.langbot.app/zh/docs/manual-deploy) · [宝塔面板](https://link.langbot.app/zh/docs/bt-panel) · [Kubernetes](./docker/README_K8S.md)
**更多方式:** [Docker](https://link.langbot.app/zh/docs/docker) · [手动部署](https://link.langbot.app/zh/docs/manual-deploy) · [宝塔面板](https://link.langbot.app/zh/docs/bt-panel) · [Kubernetes](https://docs.langbot.app/zh/deploy/langbot/kubernetes)
---

View File

@@ -37,7 +37,7 @@ LangBot es una **plataforma de código abierto y grado de producción** para con
### Capacidades Clave
- **Conversaciones e Agentes IA** — Diálogos de múltiples turnos, llamadas a herramientas, soporte multimodal, salida en streaming. RAG (base de conocimientos) incorporado con integración profunda con [Dify](https://dify.ai), [Coze](https://coze.com), [n8n](https://n8n.io), [Langflow](https://langflow.org).
- **Conversaciones e Agentes IA** — Diálogos de múltiples turnos, llamadas a herramientas, soporte multimodal, salida en streaming. RAG (base de conocimientos) incorporado con integración profunda con [Dify](https://dify.ai), [Coze](https://coze.com), [n8n](https://n8n.io), [Langflow](https://langflow.org), [Deerflow](https://deerflow.tech)、[Weknora](https://weknora.weixin.qq.com).
- **Soporte Universal de Plataformas de MI** — Un solo código base para Discord, Telegram, Slack, LINE, QQ, WeChat, WeCom, Lark, DingTalk, KOOK.
- **Listo para Producción** — Control de acceso, limitación de velocidad, filtrado de palabras sensibles, monitoreo completo y manejo de excepciones. De confianza para empresas.
- **Ecosistema de Plugins** — Cientos de plugins, arquitectura basada en eventos, extensiones de componentes y soporte del [protocolo MCP](https://modelcontextprotocol.io/).
@@ -77,7 +77,7 @@ docker compose up -d
[![Deploy on Zeabur](https://zeabur.com/button.svg)](https://zeabur.com/en-US/templates/ZKTBDH)
[![Deploy on Railway](https://railway.com/button.svg)](https://railway.app/template/yRrAyL?referralCode=vogKPF)
**Más opciones:** [Docker](https://link.langbot.app/en/docs/docker) · [Manual](https://link.langbot.app/en/docs/manual-deploy) · [BTPanel](https://link.langbot.app/en/docs/bt-panel) · [Kubernetes](./docker/README_K8S.md)
**Más opciones:** [Docker](https://link.langbot.app/en/docs/docker) · [Manual](https://link.langbot.app/en/docs/manual-deploy) · [BTPanel](https://link.langbot.app/en/docs/bt-panel) · [Kubernetes](https://docs.langbot.app/en/deploy/langbot/kubernetes)
---

View File

@@ -37,7 +37,7 @@ LangBot est une **plateforme open-source de niveau production** pour créer des
### Capacités Clés
- **Conversations IA & Agents** — Dialogues multi-tours, appels d'outils, support multimodal, sortie en streaming. RAG (base de connaissances) intégré avec intégration profonde de [Dify](https://dify.ai), [Coze](https://coze.com), [n8n](https://n8n.io), [Langflow](https://langflow.org).
- **Conversations IA & Agents** — Dialogues multi-tours, appels d'outils, support multimodal, sortie en streaming. RAG (base de connaissances) intégré avec intégration profonde de [Dify](https://dify.ai), [Coze](https://coze.com), [n8n](https://n8n.io), [Langflow](https://langflow.org), [Deerflow](https://deerflow.tech), [Weknora](https://weknora.weixin.qq.com).
- **Support Universel des Plateformes de MI** — Un seul code pour Discord, Telegram, Slack, LINE, QQ, WeChat, WeCom, Lark, DingTalk, KOOK.
- **Prêt pour la Production** — Contrôle d'accès, limitation de débit, filtrage de mots sensibles, surveillance complète et gestion des exceptions. Approuvé par les entreprises.
- **Écosystème de Plugins** — Des centaines de plugins, architecture événementielle, extensions de composants, et support du [protocole MCP](https://modelcontextprotocol.io/).
@@ -77,7 +77,7 @@ docker compose up -d
[![Deploy on Zeabur](https://zeabur.com/button.svg)](https://zeabur.com/en-US/templates/ZKTBDH)
[![Deploy on Railway](https://railway.com/button.svg)](https://railway.app/template/yRrAyL?referralCode=vogKPF)
**Plus d'options :** [Docker](https://link.langbot.app/en/docs/docker) · [Manuel](https://link.langbot.app/en/docs/manual-deploy) · [BTPanel](https://link.langbot.app/en/docs/bt-panel) · [Kubernetes](./docker/README_K8S.md)
**Plus d'options :** [Docker](https://link.langbot.app/en/docs/docker) · [Manuel](https://link.langbot.app/en/docs/manual-deploy) · [BTPanel](https://link.langbot.app/en/docs/bt-panel) · [Kubernetes](https://docs.langbot.app/en/deploy/langbot/kubernetes)
---

View File

@@ -37,7 +37,7 @@ LangBot は、AI搭載のインスタントメッセージングボットを構
### 主な機能
- **AI対話とエージェント** — マルチターン対話、ツール呼び出し、マルチモーダル対応、ストリーミング出力。RAGナレッジベースを内蔵し、[Dify](https://dify.ai)、[Coze](https://coze.com)、[n8n](https://n8n.io)、[Langflow](https://langflow.org) と深く統合。
- **AI対話とエージェント** — マルチターン対話、ツール呼び出し、マルチモーダル対応、ストリーミング出力。RAGナレッジベースを内蔵し、[Dify](https://dify.ai)、[Coze](https://coze.com)、[n8n](https://n8n.io)、[Langflow](https://langflow.org)、[Deerflow](https://deerflow.tech)、[Weknora](https://weknora.weixin.qq.com) と深く統合。
- **ユニバーサルIMプラットフォーム対応** — 単一のコードベースで Discord、Telegram、Slack、LINE、QQ、WeChat、WeCom、Lark、DingTalk、KOOK に対応。
- **本番環境対応** — アクセス制御、レート制限、センシティブワードフィルタリング、包括的な監視、例外処理を搭載。エンタープライズの信頼に応える品質。
- **プラグインエコシステム** — 数百のプラグイン、イベント駆動アーキテクチャ、コンポーネント拡張、[MCPプロトコル](https://modelcontextprotocol.io/)対応。
@@ -77,7 +77,7 @@ docker compose up -d
[![Deploy on Zeabur](https://zeabur.com/button.svg)](https://zeabur.com/en-US/templates/ZKTBDH)
[![Deploy on Railway](https://railway.com/button.svg)](https://railway.app/template/yRrAyL?referralCode=vogKPF)
**その他:** [Docker](https://link.langbot.app/en/docs/docker) · [手動デプロイ](https://link.langbot.app/en/docs/manual-deploy) · [BTPanel](https://link.langbot.app/en/docs/bt-panel) · [Kubernetes](./docker/README_K8S.md)
**その他:** [Docker](https://link.langbot.app/en/docs/docker) · [手動デプロイ](https://link.langbot.app/en/docs/manual-deploy) · [BTPanel](https://link.langbot.app/en/docs/bt-panel) · [Kubernetes](https://docs.langbot.app/en/deploy/langbot/kubernetes)
---

View File

@@ -37,7 +37,7 @@ LangBot은 AI 기반 인스턴트 메시징 봇을 구축하기 위한 **오픈
### 핵심 기능
- **AI 대화 및 에이전트** — 멀티턴 대화, 도구 호출, 멀티모달 지원, 스트리밍 출력. 내장 RAG(지식 베이스)와 [Dify](https://dify.ai), [Coze](https://coze.com), [n8n](https://n8n.io), [Langflow](https://langflow.org) 심층 통합.
- **AI 대화 및 에이전트** — 멀티턴 대화, 도구 호출, 멀티모달 지원, 스트리밍 출력. 내장 RAG(지식 베이스)와 [Dify](https://dify.ai), [Coze](https://coze.com), [n8n](https://n8n.io), [Langflow](https://langflow.org), [Deerflow](https://deerflow.tech), [Weknora](https://weknora.weixin.qq.com) 심층 통합.
- **유니버설 IM 플랫폼 지원** — 단일 코드베이스로 Discord, Telegram, Slack, LINE, QQ, WeChat, WeCom, Lark, DingTalk, KOOK 지원.
- **프로덕션 레디** — 접근 제어, 속도 제한, 민감어 필터링, 종합 모니터링 및 예외 처리. 기업 환경에서 검증됨.
- **플러그인 생태계** — 수백 개의 플러그인, 이벤트 기반 아키텍처, 컴포넌트 확장, [MCP 프로토콜](https://modelcontextprotocol.io/) 지원.
@@ -77,7 +77,7 @@ docker compose up -d
[![Deploy on Zeabur](https://zeabur.com/button.svg)](https://zeabur.com/en-US/templates/ZKTBDH)
[![Deploy on Railway](https://railway.com/button.svg)](https://railway.app/template/yRrAyL?referralCode=vogKPF)
**더 많은 옵션:** [Docker](https://link.langbot.app/en/docs/docker) · [수동 배포](https://link.langbot.app/en/docs/manual-deploy) · [BTPanel](https://link.langbot.app/en/docs/bt-panel) · [Kubernetes](./docker/README_K8S.md)
**더 많은 옵션:** [Docker](https://link.langbot.app/en/docs/docker) · [수동 배포](https://link.langbot.app/en/docs/manual-deploy) · [BTPanel](https://link.langbot.app/en/docs/bt-panel) · [Kubernetes](https://docs.langbot.app/en/deploy/langbot/kubernetes)
---

View File

@@ -37,7 +37,7 @@ LangBot — это **платформа с открытым исходным к
### Ключевые возможности
- **ИИ-диалоги и агенты** — Многораундовые диалоги, вызов инструментов, мультимодальная поддержка, потоковый вывод. Встроенная реализация RAG (база знаний) с глубокой интеграцией в [Dify](https://dify.ai), [Coze](https://coze.com), [n8n](https://n8n.io), [Langflow](https://langflow.org).
- **ИИ-диалоги и агенты** — Многораундовые диалоги, вызов инструментов, мультимодальная поддержка, потоковый вывод. Встроенная реализация RAG (база знаний) с глубокой интеграцией в [Dify](https://dify.ai), [Coze](https://coze.com), [n8n](https://n8n.io), [Langflow](https://langflow.org), [Deerflow](https://deerflow.tech), [Weknora](https://weknora.weixin.qq.com).
- **Универсальная поддержка IM-платформ** — Единая кодовая база для Discord, Telegram, Slack, LINE, QQ, WeChat, WeCom, Lark, DingTalk, KOOK.
- **Готовность к продакшену** — Контроль доступа, ограничение скорости, фильтрация чувствительных слов, комплексный мониторинг и обработка исключений. Проверено в корпоративной среде.
- **Экосистема плагинов** — Сотни плагинов, событийно-ориентированная архитектура, расширения компонентов и поддержка [протокола MCP](https://modelcontextprotocol.io/).
@@ -77,7 +77,7 @@ docker compose up -d
[![Deploy on Zeabur](https://zeabur.com/button.svg)](https://zeabur.com/en-US/templates/ZKTBDH)
[![Deploy on Railway](https://railway.com/button.svg)](https://railway.app/template/yRrAyL?referralCode=vogKPF)
**Другие варианты:** [Docker](https://link.langbot.app/en/docs/docker) · [Ручная установка](https://link.langbot.app/en/docs/manual-deploy) · [BTPanel](https://link.langbot.app/en/docs/bt-panel) · [Kubernetes](./docker/README_K8S.md)
**Другие варианты:** [Docker](https://link.langbot.app/en/docs/docker) · [Ручная установка](https://link.langbot.app/en/docs/manual-deploy) · [BTPanel](https://link.langbot.app/en/docs/bt-panel) · [Kubernetes](https://docs.langbot.app/en/deploy/langbot/kubernetes)
---

View File

@@ -39,7 +39,7 @@ LangBot 是一個**開源的生產級平台**,用於建構 AI 驅動的即時
### 核心能力
- **AI 對話與 Agent** — 多輪對話、工具調用、多模態、流式輸出。自帶 RAG知識庫深度整合 [Dify](https://dify.ai)、[Coze](https://coze.com)、[n8n](https://n8n.io)、[Langflow](https://langflow.org) 等 LLMOps 平台。
- **AI 對話與 Agent** — 多輪對話、工具調用、多模態、流式輸出。自帶 RAG知識庫深度整合 [Dify](https://dify.ai)、[Coze](https://coze.com)、[n8n](https://n8n.io)、[Langflow](https://langflow.org)、 [Deerflow](https://deerflow.tech)、[Weknora](https://weknora.weixin.qq.com)等 LLMOps 平台。
- **全平台支援** — 一套程式碼,覆蓋 QQ、微信、企業微信、飛書、釘釘、Discord、Telegram、Slack、LINE、KOOK 等平台。
- **生產就緒** — 存取控制、限速、敏感詞過濾、全面監控與異常處理,已被多家企業採用。
- **外掛生態** — 數百個外掛,事件驅動架構,組件擴展,適配 [MCP 協議](https://modelcontextprotocol.io/)。
@@ -79,7 +79,7 @@ docker compose up -d
[![Deploy on Zeabur](https://zeabur.com/button.svg)](https://zeabur.com/zh-CN/templates/ZKTBDH)
[![Deploy on Railway](https://railway.com/button.svg)](https://railway.app/template/yRrAyL?referralCode=vogKPF)
**更多方式:** [Docker](https://link.langbot.app/zh/docs/docker) · [手動部署](https://link.langbot.app/zh/docs/manual-deploy) · [寶塔面板](https://link.langbot.app/zh/docs/bt-panel) · [Kubernetes](./docker/README_K8S.md)
**更多方式:** [Docker](https://link.langbot.app/zh/docs/docker) · [手動部署](https://link.langbot.app/zh/docs/manual-deploy) · [寶塔面板](https://link.langbot.app/zh/docs/bt-panel) · [Kubernetes](https://docs.langbot.app/zh/deploy/langbot/kubernetes)
---

View File

@@ -37,7 +37,7 @@ LangBot là một **nền tảng mã nguồn mở, cấp sản xuất** để x
### Khả năng chính
- **Hội thoại AI & Agent** — Đối thoại nhiều lượt, gọi công cụ, hỗ trợ đa phương thức, đầu ra streaming. RAG (cơ sở kiến thức) tích hợp sẵn với tích hợp sâu vào [Dify](https://dify.ai), [Coze](https://coze.com), [n8n](https://n8n.io), [Langflow](https://langflow.org).
- **Hội thoại AI & Agent** — Đối thoại nhiều lượt, gọi công cụ, hỗ trợ đa phương thức, đầu ra streaming. RAG (cơ sở kiến thức) tích hợp sẵn với tích hợp sâu vào [Dify](https://dify.ai), [Coze](https://coze.com), [n8n](https://n8n.io), [Langflow](https://langflow.org), [Deerflow](https://deerflow.tech), [Weknora](https://weknora.weixin.qq.com).
- **Hỗ trợ đa nền tảng IM** — Một mã nguồn cho Discord, Telegram, Slack, LINE, QQ, WeChat, WeCom, Lark, DingTalk, KOOK.
- **Sẵn sàng cho sản xuất** — Kiểm soát truy cập, giới hạn tốc độ, lọc từ nhạy cảm, giám sát toàn diện và xử lý ngoại lệ. Được doanh nghiệp tin dùng.
- **Hệ sinh thái Plugin** — Hàng trăm plugin, kiến trúc hướng sự kiện, mở rộng thành phần, và hỗ trợ [giao thức MCP](https://modelcontextprotocol.io/).
@@ -77,7 +77,7 @@ docker compose up -d
[![Deploy on Zeabur](https://zeabur.com/button.svg)](https://zeabur.com/en-US/templates/ZKTBDH)
[![Deploy on Railway](https://railway.com/button.svg)](https://railway.app/template/yRrAyL?referralCode=vogKPF)
**Thêm tùy chọn:** [Docker](https://link.langbot.app/en/docs/docker) · [Thủ công](https://link.langbot.app/en/docs/manual-deploy) · [BTPanel](https://link.langbot.app/en/docs/bt-panel) · [Kubernetes](./docker/README_K8S.md)
**Thêm tùy chọn:** [Docker](https://link.langbot.app/en/docs/docker) · [Thủ công](https://link.langbot.app/en/docs/manual-deploy) · [BTPanel](https://link.langbot.app/en/docs/bt-panel) · [Kubernetes](https://docs.langbot.app/en/deploy/langbot/kubernetes)
---

View File

@@ -1,629 +0,0 @@
# LangBot Kubernetes 部署指南 / Kubernetes Deployment Guide
[简体中文](#简体中文) | [English](#english)
---
## 简体中文
### 概述
本指南提供了在 Kubernetes 集群中部署 LangBot 的完整步骤。Kubernetes 部署配置基于 `docker-compose.yaml`,适用于生产环境的容器化部署。
### 前置要求
- Kubernetes 集群(版本 1.19+
- `kubectl` 命令行工具已配置并可访问集群
- 集群中有可用的存储类StorageClass用于持久化存储可选但推荐
- 至少 2 vCPU 和 4GB RAM 的可用资源
### 架构说明
Kubernetes 部署包含以下组件:
1. **langbot**: 主应用服务
- 提供 Web UI端口 5300
- 处理平台 webhook端口 2280-2290
- 数据持久化卷
2. **langbot-plugin-runtime**: 插件运行时服务
- WebSocket 通信(端口 5400
- 插件数据持久化卷
3. **持久化存储**:
- `langbot-data`: LangBot 主数据
- `langbot-plugins`: 插件文件
- `langbot-plugin-runtime-data`: 插件运行时数据
### 快速开始
#### 1. 下载部署文件
```bash
# 克隆仓库
git clone https://github.com/langbot-app/LangBot
cd LangBot/docker
# 或直接下载 kubernetes.yaml
wget https://raw.githubusercontent.com/langbot-app/LangBot/main/docker/kubernetes.yaml
```
#### 2. 部署到 Kubernetes
```bash
# 应用所有配置
kubectl apply -f kubernetes.yaml
# 检查部署状态
kubectl get all -n langbot
# 查看 Pod 日志
kubectl logs -n langbot -l app=langbot -f
```
#### 3. 访问 LangBot
默认情况下LangBot 服务使用 ClusterIP 类型,只能在集群内部访问。您可以选择以下方式之一来访问:
**选项 A: 端口转发(推荐用于测试)**
```bash
kubectl port-forward -n langbot svc/langbot 5300:5300
```
然后访问 http://localhost:5300
**选项 B: NodePort适用于开发环境**
编辑 `kubernetes.yaml`,取消注释 NodePort Service 部分,然后:
```bash
kubectl apply -f kubernetes.yaml
# 获取节点 IP
kubectl get nodes -o wide
# 访问 http://<NODE_IP>:30300
```
**选项 C: LoadBalancer适用于云环境**
编辑 `kubernetes.yaml`,取消注释 LoadBalancer Service 部分,然后:
```bash
kubectl apply -f kubernetes.yaml
# 获取外部 IP
kubectl get svc -n langbot langbot-loadbalancer
# 访问 http://<EXTERNAL_IP>
```
**选项 D: Ingress推荐用于生产环境**
确保集群中已安装 Ingress Controller如 nginx-ingress然后
1. 编辑 `kubernetes.yaml` 中的 Ingress 配置
2. 修改域名为您的实际域名
3. 应用配置:
```bash
kubectl apply -f kubernetes.yaml
# 访问 http://langbot.yourdomain.com
```
### 配置说明
#### 环境变量
`ConfigMap` 中配置环境变量:
```yaml
apiVersion: v1
kind: ConfigMap
metadata:
name: langbot-config
namespace: langbot
data:
TZ: "Asia/Shanghai" # 修改为您的时区
```
#### 存储配置
默认使用动态存储分配。如果您有特定的 StorageClass请在 PVC 中指定:
```yaml
spec:
storageClassName: your-storage-class-name
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 10Gi
```
#### 资源限制
根据您的需求调整资源限制:
```yaml
resources:
requests:
memory: "1Gi"
cpu: "500m"
limits:
memory: "4Gi"
cpu: "2000m"
```
### 常用操作
#### 查看日志
```bash
# 查看 LangBot 主服务日志
kubectl logs -n langbot -l app=langbot -f
# 查看插件运行时日志
kubectl logs -n langbot -l app=langbot-plugin-runtime -f
```
#### 重启服务
```bash
# 重启 LangBot
kubectl rollout restart deployment/langbot -n langbot
# 重启插件运行时
kubectl rollout restart deployment/langbot-plugin-runtime -n langbot
```
#### 更新镜像
```bash
# 更新到最新版本
kubectl set image deployment/langbot -n langbot langbot=rockchin/langbot:latest
kubectl set image deployment/langbot-plugin-runtime -n langbot langbot-plugin-runtime=rockchin/langbot:latest
# 检查更新状态
kubectl rollout status deployment/langbot -n langbot
```
#### 扩容(不推荐)
注意:由于 LangBot 使用 ReadWriteOnce 的持久化存储,不支持多副本扩容。如需高可用,请考虑使用 ReadWriteMany 存储或其他架构方案。
#### 备份数据
```bash
# 备份 PVC 数据
kubectl exec -n langbot -it <langbot-pod-name> -- tar czf /tmp/backup.tar.gz /app/data
kubectl cp langbot/<langbot-pod-name>:/tmp/backup.tar.gz ./backup.tar.gz
```
### 卸载
```bash
# 删除所有资源(保留 PVC
kubectl delete deployment,service,configmap -n langbot --all
# 删除 PVC会删除数据
kubectl delete pvc -n langbot --all
# 删除命名空间
kubectl delete namespace langbot
```
### 故障排查
#### Pod 无法启动
```bash
# 查看 Pod 状态
kubectl get pods -n langbot
# 查看详细信息
kubectl describe pod -n langbot <pod-name>
# 查看事件
kubectl get events -n langbot --sort-by='.lastTimestamp'
```
#### 存储问题
```bash
# 检查 PVC 状态
kubectl get pvc -n langbot
# 检查 PV
kubectl get pv
```
#### 网络访问问题
```bash
# 检查 Service
kubectl get svc -n langbot
# 检查端口转发
kubectl port-forward -n langbot svc/langbot 5300:5300
```
### 生产环境建议
1. **使用特定版本标签**:避免使用 `latest` 标签,使用具体版本号如 `rockchin/langbot:v1.0.0`
2. **配置资源限制**:根据实际负载调整 CPU 和内存限制
3. **使用 Ingress + TLS**:配置 HTTPS 访问和证书管理
4. **配置监控和告警**:集成 Prometheus、Grafana 等监控工具
5. **定期备份**:配置自动备份策略保护数据
6. **使用专用 StorageClass**:为生产环境配置高性能存储
7. **配置亲和性规则**:确保 Pod 调度到合适的节点
### 高级配置
#### 使用 Secrets 管理敏感信息
如果需要配置 API 密钥等敏感信息:
```yaml
apiVersion: v1
kind: Secret
metadata:
name: langbot-secrets
namespace: langbot
type: Opaque
data:
api_key: <base64-encoded-value>
```
然后在 Deployment 中引用:
```yaml
env:
- name: API_KEY
valueFrom:
secretKeyRef:
name: langbot-secrets
key: api_key
```
#### 配置水平自动扩缩容HPA
注意:需要确保使用 ReadWriteMany 存储类型
```yaml
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: langbot-hpa
namespace: langbot
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: langbot
minReplicas: 1
maxReplicas: 3
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 70
```
### 参考资源
- [LangBot 官方文档](https://docs.langbot.app)
- [Docker 部署文档](https://link.langbot.app/zh/docs/docker)
- [Kubernetes 官方文档](https://kubernetes.io/docs/)
---
## English
### Overview
This guide provides complete steps for deploying LangBot in a Kubernetes cluster. The Kubernetes deployment configuration is based on `docker-compose.yaml` and is suitable for production containerized deployments.
### Prerequisites
- Kubernetes cluster (version 1.19+)
- `kubectl` command-line tool configured with cluster access
- Available StorageClass in the cluster for persistent storage (optional but recommended)
- At least 2 vCPU and 4GB RAM of available resources
### Architecture
The Kubernetes deployment includes the following components:
1. **langbot**: Main application service
- Provides Web UI (port 5300)
- Handles platform webhooks (ports 2280-2290)
- Data persistence volume
2. **langbot-plugin-runtime**: Plugin runtime service
- WebSocket communication (port 5400)
- Plugin data persistence volume
3. **Persistent Storage**:
- `langbot-data`: LangBot main data
- `langbot-plugins`: Plugin files
- `langbot-plugin-runtime-data`: Plugin runtime data
### Quick Start
#### 1. Download Deployment Files
```bash
# Clone repository
git clone https://github.com/langbot-app/LangBot
cd LangBot/docker
# Or download kubernetes.yaml directly
wget https://raw.githubusercontent.com/langbot-app/LangBot/main/docker/kubernetes.yaml
```
#### 2. Deploy to Kubernetes
```bash
# Apply all configurations
kubectl apply -f kubernetes.yaml
# Check deployment status
kubectl get all -n langbot
# View Pod logs
kubectl logs -n langbot -l app=langbot -f
```
#### 3. Access LangBot
By default, LangBot service uses ClusterIP type, accessible only within the cluster. Choose one of the following methods to access:
**Option A: Port Forwarding (Recommended for testing)**
```bash
kubectl port-forward -n langbot svc/langbot 5300:5300
```
Then visit http://localhost:5300
**Option B: NodePort (Suitable for development)**
Edit `kubernetes.yaml`, uncomment the NodePort Service section, then:
```bash
kubectl apply -f kubernetes.yaml
# Get node IP
kubectl get nodes -o wide
# Visit http://<NODE_IP>:30300
```
**Option C: LoadBalancer (Suitable for cloud environments)**
Edit `kubernetes.yaml`, uncomment the LoadBalancer Service section, then:
```bash
kubectl apply -f kubernetes.yaml
# Get external IP
kubectl get svc -n langbot langbot-loadbalancer
# Visit http://<EXTERNAL_IP>
```
**Option D: Ingress (Recommended for production)**
Ensure an Ingress Controller (e.g., nginx-ingress) is installed in the cluster, then:
1. Edit the Ingress configuration in `kubernetes.yaml`
2. Change the domain to your actual domain
3. Apply configuration:
```bash
kubectl apply -f kubernetes.yaml
# Visit http://langbot.yourdomain.com
```
### Configuration
#### Environment Variables
Configure environment variables in ConfigMap:
```yaml
apiVersion: v1
kind: ConfigMap
metadata:
name: langbot-config
namespace: langbot
data:
TZ: "Asia/Shanghai" # Change to your timezone
```
#### Storage Configuration
Uses dynamic storage provisioning by default. If you have a specific StorageClass, specify it in PVC:
```yaml
spec:
storageClassName: your-storage-class-name
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 10Gi
```
#### Resource Limits
Adjust resource limits based on your needs:
```yaml
resources:
requests:
memory: "1Gi"
cpu: "500m"
limits:
memory: "4Gi"
cpu: "2000m"
```
### Common Operations
#### View Logs
```bash
# View LangBot main service logs
kubectl logs -n langbot -l app=langbot -f
# View plugin runtime logs
kubectl logs -n langbot -l app=langbot-plugin-runtime -f
```
#### Restart Services
```bash
# Restart LangBot
kubectl rollout restart deployment/langbot -n langbot
# Restart plugin runtime
kubectl rollout restart deployment/langbot-plugin-runtime -n langbot
```
#### Update Images
```bash
# Update to latest version
kubectl set image deployment/langbot -n langbot langbot=rockchin/langbot:latest
kubectl set image deployment/langbot-plugin-runtime -n langbot langbot-plugin-runtime=rockchin/langbot:latest
# Check update status
kubectl rollout status deployment/langbot -n langbot
```
#### Scaling (Not Recommended)
Note: Due to LangBot using ReadWriteOnce persistent storage, multi-replica scaling is not supported. For high availability, consider using ReadWriteMany storage or alternative architectures.
#### Backup Data
```bash
# Backup PVC data
kubectl exec -n langbot -it <langbot-pod-name> -- tar czf /tmp/backup.tar.gz /app/data
kubectl cp langbot/<langbot-pod-name>:/tmp/backup.tar.gz ./backup.tar.gz
```
### Uninstall
```bash
# Delete all resources (keep PVCs)
kubectl delete deployment,service,configmap -n langbot --all
# Delete PVCs (will delete data)
kubectl delete pvc -n langbot --all
# Delete namespace
kubectl delete namespace langbot
```
### Troubleshooting
#### Pods Not Starting
```bash
# Check Pod status
kubectl get pods -n langbot
# View detailed information
kubectl describe pod -n langbot <pod-name>
# View events
kubectl get events -n langbot --sort-by='.lastTimestamp'
```
#### Storage Issues
```bash
# Check PVC status
kubectl get pvc -n langbot
# Check PV
kubectl get pv
```
#### Network Access Issues
```bash
# Check Service
kubectl get svc -n langbot
# Test port forwarding
kubectl port-forward -n langbot svc/langbot 5300:5300
```
### Production Recommendations
1. **Use specific version tags**: Avoid using `latest` tag, use specific version like `rockchin/langbot:v1.0.0`
2. **Configure resource limits**: Adjust CPU and memory limits based on actual load
3. **Use Ingress + TLS**: Configure HTTPS access and certificate management
4. **Configure monitoring and alerts**: Integrate monitoring tools like Prometheus, Grafana
5. **Regular backups**: Configure automated backup strategy to protect data
6. **Use dedicated StorageClass**: Configure high-performance storage for production
7. **Configure affinity rules**: Ensure Pods are scheduled to appropriate nodes
### Advanced Configuration
#### Using Secrets for Sensitive Information
If you need to configure sensitive information like API keys:
```yaml
apiVersion: v1
kind: Secret
metadata:
name: langbot-secrets
namespace: langbot
type: Opaque
data:
api_key: <base64-encoded-value>
```
Then reference in Deployment:
```yaml
env:
- name: API_KEY
valueFrom:
secretKeyRef:
name: langbot-secrets
key: api_key
```
#### Configure Horizontal Pod Autoscaling (HPA)
Note: Requires ReadWriteMany storage type
```yaml
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: langbot-hpa
namespace: langbot
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: langbot
minReplicas: 1
maxReplicas: 3
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 70
```
### References
- [LangBot Official Documentation](https://docs.langbot.app)
- [Docker Deployment Guide](https://link.langbot.app/zh/docs/docker)
- [Kubernetes Official Documentation](https://kubernetes.io/docs/)

View File

@@ -1,5 +1,5 @@
# Docker Compose configuration for LangBot
# For Kubernetes deployment, see kubernetes.yaml and README_K8S.md
# For Kubernetes deployment, see kubernetes.yaml and the deployment guide at https://docs.langbot.app
version: "3"
services:

View File

@@ -1,6 +1,8 @@
# Kubernetes Deployment for LangBot
# This file provides Kubernetes deployment manifests for LangBot based on docker-compose.yaml
#
#
# Full deployment guide (zh/en/ja): https://docs.langbot.app -> Installation -> Kubernetes
#
# Usage:
# kubectl apply -f kubernetes.yaml
#
@@ -8,13 +10,15 @@
# - A Kubernetes cluster (1.19+)
# - kubectl configured to communicate with your cluster
# - (Optional) A StorageClass for dynamic volume provisioning
# - For the Box sandbox runtime: a node with a reachable Docker daemon
# (the box mounts the node's /var/run/docker.sock). See the deployment guide.
#
# Components:
# - Namespace: langbot
# - PersistentVolumeClaims for data persistence
# - Deployments for langbot and langbot_plugin_runtime
# - Deployments for langbot, langbot-plugin-runtime, and langbot-box (sandbox)
# - Services for network access
# - ConfigMap for timezone configuration
# - ConfigMap for timezone + runtime endpoints
---
# Namespace
@@ -83,6 +87,11 @@ metadata:
data:
TZ: "Asia/Shanghai"
PLUGIN__RUNTIME_WS_URL: "ws://langbot-plugin-runtime:5400/control/ws"
# Box sandbox runtime endpoint. LangBot connects to the Box runtime over
# WebSocket. The hostname MUST match the langbot-box Service name. Note the
# in-container default ("langbot_box") uses an underscore, which is an
# invalid Kubernetes DNS name — so the endpoint is always set explicitly here.
BOX__RUNTIME__ENDPOINT: "ws://langbot-box:5410"
---
# Deployment for LangBot Plugin Runtime
@@ -169,6 +178,136 @@ spec:
protocol: TCP
name: runtime
---
# Deployment for LangBot Box (sandbox) runtime
#
# The Box runtime backs LangBot's sandbox tools (exec / read / write / edit /
# glob / grep), the `activate` skill tool, skill add/edit, and stdio-mode MCP
# servers. It is OPTIONAL: if you do not deploy it, set `BOX__ENABLED=false` on
# the langbot Deployment (or `box.enabled: false` in config.yaml) so the
# dashboard renders cleanly with sandbox features disabled.
#
# IMPORTANT — how the sandbox actually runs:
# The bundled image ships only the Docker CLI (no dockerd, no nsjail). The Box
# runtime therefore creates sandbox containers by talking to a Docker daemon
# over the mounted socket (`/var/run/docker.sock`). Because that daemon
# resolves bind-mount paths on the NODE filesystem, the Box workspace root
# must be the SAME absolute path inside the box container, inside every
# sandbox container it spawns, AND on the node. That is why this manifest uses
# a hostPath at a fixed absolute path (/app/data/box) and pins langbot + box
# to the same node via podAffinity. A normal PVC will NOT work for the box
# workspace, because the node's dockerd cannot see paths that exist only
# inside the pod's mount namespace.
#
# Security note: mounting the host Docker socket grants the Box runtime (and any
# code executed in the sandbox) effective root on the node. Only deploy Box on
# nodes you trust for this workload, ideally a dedicated node pool. For a
# stronger isolation boundary, switch box.backend to 'e2b' (set E2B_API_KEY) and
# drop the docker.sock mount + hostPath entirely.
apiVersion: apps/v1
kind: Deployment
metadata:
name: langbot-box
namespace: langbot
labels:
app: langbot-box
spec:
replicas: 1
selector:
matchLabels:
app: langbot-box
template:
metadata:
labels:
app: langbot-box
spec:
# Pin to the same node as langbot so they share the hostPath box root.
affinity:
podAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
- labelSelector:
matchLabels:
app: langbot
topologyKey: kubernetes.io/hostname
containers:
- name: langbot-box
image: rockchin/langbot:latest
imagePullPolicy: Always
# Launched through the same CLI entry point as the plugin runtime.
# No flag => WebSocket control transport (default), listening on 5410.
command: ["uv", "run", "--no-sync", "-m", "langbot_plugin.cli.__init__", "box"]
ports:
- containerPort: 5410
name: box-rpc
protocol: TCP
env:
- name: TZ
valueFrom:
configMapKeyRef:
name: langbot-config
key: TZ
# The Box runtime does NOT read box.local.* / BOX__* from its own env;
# it receives its configuration from LangBot via the INIT RPC action.
# Do not add BOX__* here — they would be silently ignored.
volumeMounts:
# Box workspace root — identical path on node, box, and sandbox
# containers (see the IMPORTANT note above).
- name: box-root
mountPath: /app/data/box
# Host Docker socket — the sandbox backend uses it to create containers.
- name: docker-sock
mountPath: /var/run/docker.sock
resources:
requests:
memory: "256Mi"
cpu: "100m"
limits:
memory: "1Gi"
cpu: "1000m"
livenessProbe:
tcpSocket:
port: 5410
initialDelaySeconds: 20
periodSeconds: 10
timeoutSeconds: 5
failureThreshold: 3
readinessProbe:
tcpSocket:
port: 5410
initialDelaySeconds: 10
periodSeconds: 5
timeoutSeconds: 3
failureThreshold: 3
volumes:
- name: box-root
hostPath:
path: /app/data/box
type: DirectoryOrCreate
- name: docker-sock
hostPath:
path: /var/run/docker.sock
type: Socket
restartPolicy: Always
---
# Service for LangBot Box runtime
apiVersion: v1
kind: Service
metadata:
name: langbot-box
namespace: langbot
labels:
app: langbot-box
spec:
type: ClusterIP
selector:
app: langbot-box
ports:
- port: 5410
targetPort: 5410
protocol: TCP
name: box-rpc
---
# Deployment for LangBot
apiVersion: apps/v1
@@ -213,11 +352,36 @@ spec:
configMapKeyRef:
name: langbot-config
key: PLUGIN__RUNTIME_WS_URL
# Box (sandbox) runtime endpoint. Connects LangBot to the langbot-box
# Service over WebSocket. Remove this (and the langbot-box Deployment)
# and set BOX__ENABLED=false if you do not want the sandbox.
- name: BOX__RUNTIME__ENDPOINT
valueFrom:
configMapKeyRef:
name: langbot-config
key: BOX__RUNTIME__ENDPOINT
# box.local.* config — forwarded to the Box runtime via INIT RPC. The
# host_root MUST match the box-root hostPath mountPath below AND the box
# Deployment's box-root mountPath, so that skill package paths resolve
# identically on both sides and on the node's Docker daemon.
- name: BOX__LOCAL__HOST_ROOT
value: "/app/data/box"
- name: BOX__LOCAL__DEFAULT_WORKSPACE
value: "default"
- name: BOX__LOCAL__SKILLS_ROOT
value: "skills"
- name: BOX__LOCAL__ALLOWED_MOUNT_ROOTS
value: "/app/data/box"
volumeMounts:
- name: data
mountPath: /app/data
- name: plugins
mountPath: /app/plugins
# Same node-level box root as the langbot-box Deployment. Mounted over
# the data PVC's /app/data/box subpath so both LangBot and the Box
# runtime (and the node's dockerd) agree on one absolute path.
- name: box-root
mountPath: /app/data/box
resources:
requests:
memory: "1Gi"
@@ -250,6 +414,13 @@ spec:
- name: plugins
persistentVolumeClaim:
claimName: langbot-plugins
# Node-level box workspace root, shared with the langbot-box Deployment.
# hostPath (not PVC) because the node's Docker daemon must see the same
# absolute path when bind-mounting workspaces into sandbox containers.
- name: box-root
hostPath:
path: /app/data/box
type: DirectoryOrCreate
restartPolicy: Always
---

View File

@@ -8,7 +8,7 @@ requires-python = ">=3.11,<4.0"
dependencies = [
"aiocqhttp>=1.4.4",
"aiofiles>=24.1.0",
"aiohttp>=3.13.4",
"aiohttp>=3.14.0",
"aioshutil>=1.5",
"aiosqlite>=0.21.0",
"anthropic>=0.51.0",
@@ -31,27 +31,27 @@ dependencies = [
"psutil>=7.0.0",
"pycryptodome>=3.22.0",
"pydantic>2.0",
"pyjwt>=2.10.1",
"pyjwt>=2.12.0",
"python-telegram-bot>=22.0",
"pyyaml>=6.0.2",
"qq-botpy-rc>=1.2.1.6",
"qrcode>=7.4",
"quart>=0.20.0",
"quart-cors>=0.8.0",
"requests>=2.32.3",
"requests>=2.33.0",
"slack-sdk>=3.35.0",
"alembic>=1.15.0",
"sqlalchemy[asyncio]>=2.0.40",
"sqlmodel>=0.0.24",
"telegramify-markdown>=0.5.1",
"tiktoken>=0.9.0",
"urllib3>=2.4.0",
"urllib3>=2.7.0",
"websockets>=15.0.1",
"python-socks>=2.7.1", # dingtalk missing dependency
"pip>=25.1.1",
"pip>=26.1",
"ruff>=0.11.9",
"pre-commit>=4.2.0",
"uv>=0.11.6",
"uv>=0.11.15",
"mypy>=1.16.0",
"PyPDF2>=3.0.1",
"python-docx>=1.1.0",
@@ -62,10 +62,10 @@ dependencies = [
"ebooklib>=0.18",
"html2text>=2024.2.26",
"langchain>=0.2.0",
"langchain-core>=1.2.28",
"langsmith>=0.7.31",
"python-multipart>=0.0.26",
"Mako>=1.3.11",
"langchain-core>=1.3.3",
"langsmith>=0.8.0",
"python-multipart>=0.0.27",
"Mako>=1.3.12",
"langchain-text-splitters>=1.1.2",
"chromadb>=1.0.0,<2.0.0",
"qdrant-client (>=1.15.1,<2.0.0)",
@@ -79,7 +79,6 @@ dependencies = [
"pymilvus>=2.6.4",
"pgvector>=0.4.1",
"botocore>=1.42.39",
"litellm>=1.0.0",
]
keywords = [
"bot",

View File

@@ -0,0 +1,5 @@
from .client import AsyncDeerFlowClient
from .errors import DeerFlowAPIError
from . import stream_utils
__all__ = ['AsyncDeerFlowClient', 'DeerFlowAPIError', 'stream_utils']

View File

@@ -0,0 +1,204 @@
"""DeerFlow LangGraph HTTP API 客户端
参考 astrbot 的 deerflow_api_client 实现,使用 httpx 适配 LangBot 风格。
"""
from __future__ import annotations
import codecs
import json
import typing
from collections.abc import AsyncGenerator
import httpx
from .errors import DeerFlowAPIError
SSE_MAX_BUFFER_CHARS = 1_048_576
def _normalize_sse_newlines(text: str) -> str:
"""规范化 CRLF/CR 为 LF确保 SSE 块分割稳定"""
return text.replace('\r\n', '\n').replace('\r', '\n')
def _parse_sse_data_lines(data_lines: list[str]) -> typing.Any:
raw_data = '\n'.join(data_lines)
try:
return json.loads(raw_data)
except json.JSONDecodeError:
# 某些 LangGraph 兼容服务端会在单个 SSE 事件中用多个 data 行
# 发送多段 JSON 片段(例如 tuple payload
parsed_lines: list[typing.Any] = []
can_parse_all = True
for line in data_lines:
line = line.strip()
if not line:
continue
try:
parsed_lines.append(json.loads(line))
except json.JSONDecodeError:
can_parse_all = False
break
if can_parse_all and parsed_lines:
return parsed_lines[0] if len(parsed_lines) == 1 else parsed_lines
return raw_data
def _parse_sse_block(block: str) -> dict[str, typing.Any] | None:
if not block.strip():
return None
event_name = 'message'
data_lines: list[str] = []
for line in block.splitlines():
if line.startswith('event:'):
event_name = line[6:].strip()
elif line.startswith('data:'):
data_lines.append(line[5:].lstrip())
if not data_lines:
return None
return {'event': event_name, 'data': _parse_sse_data_lines(data_lines)}
class AsyncDeerFlowClient:
"""DeerFlow LangGraph HTTP API 客户端"""
api_base: str
headers: dict[str, str]
def __init__(
self,
api_base: str = 'http://127.0.0.1:2026',
api_key: str = '',
auth_header: str = '',
) -> None:
self.api_base = api_base.rstrip('/')
self.headers: dict[str, str] = {}
if auth_header:
self.headers['Authorization'] = auth_header
elif api_key:
self.headers['Authorization'] = f'Bearer {api_key}'
async def create_thread(self, timeout: float = 20) -> dict[str, typing.Any]:
"""创建一个新的 LangGraph thread
Returns:
包含 thread_id 等信息的字典
"""
url = f'{self.api_base}/api/langgraph/threads'
payload = {'metadata': {}}
async with httpx.AsyncClient(
trust_env=True,
timeout=timeout,
) as client:
response = await client.post(
url,
headers=self.headers,
json=payload,
)
if response.status_code not in (200, 201):
raise DeerFlowAPIError(
operation='create thread',
status=response.status_code,
body=response.text,
url=url,
)
return response.json()
async def delete_thread(self, thread_id: str, timeout: float = 20) -> None:
"""删除指定 thread"""
url = f'{self.api_base}/api/threads/{thread_id}'
async with httpx.AsyncClient(
trust_env=True,
timeout=timeout,
) as client:
response = await client.delete(url, headers=self.headers)
if response.status_code not in (200, 202, 204, 404):
raise DeerFlowAPIError(
operation='delete thread',
status=response.status_code,
body=response.text,
url=url,
thread_id=thread_id,
)
async def stream_run(
self,
thread_id: str,
payload: dict[str, typing.Any],
timeout: float = 120,
) -> AsyncGenerator[dict[str, typing.Any], None]:
"""运行一次 LangGraph stream 请求,逐事件 yield
Yields:
事件字典 {'event': event_name, 'data': parsed_data}
"""
url = f'{self.api_base}/api/langgraph/threads/{thread_id}/runs/stream'
# 流式请求使用单独的 read timeout 控制
stream_timeout = httpx.Timeout(
connect=min(timeout, 30),
read=timeout,
write=timeout,
pool=timeout,
)
async with httpx.AsyncClient(
trust_env=True,
timeout=stream_timeout,
) as client:
async with client.stream(
'POST',
url,
headers={
**self.headers,
'Accept': 'text/event-stream',
'Content-Type': 'application/json',
},
json=payload,
) as resp:
if resp.status_code != 200:
body = await resp.aread()
raise DeerFlowAPIError(
operation='runs/stream request',
status=resp.status_code,
body=body.decode('utf-8', errors='replace'),
url=url,
thread_id=thread_id,
)
decoder = codecs.getincrementaldecoder('utf-8')('replace')
buffer = ''
async for chunk in resp.aiter_bytes(8192):
buffer += _normalize_sse_newlines(decoder.decode(chunk))
while '\n\n' in buffer:
block, buffer = buffer.split('\n\n', 1)
parsed = _parse_sse_block(block)
if parsed is not None:
yield parsed
if len(buffer) > SSE_MAX_BUFFER_CHARS:
# 缓冲区过大,强制 flush
parsed = _parse_sse_block(buffer)
if parsed is not None:
yield parsed
buffer = ''
# flush 剩余内容
buffer += _normalize_sse_newlines(decoder.decode(b'', final=True))
while '\n\n' in buffer:
block, buffer = buffer.split('\n\n', 1)
parsed = _parse_sse_block(block)
if parsed is not None:
yield parsed
if buffer.strip():
parsed = _parse_sse_block(buffer)
if parsed is not None:
yield parsed

View File

@@ -0,0 +1,30 @@
from __future__ import annotations
class DeerFlowAPIError(Exception):
"""DeerFlow API 请求失败"""
def __init__(
self,
*,
operation: str = '',
status: int = 0,
body: str = '',
url: str = '',
thread_id: str | None = None,
message: str = '',
) -> None:
self.operation = operation
self.status = status
self.body = body
self.url = url
self.thread_id = thread_id
if message:
super().__init__(message)
return
msg = f'DeerFlow {operation} failed: status={status}, url={url}, body={body}'
if thread_id is not None:
msg = f'DeerFlow {operation} failed: thread_id={thread_id}, status={status}, url={url}, body={body}'
super().__init__(msg)

View File

@@ -0,0 +1,212 @@
"""DeerFlow LangGraph 流式响应解析工具
参考 astrbot 实现的 deerflow_stream_utils。
"""
from __future__ import annotations
import typing
from collections.abc import Iterable
def extract_text(content: typing.Any) -> str:
"""从消息 content 中提取纯文本"""
if isinstance(content, str):
return content
if isinstance(content, dict):
if isinstance(content.get('text'), str):
return content['text']
if 'content' in content:
return extract_text(content.get('content'))
if 'kwargs' in content and isinstance(content['kwargs'], dict):
return extract_text(content['kwargs'].get('content'))
if isinstance(content, list):
parts: list[str] = []
for item in content:
if isinstance(item, str):
parts.append(item)
elif isinstance(item, dict):
item_type = item.get('type')
if item_type == 'text' and isinstance(item.get('text'), str):
parts.append(item['text'])
elif 'content' in item:
parts.append(extract_text(item['content']))
return '\n'.join([p for p in parts if p]).strip()
return str(content) if content is not None else ''
def extract_messages_from_values_data(data: typing.Any) -> list[typing.Any]:
"""从 values 事件中提取 messages 列表"""
candidates: list[typing.Any] = []
if isinstance(data, dict):
candidates.append(data)
if isinstance(data.get('values'), dict):
candidates.append(data['values'])
elif isinstance(data, list):
candidates.extend([x for x in data if isinstance(x, dict)])
for item in candidates:
messages = item.get('messages')
if isinstance(messages, list):
return messages
return []
def is_ai_message(message: dict[str, typing.Any]) -> bool:
"""判断是否为 AI/assistant 消息"""
role = str(message.get('role', '')).lower()
if role in {'assistant', 'ai'}:
return True
msg_type = str(message.get('type', '')).lower()
if msg_type in {'ai', 'assistant', 'aimessage', 'aimessagechunk'}:
return True
if 'ai' in msg_type and all(token not in msg_type for token in ('human', 'tool', 'system')):
return True
return False
def extract_latest_ai_text(messages: Iterable[typing.Any]) -> str:
"""获取最近一条 AI 消息的文本内容"""
if isinstance(messages, (list, tuple)):
iterable = reversed(messages)
else:
iterable = reversed(list(messages))
for msg in iterable:
if not isinstance(msg, dict):
continue
if is_ai_message(msg):
text = extract_text(msg.get('content'))
if text:
return text
return ''
def extract_latest_ai_message(messages: Iterable[typing.Any]) -> dict[str, typing.Any] | None:
"""获取最近一条 AI 消息对象"""
if isinstance(messages, (list, tuple)):
iterable = reversed(messages)
else:
iterable = reversed(list(messages))
for msg in iterable:
if not isinstance(msg, dict):
continue
if is_ai_message(msg):
return msg
return None
def is_clarification_tool_message(message: dict[str, typing.Any]) -> bool:
"""判断是否为澄清问题工具消息"""
msg_type = str(message.get('type', '')).lower()
tool_name = str(message.get('name', '')).lower()
return msg_type == 'tool' and tool_name == 'ask_clarification'
def extract_latest_clarification_text(messages: Iterable[typing.Any]) -> str:
"""提取最近的澄清问题文本"""
if isinstance(messages, (list, tuple)):
iterable = reversed(messages)
else:
iterable = reversed(list(messages))
for msg in iterable:
if not isinstance(msg, dict):
continue
if is_clarification_tool_message(msg):
text = extract_text(msg.get('content'))
if text:
return text
return ''
def get_message_id(message: typing.Any) -> str:
"""提取消息 ID"""
if not isinstance(message, dict):
return ''
msg_id = message.get('id')
return msg_id if isinstance(msg_id, str) else ''
def extract_event_message_obj(data: typing.Any) -> dict[str, typing.Any] | None:
"""从事件 data 中提取消息对象"""
msg_obj = data
if isinstance(data, (list, tuple)) and data:
msg_obj = data[0]
if isinstance(msg_obj, dict) and isinstance(msg_obj.get('data'), dict):
msg_obj = msg_obj['data']
return msg_obj if isinstance(msg_obj, dict) else None
def extract_ai_delta_from_event_data(data: typing.Any) -> str:
"""从 messages-tuple 事件中提取 AI delta 文本"""
msg_obj = extract_event_message_obj(data)
if not msg_obj:
return ''
if is_ai_message(msg_obj):
return extract_text(msg_obj.get('content'))
return ''
def extract_clarification_from_event_data(data: typing.Any) -> str:
"""从事件中提取澄清问题"""
msg_obj = extract_event_message_obj(data)
if not msg_obj:
return ''
if is_clarification_tool_message(msg_obj):
return extract_text(msg_obj.get('content'))
return ''
def _iter_custom_event_items(data: typing.Any) -> list[dict[str, typing.Any]]:
items: list[dict[str, typing.Any]] = []
if isinstance(data, dict):
return [data]
if isinstance(data, list):
for item in data:
if isinstance(item, dict):
items.append(item)
elif isinstance(item, (list, tuple)):
for nested in item:
if isinstance(nested, dict):
items.append(nested)
return items
def extract_task_failures_from_custom_event(data: typing.Any) -> list[str]:
"""从 custom 事件中提取子任务失败信息"""
failures: list[str] = []
for item in _iter_custom_event_items(data):
event_type = str(item.get('type', '')).lower()
if event_type not in {'task_failed', 'task_timed_out'}:
continue
task_id = str(item.get('task_id', '')).strip()
error_text = extract_text(item.get('error')).strip()
if task_id and error_text:
failures.append(f'{task_id}: {error_text}')
elif error_text:
failures.append(error_text)
elif task_id:
failures.append(f'{task_id}: unknown error')
else:
failures.append('unknown task failure')
return failures
def build_task_failure_summary(failures: list[str]) -> str:
"""构建任务失败摘要"""
if not failures:
return ''
deduped: list[str] = []
seen: set[str] = set()
for failure in failures:
if failure not in seen:
seen.add(failure)
deduped.append(failure)
if len(deduped) == 1:
return f'DeerFlow subtask failed: {deduped[0]}'
joined = '\n'.join([f'- {item}' for item in deduped[:5]])
return f'DeerFlow subtasks failed:\n{joined}'

View File

@@ -0,0 +1,4 @@
from .client import AsyncWeKnoraClient
from .errors import WeKnoraAPIError
__all__ = ['AsyncWeKnoraClient', 'WeKnoraAPIError']

View File

@@ -0,0 +1,180 @@
from __future__ import annotations
import httpx
import typing
import json
from .errors import WeKnoraAPIError
class AsyncWeKnoraClient:
"""WeKnora API 客户端"""
api_key: str
base_url: str
def __init__(
self,
api_key: str,
base_url: str = 'http://localhost:80/api/v1',
) -> None:
self.api_key = api_key
self.base_url = base_url
async def create_session(
self,
title: str = '',
description: str = '',
timeout: float = 30.0,
) -> str:
"""创建会话,返回 session_id"""
async with httpx.AsyncClient(
base_url=self.base_url,
trust_env=True,
timeout=timeout,
) as client:
payload: dict[str, typing.Any] = {}
if title:
payload['title'] = title
if description:
payload['description'] = description
response = await client.post(
'/sessions',
headers={
'X-API-Key': self.api_key,
'Content-Type': 'application/json',
},
json=payload,
)
if response.status_code not in (200, 201):
raise WeKnoraAPIError(f'{response.status_code} {response.text}')
data = response.json()
return data['data']['id']
async def agent_chat(
self,
session_id: str,
query: str,
user: str,
agent_id: str = '',
knowledge_base_ids: list[str] | None = None,
web_search_enabled: bool = False,
timeout: float = 120.0,
) -> typing.AsyncGenerator[dict[str, typing.Any], None]:
"""
Agent 智能对话SSE 流式)
响应事件类型:
- agent_query: Agent 开始处理
- thinking: 思考过程
- tool_call: 工具调用
- tool_result: 工具结果
- references: 知识库引用
- answer: 回答内容
- reflection: 反思
- session_title: 会话标题
- error: 错误
"""
if knowledge_base_ids is None:
knowledge_base_ids = []
async with httpx.AsyncClient(
base_url=self.base_url,
trust_env=True,
timeout=timeout,
) as client:
payload: dict[str, typing.Any] = {
'query': query,
'agent_enabled': True,
'channel': 'im',
}
if agent_id:
payload['agent_id'] = agent_id
if knowledge_base_ids:
payload['knowledge_base_ids'] = knowledge_base_ids
if web_search_enabled:
payload['web_search_enabled'] = True
async with client.stream(
'POST',
f'/agent-chat/{session_id}',
headers={
'X-API-Key': self.api_key,
'Content-Type': 'application/json',
},
json=payload,
) as r:
async for chunk in r.aiter_lines():
if r.status_code != 200:
raise WeKnoraAPIError(f'{r.status_code} {chunk}')
if chunk.strip() == '':
continue
if chunk.startswith('data:'):
try:
data = json.loads(chunk[5:].strip())
except json.JSONDecodeError:
continue
yield data
# 收到 error 事件后主动结束流,避免上层未 raise 时持续等待
if data.get('response_type') == 'error':
return
async def knowledge_chat(
self,
session_id: str,
query: str,
user: str,
agent_id: str = 'builtin-quick-answer',
knowledge_base_ids: list[str] | None = None,
timeout: float = 120.0,
) -> typing.AsyncGenerator[dict[str, typing.Any], None]:
"""
知识库 RAG 问答SSE 流式)
响应事件类型:
- references: 知识库引用
- answer: 回答内容
"""
if knowledge_base_ids is None:
knowledge_base_ids = []
async with httpx.AsyncClient(
base_url=self.base_url,
trust_env=True,
timeout=timeout,
) as client:
payload: dict[str, typing.Any] = {
'query': query,
'channel': 'im',
}
if agent_id:
payload['agent_id'] = agent_id
if knowledge_base_ids:
payload['knowledge_base_ids'] = knowledge_base_ids
async with client.stream(
'POST',
f'/knowledge-chat/{session_id}',
headers={
'X-API-Key': self.api_key,
'Content-Type': 'application/json',
},
json=payload,
) as r:
async for chunk in r.aiter_lines():
if r.status_code != 200:
raise WeKnoraAPIError(f'{r.status_code} {chunk}')
if chunk.strip() == '':
continue
if chunk.startswith('data:'):
try:
data = json.loads(chunk[5:].strip())
except json.JSONDecodeError:
continue
yield data
# 收到 error 事件后主动结束流,避免上层未 raise 时持续等待
if data.get('response_type') == 'error':
return

View File

@@ -0,0 +1,6 @@
class WeKnoraAPIError(Exception):
"""WeKnora API 请求失败"""
def __init__(self, message: str = ''):
self.message = message
super().__init__(self.message)

View File

@@ -46,30 +46,6 @@ class MonitoringRouterGroup(group.RouterGroup):
return self.success(data=metrics)
@self.route('/token-statistics', methods=['GET'], auth_type=group.AuthType.USER_TOKEN)
async def get_token_statistics() -> str:
"""Get detailed token usage statistics (summary, per-model, timeseries)."""
bot_ids = quart.request.args.getlist('botId')
pipeline_ids = quart.request.args.getlist('pipelineId')
start_time_str = quart.request.args.get('startTime')
end_time_str = quart.request.args.get('endTime')
bucket = quart.request.args.get('bucket', 'hour')
if bucket not in ('hour', 'day'):
bucket = 'hour'
start_time = parse_iso_datetime(start_time_str)
end_time = parse_iso_datetime(end_time_str)
stats = await self.ap.monitoring_service.get_token_statistics(
bot_ids=bot_ids if bot_ids else None,
pipeline_ids=pipeline_ids if pipeline_ids else None,
start_time=start_time,
end_time=end_time,
bucket=bucket,
)
return self.success(data=stats)
@self.route('/messages', methods=['GET'], auth_type=group.AuthType.USER_TOKEN)
async def get_messages() -> str:
"""Get message logs"""

View File

@@ -472,179 +472,6 @@ class MonitoringService:
'active_sessions': active_sessions,
}
async def get_token_statistics(
self,
bot_ids: list[str] | None = None,
pipeline_ids: list[str] | None = None,
start_time: datetime.datetime | None = None,
end_time: datetime.datetime | None = None,
bucket: str = 'hour',
) -> dict:
"""Get detailed token usage statistics for production observability.
Returns:
- summary: aggregate token counters and call/latency stats over the window
- by_model: per-model token + call breakdown (sorted by total tokens desc)
- timeseries: token usage bucketed by `bucket` ('hour' or 'day')
Only successful LLM calls are counted toward token totals; error calls are
reported separately so a spike in failures is visible without polluting
token accounting.
"""
LLMCall = persistence_monitoring.MonitoringLLMCall
conditions = []
if bot_ids:
conditions.append(LLMCall.bot_id.in_(bot_ids))
if pipeline_ids:
conditions.append(LLMCall.pipeline_id.in_(pipeline_ids))
if start_time:
conditions.append(LLMCall.timestamp >= start_time)
if end_time:
conditions.append(LLMCall.timestamp <= end_time)
def _apply(query):
if conditions:
query = query.where(sqlalchemy.and_(*conditions))
return query
# ---- Summary aggregates ----
summary_query = _apply(
sqlalchemy.select(
sqlalchemy.func.count(LLMCall.id),
sqlalchemy.func.coalesce(sqlalchemy.func.sum(LLMCall.input_tokens), 0),
sqlalchemy.func.coalesce(sqlalchemy.func.sum(LLMCall.output_tokens), 0),
sqlalchemy.func.coalesce(sqlalchemy.func.sum(LLMCall.total_tokens), 0),
sqlalchemy.func.coalesce(sqlalchemy.func.sum(LLMCall.duration), 0),
sqlalchemy.func.coalesce(sqlalchemy.func.sum(LLMCall.cost), 0.0),
sqlalchemy.func.sum(sqlalchemy.case((LLMCall.status == 'success', 1), else_=0)),
sqlalchemy.func.sum(sqlalchemy.case((LLMCall.status == 'error', 1), else_=0)),
# Count of successful calls that nonetheless recorded zero tokens —
# a data-quality signal that usage reporting may be broken upstream.
sqlalchemy.func.sum(
sqlalchemy.case(
(sqlalchemy.and_(LLMCall.status == 'success', LLMCall.total_tokens == 0), 1),
else_=0,
)
),
)
)
summary_result = await self.ap.persistence_mgr.execute_async(summary_query)
row = summary_result.first()
(
total_calls,
total_input_tokens,
total_output_tokens,
total_tokens,
total_duration,
total_cost,
success_calls,
error_calls,
zero_token_success_calls,
) = row if row else (0, 0, 0, 0, 0, 0.0, 0, 0, 0)
total_calls = total_calls or 0
success_calls = success_calls or 0
error_calls = error_calls or 0
zero_token_success_calls = zero_token_success_calls or 0
summary = {
'total_calls': total_calls,
'success_calls': success_calls,
'error_calls': error_calls,
'total_input_tokens': int(total_input_tokens or 0),
'total_output_tokens': int(total_output_tokens or 0),
'total_tokens': int(total_tokens or 0),
'total_cost': round(float(total_cost or 0.0), 6),
'avg_tokens_per_call': int((total_tokens or 0) / total_calls) if total_calls > 0 else 0,
'avg_duration_ms': int((total_duration or 0) / total_calls) if total_calls > 0 else 0,
'avg_tokens_per_second': round((total_output_tokens or 0) / (total_duration / 1000), 2)
if total_duration and total_duration > 0
else 0,
'zero_token_success_calls': zero_token_success_calls,
}
# ---- Per-model breakdown ----
by_model_query = _apply(
sqlalchemy.select(
LLMCall.model_name,
sqlalchemy.func.count(LLMCall.id),
sqlalchemy.func.coalesce(sqlalchemy.func.sum(LLMCall.input_tokens), 0),
sqlalchemy.func.coalesce(sqlalchemy.func.sum(LLMCall.output_tokens), 0),
sqlalchemy.func.coalesce(sqlalchemy.func.sum(LLMCall.total_tokens), 0),
sqlalchemy.func.coalesce(sqlalchemy.func.sum(LLMCall.duration), 0),
sqlalchemy.func.coalesce(sqlalchemy.func.sum(LLMCall.cost), 0.0),
sqlalchemy.func.sum(sqlalchemy.case((LLMCall.status == 'error', 1), else_=0)),
).group_by(LLMCall.model_name)
)
by_model_result = await self.ap.persistence_mgr.execute_async(by_model_query)
by_model = []
for mrow in by_model_result.all():
(
model_name,
m_calls,
m_in,
m_out,
m_total,
m_duration,
m_cost,
m_errors,
) = mrow
m_calls = m_calls or 0
by_model.append(
{
'model_name': model_name,
'calls': m_calls,
'error_calls': m_errors or 0,
'input_tokens': int(m_in or 0),
'output_tokens': int(m_out or 0),
'total_tokens': int(m_total or 0),
'cost': round(float(m_cost or 0.0), 6),
'avg_tokens_per_call': int((m_total or 0) / m_calls) if m_calls > 0 else 0,
'avg_duration_ms': int((m_duration or 0) / m_calls) if m_calls > 0 else 0,
}
)
by_model.sort(key=lambda x: x['total_tokens'], reverse=True)
# ---- Time-bucketed series ----
# Use a DB-agnostic bucketing approach: fetch (timestamp, tokens) rows and
# aggregate in Python. The window is bounded by the time filter, so this is
# cheap for typical dashboard ranges (hours/days).
series_query = _apply(
sqlalchemy.select(
LLMCall.timestamp,
LLMCall.input_tokens,
LLMCall.output_tokens,
LLMCall.total_tokens,
).order_by(LLMCall.timestamp.asc())
)
series_result = await self.ap.persistence_mgr.execute_async(series_query)
bucket_fmt = '%Y-%m-%d %H:00' if bucket == 'hour' else '%Y-%m-%d'
buckets: dict[str, dict] = {}
for srow in series_result.all():
ts, s_in, s_out, s_total = srow
if ts is None:
continue
key = ts.strftime(bucket_fmt)
b = buckets.setdefault(
key,
{'bucket': key, 'input_tokens': 0, 'output_tokens': 0, 'total_tokens': 0, 'calls': 0},
)
b['input_tokens'] += int(s_in or 0)
b['output_tokens'] += int(s_out or 0)
b['total_tokens'] += int(s_total or 0)
b['calls'] += 1
timeseries = [buckets[k] for k in sorted(buckets.keys())]
return {
'summary': summary,
'by_model': by_model,
'timeseries': timeseries,
'bucket': bucket,
}
async def get_messages(
self,
bot_ids: list[str] | None = None,

View File

@@ -120,13 +120,19 @@ class BoxRuntimeConnector(ManagedRuntimeConnector):
self._relay_port = parsed.port or _DEFAULT_PORT
self._filtered_box_config = _filter_config_for_runtime(_get_box_config(ap))
def _uses_websocket(self) -> bool:
def uses_websocket(self) -> bool:
"""Whether the connector should use WebSocket to reach the Box runtime.
True when:
- Running inside Docker (Box runtime is a separate container)
- The ``--standalone-box`` CLI flag was passed
- An explicit ``runtime.endpoint`` was configured
When this is True the Box runtime lives in a separate process with its
own filesystem view (container, pod sidecar, or remote host), so paths
it reports (e.g. skill ``package_root``) are NOT resolvable on the
LangBot side. When False, Box runs as a stdio child process that shares
LangBot's filesystem.
"""
return bool(
self.configured_runtime_endpoint
@@ -134,6 +140,10 @@ class BoxRuntimeConnector(ManagedRuntimeConnector):
or platform.use_websocket_to_connect_box_runtime()
)
# Backwards-compatible private alias.
def _uses_websocket(self) -> bool:
return self.uses_websocket()
async def initialize(self) -> None:
if self._uses_websocket():
if platform.get_platform() == 'win32' and not self.configured_runtime_endpoint:

View File

@@ -67,6 +67,10 @@ class BoxService:
self._available = False
self._connector_error: str = ''
self._reconnecting = False
# Optional explicit override for shares_filesystem_with_box. None means
# "derive from the connector transport". Set by tests / embedders that
# know the real LangBot<->Box filesystem topology.
self._shares_filesystem_with_box_override: bool | None = None
@property
def enabled(self) -> bool:
@@ -148,6 +152,32 @@ class BoxService:
def available(self) -> bool:
return self._available
@property
def shares_filesystem_with_box(self) -> bool:
"""Whether LangBot and the Box runtime share a filesystem view.
This is True only when Box runs as a local stdio child process of
LangBot (same container/host). In that case paths the Box runtime
reports — notably skill ``package_root`` — resolve identically on the
LangBot side, so LangBot may validate them against its own filesystem.
It is False for every separated deployment (Docker Compose, k8s
sidecar, ``--standalone-box``, or an explicit ``runtime.endpoint``),
where the Box runtime owns its own filesystem and LangBot must trust
the paths it reports rather than checking them locally.
When Box is wired up with an injected client (tests, custom embeds)
there is no connector to introspect; we conservatively report False so
LangBot never wrongly drops Box-reported skills. An explicit override
can be set via ``_shares_filesystem_with_box`` (used by tests and any
embedder that knows the real topology).
"""
if self._shares_filesystem_with_box_override is not None:
return self._shares_filesystem_with_box_override
if self._runtime_connector is None:
return False
return not self._runtime_connector.uses_websocket()
async def execute_spec_payload(
self,
spec_payload: dict,
@@ -220,14 +250,24 @@ class BoxService:
all skill packages mounted, regardless of which skill is currently
activated.
Skills whose ``package_root`` is missing or no longer a directory on
the LangBot-visible filesystem are skipped with a warning instead of
being passed through to the backend. Without this guard the three
backends behave inconsistently on a stale mount: nsjail refuses to
start the sandbox (failing every exec in the session), Docker
silently auto-creates a root-owned empty directory on the host, and
E2B silently skips the upload — none of which surfaces an
actionable error to the agent or operator.
Path validation is filesystem-topology dependent. When LangBot and the
Box runtime share a filesystem (local stdio mode), a skill whose
``package_root`` is missing or no longer a directory is skipped with a
warning instead of being passed through to the backend. Without that
guard the three backends behave inconsistently on a stale mount: nsjail
refuses to start the sandbox (failing every exec in the session),
Docker silently auto-creates a root-owned empty directory on the host,
and E2B silently skips the upload — none of which surfaces an
actionable error.
When Box runs as a separate process (Docker Compose, k8s sidecar,
``--standalone-box``, or a remote ``runtime.endpoint``), the
``package_root`` reported by ``list_skills`` is the Box runtime's own
filesystem path and is NOT resolvable on the LangBot side. Validating
it locally would wrongly drop every skill, so LangBot trusts the path
and lets the Box runtime resolve it. The Box runtime only ever reports
skills it discovered on its own filesystem, so the path is valid there
by construction.
"""
skill_mgr = getattr(self.ap, 'skill_mgr', None)
if skill_mgr is None:
@@ -235,13 +275,15 @@ class BoxService:
from ..provider.tools.loaders import skill as skill_loader
validate_locally = self.shares_filesystem_with_box
visible_skills = skill_loader.get_visible_skills(self.ap, query)
mounts: list[dict] = []
for skill_name, skill_data in visible_skills.items():
package_root = str(skill_data.get('package_root', '') or '').strip()
if not package_root:
continue
if not os.path.isdir(package_root):
if validate_locally and not os.path.isdir(package_root):
self.ap.logger.warning(
f'Skill "{skill_name}" package_root missing on filesystem '
f'({package_root}); skipping mount to prevent sandbox failures. '

View File

@@ -42,7 +42,6 @@ required_deps = {
'telegramify_markdown': 'telegramify-markdown',
'slack_sdk': 'slack_sdk',
'asyncpg': 'asyncpg',
'litellm': 'litellm',
}

View File

@@ -0,0 +1,27 @@
from __future__ import annotations
from .. import migration
@migration.migration_class('weknora-api-config', 42)
class WeKnoraAPICfgMigration(migration.Migration):
"""WeKnora API 配置迁移"""
async def need_migrate(self) -> bool:
"""判断当前环境是否需要运行此迁移"""
return 'weknora-api' not in self.ap.provider_cfg.data
async def run(self):
"""执行迁移"""
self.ap.provider_cfg.data['weknora-api'] = {
'base-url': 'http://localhost:8080/api/v1',
'app-type': 'agent',
'api-key': '',
'agent-id': 'builtin-smart-reasoning',
'knowledge-base-ids': [],
'web-search-enabled': False,
'timeout': 120,
'base-prompt': '请回答用户的问题。',
}
await self.ap.provider_cfg.dump_config()

View File

@@ -0,0 +1,30 @@
from __future__ import annotations
from .. import migration
@migration.migration_class('deerflow-api-config', 43)
class DeerFlowAPICfgMigration(migration.Migration):
"""DeerFlow API 配置迁移"""
async def need_migrate(self) -> bool:
"""判断当前环境是否需要运行此迁移"""
return 'deerflow-api' not in self.ap.provider_cfg.data
async def run(self):
"""执行迁移"""
self.ap.provider_cfg.data['deerflow-api'] = {
'api-base': 'http://127.0.0.1:2026',
'api-key': '',
'auth-header': '',
'assistant-id': 'lead_agent',
'model-name': '',
'thinking-enabled': False,
'plan-mode': False,
'subagent-enabled': False,
'max-concurrent-subagents': 3,
'timeout': 300,
'recursion-limit': 1000,
}
await self.ap.provider_cfg.dump_config()

View File

@@ -11,6 +11,10 @@ class MCPServer(Base):
enable = sqlalchemy.Column(sqlalchemy.Boolean, nullable=False, default=False)
mode = sqlalchemy.Column(sqlalchemy.String(255), nullable=False) # stdio, sse, http
extra_args = sqlalchemy.Column(sqlalchemy.JSON, nullable=False, default={})
# Markdown documentation captured from LangBot Space at install time so the
# detail page can show docs even when the server is offline / has no tools.
# Empty string for manually-created servers that have no marketplace README.
readme = sqlalchemy.Column(sqlalchemy.Text, nullable=False, server_default='', default='')
created_at = sqlalchemy.Column(sqlalchemy.DateTime, nullable=False, server_default=sqlalchemy.func.now())
updated_at = sqlalchemy.Column(
sqlalchemy.DateTime,

View File

@@ -31,7 +31,6 @@ class LLMModel(Base):
name = sqlalchemy.Column(sqlalchemy.String(255), nullable=False)
provider_uuid = sqlalchemy.Column(sqlalchemy.String(255), nullable=False)
abilities = sqlalchemy.Column(sqlalchemy.JSON, nullable=False, default=[])
context_length = sqlalchemy.Column(sqlalchemy.Integer, nullable=True)
extra_args = sqlalchemy.Column(sqlalchemy.JSON, nullable=False, default={})
prefered_ranking = sqlalchemy.Column(sqlalchemy.Integer, nullable=False, default=0)
created_at = sqlalchemy.Column(sqlalchemy.DateTime, nullable=False, server_default=sqlalchemy.func.now())

View File

@@ -1,30 +0,0 @@
"""add llm model context length
Revision ID: 0004_add_llm_model_context_length
Revises: 0003_add_rerank_models
Create Date: 2026-06-07
"""
import sqlalchemy as sa
from alembic import op
revision = '0004_add_llm_model_context_length'
down_revision = '0003_add_rerank_models'
branch_labels = None
depends_on = None
def upgrade() -> None:
conn = op.get_bind()
inspector = sa.inspect(conn)
columns = {column['name'] for column in inspector.get_columns('llm_models')}
if 'context_length' not in columns:
op.add_column('llm_models', sa.Column('context_length', sa.Integer(), nullable=True))
def downgrade() -> None:
conn = op.get_bind()
inspector = sa.inspect(conn)
columns = {column['name'] for column in inspector.get_columns('llm_models')}
if 'context_length' in columns:
op.drop_column('llm_models', 'context_length')

View File

@@ -0,0 +1,34 @@
"""add readme column to mcp_servers
Revision ID: 0004_add_mcp_readme
Revises: 0003_add_rerank_models
Create Date: 2026-06-06
"""
import sqlalchemy as sa
from alembic import op
revision = '0004_add_mcp_readme'
down_revision = '0003_add_rerank_models'
branch_labels = None
depends_on = None
def upgrade() -> None:
# Add ``readme`` to mcp_servers if the table exists and the column is missing
# (the table may have been created by create_all() with the column already
# present on fresh installs, so guard against duplicate-add).
conn = op.get_bind()
inspector = sa.inspect(conn)
if 'mcp_servers' not in inspector.get_table_names():
return
columns = {col['name'] for col in inspector.get_columns('mcp_servers')}
if 'readme' not in columns:
op.add_column(
'mcp_servers',
sa.Column('readme', sa.Text(), nullable=False, server_default=''),
)
def downgrade() -> None:
op.drop_column('mcp_servers', 'readme')

View File

@@ -1,42 +0,0 @@
import sqlalchemy
from .. import migration
@migration.migration_class(26)
class DBMigrateLLMModelContextLength(migration.DBMigration):
"""Add context_length column to LLM models"""
async def upgrade(self):
columns = await self._get_columns('llm_models')
if 'context_length' not in columns:
await self.ap.persistence_mgr.execute_async(
sqlalchemy.text('ALTER TABLE llm_models ADD COLUMN context_length INTEGER')
)
async def downgrade(self):
columns = await self._get_columns('llm_models')
if 'context_length' not in columns:
return
if self.ap.persistence_mgr.db.name == 'postgresql':
await self.ap.persistence_mgr.execute_async(
sqlalchemy.text('ALTER TABLE llm_models DROP COLUMN IF EXISTS context_length')
)
else:
await self.ap.persistence_mgr.execute_async(
sqlalchemy.text('ALTER TABLE llm_models DROP COLUMN context_length')
)
async def _get_columns(self, table_name: str) -> set[str]:
if self.ap.persistence_mgr.db.name == 'postgresql':
result = await self.ap.persistence_mgr.execute_async(
sqlalchemy.text("""
SELECT column_name FROM information_schema.columns
WHERE table_name = :table_name
"""),
{'table_name': table_name},
)
return {row[0] for row in result.fetchall()}
result = await self.ap.persistence_mgr.execute_async(sqlalchemy.text(f'PRAGMA table_info({table_name})'))
return {row[1] for row in result.fetchall()}

View File

@@ -109,7 +109,7 @@ class PreProcessor(stage.PipelineStage):
if llm_model:
query.use_llm_model_uuid = llm_model.model_entity.uuid
if 'func_call' in (llm_model.model_entity.abilities or []):
if llm_model.model_entity.abilities.__contains__('func_call'):
# Get bound plugins and MCP servers for filtering tools
bound_plugins = query.variables.get('_pipeline_bound_plugins', None)
bound_mcp_servers = query.variables.get('_pipeline_bound_mcp_servers', None)
@@ -159,7 +159,11 @@ class PreProcessor(stage.PipelineStage):
# Check if this model supports vision, if not, remove all images
# TODO this checking should be performed in runner, and in this stage, the image should be reserved
if selected_runner == 'local-agent' and llm_model and 'vision' not in (llm_model.model_entity.abilities or []):
if (
selected_runner == 'local-agent'
and llm_model
and not llm_model.model_entity.abilities.__contains__('vision')
):
for msg in query.messages:
if isinstance(msg.content, list):
for me in msg.content:
@@ -177,7 +181,7 @@ class PreProcessor(stage.PipelineStage):
plain_text += me.text
elif isinstance(me, platform_message.Image):
if selected_runner != 'local-agent' or (
llm_model and 'vision' in (llm_model.model_entity.abilities or [])
llm_model and llm_model.model_entity.abilities.__contains__('vision')
):
if me.base64 is not None:
content_list.append(provider_message.ContentElement.from_image_base64(me.base64))
@@ -198,7 +202,7 @@ class PreProcessor(stage.PipelineStage):
content_list.append(provider_message.ContentElement.from_text(msg.text))
elif isinstance(msg, platform_message.Image):
if selected_runner != 'local-agent' or (
llm_model and 'vision' in (llm_model.model_entity.abilities or [])
llm_model and llm_model.model_entity.abilities.__contains__('vision')
):
if msg.base64 is not None:
content_list.append(provider_message.ContentElement.from_image_base64(msg.base64))

View File

@@ -248,6 +248,9 @@ class PluginRuntimeConnector(ManagedRuntimeConnector):
mode = mcp_data.get('mode') or 'stdio'
extra_args = mcp_data.get('extra_args') or {}
# Marketplace records carry the rendered README markdown; persist it so
# the detail page Docs tab works offline and without a marketplace round-trip.
readme = mcp_data.get('readme') or ''
# Use __ instead of / to avoid URL routing issues with slashes
name = f'{mcp_data.get("author", "")}__{mcp_data.get("name", "")}'
@@ -267,6 +270,7 @@ class PluginRuntimeConnector(ManagedRuntimeConnector):
'enable': True,
'mode': mode,
'extra_args': extra_args,
'readme': readme,
}
await self.ap.persistence_mgr.execute_async(sqlalchemy.insert(persistence_mcp.MCPServer).values(server_data))

View File

@@ -37,41 +37,11 @@ class ModelManager:
self.requester_components = []
self.requester_dict = {}
@staticmethod
def _get_litellm_provider_from_manifest(component: engine.Component | None) -> str | None:
if component is None:
return None
spec = getattr(component, 'spec', None) or {}
litellm_provider = None
if isinstance(spec, dict):
litellm_provider = spec.get('litellm_provider')
else:
getter = getattr(spec, 'get', None)
if callable(getter):
try:
litellm_provider = getter('litellm_provider')
except Exception:
litellm_provider = None
if isinstance(litellm_provider, str) and litellm_provider:
return litellm_provider
return None
async def initialize(self):
self.requester_components = self.ap.discover.get_components_by_kind('LLMAPIRequester')
requester_dict: dict[str, type[requester.ProviderAPIRequester]] = {}
for component in self.requester_components:
# Skip components that use litellm_provider (they will use litellmchat.py instead)
litellm_provider = self._get_litellm_provider_from_manifest(component)
if litellm_provider:
self.ap.logger.debug(
f'Skipping Python class loading for {component.metadata.name} '
f'(uses litellm_provider={litellm_provider})'
)
continue
requester_dict[component.metadata.name] = component.get_python_component_class()
self.requester_dict = requester_dict
@@ -266,7 +236,6 @@ class ModelManager:
name=model_info.get('name', ''),
provider_uuid='',
abilities=model_info.get('abilities', []),
context_length=model_info.get('context_length'),
extra_args=model_info.get('extra_args', {}),
),
provider=runtime_provider,
@@ -325,37 +294,13 @@ class ModelManager:
else:
provider_entity = provider_info
# Get requester manifest to check for litellm_provider
requester_manifest = self.get_available_requester_manifest_by_name(provider_entity.requester)
litellm_provider = self._get_litellm_provider_from_manifest(requester_manifest)
# Build config from base_url
config = {'base_url': provider_entity.base_url}
# Check if requester manifest specifies litellm_provider
if litellm_provider:
from .requesters import litellmchat
# Use unified LiteLLMRequester with provider prefix
# Map litellm_provider (YAML spec) to custom_llm_provider (config)
config['custom_llm_provider'] = litellm_provider
requester_inst = litellmchat.LiteLLMRequester(
ap=self.ap,
config=config,
)
self.ap.logger.debug(
f'Using LiteLLMRequester for {provider_entity.requester} '
f'with custom_llm_provider={config["custom_llm_provider"]}'
)
else:
# Use original requester class (for backward compatibility)
if provider_entity.requester not in self.requester_dict:
raise provider_errors.RequesterNotFoundError(provider_entity.requester)
requester_inst = self.requester_dict[provider_entity.requester](
ap=self.ap,
config=config,
)
if provider_entity.requester not in self.requester_dict:
raise provider_errors.RequesterNotFoundError(provider_entity.requester)
requester_inst = self.requester_dict[provider_entity.requester](
ap=self.ap,
config={'base_url': provider_entity.base_url},
)
await requester_inst.initialize()
token_mgr = token.TokenManager(name=provider_entity.uuid, tokens=provider_entity.api_keys or [])
@@ -461,7 +406,6 @@ class ModelManager:
name=model_info.get('name', ''),
provider_uuid=model_info.get('provider_uuid', ''),
abilities=model_info.get('abilities', []),
context_length=model_info.get('context_length'),
extra_args=model_info.get('extra_args', {}),
)

View File

@@ -67,8 +67,8 @@ class RuntimeProvider:
if isinstance(result, tuple):
msg, usage_info = result
if usage_info:
input_tokens = usage_info.get('prompt_tokens', 0)
output_tokens = usage_info.get('completion_tokens', 0)
input_tokens = usage_info.get('input_tokens', 0)
output_tokens = usage_info.get('output_tokens', 0)
return msg
else:
return result
@@ -128,6 +128,7 @@ class RuntimeProvider:
start_time = time.time()
status = 'success'
error_message = None
# Note: Stream doesn't easily provide token counts, set to 0
input_tokens = 0
output_tokens = 0
@@ -142,15 +143,6 @@ class RuntimeProvider:
remove_think=remove_think,
):
yield chunk
# Extract usage from stream if available (stored by LiteLLM requester)
if query:
if query.variables is None:
query.variables = {}
if '_stream_usage' in query.variables:
usage_info = query.variables['_stream_usage']
input_tokens = usage_info.get('prompt_tokens', 0)
output_tokens = usage_info.get('completion_tokens', 0)
del query.variables['_stream_usage']
except Exception as e:
status = 'error'
error_message = str(e)

View File

@@ -0,0 +1,17 @@
from __future__ import annotations
import typing
import openai
from . import chatcmpl
class AI302ChatCompletions(chatcmpl.OpenAIChatCompletions):
"""302.AI ChatCompletion API 请求器"""
client: openai.AsyncClient
default_config: dict[str, typing.Any] = {
'base_url': 'https://api.302.ai/v1',
'timeout': 120,
}

View File

@@ -7,7 +7,6 @@ metadata:
zh_Hans: 302.AI
icon: 302ai.png
spec:
litellm_provider: openai
config:
- name: base_url
label:

View File

@@ -0,0 +1,370 @@
from __future__ import annotations
import typing
import json
import platform
import socket
import anthropic
import httpx
from .. import errors, requester
from ....utils import image
import langbot_plugin.api.entities.builtin.resource.tool as resource_tool
import langbot_plugin.api.entities.builtin.pipeline.query as pipeline_query
import langbot_plugin.api.entities.builtin.provider.message as provider_message
class AnthropicMessages(requester.ProviderAPIRequester):
"""Anthropic Messages API 请求器"""
client: anthropic.AsyncAnthropic
default_config: dict[str, typing.Any] = {
'base_url': 'https://api.anthropic.com',
'timeout': 120,
}
async def initialize(self):
# 兼容 Windows 缺失 TCP_KEEPINTVL 和 TCP_KEEPCNT 的问题
if platform.system() == 'Windows':
if not hasattr(socket, 'TCP_KEEPINTVL'):
socket.TCP_KEEPINTVL = 0
if not hasattr(socket, 'TCP_KEEPCNT'):
socket.TCP_KEEPCNT = 0
httpx_client = anthropic._base_client.AsyncHttpxClientWrapper(
base_url=self.requester_cfg['base_url'],
# cast to a valid type because mypy doesn't understand our type narrowing
timeout=typing.cast(httpx.Timeout, self.requester_cfg['timeout']),
limits=anthropic._constants.DEFAULT_CONNECTION_LIMITS,
follow_redirects=True,
trust_env=True,
)
self.client = anthropic.AsyncAnthropic(
api_key='',
http_client=httpx_client,
base_url=self.requester_cfg['base_url'],
)
async def invoke_llm(
self,
query: pipeline_query.Query,
model: requester.RuntimeLLMModel,
messages: typing.List[provider_message.Message],
funcs: typing.List[resource_tool.LLMTool] = None,
extra_args: dict[str, typing.Any] = {},
remove_think: bool = False,
) -> provider_message.Message:
self.client.api_key = model.provider.token_mgr.get_token()
args = extra_args.copy()
args['model'] = model.model_entity.name
# 处理消息
# system
system_role_message = None
for i, m in enumerate(messages):
if m.role == 'system':
system_role_message = m
break
if system_role_message:
messages.pop(i)
if isinstance(system_role_message, provider_message.Message) and isinstance(system_role_message.content, str):
args['system'] = system_role_message.content
req_messages = []
for m in messages:
if m.role == 'tool':
tool_call_id = m.tool_call_id
req_messages.append(
{
'role': 'user',
'content': [
{
'type': 'tool_result',
'tool_use_id': tool_call_id,
'is_error': False,
'content': [{'type': 'text', 'text': m.content}],
}
],
}
)
continue
msg_dict = m.dict(exclude_none=True)
if isinstance(m.content, str) and m.content.strip() != '':
msg_dict['content'] = [{'type': 'text', 'text': m.content}]
elif isinstance(m.content, list):
for i, ce in enumerate(m.content):
if ce.type == 'image_base64':
image_b64, image_format = await image.extract_b64_and_format(ce.image_base64)
alter_image_ele = {
'type': 'image',
'source': {
'type': 'base64',
'media_type': f'image/{image_format}',
'data': image_b64,
},
}
msg_dict['content'][i] = alter_image_ele
if m.tool_calls:
for tool_call in m.tool_calls:
msg_dict['content'].append(
{
'type': 'tool_use',
'id': tool_call.id,
'name': tool_call.function.name,
'input': json.loads(tool_call.function.arguments),
}
)
del msg_dict['tool_calls']
req_messages.append(msg_dict)
args['messages'] = req_messages
if 'thinking' in args:
args['thinking'] = {'type': 'enabled', 'budget_tokens': 10000}
if funcs:
tools = await self.ap.tool_mgr.generate_tools_for_anthropic(funcs)
if tools:
args['tools'] = tools
try:
resp = await self.client.messages.create(**args)
args = {
'content': '',
'role': resp.role,
}
assert type(resp) is anthropic.types.message.Message
for block in resp.content:
if not remove_think and block.type == 'thinking':
args['content'] = '<think>\n' + block.thinking + '\n</think>\n' + args['content']
elif block.type == 'text':
args['content'] += block.text
elif block.type == 'tool_use':
assert type(block) is anthropic.types.tool_use_block.ToolUseBlock
tool_call = provider_message.ToolCall(
id=block.id,
type='function',
function=provider_message.FunctionCall(name=block.name, arguments=json.dumps(block.input)),
)
if 'tool_calls' not in args:
args['tool_calls'] = []
args['tool_calls'].append(tool_call)
return provider_message.Message(**args)
except anthropic.AuthenticationError as e:
raise errors.RequesterError(f'api-key 无效: {e.message}')
except anthropic.BadRequestError as e:
raise errors.RequesterError(str(e.message))
except anthropic.NotFoundError as e:
if 'model: ' in str(e):
raise errors.RequesterError(f'模型无效: {e.message}')
else:
raise errors.RequesterError(f'请求地址无效: {e.message}')
async def invoke_llm_stream(
self,
query: pipeline_query.Query,
model: requester.RuntimeLLMModel,
messages: typing.List[provider_message.Message],
funcs: typing.List[resource_tool.LLMTool] = None,
extra_args: dict[str, typing.Any] = {},
remove_think: bool = False,
) -> provider_message.Message:
self.client.api_key = model.provider.token_mgr.get_token()
args = extra_args.copy()
args['model'] = model.model_entity.name
args['stream'] = True
# 处理消息
# system
system_role_message = None
for i, m in enumerate(messages):
if m.role == 'system':
system_role_message = m
break
if system_role_message:
messages.pop(i)
if isinstance(system_role_message, provider_message.Message) and isinstance(system_role_message.content, str):
args['system'] = system_role_message.content
req_messages = []
for m in messages:
if m.role == 'tool':
tool_call_id = m.tool_call_id
req_messages.append(
{
'role': 'user',
'content': [
{
'type': 'tool_result',
'tool_use_id': tool_call_id,
'is_error': False, # 暂时直接写false
'content': [
{'type': 'text', 'text': m.content}
], # 这里要是list包裹应该是多个返回的情况type类型好像也可以填其他的暂时只写text
}
],
}
)
continue
msg_dict = m.dict(exclude_none=True)
if isinstance(m.content, str) and m.content.strip() != '':
msg_dict['content'] = [{'type': 'text', 'text': m.content}]
elif isinstance(m.content, list):
for i, ce in enumerate(m.content):
if ce.type == 'image_base64':
image_b64, image_format = await image.extract_b64_and_format(ce.image_base64)
alter_image_ele = {
'type': 'image',
'source': {
'type': 'base64',
'media_type': f'image/{image_format}',
'data': image_b64,
},
}
msg_dict['content'][i] = alter_image_ele
if isinstance(msg_dict['content'], str) and msg_dict['content'] == '':
msg_dict['content'] = [] # 这里不知道为什么会莫名有个空导致content为字符
if m.tool_calls:
for tool_call in m.tool_calls:
msg_dict['content'].append(
{
'type': 'tool_use',
'id': tool_call.id,
'name': tool_call.function.name,
'input': json.loads(tool_call.function.arguments),
}
)
del msg_dict['tool_calls']
req_messages.append(msg_dict)
if 'thinking' in args:
args['thinking'] = {'type': 'enabled', 'budget_tokens': 10000}
args['messages'] = req_messages
if funcs:
tools = await self.ap.tool_mgr.generate_tools_for_anthropic(funcs)
if tools:
args['tools'] = tools
try:
role = 'assistant' # 默认角色
# chunk_idx = 0
think_started = False
think_ended = False
finish_reason = False
tool_name = ''
tool_id = ''
async for chunk in await self.client.messages.create(**args):
content = ''
tool_call = {'id': None, 'function': {'name': None, 'arguments': None}, 'type': 'function'}
if isinstance(
chunk, anthropic.types.raw_content_block_start_event.RawContentBlockStartEvent
): # 记录开始
if chunk.content_block.type == 'tool_use':
if chunk.content_block.name is not None:
tool_name = chunk.content_block.name
if chunk.content_block.id is not None:
tool_id = chunk.content_block.id
tool_call['function']['name'] = tool_name
tool_call['function']['arguments'] = ''
tool_call['id'] = tool_id
if not remove_think:
if chunk.content_block.type == 'thinking' and not remove_think:
think_started = True
elif chunk.content_block.type == 'text' and chunk.index != 0 and not remove_think:
think_ended = True
continue
elif isinstance(chunk, anthropic.types.raw_content_block_delta_event.RawContentBlockDeltaEvent):
if chunk.delta.type == 'thinking_delta':
if think_started:
think_started = False
content = '<think>\n' + chunk.delta.thinking
elif remove_think:
continue
else:
content = chunk.delta.thinking
elif chunk.delta.type == 'text_delta':
if think_ended:
think_ended = False
content = '\n</think>\n' + chunk.delta.text
else:
content = chunk.delta.text
elif chunk.delta.type == 'input_json_delta':
tool_call['function']['arguments'] = chunk.delta.partial_json
tool_call['function']['name'] = tool_name
tool_call['id'] = tool_id
elif isinstance(chunk, anthropic.types.raw_content_block_stop_event.RawContentBlockStopEvent):
continue # 记录raw_content_block结束的
elif isinstance(chunk, anthropic.types.raw_message_delta_event.RawMessageDeltaEvent):
if chunk.delta.stop_reason == 'end_turn':
finish_reason = True
elif isinstance(chunk, anthropic.types.raw_message_stop_event.RawMessageStopEvent):
continue # 这个好像是完全结束
else:
# print(chunk)
self.ap.logger.debug(f'anthropic chunk: {chunk}')
continue
args = {
'content': content,
'role': role,
'is_final': finish_reason,
'tool_calls': None if tool_call['id'] is None else [tool_call],
}
# if chunk_idx == 0:
# chunk_idx += 1
# continue
# assert type(chunk) is anthropic.types.message.Chunk
yield provider_message.MessageChunk(**args)
# return llm_entities.Message(**args)
except anthropic.AuthenticationError as e:
raise errors.RequesterError(f'api-key 无效: {e.message}')
except anthropic.BadRequestError as e:
raise errors.RequesterError(str(e.message))
except anthropic.NotFoundError as e:
if 'model: ' in str(e):
raise errors.RequesterError(f'模型无效: {e.message}')
else:
raise errors.RequesterError(f'请求地址无效: {e.message}')

View File

@@ -7,7 +7,6 @@ metadata:
zh_Hans: Anthropic
icon: anthropic.svg
spec:
litellm_provider: anthropic
config:
- name: base_url
label:
@@ -25,8 +24,6 @@ spec:
default: 120
support_type:
- llm
- text-embedding
- rerank
provider_category: manufacturer
execution:
python:

View File

@@ -1,5 +0,0 @@
<svg width="60" height="50" viewBox="0 0 60 50" xmlns="http://www.w3.org/2000/svg">
<rect width="60" height="50" rx="8" fill="#2932E1"/>
<text x="30" y="28" font-family="Arial, sans-serif" font-size="10" font-weight="bold" fill="white" text-anchor="middle">Baidu</text>
<text x="30" y="40" font-family="Arial, sans-serif" font-size="8" fill="white" text-anchor="middle">ERNIE</text>
</svg>

Before

Width:  |  Height:  |  Size: 396 B

View File

@@ -1,30 +0,0 @@
apiVersion: v1
kind: LLMAPIRequester
metadata:
name: baidu-chat-completions
label:
en_US: Baidu ERNIE
zh_Hans: 百度文心一言
icon: baidu.svg
spec:
litellm_provider: openai
config:
- name: base_url
label:
en_US: Base URL
zh_Hans: 基础 URL
type: string
required: true
default: https://aip.baidubce.com/rpc/2.0/ai_custom/v1/wenxinworkshop
- name: timeout
label:
en_US: Timeout
zh_Hans: 超时时间
type: integer
required: true
default: 120
support_type:
- llm
- text-embedding
- rerank
provider_category: manufacturer

View File

@@ -0,0 +1,242 @@
from __future__ import annotations
import typing
import dashscope
import openai
from . import modelscopechatcmpl
from .. import requester
import langbot_plugin.api.entities.builtin.resource.tool as resource_tool
import langbot_plugin.api.entities.builtin.pipeline.query as pipeline_query
import langbot_plugin.api.entities.builtin.provider.message as provider_message
class BailianChatCompletions(modelscopechatcmpl.ModelScopeChatCompletions):
"""阿里云百炼大模型平台 ChatCompletion API 请求器"""
client: openai.AsyncClient
default_config: dict[str, typing.Any] = {
'base_url': 'https://dashscope.aliyuncs.com/compatible-mode/v1',
'timeout': 120,
}
async def _closure_stream(
self,
query: pipeline_query.Query,
req_messages: list[dict],
use_model: requester.RuntimeLLMModel,
use_funcs: list[resource_tool.LLMTool] = None,
extra_args: dict[str, typing.Any] = {},
remove_think: bool = False,
) -> provider_message.Message | typing.AsyncGenerator[provider_message.MessageChunk, None]:
self.client.api_key = use_model.provider.token_mgr.get_token()
args = {}
args['model'] = use_model.model_entity.name
if use_funcs:
tools = await self.ap.tool_mgr.generate_tools_for_openai(use_funcs)
if tools:
args['tools'] = tools
# 设置此次请求中的messages
messages = req_messages.copy()
is_use_dashscope_call = False # 是否使用阿里原生库调用
is_enable_multi_model = True # 是否支持多轮对话
use_time_num = 0 # 模型已调用次数,防止存在多文件时重复调用
use_time_ids = [] # 已调用的ID列表
message_id = 0 # 记录消息序号
for msg in messages:
# print(msg)
if 'content' in msg and isinstance(msg['content'], list):
for me in msg['content']:
if me['type'] == 'image_base64':
me['image_url'] = {'url': me['image_base64']}
me['type'] = 'image_url'
del me['image_base64']
elif me['type'] == 'file_url' and '.' in me.get('file_name', ''):
# 1. 视频文件推理
# https://bailian.console.aliyun.com/?tab=doc#/doc/?type=model&url=2845871
file_type = me.get('file_name').lower().split('.')[-1]
if file_type in ['mp4', 'avi', 'mkv', 'mov', 'flv', 'wmv']:
me['type'] = 'video_url'
me['video_url'] = {'url': me['file_url']}
del me['file_url']
del me['file_name']
use_time_num += 1
use_time_ids.append(message_id)
is_enable_multi_model = False
# 2. 语音文件识别, 无法通过openai的audio字段传递暂时不支持
# https://bailian.console.aliyun.com/?tab=doc#/doc/?type=model&url=2979031
elif file_type in [
'aac',
'amr',
'aiff',
'flac',
'm4a',
'mp3',
'mpeg',
'ogg',
'opus',
'wav',
'webm',
'wma',
]:
me['audio'] = me['file_url']
me['type'] = 'audio'
del me['file_url']
del me['type']
del me['file_name']
is_use_dashscope_call = True
use_time_num += 1
use_time_ids.append(message_id)
is_enable_multi_model = False
message_id += 1
# 使用列表推导式,保留不在 use_time_ids[:-1] 中的元素,仅保留最后一个多媒体消息
if not is_enable_multi_model and use_time_num > 1:
messages = [msg for idx, msg in enumerate(messages) if idx not in use_time_ids[:-1]]
if not is_enable_multi_model:
messages = [msg for msg in messages if 'resp_message_id' not in msg]
args['messages'] = messages
args['stream'] = True
# 流式处理状态
# tool_calls_map: dict[str, provider_message.ToolCall] = {}
chunk_idx = 0
thinking_started = False
thinking_ended = False
role = 'assistant' # 默认角色
if is_use_dashscope_call:
response = dashscope.MultiModalConversation.call(
# 若没有配置环境变量请用百炼API Key将下行替换为api_key = "sk-xxx"
api_key=use_model.provider.token_mgr.get_token(),
model=use_model.model_entity.name,
messages=messages,
result_format='message',
asr_options={
# "language": "zh", # 可选,若已知音频的语种,可通过该参数指定待识别语种,以提升识别准确率
'enable_lid': True,
'enable_itn': False,
},
stream=True,
)
content_length_list = []
previous_length = 0 # 记录上一次的内容长度
for res in response:
chunk = res['output']
# 解析 chunk 数据
if hasattr(chunk, 'choices') and chunk.choices:
choice = chunk.choices[0]
delta_content = choice['message'].content[0]['text']
finish_reason = choice['finish_reason']
content_length_list.append(len(delta_content))
else:
delta_content = ''
finish_reason = None
# 跳过空的第一个 chunk只有 role 没有内容)
if chunk_idx == 0 and not delta_content:
chunk_idx += 1
continue
# 检查 content_length_list 是否有足够的数据
if len(content_length_list) >= 2:
now_content = delta_content[previous_length : content_length_list[-1]]
previous_length = content_length_list[-1] # 更新上一次的长度
else:
now_content = delta_content # 第一次循环时直接使用 delta_content
previous_length = len(delta_content) # 更新上一次的长度
# 构建 MessageChunk - 只包含增量内容
chunk_data = {
'role': role,
'content': now_content if now_content else None,
'is_final': bool(finish_reason) and finish_reason != 'null',
}
# 移除 None 值
chunk_data = {k: v for k, v in chunk_data.items() if v is not None}
yield provider_message.MessageChunk(**chunk_data)
chunk_idx += 1
else:
async for chunk in self._req_stream(args, extra_body=extra_args):
# 解析 chunk 数据
if hasattr(chunk, 'choices') and chunk.choices:
choice = chunk.choices[0]
delta = choice.delta.model_dump() if hasattr(choice, 'delta') else {}
finish_reason = getattr(choice, 'finish_reason', None)
else:
delta = {}
finish_reason = None
# 从第一个 chunk 获取 role后续使用这个 role
if 'role' in delta and delta['role']:
role = delta['role']
# 获取增量内容
delta_content = delta.get('content', '')
reasoning_content = delta.get('reasoning_content', '')
# 处理 reasoning_content
if reasoning_content:
# accumulated_reasoning += reasoning_content
# 如果设置了 remove_think跳过 reasoning_content
if remove_think:
chunk_idx += 1
continue
# 第一次出现 reasoning_content添加 <think> 开始标签
if not thinking_started:
thinking_started = True
delta_content = '<think>\n' + reasoning_content
else:
# 继续输出 reasoning_content
delta_content = reasoning_content
elif thinking_started and not thinking_ended and delta_content:
# reasoning_content 结束normal content 开始,添加 </think> 结束标签
thinking_ended = True
delta_content = '\n</think>\n' + delta_content
# 处理工具调用增量
if delta.get('tool_calls'):
for tool_call in delta['tool_calls']:
if tool_call['id'] != '':
tool_id = tool_call['id']
if tool_call['function']['name'] is not None:
tool_name = tool_call['function']['name']
if tool_call['type'] is None:
tool_call['type'] = 'function'
tool_call['id'] = tool_id
tool_call['function']['name'] = tool_name
tool_call['function']['arguments'] = (
'' if tool_call['function']['arguments'] is None else tool_call['function']['arguments']
)
# 跳过空的第一个 chunk只有 role 没有内容)
if chunk_idx == 0 and not delta_content and not reasoning_content and not delta.get('tool_calls'):
chunk_idx += 1
continue
# 构建 MessageChunk - 只包含增量内容
chunk_data = {
'role': role,
'content': delta_content if delta_content else None,
'tool_calls': delta.get('tool_calls'),
'is_final': bool(finish_reason),
}
# 移除 None 值
chunk_data = {k: v for k, v in chunk_data.items() if v is not None}
yield provider_message.MessageChunk(**chunk_data)
chunk_idx += 1
# return

View File

@@ -7,7 +7,6 @@ metadata:
zh_Hans: 阿里云百炼
icon: bailian.png
spec:
litellm_provider: openai
config:
- name: base_url
label:
@@ -25,7 +24,6 @@ spec:
default: 120
support_type:
- llm
- text-embedding
- rerank
provider_category: maas
execution:

View File

@@ -0,0 +1,702 @@
from __future__ import annotations
import asyncio
import typing
import openai
import openai.types.chat.chat_completion as chat_completion_module
import httpx
from .. import errors, requester
import langbot_plugin.api.entities.builtin.resource.tool as resource_tool
import langbot_plugin.api.entities.builtin.pipeline.query as pipeline_query
import langbot_plugin.api.entities.builtin.provider.message as provider_message
class OpenAIChatCompletions(requester.ProviderAPIRequester):
"""OpenAI ChatCompletion API 请求器"""
client: openai.AsyncClient
default_config: dict[str, typing.Any] = {
'base_url': 'https://api.openai.com/v1',
'timeout': 120,
}
async def initialize(self):
self.client = openai.AsyncClient(
api_key=self.init_api_key,
base_url=self.requester_cfg['base_url'].replace(' ', ''),
timeout=self.requester_cfg['timeout'],
http_client=httpx.AsyncClient(trust_env=True, timeout=self.requester_cfg['timeout']),
)
def _mask_api_key(self, api_key: str | None) -> str:
if not api_key:
return ''
if len(api_key) <= 8:
return '****'
return f'{api_key[:4]}...{api_key[-4:]}'
def _infer_model_type(self, model_id: str) -> str:
normalized_model_id = (model_id or '').lower()
embedding_keywords = (
'embedding',
'embed',
'bge-',
'e5-',
'm3e',
'gte-',
'multilingual-e5',
'text-embedding',
)
return 'embedding' if any(keyword in normalized_model_id for keyword in embedding_keywords) else 'llm'
def _infer_model_abilities(self, item: dict[str, typing.Any], model_id: str) -> list[str]:
normalized_model_id = (model_id or '').lower()
abilities: set[str] = set()
def _flatten(value: typing.Any) -> list[str]:
if value is None:
return []
if isinstance(value, str):
return [value.lower()]
if isinstance(value, dict):
flattened: list[str] = []
for nested_value in value.values():
flattened.extend(_flatten(nested_value))
return flattened
if isinstance(value, (list, tuple, set)):
flattened: list[str] = []
for nested_value in value:
flattened.extend(_flatten(nested_value))
return flattened
return [str(value).lower()]
capability_tokens = _flatten(item.get('capabilities'))
capability_tokens.extend(_flatten(item.get('modalities')))
capability_tokens.extend(_flatten(item.get('input_modalities')))
capability_tokens.extend(_flatten(item.get('output_modalities')))
capability_tokens.extend(_flatten(item.get('supported_generation_methods')))
capability_tokens.extend(_flatten(item.get('supported_parameters')))
capability_tokens.extend(_flatten(item.get('architecture')))
combined_tokens = capability_tokens + [normalized_model_id]
vision_keywords = (
'vision',
'image',
'file',
'video',
'multimodal',
'vl',
'ocr',
'omni',
)
function_call_keywords = (
'function',
'tool',
'tools',
'tool_choice',
'tool_call',
'tool-use',
'tool_use',
)
if any(any(keyword in token for keyword in vision_keywords) for token in combined_tokens):
abilities.add('vision')
if any(any(keyword in token for keyword in function_call_keywords) for token in combined_tokens):
abilities.add('func_call')
return sorted(abilities)
def _normalize_modalities(self, value: typing.Any) -> list[str]:
normalized: list[str] = []
def _collect(item: typing.Any):
if item is None:
return
if isinstance(item, str):
for part in item.replace('->', ',').replace('+', ',').split(','):
token = part.strip().lower()
if token and token not in normalized:
normalized.append(token)
return
if isinstance(item, dict):
for nested in item.values():
_collect(nested)
return
if isinstance(item, (list, tuple, set)):
for nested in item:
_collect(nested)
return
_collect(value)
return normalized
def _extract_scan_metadata(self, item: dict[str, typing.Any], model_id: str) -> dict[str, typing.Any]:
display_name = item.get('name')
if not isinstance(display_name, str) or not display_name.strip() or display_name == model_id:
display_name = ''
description = item.get('description')
if not isinstance(description, str) or not description.strip():
description = ''
context_length = item.get('context_length')
if context_length is None and isinstance(item.get('top_provider'), dict):
context_length = item['top_provider'].get('context_length')
if not isinstance(context_length, int):
try:
context_length = int(context_length) if context_length is not None else None
except (TypeError, ValueError):
context_length = None
input_modalities = self._normalize_modalities(item.get('input_modalities'))
output_modalities = self._normalize_modalities(item.get('output_modalities'))
if isinstance(item.get('architecture'), dict):
if not input_modalities:
input_modalities = self._normalize_modalities(item['architecture'].get('input_modalities'))
if not output_modalities:
output_modalities = self._normalize_modalities(item['architecture'].get('output_modalities'))
owned_by = item.get('owned_by')
if not isinstance(owned_by, str) or not owned_by.strip():
owned_by = ''
return {
'display_name': display_name or None,
'description': description or None,
'context_length': context_length,
'owned_by': owned_by or None,
'input_modalities': input_modalities,
'output_modalities': output_modalities,
}
async def scan_models(self, api_key: str | None = None) -> dict[str, typing.Any]:
headers = {}
if api_key:
headers['Authorization'] = f'Bearer {api_key}'
models_url = f'{self.requester_cfg["base_url"].rstrip("/")}/models'
async with httpx.AsyncClient(trust_env=True, timeout=self.requester_cfg['timeout']) as client:
response = await client.get(models_url, headers=headers)
response.raise_for_status()
payload = response.json()
models = []
for item in payload.get('data', []):
model_id = item.get('id')
if not model_id:
continue
models.append(
{
'id': model_id,
'name': model_id,
'type': self._infer_model_type(model_id),
'abilities': self._infer_model_abilities(item, model_id),
**self._extract_scan_metadata(item, model_id),
}
)
models.sort(key=lambda item: (item['type'] != 'llm', item['name'].lower()))
return {
'models': models,
'debug': {
'request': {
'method': 'GET',
'url': models_url,
'headers': {
'Authorization': f'Bearer {self._mask_api_key(api_key)}' if api_key else '',
},
},
'response': payload,
},
}
async def _req(
self,
args: dict,
extra_body: dict = {},
) -> chat_completion_module.ChatCompletion:
return await self.client.chat.completions.create(**args, extra_body=extra_body)
async def _req_stream(
self,
args: dict,
extra_body: dict = {},
):
async for chunk in await self.client.chat.completions.create(**args, extra_body=extra_body):
yield chunk
async def _make_msg(
self,
chat_completion: chat_completion_module.ChatCompletion,
remove_think: bool = False,
) -> provider_message.Message:
if not isinstance(chat_completion, chat_completion_module.ChatCompletion):
raise TypeError(f'Expected ChatCompletion, got {type(chat_completion).__name__}: {chat_completion[:16]}')
chatcmpl_message = chat_completion.choices[0].message.model_dump()
# 确保 role 字段存在且不为 None
if 'role' not in chatcmpl_message or chatcmpl_message['role'] is None:
chatcmpl_message['role'] = 'assistant'
# 处理思维链
content = chatcmpl_message.get('content', '')
reasoning_content = chatcmpl_message.get('reasoning_content', None)
processed_content, _ = await self._process_thinking_content(
content=content, reasoning_content=reasoning_content, remove_think=remove_think
)
chatcmpl_message['content'] = processed_content
# 移除 reasoning_content 字段,避免传递给 Message
if 'reasoning_content' in chatcmpl_message:
del chatcmpl_message['reasoning_content']
message = provider_message.Message(**chatcmpl_message)
return message
async def _process_thinking_content(
self,
content: str,
reasoning_content: str = None,
remove_think: bool = False,
) -> tuple[str, str]:
"""处理思维链内容
Args:
content: 原始内容
reasoning_content: reasoning_content 字段内容
remove_think: 是否移除思维链
Returns:
(处理后的内容, 提取的思维链内容)
"""
thinking_content = ''
# 1. 从 reasoning_content 提取思维链
if reasoning_content:
thinking_content = reasoning_content
# 2. 从 content 中提取 <think> 标签内容
if content and '<think>' in content and '</think>' in content:
import re
think_pattern = r'<think>(.*?)</think>'
think_matches = re.findall(think_pattern, content, re.DOTALL)
if think_matches:
# 如果已有 reasoning_content则追加
if thinking_content:
thinking_content += '\n' + '\n'.join(think_matches)
else:
thinking_content = '\n'.join(think_matches)
# 移除 content 中的 <think> 标签
content = re.sub(think_pattern, '', content, flags=re.DOTALL).strip()
# 3. 根据 remove_think 参数决定是否保留思维链
if remove_think:
return content, ''
else:
# 如果有思维链内容,将其以 <think> 格式添加到 content 开头
if thinking_content:
content = f'<think>\n{thinking_content}\n</think>\n{content}'.strip()
return content, thinking_content
async def _closure_stream(
self,
query: pipeline_query.Query,
req_messages: list[dict],
use_model: requester.RuntimeLLMModel,
use_funcs: list[resource_tool.LLMTool] = None,
extra_args: dict[str, typing.Any] = {},
remove_think: bool = False,
) -> provider_message.MessageChunk:
self.client.api_key = use_model.provider.token_mgr.get_token()
args = {}
args['model'] = use_model.model_entity.name
if use_funcs:
tools = await self.ap.tool_mgr.generate_tools_for_openai(use_funcs)
if tools:
args['tools'] = tools
# 设置此次请求中的messages
messages = req_messages.copy()
# 检查vision
for msg in messages:
if 'content' in msg and isinstance(msg['content'], list):
for me in msg['content']:
if me['type'] == 'image_base64':
me['image_url'] = {'url': me['image_base64']}
me['type'] = 'image_url'
del me['image_base64']
args['messages'] = messages
args['stream'] = True
# 流式处理状态
# tool_calls_map: dict[str, provider_message.ToolCall] = {}
chunk_idx = 0
thinking_started = False
thinking_ended = False
role = 'assistant' # 默认角色
tool_id = ''
tool_name = ''
# accumulated_reasoning = '' # 仅用于判断何时结束思维链
async for chunk in self._req_stream(args, extra_body=extra_args):
# 解析 chunk 数据
if hasattr(chunk, 'choices') and chunk.choices:
choice = chunk.choices[0]
delta = choice.delta.model_dump() if hasattr(choice, 'delta') else {}
finish_reason = getattr(choice, 'finish_reason', None)
else:
delta = {}
finish_reason = None
# 从第一个 chunk 获取 role后续使用这个 role
if 'role' in delta and delta['role']:
role = delta['role']
# 获取增量内容
delta_content = delta.get('content', '')
reasoning_content = delta.get('reasoning_content', '')
# 处理 reasoning_content
if reasoning_content:
# accumulated_reasoning += reasoning_content
# 如果设置了 remove_think跳过 reasoning_content
if remove_think:
chunk_idx += 1
continue
# 第一次出现 reasoning_content添加 <think> 开始标签
if not thinking_started:
thinking_started = True
delta_content = '<think>\n' + reasoning_content
else:
# 继续输出 reasoning_content
delta_content = reasoning_content
elif thinking_started and not thinking_ended and delta_content:
# reasoning_content 结束normal content 开始,添加 </think> 结束标签
thinking_ended = True
delta_content = '\n</think>\n' + delta_content
# 处理 content 中已有的 <think> 标签(如果需要移除)
# if delta_content and remove_think and '<think>' in delta_content:
# import re
#
# # 移除 <think> 标签及其内容
# delta_content = re.sub(r'<think>.*?</think>', '', delta_content, flags=re.DOTALL)
# 处理工具调用增量
# delta_tool_calls = None
if delta.get('tool_calls'):
for tool_call in delta['tool_calls']:
if tool_call['id'] and tool_call['function']['name']:
tool_id = tool_call['id']
tool_name = tool_call['function']['name']
else:
tool_call['id'] = tool_id
tool_call['function']['name'] = tool_name
if tool_call['type'] is None:
tool_call['type'] = 'function'
# 跳过空的第一个 chunk只有 role 没有内容)
if chunk_idx == 0 and not delta_content and not reasoning_content and not delta.get('tool_calls'):
chunk_idx += 1
continue
# 构建 MessageChunk - 只包含增量内容
chunk_data = {
'role': role,
'content': delta_content if delta_content else None,
'tool_calls': delta.get('tool_calls'),
'is_final': bool(finish_reason),
}
# 移除 None 值
chunk_data = {k: v for k, v in chunk_data.items() if v is not None}
yield provider_message.MessageChunk(**chunk_data)
chunk_idx += 1
async def _closure(
self,
query: pipeline_query.Query,
req_messages: list[dict],
use_model: requester.RuntimeLLMModel,
use_funcs: list[resource_tool.LLMTool] = None,
extra_args: dict[str, typing.Any] = {},
remove_think: bool = False,
) -> tuple[provider_message.Message, dict]:
self.client.api_key = use_model.provider.token_mgr.get_token()
args = {}
args['model'] = use_model.model_entity.name
if use_funcs:
tools = await self.ap.tool_mgr.generate_tools_for_openai(use_funcs)
if tools:
args['tools'] = tools
# 设置此次请求中的messages
messages = req_messages.copy()
# 检查vision
for msg in messages:
if 'content' in msg and isinstance(msg['content'], list):
for me in msg['content']:
if me['type'] == 'image_base64':
me['image_url'] = {'url': me['image_base64']}
me['type'] = 'image_url'
del me['image_base64']
args['messages'] = messages
# 发送请求
resp = await self._req(args, extra_body=extra_args)
# 处理请求结果
message = await self._make_msg(resp, remove_think)
# Extract token usage from response
usage_info = {}
if hasattr(resp, 'usage') and resp.usage:
usage_info['input_tokens'] = resp.usage.prompt_tokens or 0
usage_info['output_tokens'] = resp.usage.completion_tokens or 0
usage_info['total_tokens'] = resp.usage.total_tokens or 0
return message, usage_info
async def invoke_llm(
self,
query: pipeline_query.Query,
model: requester.RuntimeLLMModel,
messages: typing.List[provider_message.Message],
funcs: typing.List[resource_tool.LLMTool] = None,
extra_args: dict[str, typing.Any] = {},
remove_think: bool = False,
) -> tuple[provider_message.Message, dict]:
"""Invoke LLM and return message with usage info"""
req_messages = [] # req_messages 仅用于类内,外部同步由 query.messages 进行
for m in messages:
msg_dict = m.dict(exclude_none=True)
content = msg_dict.get('content')
if isinstance(content, list):
# 检查 content 列表中是否每个部分都是文本
if all(isinstance(part, dict) and part.get('type') == 'text' for part in content):
# 将所有文本部分合并为一个字符串
msg_dict['content'] = '\n'.join(part['text'] for part in content)
req_messages.append(msg_dict)
try:
msg, usage_info = await self._closure(
query=query,
req_messages=req_messages,
use_model=model,
use_funcs=funcs,
extra_args=extra_args,
remove_think=remove_think,
)
return msg, usage_info
except asyncio.TimeoutError:
raise errors.RequesterError('请求超时')
except openai.BadRequestError as e:
error_message = str(e.message) if hasattr(e, 'message') else str(e)
if 'context_length_exceeded' in str(e):
raise errors.RequesterError(f'上文过长,请重置会话: {error_message}')
else:
raise errors.RequesterError(f'请求参数错误: {error_message}')
except openai.AuthenticationError as e:
error_message = str(e.message) if hasattr(e, 'message') else str(e)
raise errors.RequesterError(f'无效的 api-key: {error_message}')
except openai.NotFoundError as e:
error_message = str(e.message) if hasattr(e, 'message') else str(e)
raise errors.RequesterError(f'请求路径错误: {error_message}')
except openai.RateLimitError as e:
error_message = str(e.message) if hasattr(e, 'message') else str(e)
raise errors.RequesterError(f'请求过于频繁或余额不足: {error_message}')
except openai.APIConnectionError as e:
error_message = f'连接错误: {str(e)}'
raise errors.RequesterError(error_message)
except openai.APIError as e:
error_message = str(e.message) if hasattr(e, 'message') else str(e)
raise errors.RequesterError(f'请求错误: {error_message}')
async def invoke_embedding(
self,
model: requester.RuntimeEmbeddingModel,
input_text: list[str],
extra_args: dict[str, typing.Any] = {},
) -> tuple[list[list[float]], dict]:
"""调用 Embedding API, returns (embeddings, usage_info)"""
self.client.api_key = model.provider.token_mgr.get_token()
args = {
'model': model.model_entity.name,
'input': input_text,
}
if model.model_entity.extra_args:
args.update(model.model_entity.extra_args)
args.update(extra_args)
try:
resp = await self.client.embeddings.create(**args)
# Extract usage info
usage_info = {}
if hasattr(resp, 'usage') and resp.usage:
usage_info['prompt_tokens'] = resp.usage.prompt_tokens or 0
usage_info['total_tokens'] = resp.usage.total_tokens or 0
return [d.embedding for d in resp.data], usage_info
except asyncio.TimeoutError:
raise errors.RequesterError('请求超时')
except openai.BadRequestError as e:
raise errors.RequesterError(f'请求参数错误: {e.message}')
async def invoke_llm_stream(
self,
query: pipeline_query.Query,
model: requester.RuntimeLLMModel,
messages: typing.List[provider_message.Message],
funcs: typing.List[resource_tool.LLMTool] = None,
extra_args: dict[str, typing.Any] = {},
remove_think: bool = False,
) -> provider_message.MessageChunk:
req_messages = [] # req_messages 仅用于类内,外部同步由 query.messages 进行
for m in messages:
msg_dict = m.dict(exclude_none=True)
content = msg_dict.get('content')
if isinstance(content, list):
# 检查 content 列表中是否每个部分都是文本
if all(isinstance(part, dict) and part.get('type') == 'text' for part in content):
# 将所有文本部分合并为一个字符串
msg_dict['content'] = '\n'.join(part['text'] for part in content)
req_messages.append(msg_dict)
try:
async for item in self._closure_stream(
query=query,
req_messages=req_messages,
use_model=model,
use_funcs=funcs,
extra_args=extra_args,
remove_think=remove_think,
):
yield item
except asyncio.TimeoutError:
raise errors.RequesterError('请求超时')
except openai.BadRequestError as e:
if 'context_length_exceeded' in e.message:
raise errors.RequesterError(f'上文过长,请重置会话: {e.message}')
else:
raise errors.RequesterError(f'请求参数错误: {e.message}')
except openai.AuthenticationError as e:
raise errors.RequesterError(f'无效的 api-key: {e.message}')
except openai.NotFoundError as e:
raise errors.RequesterError(f'请求路径错误: {e.message}')
except openai.RateLimitError as e:
raise errors.RequesterError(f'请求过于频繁或余额不足: {e.message}')
except openai.APIError as e:
raise errors.RequesterError(f'请求错误: {e.message}')
async def invoke_rerank(
self,
model: requester.RuntimeRerankModel,
query: str,
documents: typing.List[str],
extra_args: dict[str, typing.Any] = {},
) -> typing.List[dict]:
"""Standard /rerank endpoint (Jina/Cohere/SiliconFlow/Voyage/DashScope compatible)
Supports extra_args from model.extra_args:
- rerank_url: full URL override (e.g. "https://dashscope.aliyuncs.com/compatible-api/v1/reranks")
- rerank_path: path override appended to base_url (e.g. "reranks" instead of default "rerank")
- Any other fields are merged into the request payload.
"""
api_key = model.provider.token_mgr.get_token()
base_url = self.requester_cfg.get('base_url', '').rstrip('/')
timeout = self.requester_cfg.get('timeout', 120)
merged_args = {}
if model.model_entity.extra_args:
merged_args.update(model.model_entity.extra_args)
if extra_args:
merged_args.update(extra_args)
rerank_url = merged_args.pop('rerank_url', None)
rerank_path = merged_args.pop('rerank_path', 'rerank')
if not rerank_url:
rerank_url = f'{base_url}/{rerank_path}'
headers = {
'Content-Type': 'application/json',
'Authorization': f'Bearer {api_key}',
}
payload = {
'model': model.model_entity.name,
'query': query,
'documents': documents[:64],
'top_n': min(len(documents), 64),
}
if merged_args:
payload.update(merged_args)
try:
async with httpx.AsyncClient(trust_env=True, timeout=timeout) as client:
resp = await client.post(rerank_url, headers=headers, json=payload)
resp.raise_for_status()
data = resp.json()
results = self._parse_rerank_response(data)
if results:
scores = [r.get('relevance_score', 0.0) for r in results]
min_score = min(scores)
max_score = max(scores)
if max_score - min_score > 1e-6:
for r in results:
r['relevance_score'] = (r['relevance_score'] - min_score) / (max_score - min_score)
return results
except httpx.HTTPStatusError as e:
raise errors.RequesterError(f'Rerank request failed: {e.response.status_code} - {e.response.text}')
except httpx.TimeoutException:
raise errors.RequesterError('Rerank request timed out')
except Exception as e:
raise errors.RequesterError(f'Rerank request error: {str(e)}')
@staticmethod
def _parse_rerank_response(data: dict) -> typing.List[dict]:
"""Parse rerank response from various providers.
Handles:
- Jina/Cohere/SiliconFlow: {"results": [{"index", "relevance_score"}]}
- Voyage AI: {"data": [{"index", "relevance_score"}]}
- DashScope: {"output": {"results": [{"index", "relevance_score"}]}}
"""
if 'results' in data:
return data['results']
if 'data' in data:
return data['data']
if 'output' in data and isinstance(data['output'], dict):
return data['output'].get('results', [])
return []

View File

@@ -7,7 +7,6 @@ metadata:
zh_Hans: OpenAI
icon: openai.svg
spec:
litellm_provider: openai
config:
- name: base_url
label:

View File

@@ -7,7 +7,6 @@ metadata:
zh_Hans: Cohere
icon: cohere.svg
spec:
litellm_provider: cohere
config:
- name: base_url
label:

View File

@@ -0,0 +1,17 @@
from __future__ import annotations
import typing
import openai
from . import chatcmpl
class CompShareChatCompletions(chatcmpl.OpenAIChatCompletions):
"""CompShare ChatCompletion API 请求器"""
client: openai.AsyncClient
default_config: dict[str, typing.Any] = {
'base_url': 'https://api.modelverse.cn/v1',
'timeout': 120,
}

View File

@@ -7,7 +7,6 @@ metadata:
zh_Hans: 优云智算
icon: compshare.png
spec:
litellm_provider: openai
config:
- name: base_url
label:
@@ -25,8 +24,6 @@ spec:
default: 120
support_type:
- llm
- text-embedding
- rerank
provider_category: maas
execution:
python:

View File

@@ -0,0 +1,67 @@
from __future__ import annotations
import typing
from . import chatcmpl
from .. import errors, requester
import langbot_plugin.api.entities.builtin.resource.tool as resource_tool
import langbot_plugin.api.entities.builtin.pipeline.query as pipeline_query
import langbot_plugin.api.entities.builtin.provider.message as provider_message
class DeepseekChatCompletions(chatcmpl.OpenAIChatCompletions):
"""Deepseek ChatCompletion API 请求器"""
default_config: dict[str, typing.Any] = {
'base_url': 'https://api.deepseek.com',
'timeout': 120,
}
async def _closure(
self,
query: pipeline_query.Query,
req_messages: list[dict],
use_model: requester.RuntimeLLMModel,
use_funcs: list[resource_tool.LLMTool] = None,
extra_args: dict[str, typing.Any] = {},
remove_think: bool = False,
) -> tuple[provider_message.Message, dict]:
self.client.api_key = use_model.provider.token_mgr.get_token()
args = {}
args['model'] = use_model.model_entity.name
if use_funcs:
tools = await self.ap.tool_mgr.generate_tools_for_openai(use_funcs)
if tools:
args['tools'] = tools
# 设置此次请求中的messages
messages = req_messages
# deepseek 不支持多模态把content都转换成纯文字
for m in messages:
if 'content' in m and isinstance(m['content'], list):
m['content'] = ' '.join([c['text'] for c in m['content'] if 'text' in c])
args['messages'] = messages
# 发送请求
resp = await self._req(args, extra_body=extra_args)
# print(resp)
if resp is None:
raise errors.RequesterError('接口返回为空,请确定模型提供商服务是否正常')
# 处理请求结果
message = await self._make_msg(resp, remove_think)
# Extract token usage from response
usage_info = {}
if hasattr(resp, 'usage') and resp.usage:
usage_info['input_tokens'] = resp.usage.prompt_tokens or 0
usage_info['output_tokens'] = resp.usage.completion_tokens or 0
usage_info['total_tokens'] = resp.usage.total_tokens or 0
return message, usage_info

View File

@@ -7,7 +7,6 @@ metadata:
zh_Hans: DeepSeek
icon: deepseek.svg
spec:
litellm_provider: deepseek
config:
- name: base_url
label:
@@ -25,8 +24,6 @@ spec:
default: 120
support_type:
- llm
- text-embedding
- rerank
provider_category: manufacturer
execution:
python:

View File

@@ -1,4 +0,0 @@
<svg width="60" height="50" viewBox="0 0 60 50" xmlns="http://www.w3.org/2000/svg">
<rect width="60" height="50" rx="8" fill="#3B82F6"/>
<text x="30" y="32" font-family="Arial, sans-serif" font-size="12" font-weight="bold" fill="white" text-anchor="middle">豆包</text>
</svg>

Before

Width:  |  Height:  |  Size: 282 B

View File

@@ -1,30 +0,0 @@
apiVersion: v1
kind: LLMAPIRequester
metadata:
name: doubao-chat-completions
label:
en_US: ByteDance Doubao
zh_Hans: 字节豆包
icon: doubao.svg
spec:
litellm_provider: openai
config:
- name: base_url
label:
en_US: Base URL
zh_Hans: 基础 URL
type: string
required: true
default: https://ark.cn-beijing.volces.com/api/v3
- name: timeout
label:
en_US: Timeout
zh_Hans: 超时时间
type: integer
required: true
default: 120
support_type:
- llm
- text-embedding
- rerank
provider_category: manufacturer

View File

@@ -0,0 +1,205 @@
from __future__ import annotations
import typing
import httpx
from . import chatcmpl
import uuid
from .. import requester
import langbot_plugin.api.entities.builtin.provider.message as provider_message
import langbot_plugin.api.entities.builtin.pipeline.query as pipeline_query
import langbot_plugin.api.entities.builtin.resource.tool as resource_tool
class GeminiChatCompletions(chatcmpl.OpenAIChatCompletions):
"""Google Gemini API 请求器"""
default_config: dict[str, typing.Any] = {
'base_url': 'https://generativelanguage.googleapis.com/v1beta/openai',
'timeout': 120,
}
async def scan_models(self, api_key: str | None = None) -> dict[str, typing.Any]:
models_url = 'https://generativelanguage.googleapis.com/v1beta/models'
params = {'key': api_key} if api_key else {}
all_models: list[dict[str, typing.Any]] = []
next_page_token = ''
last_payload: dict[str, typing.Any] = {}
async with httpx.AsyncClient(trust_env=True, timeout=self.requester_cfg['timeout']) as client:
while True:
request_params = dict(params)
if next_page_token:
request_params['pageToken'] = next_page_token
response = await client.get(models_url, params=request_params)
response.raise_for_status()
payload = response.json()
last_payload = payload
for item in payload.get('models', []):
model_name = item.get('name', '')
model_id = model_name.replace('models/', '', 1)
if not model_id:
continue
supported_methods = item.get('supportedGenerationMethods', []) or []
if 'embedContent' in supported_methods and 'generateContent' not in supported_methods:
model_type = 'embedding'
else:
model_type = 'llm'
all_models.append(
{
'id': model_id,
'name': model_id,
'type': model_type,
'abilities': self._infer_model_abilities(item, model_id),
'display_name': item.get('displayName') or None,
'description': item.get('description') or None,
'context_length': item.get('inputTokenLimit'),
'input_modalities': self._normalize_modalities(item.get('inputModalities')),
'output_modalities': self._normalize_modalities(item.get('outputModalities')),
}
)
next_page_token = payload.get('nextPageToken', '')
if not next_page_token:
break
all_models.sort(key=lambda item: (item['type'] != 'llm', item['name'].lower()))
return {
'models': all_models,
'debug': {
'request': {
'method': 'GET',
'url': models_url,
'query': {'key': self._mask_api_key(api_key)} if api_key else {},
},
'response': last_payload,
},
}
async def _closure_stream(
self,
query: pipeline_query.Query,
req_messages: list[dict],
use_model: requester.RuntimeLLMModel,
use_funcs: list[resource_tool.LLMTool] = None,
extra_args: dict[str, typing.Any] = {},
remove_think: bool = False,
) -> provider_message.MessageChunk:
self.client.api_key = use_model.provider.token_mgr.get_token()
args = {}
args['model'] = use_model.model_entity.name
if use_funcs:
tools = await self.ap.tool_mgr.generate_tools_for_openai(use_funcs)
if tools:
args['tools'] = tools
# 设置此次请求中的messages
messages = req_messages.copy()
# 检查vision
for msg in messages:
if 'content' in msg and isinstance(msg['content'], list):
for me in msg['content']:
if me['type'] == 'image_base64':
me['image_url'] = {'url': me['image_base64']}
me['type'] = 'image_url'
del me['image_base64']
args['messages'] = messages
args['stream'] = True
# 流式处理状态
# tool_calls_map: dict[str, provider_message.ToolCall] = {}
chunk_idx = 0
thinking_started = False
thinking_ended = False
role = 'assistant' # 默认角色
tool_id = ''
tool_name = ''
# accumulated_reasoning = '' # 仅用于判断何时结束思维链
async for chunk in self._req_stream(args, extra_body=extra_args):
# 解析 chunk 数据
if hasattr(chunk, 'choices') and chunk.choices:
choice = chunk.choices[0]
delta = choice.delta.model_dump() if hasattr(choice, 'delta') else {}
finish_reason = getattr(choice, 'finish_reason', None)
else:
delta = {}
finish_reason = None
# 从第一个 chunk 获取 role后续使用这个 role
if 'role' in delta and delta['role']:
role = delta['role']
# 获取增量内容
delta_content = delta.get('content', '')
reasoning_content = delta.get('reasoning_content', '')
# 处理 reasoning_content
if reasoning_content:
# accumulated_reasoning += reasoning_content
# 如果设置了 remove_think跳过 reasoning_content
if remove_think:
chunk_idx += 1
continue
# 第一次出现 reasoning_content添加 <think> 开始标签
if not thinking_started:
thinking_started = True
delta_content = '<think>\n' + reasoning_content
else:
# 继续输出 reasoning_content
delta_content = reasoning_content
elif thinking_started and not thinking_ended and delta_content:
# reasoning_content 结束normal content 开始,添加 </think> 结束标签
thinking_ended = True
delta_content = '\n</think>\n' + delta_content
# 处理 content 中已有的 <think> 标签(如果需要移除)
# if delta_content and remove_think and '<think>' in delta_content:
# import re
#
# # 移除 <think> 标签及其内容
# delta_content = re.sub(r'<think>.*?</think>', '', delta_content, flags=re.DOTALL)
# 处理工具调用增量
# delta_tool_calls = None
if delta.get('tool_calls'):
for tool_call in delta['tool_calls']:
if tool_call['id'] == '' and tool_id == '':
tool_id = str(uuid.uuid4())
if tool_call['function']['name']:
tool_name = tool_call['function']['name']
tool_call['id'] = tool_id
tool_call['function']['name'] = tool_name
if tool_call['type'] is None:
tool_call['type'] = 'function'
# 跳过空的第一个 chunk只有 role 没有内容)
if chunk_idx == 0 and not delta_content and not reasoning_content and not delta.get('tool_calls'):
chunk_idx += 1
continue
# 构建 MessageChunk - 只包含增量内容
chunk_data = {
'role': role,
'content': delta_content if delta_content else None,
'tool_calls': delta.get('tool_calls'),
'is_final': bool(finish_reason),
}
# 移除 None 值
chunk_data = {k: v for k, v in chunk_data.items() if v is not None}
yield provider_message.MessageChunk(**chunk_data)
chunk_idx += 1

View File

@@ -7,7 +7,6 @@ metadata:
zh_Hans: Google Gemini
icon: gemini.svg
spec:
litellm_provider: gemini
config:
- name: base_url
label:
@@ -25,8 +24,6 @@ spec:
default: 120
support_type:
- llm
- text-embedding
- rerank
provider_category: manufacturer
execution:
python:

View File

@@ -0,0 +1,15 @@
from __future__ import annotations
import typing
from . import ppiochatcmpl
class GiteeAIChatCompletions(ppiochatcmpl.PPIOChatCompletions):
"""Gitee AI ChatCompletions API 请求器"""
default_config: dict[str, typing.Any] = {
'base_url': 'https://ai.gitee.com/v1',
'timeout': 120,
}

View File

@@ -7,7 +7,6 @@ metadata:
zh_Hans: Gitee AI
icon: giteeai.svg
spec:
litellm_provider: openai
config:
- name: base_url
label:

View File

@@ -1,4 +0,0 @@
<svg width="60" height="50" viewBox="0 0 60 50" xmlns="http://www.w3.org/2000/svg">
<rect width="60" height="50" rx="8" fill="#F97316"/>
<text x="30" y="32" font-family="Arial, sans-serif" font-size="14" font-weight="bold" fill="white" text-anchor="middle">Groq</text>
</svg>

Before

Width:  |  Height:  |  Size: 280 B

View File

@@ -1,30 +0,0 @@
apiVersion: v1
kind: LLMAPIRequester
metadata:
name: groq-chat-completions
label:
en_US: Groq
zh_Hans: Groq
icon: groq.svg
spec:
litellm_provider: groq
config:
- name: base_url
label:
en_US: Base URL
zh_Hans: 基础 URL
type: string
required: true
default: https://api.groq.com/openai/v1
- name: timeout
label:
en_US: Timeout
zh_Hans: 超时时间
type: integer
required: true
default: 120
support_type:
- llm
- text-embedding
- rerank
provider_category: manufacturer

View File

@@ -1,5 +0,0 @@
<svg width="60" height="50" viewBox="0 0 60 50" xmlns="http://www.w3.org/2000/svg">
<rect width="60" height="50" rx="8" fill="#0066FF"/>
<text x="30" y="28" font-family="Arial, sans-serif" font-size="10" font-weight="bold" fill="white" text-anchor="middle">iFlytek</text>
<text x="30" y="40" font-family="Arial, sans-serif" font-size="8" fill="white" text-anchor="middle">Spark</text>
</svg>

Before

Width:  |  Height:  |  Size: 398 B

View File

@@ -1,30 +0,0 @@
apiVersion: v1
kind: LLMAPIRequester
metadata:
name: iflytek-chat-completions
label:
en_US: iFlytek Spark
zh_Hans: 讯飞星火
icon: iflytek.svg
spec:
litellm_provider: openai
config:
- name: base_url
label:
en_US: Base URL
zh_Hans: 基础 URL
type: string
required: true
default: https://spark-api-open.xf-yun.com/v1
- name: timeout
label:
en_US: Timeout
zh_Hans: 超时时间
type: integer
required: true
default: 120
support_type:
- llm
- text-embedding
- rerank
provider_category: manufacturer

View File

@@ -0,0 +1,208 @@
from __future__ import annotations
import openai
import typing
from . import chatcmpl
from .. import requester
import openai.types.chat.chat_completion as chat_completion
import re
import langbot_plugin.api.entities.builtin.provider.message as provider_message
import langbot_plugin.api.entities.builtin.pipeline.query as pipeline_query
import langbot_plugin.api.entities.builtin.resource.tool as resource_tool
class JieKouAIChatCompletions(chatcmpl.OpenAIChatCompletions):
"""接口 AI ChatCompletion API 请求器"""
client: openai.AsyncClient
default_config: dict[str, typing.Any] = {
'base_url': 'https://api.jiekou.ai/openai',
'timeout': 120,
}
is_think: bool = False
async def _make_msg(
self,
chat_completion: chat_completion.ChatCompletion,
remove_think: bool,
) -> provider_message.Message:
chatcmpl_message = chat_completion.choices[0].message.model_dump()
# print(chatcmpl_message.keys(), chatcmpl_message.values())
# 确保 role 字段存在且不为 None
if 'role' not in chatcmpl_message or chatcmpl_message['role'] is None:
chatcmpl_message['role'] = 'assistant'
reasoning_content = chatcmpl_message['reasoning_content'] if 'reasoning_content' in chatcmpl_message else None
# deepseek的reasoner模型
chatcmpl_message['content'] = await self._process_thinking_content(
chatcmpl_message['content'], reasoning_content, remove_think
)
# 移除 reasoning_content 字段,避免传递给 Message
if 'reasoning_content' in chatcmpl_message:
del chatcmpl_message['reasoning_content']
message = provider_message.Message(**chatcmpl_message)
return message
async def _process_thinking_content(
self,
content: str,
reasoning_content: str = None,
remove_think: bool = False,
) -> tuple[str, str]:
"""处理思维链内容
Args:
content: 原始内容
reasoning_content: reasoning_content 字段内容
remove_think: 是否移除思维链
Returns:
处理后的内容
"""
if remove_think:
content = re.sub(r'<think>.*?</think>', '', content, flags=re.DOTALL)
else:
if reasoning_content is not None:
content = '<think>\n' + reasoning_content + '\n</think>\n' + content
return content
async def _make_msg_chunk(
self,
delta: dict[str, typing.Any],
idx: int,
) -> provider_message.MessageChunk:
# 处理流式chunk和完整响应的差异
# print(chat_completion.choices[0])
# 确保 role 字段存在且不为 None
if 'role' not in delta or delta['role'] is None:
delta['role'] = 'assistant'
reasoning_content = delta['reasoning_content'] if 'reasoning_content' in delta else None
delta['content'] = '' if delta['content'] is None else delta['content']
# print(reasoning_content)
# deepseek的reasoner模型
if reasoning_content is not None:
delta['content'] += reasoning_content
message = provider_message.MessageChunk(**delta)
return message
async def _closure_stream(
self,
query: pipeline_query.Query,
req_messages: list[dict],
use_model: requester.RuntimeLLMModel,
use_funcs: list[resource_tool.LLMTool] = None,
extra_args: dict[str, typing.Any] = {},
remove_think: bool = False,
) -> provider_message.Message | typing.AsyncGenerator[provider_message.MessageChunk, None]:
self.client.api_key = use_model.provider.token_mgr.get_token()
args = {}
args['model'] = use_model.model_entity.name
if use_funcs:
tools = await self.ap.tool_mgr.generate_tools_for_openai(use_funcs)
if tools:
args['tools'] = tools
# 设置此次请求中的messages
messages = req_messages.copy()
# 检查vision
for msg in messages:
if 'content' in msg and isinstance(msg['content'], list):
for me in msg['content']:
if me['type'] == 'image_base64':
me['image_url'] = {'url': me['image_base64']}
me['type'] = 'image_url'
del me['image_base64']
args['messages'] = messages
args['stream'] = True
# tool_calls_map: dict[str, provider_message.ToolCall] = {}
chunk_idx = 0
thinking_started = False
thinking_ended = False
role = 'assistant' # 默认角色
async for chunk in self._req_stream(args, extra_body=extra_args):
# 解析 chunk 数据
if hasattr(chunk, 'choices') and chunk.choices:
choice = chunk.choices[0]
delta = choice.delta.model_dump() if hasattr(choice, 'delta') else {}
finish_reason = getattr(choice, 'finish_reason', None)
else:
delta = {}
finish_reason = None
# 从第一个 chunk 获取 role后续使用这个 role
if 'role' in delta and delta['role']:
role = delta['role']
# 获取增量内容
delta_content = delta.get('content', '')
# reasoning_content = delta.get('reasoning_content', '')
if remove_think:
if delta['content'] is not None:
if '<think>' in delta['content'] and not thinking_started and not thinking_ended:
thinking_started = True
continue
elif delta['content'] == r'</think>' and not thinking_ended:
thinking_ended = True
continue
elif thinking_ended and delta['content'] == '\n\n' and thinking_started:
thinking_started = False
continue
elif thinking_started and not thinking_ended:
continue
# delta_tool_calls = None
if delta.get('tool_calls'):
for tool_call in delta['tool_calls']:
if tool_call['id'] and tool_call['function']['name']:
tool_id = tool_call['id']
tool_name = tool_call['function']['name']
if tool_call['id'] is None:
tool_call['id'] = tool_id
if tool_call['function']['name'] is None:
tool_call['function']['name'] = tool_name
if tool_call['function']['arguments'] is None:
tool_call['function']['arguments'] = ''
if tool_call['type'] is None:
tool_call['type'] = 'function'
# 跳过空的第一个 chunk只有 role 没有内容)
if chunk_idx == 0 and not delta_content and not delta.get('tool_calls'):
chunk_idx += 1
continue
# 构建 MessageChunk - 只包含增量内容
chunk_data = {
'role': role,
'content': delta_content if delta_content else None,
'tool_calls': delta.get('tool_calls'),
'is_final': bool(finish_reason),
}
# 移除 None 值
chunk_data = {k: v for k, v in chunk_data.items() if v is not None}
yield provider_message.MessageChunk(**chunk_data)
chunk_idx += 1

View File

@@ -7,7 +7,6 @@ metadata:
zh_Hans: 接口 AI
icon: jiekouai.png
spec:
litellm_provider: openai
config:
- name: base_url
label:

View File

@@ -7,7 +7,6 @@ metadata:
zh_Hans: Jina
icon: jina.svg
spec:
litellm_provider: openai
config:
- name: base_url
label:

View File

@@ -1,644 +0,0 @@
"""LiteLLM unified requester for chat, embedding, and rerank."""
from __future__ import annotations
import typing
import litellm
from litellm import acompletion, aembedding, arerank
from .. import errors, requester
import langbot_plugin.api.entities.builtin.resource.tool as resource_tool
import langbot_plugin.api.entities.builtin.pipeline.query as pipeline_query
import langbot_plugin.api.entities.builtin.provider.message as provider_message
class LiteLLMRequester(requester.ProviderAPIRequester):
"""LiteLLM unified API requester supporting chat, embedding, and rerank."""
_EMBEDDING_MODEL_HINTS = ('embedding', 'embed', 'bge-', 'e5-', 'm3e', 'gte-', 'text-embedding')
_RERANK_MODEL_HINTS = ('rerank', 're-rank', 're_rank')
default_config: dict[str, typing.Any] = {
'base_url': '',
'timeout': 120,
'custom_llm_provider': '',
'drop_params': False,
'num_retries': 0,
'api_version': '',
}
async def initialize(self):
"""Initialize LiteLLM client settings."""
# LiteLLM doesn't require explicit client initialization
# Configuration is passed per-request via litellm params
pass
def _build_litellm_model_name(self, model_name: str, custom_llm_provider: str | None = None) -> str:
"""Build LiteLLM model name with provider prefix if needed."""
provider = custom_llm_provider or self.requester_cfg.get('custom_llm_provider', '')
if provider:
# LiteLLM format: provider/model_name
if model_name.startswith(f'{provider}/'):
return model_name
return f'{provider}/{model_name}'
# If no custom provider, assume model_name already includes prefix or is OpenAI-compatible
return model_name
def _get_custom_llm_provider(self) -> str | None:
return self.requester_cfg.get('custom_llm_provider') or None
def _safe_litellm_bool_helper(self, helper_name: str, model_name: str) -> bool:
"""Call a LiteLLM boolean capability helper without letting metadata gaps fail requests."""
helper = getattr(litellm, helper_name, None)
if not callable(helper):
return False
provider = self._get_custom_llm_provider()
candidates: list[tuple[str, str | None]] = [(model_name, provider)]
litellm_model_name = self._build_litellm_model_name(model_name)
if litellm_model_name != model_name:
candidates.append((litellm_model_name, None))
for metadata_provider in self._metadata_provider_candidates(model_name):
candidates.append((f'{metadata_provider}/{model_name}', None))
tried_candidates: set[tuple[str, str | None]] = set()
for candidate_model, candidate_provider in candidates:
candidate_key = (candidate_model, candidate_provider)
if candidate_key in tried_candidates:
continue
tried_candidates.add(candidate_key)
try:
if bool(helper(model=candidate_model, custom_llm_provider=candidate_provider)):
return True
except Exception:
continue
return False
def _context_length_from_scan_payload(self, model_payload: dict[str, typing.Any] | None) -> int | None:
if not model_payload:
return None
for field_name in ('context_length', 'context_window', 'max_context_length'):
value = model_payload.get(field_name)
if isinstance(value, bool):
continue
if isinstance(value, int) and value > 0:
return value
if isinstance(value, str) and value.isdigit():
parsed_value = int(value)
if parsed_value > 0:
return parsed_value
return None
def _metadata_provider_candidates(self, model_name: str) -> list[str]:
normalized_model_name = (model_name or '').lower()
candidates = []
if normalized_model_name.startswith(('moonshot-', 'kimi-')):
candidates.append('moonshot')
if normalized_model_name.startswith('deepseek-'):
candidates.append('deepseek')
base_url = self.requester_cfg.get('base_url', '').lower()
if 'moonshot' in base_url:
candidates.append('moonshot')
if 'deepseek' in base_url:
candidates.append('deepseek')
deduped_candidates = []
for candidate in candidates:
if candidate not in deduped_candidates:
deduped_candidates.append(candidate)
return deduped_candidates
def _known_context_length_fallback(self, model_name: str) -> int | None:
normalized_model_name = (model_name or '').lower()
if normalized_model_name.startswith('deepseek-v4-'):
return 1_000_000
if normalized_model_name.startswith(('kimi-k2.5', 'kimi-k2.6')):
return 256 * 1024
if normalized_model_name.startswith('moonshot-v1-8k'):
return 8 * 1024
if normalized_model_name.startswith('moonshot-v1-32k'):
return 32 * 1024
if normalized_model_name.startswith('moonshot-v1-128k') or normalized_model_name == 'moonshot-v1-auto':
return 128 * 1024
return None
def _safe_context_length(self, model_name: str) -> int | None:
helper = getattr(litellm, 'get_max_tokens', None)
if not callable(helper):
return self._known_context_length_fallback(model_name)
candidates = [model_name]
litellm_model_name = self._build_litellm_model_name(model_name)
if litellm_model_name != model_name:
candidates.append(litellm_model_name)
for provider in self._metadata_provider_candidates(model_name):
candidates.append(f'{provider}/{model_name}')
tried_candidates = []
for candidate in candidates:
if candidate in tried_candidates:
continue
tried_candidates.append(candidate)
try:
max_tokens = helper(candidate)
except Exception:
continue
if isinstance(max_tokens, int) and max_tokens > 0:
return max_tokens
return self._known_context_length_fallback(model_name)
def _supports_function_calling(self, model_name: str) -> bool:
return self._safe_litellm_bool_helper('supports_function_calling', model_name)
def _supports_vision(self, model_name: str) -> bool:
return self._safe_litellm_bool_helper('supports_vision', model_name)
def _infer_model_type(self, model_id: str) -> str:
normalized_id = (model_id or '').lower()
if any(kw in normalized_id for kw in self._RERANK_MODEL_HINTS):
return 'rerank'
if any(kw in normalized_id for kw in self._EMBEDDING_MODEL_HINTS):
return 'embedding'
return 'llm'
def _enrich_scanned_model(
self,
model_id: str,
model_payload: dict[str, typing.Any] | None = None,
) -> dict[str, typing.Any]:
model_type = self._infer_model_type(model_id)
scanned_model: dict[str, typing.Any] = {
'id': model_id,
'name': model_id,
'type': model_type,
}
if model_type == 'llm':
abilities = []
if self._supports_function_calling(model_id):
abilities.append('func_call')
supports_provider_reported_vision = bool(
model_payload
and (model_payload.get('supports_image_in') is True or model_payload.get('supports_vision') is True)
)
if supports_provider_reported_vision or self._supports_vision(model_id):
abilities.append('vision')
scanned_model['abilities'] = abilities
context_length = self._context_length_from_scan_payload(model_payload)
if context_length is None:
context_length = self._safe_context_length(model_id)
if context_length is not None:
scanned_model['context_length'] = context_length
return scanned_model
def _convert_messages(self, messages: typing.List[provider_message.Message]) -> list[dict]:
"""Convert LangBot messages to LiteLLM/OpenAI format."""
req_messages = []
for m in messages:
msg_dict = m.dict(exclude_none=True)
content = msg_dict.get('content')
if isinstance(content, list):
for part in content:
if isinstance(part, dict) and part.get('type') == 'image_base64':
part['image_url'] = {'url': part['image_base64']}
part['type'] = 'image_url'
del part['image_base64']
req_messages.append(msg_dict)
return req_messages
def _process_thinking_content(self, content: str, reasoning_content: str | None, remove_think: bool) -> str:
"""Process thinking/reasoning content.
Args:
content: The main content from response
reasoning_content: Separate reasoning content from model
remove_think: If True, remove thinking markers; if False, preserve them
Returns:
Processed content string
"""
# Extract and handle thinking tags
if content and 'CRETIRE_REASONING_BEGINk' in content and 'CRETIRE_REASONING_ENDk' in content:
import re
think_pattern = r'CRETIRE_REASONING_BEGINk(.*?)CRETIRE_REASONING_ENDk'
if remove_think:
# Remove thinking tags and their content from output
content = re.sub(think_pattern, '', content, flags=re.DOTALL).strip()
# else: preserve thinking content as-is
# Handle separate reasoning_content field
# Currently we don't include reasoning_content in user-facing output regardless of remove_think
# because it's typically internal model reasoning, not user-visible thinking
return content or ''
@staticmethod
def _normalize_usage(usage: typing.Any) -> dict:
"""Normalize a LiteLLM/OpenAI usage object into a plain token dict.
Handles several real-world shapes returned by different upstreams:
- object with ``prompt_tokens`` / ``completion_tokens`` / ``total_tokens`` attrs
- dict with the same keys
- missing ``total_tokens`` (derived from prompt + completion)
- ``None`` / partially-populated usage (defaults to 0)
"""
if usage is None:
return {'prompt_tokens': 0, 'completion_tokens': 0, 'total_tokens': 0}
def _get(key: str) -> typing.Any:
if isinstance(usage, dict):
return usage.get(key)
return getattr(usage, key, None)
prompt_tokens = _get('prompt_tokens') or 0
completion_tokens = _get('completion_tokens') or 0
total_tokens = _get('total_tokens') or 0
# Some providers omit total_tokens in streaming usage; derive it.
if not total_tokens:
total_tokens = prompt_tokens + completion_tokens
return {
'prompt_tokens': int(prompt_tokens),
'completion_tokens': int(completion_tokens),
'total_tokens': int(total_tokens),
}
def _extract_usage(self, response) -> dict:
"""Extract usage info from a non-streaming LiteLLM response."""
return self._normalize_usage(getattr(response, 'usage', None))
@staticmethod
def _as_dict(value: typing.Any) -> dict:
if value is None:
return {}
if isinstance(value, dict):
return value
if hasattr(value, 'model_dump'):
return value.model_dump()
return {}
def _normalize_stream_tool_calls(
self,
raw_tool_calls: typing.Any,
tool_call_state: dict[int, dict[str, str]],
) -> list[dict] | None:
"""Fill OpenAI-style streaming tool-call deltas so MessageChunk can validate them."""
if not raw_tool_calls:
return None
normalized = []
for fallback_index, raw_tool_call in enumerate(raw_tool_calls):
tool_call = self._as_dict(raw_tool_call)
index = tool_call.get('index')
if not isinstance(index, int):
index = fallback_index
state = tool_call_state.setdefault(index, {'id': '', 'type': 'function', 'name': ''})
if tool_call.get('id'):
state['id'] = tool_call['id']
if tool_call.get('type'):
state['type'] = tool_call['type']
function = self._as_dict(tool_call.get('function'))
if function.get('name'):
state['name'] = function['name']
arguments = function.get('arguments')
if arguments is None:
arguments = ''
elif not isinstance(arguments, str):
arguments = str(arguments)
if not state['id'] or not state['name']:
continue
normalized.append(
{
'id': state['id'],
'type': state['type'] or 'function',
'function': {
'name': state['name'],
'arguments': arguments,
},
}
)
return normalized or None
def _build_common_args(self, args: dict, include_retry_params: bool = True) -> dict:
"""Apply common requester config to args dict."""
if self.requester_cfg.get('base_url'):
args['api_base'] = self.requester_cfg['base_url']
if self.requester_cfg.get('timeout'):
args['timeout'] = self.requester_cfg['timeout']
if include_retry_params:
if self.requester_cfg.get('drop_params'):
args['drop_params'] = self.requester_cfg['drop_params']
if self.requester_cfg.get('num_retries'):
args['num_retries'] = self.requester_cfg['num_retries']
if self.requester_cfg.get('api_version'):
args['api_version'] = self.requester_cfg['api_version']
return args
def _handle_litellm_error(self, e: Exception) -> None:
"""Convert LiteLLM exceptions to RequesterError. Never returns, always raises."""
# Check more specific exceptions first (they inherit from base exceptions)
if isinstance(e, litellm.ContextWindowExceededError):
raise errors.RequesterError(f'上下文长度超限: {str(e)}')
if isinstance(e, litellm.BadRequestError):
raise errors.RequesterError(f'请求参数错误: {str(e)}')
if isinstance(e, litellm.AuthenticationError):
raise errors.RequesterError(f'API key 无效: {str(e)}')
if isinstance(e, litellm.NotFoundError):
raise errors.RequesterError(f'模型或路径无效: {str(e)}')
if isinstance(e, litellm.RateLimitError):
raise errors.RequesterError(f'请求过于频繁或余额不足: {str(e)}')
if isinstance(e, litellm.Timeout):
raise errors.RequesterError(f'请求超时: {str(e)}')
if isinstance(e, litellm.APIConnectionError):
raise errors.RequesterError(f'连接错误: {str(e)}')
if isinstance(e, litellm.APIError):
raise errors.RequesterError(f'API 错误: {str(e)}')
raise errors.RequesterError(f'未知错误: {str(e)}')
async def _build_completion_args(
self,
model: requester.RuntimeLLMModel,
messages: typing.List[provider_message.Message],
funcs: typing.List[resource_tool.LLMTool] = None,
extra_args: dict[str, typing.Any] = {},
stream: bool = False,
) -> dict:
"""Build common completion arguments for invoke_llm and invoke_llm_stream."""
req_messages = self._convert_messages(messages)
model_name = self._build_litellm_model_name(model.model_entity.name)
api_key = model.provider.token_mgr.get_token()
args = {
'model': model_name,
'messages': req_messages,
'api_key': api_key,
}
if stream:
args['stream'] = True
args['stream_options'] = {'include_usage': True}
self._build_common_args(args)
# Apply model-level extra_args first, then call-level extra_args
if model.model_entity.extra_args:
args.update(model.model_entity.extra_args)
args.update(extra_args)
if funcs:
tools = await self.ap.tool_mgr.generate_tools_for_openai(funcs)
if tools:
args['tools'] = tools
args.setdefault('tool_choice', 'auto')
return args
async def invoke_llm(
self,
query: pipeline_query.Query,
model: requester.RuntimeLLMModel,
messages: typing.List[provider_message.Message],
funcs: typing.List[resource_tool.LLMTool] = None,
extra_args: dict[str, typing.Any] = {},
remove_think: bool = False,
) -> tuple[provider_message.Message, dict]:
"""Invoke LLM and return message with usage info."""
args = await self._build_completion_args(model, messages, funcs, extra_args, stream=False)
try:
response = await acompletion(**args)
message_data = response.choices[0].message.model_dump()
if 'role' not in message_data or message_data['role'] is None:
message_data['role'] = 'assistant'
content = message_data.get('content', '')
reasoning_content = message_data.get('reasoning_content', None)
message_data['content'] = self._process_thinking_content(content, reasoning_content, remove_think)
if 'reasoning_content' in message_data:
del message_data['reasoning_content']
message = provider_message.Message(**message_data)
usage_info = self._extract_usage(response)
return message, usage_info
except Exception as e:
self._handle_litellm_error(e)
async def invoke_llm_stream(
self,
query: pipeline_query.Query,
model: requester.RuntimeLLMModel,
messages: typing.List[provider_message.Message],
funcs: typing.List[resource_tool.LLMTool] = None,
extra_args: dict[str, typing.Any] = {},
remove_think: bool = False,
) -> provider_message.MessageChunk:
"""Invoke LLM streaming and yield chunks."""
args = await self._build_completion_args(model, messages, funcs, extra_args, stream=True)
chunk_idx = 0
role = 'assistant'
tool_call_state: dict[int, dict[str, str]] = {}
try:
response = await acompletion(**args)
async for chunk in response:
# Capture usage whenever a chunk carries it.
#
# Important: many OpenAI-compatible gateways (e.g. new-api) and
# providers send the final usage payload in a chunk that STILL
# contains a (empty-delta) choice, not an empty `choices` list.
# The previous implementation only captured usage when `choices`
# was empty, so streamed calls always recorded 0 tokens.
# We therefore capture usage independently of `choices`, and then
# fall through to also process any content this chunk may carry.
if getattr(chunk, 'usage', None):
usage_info = self._normalize_usage(chunk.usage)
if query is not None:
if query.variables is None:
query.variables = {}
query.variables['_stream_usage'] = usage_info
if not hasattr(chunk, 'choices') or not chunk.choices:
continue
choice = chunk.choices[0]
delta = choice.delta.model_dump() if hasattr(choice, 'delta') else {}
finish_reason = getattr(choice, 'finish_reason', None)
if 'role' in delta and delta['role']:
role = delta['role']
delta_content = delta.get('content', '')
reasoning_content = delta.get('reasoning_content', '')
# Handle reasoning_content based on remove_think flag
if reasoning_content:
if remove_think:
# Skip reasoning content when remove_think is True
chunk_idx += 1
continue
else:
# Use reasoning_content as the displayed content
delta_content = reasoning_content
tool_calls = self._normalize_stream_tool_calls(delta.get('tool_calls'), tool_call_state)
if chunk_idx == 0 and not delta_content and not tool_calls:
chunk_idx += 1
continue
chunk_data = {
'role': role,
'content': delta_content if delta_content else None,
'tool_calls': tool_calls,
'is_final': bool(finish_reason),
}
chunk_data = {k: v for k, v in chunk_data.items() if v is not None}
yield provider_message.MessageChunk(**chunk_data)
chunk_idx += 1
except Exception as e:
self._handle_litellm_error(e)
async def invoke_embedding(
self,
model: requester.RuntimeEmbeddingModel,
input_text: list[str],
extra_args: dict[str, typing.Any] = {},
) -> tuple[list[list[float]], dict]:
"""Invoke embedding and return vectors with usage info."""
model_name = self._build_litellm_model_name(model.model_entity.name)
api_key = model.provider.token_mgr.get_token()
args = {
'model': model_name,
'input': input_text,
'api_key': api_key,
}
self._build_common_args(args, include_retry_params=False)
if model.model_entity.extra_args:
args.update(model.model_entity.extra_args)
args.update(extra_args)
try:
response = await aembedding(**args)
embeddings = [d.embedding for d in response.data]
usage_info = self._extract_usage(response)
return embeddings, usage_info
except Exception as e:
self._handle_litellm_error(e)
async def invoke_rerank(
self,
model: requester.RuntimeRerankModel,
query: str,
documents: typing.List[str],
extra_args: dict[str, typing.Any] = {},
) -> typing.List[dict]:
"""Invoke rerank and return relevance scores."""
model_name = self._build_litellm_model_name(model.model_entity.name)
api_key = model.provider.token_mgr.get_token()
args = {
'model': model_name,
'query': query,
'documents': documents,
'api_key': api_key,
'top_n': min(len(documents), 64),
}
self._build_common_args(args, include_retry_params=False)
if model.model_entity.extra_args:
args.update(model.model_entity.extra_args)
args.update(extra_args)
try:
response = await arerank(**args)
results = []
for r in response.results:
results.append(
{
'index': r.get('index', 0),
'relevance_score': r.get('relevance_score', 0.0),
}
)
if results:
scores = [r['relevance_score'] for r in results]
min_score = min(scores)
max_score = max(scores)
if max_score - min_score > 1e-6:
for r in results:
r['relevance_score'] = (r['relevance_score'] - min_score) / (max_score - min_score)
return results
except Exception as e:
self._handle_litellm_error(e)
async def scan_models(self, api_key: str | None = None) -> dict[str, typing.Any]:
"""Scan models supported by the provider."""
import httpx
base_url = self.requester_cfg.get('base_url', '').rstrip('/')
timeout = self.requester_cfg.get('timeout', 120)
if not base_url:
raise errors.RequesterError('Base URL required for model scanning')
headers = {}
if api_key:
headers['Authorization'] = f'Bearer {api_key}'
models_url = f'{base_url}/models'
try:
async with httpx.AsyncClient(trust_env=True, timeout=timeout) as client:
response = await client.get(models_url, headers=headers)
response.raise_for_status()
payload = response.json()
models = []
for item in payload.get('data', []):
model_id = item.get('id')
if not model_id:
continue
models.append(self._enrich_scanned_model(model_id, item))
models.sort(key=lambda x: (x['type'] != 'llm', x['name'].lower()))
return {'models': models}
except httpx.HTTPStatusError as e:
raise errors.RequesterError(f'Model scan failed: {e.response.status_code}')
except httpx.TimeoutException:
raise errors.RequesterError('Model scan timeout')
except Exception as e:
raise errors.RequesterError(f'Model scan error: {str(e)}')

View File

@@ -1,64 +0,0 @@
apiVersion: v1
kind: LLMAPIRequester
metadata:
name: litellm-chat
label:
en_US: LiteLLM (Unified)
zh_Hans: LiteLLM (统一请求器)
icon: litellm.svg
spec:
config:
- name: base_url
label:
en_US: Base URL
zh_Hans: 基础 URL
type: string
required: false
default: ''
- name: timeout
label:
en_US: Timeout
zh_Hans: 超时时间
type: integer
required: true
default: 120
- name: custom_llm_provider
label:
en_US: Custom Provider
zh_Hans: 自定义 Provider
type: string
required: false
default: ''
description:
en_US: Force provider type (e.g., anthropic, openai, gemini)
zh_Hans: 强制指定 provider 类型(如 anthropic, openai, gemini
- name: drop_params
label:
en_US: Drop Unsupported Params
zh_Hans: 丢弃不支持参数
type: boolean
required: false
default: false
- name: num_retries
label:
en_US: Number of Retries
zh_Hans: 重试次数
type: integer
required: false
default: 0
- name: api_version
label:
en_US: API Version
zh_Hans: API 版本
type: string
required: false
default: ''
support_type:
- llm
- text-embedding
- rerank
provider_category: unified
execution:
python:
path: ./litellmchat.py
attr: LiteLLMRequester

View File

@@ -0,0 +1,17 @@
from __future__ import annotations
import typing
import openai
from . import chatcmpl
class LmStudioChatCompletions(chatcmpl.OpenAIChatCompletions):
"""LMStudio ChatCompletion API 请求器"""
client: openai.AsyncClient
default_config: dict[str, typing.Any] = {
'base_url': 'http://127.0.0.1:1234/v1',
'timeout': 120,
}

View File

@@ -7,7 +7,6 @@ metadata:
zh_Hans: LM Studio
icon: lmstudio.webp
spec:
litellm_provider: openai
config:
- name: base_url
label:

View File

@@ -1,4 +0,0 @@
<svg width="60" height="50" viewBox="0 0 60 50" xmlns="http://www.w3.org/2000/svg">
<rect width="60" height="50" rx="8" fill="#FF6700"/>
<text x="30" y="32" font-family="Arial, sans-serif" font-size="18" font-weight="bold" fill="white" text-anchor="middle">MiMo</text>
</svg>

Before

Width:  |  Height:  |  Size: 280 B

View File

@@ -1,30 +0,0 @@
apiVersion: v1
kind: LLMAPIRequester
metadata:
name: mimo-chat-completions
label:
en_US: Xiaomi MiMo
zh_Hans: 小米 MiMo
icon: mimo.svg
spec:
litellm_provider: openai
config:
- name: base_url
label:
en_US: Base URL
zh_Hans: 基础 URL
type: string
required: true
default: https://api.xiaomimimo.com/v1
- name: timeout
label:
en_US: Timeout
zh_Hans: 超时时间
type: integer
required: true
default: 120
support_type:
- llm
- text-embedding
- rerank
provider_category: manufacturer

View File

@@ -1,4 +0,0 @@
<svg width="60" height="50" viewBox="0 0 60 50" xmlns="http://www.w3.org/2000/svg">
<rect width="60" height="50" rx="8" fill="#4F46E5"/>
<text x="30" y="32" font-family="Arial, sans-serif" font-size="12" font-weight="bold" fill="white" text-anchor="middle">MiniMax</text>
</svg>

Before

Width:  |  Height:  |  Size: 283 B

View File

@@ -1,30 +0,0 @@
apiVersion: v1
kind: LLMAPIRequester
metadata:
name: minimax-chat-completions
label:
en_US: MiniMax
zh_Hans: MiniMax
icon: minimax.svg
spec:
litellm_provider: openai
config:
- name: base_url
label:
en_US: Base URL
zh_Hans: 基础 URL
type: string
required: true
default: https://api.minimax.chat/v1
- name: timeout
label:
en_US: Timeout
zh_Hans: 超时时间
type: integer
required: true
default: 120
support_type:
- llm
- text-embedding
- rerank
provider_category: manufacturer

View File

@@ -1,5 +0,0 @@
<svg width="60" height="50" viewBox="0 0 60 50" xmlns="http://www.w3.org/2000/svg">
<rect width="60" height="50" rx="8" fill="#FF6B35"/>
<text x="30" y="28" font-family="Arial, sans-serif" font-size="10" font-weight="bold" fill="white" text-anchor="middle">Mistral</text>
<text x="30" y="40" font-family="Arial, sans-serif" font-size="8" fill="white" text-anchor="middle">AI</text>
</svg>

Before

Width:  |  Height:  |  Size: 395 B

View File

@@ -1,30 +0,0 @@
apiVersion: v1
kind: LLMAPIRequester
metadata:
name: mistral-chat-completions
label:
en_US: Mistral AI
zh_Hans: Mistral AI
icon: mistral.svg
spec:
litellm_provider: mistral
config:
- name: base_url
label:
en_US: Base URL
zh_Hans: 基础 URL
type: string
required: true
default: https://api.mistral.ai/v1
- name: timeout
label:
en_US: Timeout
zh_Hans: 超时时间
type: integer
required: true
default: 120
support_type:
- llm
- text-embedding
- rerank
provider_category: manufacturer

View File

@@ -0,0 +1,561 @@
from __future__ import annotations
import asyncio
import typing
import openai
import openai.types.chat.chat_completion as chat_completion
import httpx
from .. import entities, errors, requester
import langbot_plugin.api.entities.builtin.resource.tool as resource_tool
import langbot_plugin.api.entities.builtin.pipeline.query as pipeline_query
import langbot_plugin.api.entities.builtin.provider.message as provider_message
class ModelScopeChatCompletions(requester.ProviderAPIRequester):
"""ModelScope ChatCompletion API 请求器"""
client: openai.AsyncClient
default_config: dict[str, typing.Any] = {
'base_url': 'https://api-inference.modelscope.cn/v1',
'timeout': 120,
}
async def initialize(self):
self.client = openai.AsyncClient(
api_key=self.init_api_key,
base_url=self.requester_cfg['base_url'],
timeout=self.requester_cfg['timeout'],
http_client=httpx.AsyncClient(trust_env=True, timeout=self.requester_cfg['timeout']),
)
def _mask_api_key(self, api_key: str | None) -> str:
if not api_key:
return ''
if len(api_key) <= 8:
return '****'
return f'{api_key[:4]}...{api_key[-4:]}'
def _infer_model_type(self, model_id: str) -> str:
normalized_model_id = (model_id or '').lower()
embedding_keywords = (
'embedding',
'embed',
'bge-',
'e5-',
'm3e',
'gte-',
'multilingual-e5',
'text-embedding',
)
return 'embedding' if any(keyword in normalized_model_id for keyword in embedding_keywords) else 'llm'
def _infer_model_abilities(self, item: dict[str, typing.Any], model_id: str) -> list[str]:
normalized_model_id = (model_id or '').lower()
abilities: set[str] = set()
def _flatten(value: typing.Any) -> list[str]:
if value is None:
return []
if isinstance(value, str):
return [value.lower()]
if isinstance(value, dict):
flattened: list[str] = []
for nested_value in value.values():
flattened.extend(_flatten(nested_value))
return flattened
if isinstance(value, (list, tuple, set)):
flattened: list[str] = []
for nested_value in value:
flattened.extend(_flatten(nested_value))
return flattened
return [str(value).lower()]
capability_tokens = _flatten(item.get('capabilities'))
capability_tokens.extend(_flatten(item.get('modalities')))
capability_tokens.extend(_flatten(item.get('input_modalities')))
capability_tokens.extend(_flatten(item.get('output_modalities')))
capability_tokens.extend(_flatten(item.get('supported_generation_methods')))
capability_tokens.extend(_flatten(item.get('supported_parameters')))
capability_tokens.extend(_flatten(item.get('architecture')))
combined_tokens = capability_tokens + [normalized_model_id]
vision_keywords = ('vision', 'image', 'file', 'video', 'multimodal', 'vl', 'ocr', 'omni')
function_call_keywords = ('function', 'tool', 'tools', 'tool_choice', 'tool_call', 'tool-use', 'tool_use')
if any(any(keyword in token for keyword in vision_keywords) for token in combined_tokens):
abilities.add('vision')
if any(any(keyword in token for keyword in function_call_keywords) for token in combined_tokens):
abilities.add('func_call')
return sorted(abilities)
def _normalize_modalities(self, value: typing.Any) -> list[str]:
normalized: list[str] = []
def _collect(item: typing.Any):
if item is None:
return
if isinstance(item, str):
for part in item.replace('->', ',').replace('+', ',').split(','):
token = part.strip().lower()
if token and token not in normalized:
normalized.append(token)
return
if isinstance(item, dict):
for nested in item.values():
_collect(nested)
return
if isinstance(item, (list, tuple, set)):
for nested in item:
_collect(nested)
return
_collect(value)
return normalized
def _extract_scan_metadata(self, item: dict[str, typing.Any], model_id: str) -> dict[str, typing.Any]:
display_name = item.get('name')
if not isinstance(display_name, str) or not display_name.strip() or display_name == model_id:
display_name = ''
description = item.get('description')
if not isinstance(description, str) or not description.strip():
description = ''
context_length = item.get('context_length')
if context_length is None and isinstance(item.get('top_provider'), dict):
context_length = item['top_provider'].get('context_length')
if not isinstance(context_length, int):
try:
context_length = int(context_length) if context_length is not None else None
except (TypeError, ValueError):
context_length = None
input_modalities = self._normalize_modalities(item.get('input_modalities'))
output_modalities = self._normalize_modalities(item.get('output_modalities'))
if isinstance(item.get('architecture'), dict):
if not input_modalities:
input_modalities = self._normalize_modalities(item['architecture'].get('input_modalities'))
if not output_modalities:
output_modalities = self._normalize_modalities(item['architecture'].get('output_modalities'))
owned_by = item.get('owned_by')
if not isinstance(owned_by, str) or not owned_by.strip():
owned_by = ''
return {
'display_name': display_name or None,
'description': description or None,
'context_length': context_length,
'owned_by': owned_by or None,
'input_modalities': input_modalities,
'output_modalities': output_modalities,
}
async def scan_models(self, api_key: str | None = None) -> dict[str, typing.Any]:
headers = {}
if api_key:
headers['Authorization'] = f'Bearer {api_key}'
models_url = f'{self.requester_cfg["base_url"].rstrip("/")}/models'
async with httpx.AsyncClient(trust_env=True, timeout=self.requester_cfg['timeout']) as client:
response = await client.get(models_url, headers=headers)
response.raise_for_status()
payload = response.json()
models = []
for item in payload.get('data', []):
model_id = item.get('id')
if not model_id:
continue
models.append(
{
'id': model_id,
'name': model_id,
'type': self._infer_model_type(model_id),
'abilities': self._infer_model_abilities(item, model_id),
**self._extract_scan_metadata(item, model_id),
}
)
models.sort(key=lambda item: (item['type'] != 'llm', item['name'].lower()))
return {
'models': models,
'debug': {
'request': {
'method': 'GET',
'url': models_url,
'headers': {
'Authorization': f'Bearer {self._mask_api_key(api_key)}' if api_key else '',
},
},
'response': payload,
},
}
async def _req(
self,
query: pipeline_query.Query,
args: dict,
extra_body: dict = {},
remove_think: bool = False,
) -> list[dict[str, typing.Any]]:
args['stream'] = True
chunk = None
pending_content = ''
tool_calls = []
resp_gen: openai.AsyncStream = await self.client.chat.completions.create(**args, extra_body=extra_body)
chunk_idx = 0
thinking_started = False
thinking_ended = False
tool_id = ''
tool_name = ''
message_delta = {}
async for chunk in resp_gen:
if not chunk or not chunk.id or not chunk.choices or not chunk.choices[0] or not chunk.choices[0].delta:
continue
delta = chunk.choices[0].delta.model_dump() if hasattr(chunk.choices[0], 'delta') else {}
reasoning_content = delta.get('reasoning_content')
# 处理 reasoning_content
if reasoning_content:
# accumulated_reasoning += reasoning_content
# 如果设置了 remove_think跳过 reasoning_content
if remove_think:
chunk_idx += 1
continue
# 第一次出现 reasoning_content添加 <think> 开始标签
if not thinking_started:
thinking_started = True
pending_content += '<think>\n' + reasoning_content
else:
# 继续输出 reasoning_content
pending_content += reasoning_content
elif thinking_started and not thinking_ended and delta.get('content'):
# reasoning_content 结束normal content 开始,添加 </think> 结束标签
thinking_ended = True
pending_content += '\n</think>\n' + delta.get('content')
if delta.get('content') is not None:
pending_content += delta.get('content')
if delta.get('tool_calls') is not None:
for tool_call in delta.get('tool_calls'):
if tool_call['id'] != '':
tool_id = tool_call['id']
if tool_call['function']['name'] is not None:
tool_name = tool_call['function']['name']
if tool_call['function']['arguments'] is None:
continue
tool_call['id'] = tool_id
tool_call['name'] = tool_name
for tc in tool_calls:
if tc['index'] == tool_call['index']:
tc['function']['arguments'] += tool_call['function']['arguments']
break
else:
tool_calls.append(tool_call)
if chunk.choices[0].finish_reason is not None:
break
message_delta['content'] = pending_content
message_delta['role'] = 'assistant'
message_delta['tool_calls'] = tool_calls if tool_calls else None
return [message_delta]
async def _make_msg(
self,
chat_completion: list[dict[str, typing.Any]],
) -> provider_message.Message:
chatcmpl_message = chat_completion[0]
# 确保 role 字段存在且不为 None
if 'role' not in chatcmpl_message or chatcmpl_message['role'] is None:
chatcmpl_message['role'] = 'assistant'
message = provider_message.Message(**chatcmpl_message)
return message
async def _closure(
self,
query: pipeline_query.Query,
req_messages: list[dict],
use_model: requester.RuntimeLLMModel,
use_funcs: list[resource_tool.LLMTool] = None,
extra_args: dict[str, typing.Any] = {},
remove_think: bool = False,
) -> tuple[provider_message.Message, dict]:
self.client.api_key = use_model.provider.token_mgr.get_token()
args = {}
args['model'] = use_model.model_entity.name
if use_funcs:
tools = await self.ap.tool_mgr.generate_tools_for_openai(use_funcs)
if tools:
args['tools'] = tools
# 设置此次请求中的messages
messages = req_messages.copy()
# 检查vision
for msg in messages:
if 'content' in msg and isinstance(msg['content'], list):
for me in msg['content']:
if me['type'] == 'image_base64':
me['image_url'] = {'url': me['image_base64']}
me['type'] = 'image_url'
del me['image_base64']
args['messages'] = messages
# 发送请求
resp = await self._req(query, args, extra_body=extra_args, remove_think=remove_think)
# 处理请求结果
message = await self._make_msg(resp)
# ModelScope uses streaming, usage info not available
usage_info = {}
return message, usage_info
async def _req_stream(
self,
args: dict,
extra_body: dict = {},
) -> chat_completion.ChatCompletion:
async for chunk in await self.client.chat.completions.create(**args, extra_body=extra_body):
yield chunk
async def _closure_stream(
self,
query: pipeline_query.Query,
req_messages: list[dict],
use_model: requester.RuntimeLLMModel,
use_funcs: list[resource_tool.LLMTool] = None,
extra_args: dict[str, typing.Any] = {},
remove_think: bool = False,
) -> provider_message.Message | typing.AsyncGenerator[provider_message.MessageChunk, None]:
self.client.api_key = use_model.provider.token_mgr.get_token()
args = {}
args['model'] = use_model.model_entity.name
if use_funcs:
tools = await self.ap.tool_mgr.generate_tools_for_openai(use_funcs)
if tools:
args['tools'] = tools
# 设置此次请求中的messages
messages = req_messages.copy()
# 检查vision
for msg in messages:
if 'content' in msg and isinstance(msg['content'], list):
for me in msg['content']:
if me['type'] == 'image_base64':
me['image_url'] = {'url': me['image_base64']}
me['type'] = 'image_url'
del me['image_base64']
args['messages'] = messages
args['stream'] = True
# 流式处理状态
# tool_calls_map: dict[str, provider_message.ToolCall] = {}
chunk_idx = 0
thinking_started = False
thinking_ended = False
role = 'assistant' # 默认角色
# accumulated_reasoning = '' # 仅用于判断何时结束思维链
async for chunk in self._req_stream(args, extra_body=extra_args):
# 解析 chunk 数据
if hasattr(chunk, 'choices') and chunk.choices:
choice = chunk.choices[0]
delta = choice.delta.model_dump() if hasattr(choice, 'delta') else {}
finish_reason = getattr(choice, 'finish_reason', None)
else:
delta = {}
finish_reason = None
# 从第一个 chunk 获取 role后续使用这个 role
if 'role' in delta and delta['role']:
role = delta['role']
# 获取增量内容
delta_content = delta.get('content', '')
reasoning_content = delta.get('reasoning_content', '')
# 处理 reasoning_content
if reasoning_content:
# accumulated_reasoning += reasoning_content
# 如果设置了 remove_think跳过 reasoning_content
if remove_think:
chunk_idx += 1
continue
# 第一次出现 reasoning_content添加 <think> 开始标签
if not thinking_started:
thinking_started = True
delta_content = '<think>\n' + reasoning_content
else:
# 继续输出 reasoning_content
delta_content = reasoning_content
elif thinking_started and not thinking_ended and delta_content:
# reasoning_content 结束normal content 开始,添加 </think> 结束标签
thinking_ended = True
delta_content = '\n</think>\n' + delta_content
# 处理 content 中已有的 <think> 标签(如果需要移除)
# if delta_content and remove_think and '<think>' in delta_content:
# import re
#
# # 移除 <think> 标签及其内容
# delta_content = re.sub(r'<think>.*?</think>', '', delta_content, flags=re.DOTALL)
# 处理工具调用增量
if delta.get('tool_calls'):
for tool_call in delta['tool_calls']:
if tool_call['id'] != '':
tool_id = tool_call['id']
if tool_call['function']['name'] is not None:
tool_name = tool_call['function']['name']
if tool_call['type'] is None:
tool_call['type'] = 'function'
tool_call['id'] = tool_id
tool_call['function']['name'] = tool_name
tool_call['function']['arguments'] = (
'' if tool_call['function']['arguments'] is None else tool_call['function']['arguments']
)
# 跳过空的第一个 chunk只有 role 没有内容)
if chunk_idx == 0 and not delta_content and not reasoning_content and not delta.get('tool_calls'):
chunk_idx += 1
continue
# 构建 MessageChunk - 只包含增量内容
chunk_data = {
'role': role,
'content': delta_content if delta_content else None,
'tool_calls': delta.get('tool_calls'),
'is_final': bool(finish_reason),
}
# 移除 None 值
chunk_data = {k: v for k, v in chunk_data.items() if v is not None}
yield provider_message.MessageChunk(**chunk_data)
chunk_idx += 1
# return
async def invoke_llm(
self,
query: pipeline_query.Query,
model: entities.LLMModelInfo,
messages: typing.List[provider_message.Message],
funcs: typing.List[resource_tool.LLMTool] = None,
extra_args: dict[str, typing.Any] = {},
remove_think: bool = False,
) -> provider_message.Message:
req_messages = [] # req_messages 仅用于类内,外部同步由 query.messages 进行
for m in messages:
msg_dict = m.dict(exclude_none=True)
content = msg_dict.get('content')
if isinstance(content, list):
# 检查 content 列表中是否每个部分都是文本
if all(isinstance(part, dict) and part.get('type') == 'text' for part in content):
# 将所有文本部分合并为一个字符串
msg_dict['content'] = '\n'.join(part['text'] for part in content)
req_messages.append(msg_dict)
try:
return await self._closure(
query=query,
req_messages=req_messages,
use_model=model,
use_funcs=funcs,
extra_args=extra_args,
remove_think=remove_think,
)
except asyncio.TimeoutError:
raise errors.RequesterError('请求超时')
except openai.BadRequestError as e:
if 'context_length_exceeded' in e.message:
raise errors.RequesterError(f'上文过长,请重置会话: {e.message}')
else:
raise errors.RequesterError(f'请求参数错误: {e.message}')
except openai.AuthenticationError as e:
raise errors.RequesterError(f'无效的 api-key: {e.message}')
except openai.NotFoundError as e:
raise errors.RequesterError(f'请求路径错误: {e.message}')
except openai.RateLimitError as e:
raise errors.RequesterError(f'请求过于频繁或余额不足: {e.message}')
except openai.APIError as e:
raise errors.RequesterError(f'请求错误: {e.message}')
async def invoke_llm_stream(
self,
query: pipeline_query.Query,
model: requester.RuntimeLLMModel,
messages: typing.List[provider_message.Message],
funcs: typing.List[resource_tool.LLMTool] = None,
extra_args: dict[str, typing.Any] = {},
remove_think: bool = False,
) -> provider_message.MessageChunk:
req_messages = [] # req_messages 仅用于类内,外部同步由 query.messages 进行
for m in messages:
msg_dict = m.dict(exclude_none=True)
content = msg_dict.get('content')
if isinstance(content, list):
# 检查 content 列表中是否每个部分都是文本
if all(isinstance(part, dict) and part.get('type') == 'text' for part in content):
# 将所有文本部分合并为一个字符串
msg_dict['content'] = '\n'.join(part['text'] for part in content)
req_messages.append(msg_dict)
try:
async for item in self._closure_stream(
query=query,
req_messages=req_messages,
use_model=model,
use_funcs=funcs,
extra_args=extra_args,
remove_think=remove_think,
):
yield item
except asyncio.TimeoutError:
raise errors.RequesterError('请求超时')
except openai.BadRequestError as e:
if 'context_length_exceeded' in e.message:
raise errors.RequesterError(f'上文过长,请重置会话: {e.message}')
else:
raise errors.RequesterError(f'请求参数错误: {e.message}')
except openai.AuthenticationError as e:
raise errors.RequesterError(f'无效的 api-key: {e.message}')
except openai.NotFoundError as e:
raise errors.RequesterError(f'请求路径错误: {e.message}')
except openai.RateLimitError as e:
raise errors.RequesterError(f'请求过于频繁或余额不足: {e.message}')
except openai.APIError as e:
raise errors.RequesterError(f'请求错误: {e.message}')

View File

@@ -7,7 +7,6 @@ metadata:
zh_Hans: 魔搭社区
icon: modelscope.svg
spec:
litellm_provider: openai
config:
- name: base_url
label:
@@ -32,8 +31,6 @@ spec:
default: 120
support_type:
- llm
- text-embedding
- rerank
provider_category: maas
execution:
python:

View File

@@ -0,0 +1,67 @@
from __future__ import annotations
import typing
from . import chatcmpl
from .. import requester
import langbot_plugin.api.entities.builtin.resource.tool as resource_tool
import langbot_plugin.api.entities.builtin.pipeline.query as pipeline_query
import langbot_plugin.api.entities.builtin.provider.message as provider_message
class MoonshotChatCompletions(chatcmpl.OpenAIChatCompletions):
"""Moonshot ChatCompletion API 请求器"""
default_config: dict[str, typing.Any] = {
'base_url': 'https://api.moonshot.cn/v1',
'timeout': 120,
}
async def _closure(
self,
query: pipeline_query.Query,
req_messages: list[dict],
use_model: requester.RuntimeLLMModel,
use_funcs: list[resource_tool.LLMTool] = None,
extra_args: dict[str, typing.Any] = {},
remove_think: bool = False,
) -> tuple[provider_message.Message, dict]:
self.client.api_key = use_model.provider.token_mgr.get_token()
args = {}
args['model'] = use_model.model_entity.name
if use_funcs:
tools = await self.ap.tool_mgr.generate_tools_for_openai(use_funcs)
if tools:
args['tools'] = tools
# 设置此次请求中的messages
messages = req_messages
# deepseek 不支持多模态把content都转换成纯文字
for m in messages:
if 'content' in m and isinstance(m['content'], list):
m['content'] = ' '.join([c['text'] for c in m['content']])
# 删除空的,不知道干嘛的,直接删了。
# messages = [m for m in messages if m["content"].strip() != "" and ('tool_calls' not in m or not m['tool_calls'])]
args['messages'] = messages
# 发送请求
resp = await self._req(args, extra_body=extra_args)
# 处理请求结果
message = await self._make_msg(resp, remove_think)
# Extract token usage from response
usage_info = {}
if hasattr(resp, 'usage') and resp.usage:
usage_info['input_tokens'] = resp.usage.prompt_tokens or 0
usage_info['output_tokens'] = resp.usage.completion_tokens or 0
usage_info['total_tokens'] = resp.usage.total_tokens or 0
return message, usage_info

View File

@@ -7,7 +7,6 @@ metadata:
zh_Hans: 月之暗面
icon: moonshot.png
spec:
litellm_provider: openai
config:
- name: base_url
label:
@@ -25,8 +24,6 @@ spec:
default: 120
support_type:
- llm
- text-embedding
- rerank
provider_category: manufacturer
execution:
python:

View File

@@ -0,0 +1,17 @@
from __future__ import annotations
import typing
import openai
from . import chatcmpl
class NewAPIChatCompletions(chatcmpl.OpenAIChatCompletions):
"""New API ChatCompletion API 请求器"""
client: openai.AsyncClient
default_config: dict[str, typing.Any] = {
'base_url': 'http://localhost:3000/v1',
'timeout': 120,
}

View File

@@ -7,7 +7,6 @@ metadata:
zh_Hans: New API
icon: newapi.png
spec:
litellm_provider: openai
config:
- name: base_url
label:

View File

@@ -0,0 +1,314 @@
from __future__ import annotations
import asyncio
import os
import typing
from typing import Union, Mapping, Any, AsyncIterator
import uuid
import json
import ollama
import httpx
from .. import errors, requester
import langbot_plugin.api.entities.builtin.resource.tool as resource_tool
import langbot_plugin.api.entities.builtin.pipeline.query as pipeline_query
import langbot_plugin.api.entities.builtin.provider.message as provider_message
REQUESTER_NAME: str = 'ollama-chat'
class OllamaChatCompletions(requester.ProviderAPIRequester):
"""Ollama平台 ChatCompletion API请求器"""
client: ollama.AsyncClient
default_config: dict[str, typing.Any] = {
'base_url': 'http://127.0.0.1:11434',
'timeout': 120,
}
async def initialize(self):
os.environ['OLLAMA_HOST'] = self.requester_cfg['base_url']
self.client = ollama.AsyncClient(timeout=self.requester_cfg['timeout'])
def _infer_model_type(self, model_id: str) -> str:
normalized_model_id = (model_id or '').lower()
embedding_keywords = ('embedding', 'embed', 'bge-', 'e5-', 'm3e', 'gte-', 'text-embedding')
return 'embedding' if any(keyword in normalized_model_id for keyword in embedding_keywords) else 'llm'
def _infer_model_abilities(self, item: dict[str, typing.Any], model_id: str) -> list[str]:
normalized_model_id = (model_id or '').lower()
abilities: set[str] = set()
details = item.get('details', {}) or {}
families = details.get('families', []) or []
tokens = [normalized_model_id, str(details.get('family', '')).lower()]
tokens.extend(str(family).lower() for family in families)
if any(keyword in token for token in tokens for keyword in ('vision', 'vl', 'omni', 'llava', 'ocr')):
abilities.add('vision')
if any(keyword in token for token in tokens for keyword in ('tool', 'function')):
abilities.add('func_call')
return sorted(abilities)
async def scan_models(self, api_key: str | None = None) -> dict[str, typing.Any]:
del api_key
models_url = f'{self.requester_cfg["base_url"].rstrip("/")}/api/tags'
async with httpx.AsyncClient(trust_env=True, timeout=self.requester_cfg['timeout']) as client:
response = await client.get(models_url)
response.raise_for_status()
payload = response.json()
models: list[dict[str, typing.Any]] = []
for item in payload.get('models', []):
model_id = item.get('model') or item.get('name')
if not model_id:
continue
models.append(
{
'id': model_id,
'name': item.get('name', model_id),
'type': self._infer_model_type(model_id),
'abilities': self._infer_model_abilities(item, model_id),
}
)
models.sort(key=lambda item: (item['type'] != 'llm', item['name'].lower()))
return {
'models': models,
'debug': {
'request': {
'method': 'GET',
'url': models_url,
},
'response': payload,
},
}
async def _req(
self,
args: dict,
) -> Union[Mapping[str, Any], AsyncIterator[Mapping[str, Any]]]:
return await self.client.chat(**args)
async def _closure(
self,
query: pipeline_query.Query,
req_messages: list[dict],
use_model: requester.RuntimeLLMModel,
use_funcs: list[resource_tool.LLMTool] = None,
extra_args: dict[str, typing.Any] = {},
remove_think: bool = False,
) -> provider_message.Message:
args = extra_args.copy()
args['model'] = use_model.model_entity.name
messages: list[dict] = req_messages.copy()
for msg in messages:
if 'content' in msg and isinstance(msg['content'], list):
text_content: list = []
image_urls: list = []
for me in msg['content']:
if me['type'] == 'text':
text_content.append(me['text'])
elif me['type'] == 'image_base64':
image_urls.append(me['image_base64'])
msg['content'] = '\n'.join(text_content)
msg['images'] = [url.split(',')[1] for url in image_urls]
if 'tool_calls' in msg: # LangBot 内部以 str 存储 tool_calls 的参数,这里需要转换为 dict
for tool_call in msg['tool_calls']:
tool_call['function']['arguments'] = json.loads(tool_call['function']['arguments'])
args['messages'] = messages
args['tools'] = []
if use_funcs:
tools = await self.ap.tool_mgr.generate_tools_for_openai(use_funcs)
if tools:
args['tools'] = tools
resp = await self._req(args)
message: provider_message.Message = await self._make_msg(resp)
return message
async def _make_msg(self, chat_completions: ollama.ChatResponse) -> provider_message.Message:
message: ollama.Message = chat_completions.message
if message is None:
raise ValueError("chat_completions must contain a 'message' field")
ret_msg: provider_message.Message = None
if message.content is not None:
ret_msg = provider_message.Message(role='assistant', content=message.content)
if message.tool_calls is not None and len(message.tool_calls) > 0:
tool_calls: list[provider_message.ToolCall] = []
for tool_call in message.tool_calls:
tool_calls.append(
provider_message.ToolCall(
id=uuid.uuid4().hex,
type='function',
function=provider_message.FunctionCall(
name=tool_call.function.name,
arguments=json.dumps(tool_call.function.arguments),
),
)
)
ret_msg.tool_calls = tool_calls
return ret_msg
async def _prepare_messages(
self,
messages: typing.List[provider_message.Message],
) -> list[dict]:
"""Prepare messages for Ollama API request."""
req_messages: list = []
for m in messages:
msg_dict: dict = m.dict(exclude_none=True)
content: Any = msg_dict.get('content')
if isinstance(content, list):
if all(isinstance(part, dict) and part.get('type') == 'text' for part in content):
msg_dict['content'] = '\n'.join(part['text'] for part in content)
req_messages.append(msg_dict)
return req_messages
async def invoke_llm(
self,
query: pipeline_query.Query,
model: requester.RuntimeLLMModel,
messages: typing.List[provider_message.Message],
funcs: typing.List[resource_tool.LLMTool] = None,
extra_args: dict[str, typing.Any] = {},
remove_think: bool = False,
) -> provider_message.Message:
req_messages = await self._prepare_messages(messages)
try:
return await self._closure(
query=query,
req_messages=req_messages,
use_model=model,
use_funcs=funcs,
extra_args=extra_args,
remove_think=remove_think,
)
except asyncio.TimeoutError:
raise errors.RequesterError('请求超时')
async def invoke_llm_stream(
self,
query: pipeline_query.Query,
model: requester.RuntimeLLMModel,
messages: typing.List[provider_message.Message],
funcs: typing.List[resource_tool.LLMTool] = None,
extra_args: dict[str, typing.Any] = {},
remove_think: bool = False,
) -> provider_message.MessageChunk:
req_messages = await self._prepare_messages(messages)
try:
args = extra_args.copy()
args['model'] = model.model_entity.name
# Process messages for Ollama format
msgs: list[dict] = req_messages.copy()
for msg in msgs:
if 'content' in msg and isinstance(msg['content'], list):
text_content: list = []
image_urls: list = []
for me in msg['content']:
if me['type'] == 'text':
text_content.append(me['text'])
elif me['type'] == 'image_base64':
image_urls.append(me['image_base64'])
msg['content'] = '\n'.join(text_content)
msg['images'] = [url.split(',')[1] for url in image_urls]
if 'tool_calls' in msg:
for tool_call in msg['tool_calls']:
tool_call['function']['arguments'] = json.loads(tool_call['function']['arguments'])
args['messages'] = msgs
args['tools'] = []
if funcs:
tools = await self.ap.tool_mgr.generate_tools_for_openai(funcs)
if tools:
args['tools'] = tools
args['stream'] = True
chunk_idx = 0
thinking_started = False
thinking_ended = False
role = 'assistant'
async for chunk in await self.client.chat(**args):
message: ollama.Message = chunk.message
done = chunk.done
delta_content = message.content or ''
reasoning_content = getattr(message, 'thinking', '') or ''
# Handle reasoning/thinking content
if reasoning_content:
if remove_think:
chunk_idx += 1
continue
if not thinking_started:
thinking_started = True
delta_content = '<think>\n' + reasoning_content
else:
delta_content = reasoning_content
elif thinking_started and not thinking_ended and delta_content:
thinking_ended = True
delta_content = '\n</think>\n' + delta_content
# Handle tool calls
tool_calls_data = None
if message.tool_calls:
tool_calls_data = []
for tc in message.tool_calls:
tool_calls_data.append(
{
'id': uuid.uuid4().hex,
'type': 'function',
'function': {
'name': tc.function.name,
'arguments': json.dumps(tc.function.arguments),
},
}
)
# Skip empty first chunk
if chunk_idx == 0 and not delta_content and not reasoning_content and not tool_calls_data:
chunk_idx += 1
continue
chunk_data = {
'role': role,
'content': delta_content if delta_content else None,
'tool_calls': tool_calls_data,
'is_final': bool(done),
}
chunk_data = {k: v for k, v in chunk_data.items() if v is not None}
yield provider_message.MessageChunk(**chunk_data)
chunk_idx += 1
except asyncio.TimeoutError:
raise errors.RequesterError('请求超时')
async def invoke_embedding(
self,
model: requester.RuntimeEmbeddingModel,
input_text: list[str],
extra_args: dict[str, typing.Any] = {},
) -> list[list[float]]:
return (
await self.client.embed(
model=model.model_entity.name,
input=input_text,
**extra_args,
)
).embeddings

View File

@@ -7,7 +7,6 @@ metadata:
zh_Hans: Ollama
icon: ollama.svg
spec:
litellm_provider: ollama
config:
- name: base_url
label:

View File

@@ -0,0 +1,25 @@
from __future__ import annotations
import typing
import openai
from . import modelscopechatcmpl
class OpenRouterChatCompletions(modelscopechatcmpl.ModelScopeChatCompletions):
"""OpenRouter ChatCompletion API 请求器"""
client: openai.AsyncClient
default_config: dict[str, typing.Any] = {
'base_url': 'https://openrouter.ai/api/v1',
'timeout': 120,
}
async def scan_models(self, api_key: str | None = None) -> dict[str, typing.Any]:
original_base_url = self.requester_cfg.get('base_url', '')
self.requester_cfg['base_url'] = 'https://openrouter.ai/api/v1'
try:
return await super().scan_models(api_key)
finally:
self.requester_cfg['base_url'] = original_base_url

View File

@@ -7,7 +7,6 @@ metadata:
zh_Hans: OpenRouter
icon: openrouter.svg
spec:
litellm_provider: openai
config:
- name: base_url
label:

View File

@@ -0,0 +1,208 @@
from __future__ import annotations
import openai
import typing
from . import chatcmpl
from .. import requester
import openai.types.chat.chat_completion as chat_completion
import re
import langbot_plugin.api.entities.builtin.provider.message as provider_message
import langbot_plugin.api.entities.builtin.pipeline.query as pipeline_query
import langbot_plugin.api.entities.builtin.resource.tool as resource_tool
class PPIOChatCompletions(chatcmpl.OpenAIChatCompletions):
"""欧派云 ChatCompletion API 请求器"""
client: openai.AsyncClient
default_config: dict[str, typing.Any] = {
'base_url': 'https://api.ppinfra.com/v3/openai',
'timeout': 120,
}
is_think: bool = False
async def _make_msg(
self,
chat_completion: chat_completion.ChatCompletion,
remove_think: bool,
) -> provider_message.Message:
chatcmpl_message = chat_completion.choices[0].message.model_dump()
# print(chatcmpl_message.keys(), chatcmpl_message.values())
# 确保 role 字段存在且不为 None
if 'role' not in chatcmpl_message or chatcmpl_message['role'] is None:
chatcmpl_message['role'] = 'assistant'
reasoning_content = chatcmpl_message['reasoning_content'] if 'reasoning_content' in chatcmpl_message else None
# deepseek的reasoner模型
chatcmpl_message['content'] = await self._process_thinking_content(
chatcmpl_message['content'], reasoning_content, remove_think
)
# 移除 reasoning_content 字段,避免传递给 Message
if 'reasoning_content' in chatcmpl_message:
del chatcmpl_message['reasoning_content']
message = provider_message.Message(**chatcmpl_message)
return message
async def _process_thinking_content(
self,
content: str,
reasoning_content: str = None,
remove_think: bool = False,
) -> tuple[str, str]:
"""处理思维链内容
Args:
content: 原始内容
reasoning_content: reasoning_content 字段内容
remove_think: 是否移除思维链
Returns:
处理后的内容
"""
if remove_think:
content = re.sub(r'<think>.*?</think>', '', content, flags=re.DOTALL)
else:
if reasoning_content is not None:
content = '<think>\n' + reasoning_content + '\n</think>\n' + content
return content
async def _make_msg_chunk(
self,
delta: dict[str, typing.Any],
idx: int,
) -> provider_message.MessageChunk:
# 处理流式chunk和完整响应的差异
# print(chat_completion.choices[0])
# 确保 role 字段存在且不为 None
if 'role' not in delta or delta['role'] is None:
delta['role'] = 'assistant'
reasoning_content = delta['reasoning_content'] if 'reasoning_content' in delta else None
delta['content'] = '' if delta['content'] is None else delta['content']
# print(reasoning_content)
# deepseek的reasoner模型
if reasoning_content is not None:
delta['content'] += reasoning_content
message = provider_message.MessageChunk(**delta)
return message
async def _closure_stream(
self,
query: pipeline_query.Query,
req_messages: list[dict],
use_model: requester.RuntimeLLMModel,
use_funcs: list[resource_tool.LLMTool] = None,
extra_args: dict[str, typing.Any] = {},
remove_think: bool = False,
) -> provider_message.Message | typing.AsyncGenerator[provider_message.MessageChunk, None]:
self.client.api_key = use_model.provider.token_mgr.get_token()
args = {}
args['model'] = use_model.model_entity.name
if use_funcs:
tools = await self.ap.tool_mgr.generate_tools_for_openai(use_funcs)
if tools:
args['tools'] = tools
# 设置此次请求中的messages
messages = req_messages.copy()
# 检查vision
for msg in messages:
if 'content' in msg and isinstance(msg['content'], list):
for me in msg['content']:
if me['type'] == 'image_base64':
me['image_url'] = {'url': me['image_base64']}
me['type'] = 'image_url'
del me['image_base64']
args['messages'] = messages
args['stream'] = True
# tool_calls_map: dict[str, provider_message.ToolCall] = {}
chunk_idx = 0
thinking_started = False
thinking_ended = False
role = 'assistant' # 默认角色
async for chunk in self._req_stream(args, extra_body=extra_args):
# 解析 chunk 数据
if hasattr(chunk, 'choices') and chunk.choices:
choice = chunk.choices[0]
delta = choice.delta.model_dump() if hasattr(choice, 'delta') else {}
finish_reason = getattr(choice, 'finish_reason', None)
else:
delta = {}
finish_reason = None
# 从第一个 chunk 获取 role后续使用这个 role
if 'role' in delta and delta['role']:
role = delta['role']
# 获取增量内容
delta_content = delta.get('content', '')
# reasoning_content = delta.get('reasoning_content', '')
if remove_think:
if delta['content'] is not None:
if '<think>' in delta['content'] and not thinking_started and not thinking_ended:
thinking_started = True
continue
elif delta['content'] == r'</think>' and not thinking_ended:
thinking_ended = True
continue
elif thinking_ended and delta['content'] == '\n\n' and thinking_started:
thinking_started = False
continue
elif thinking_started and not thinking_ended:
continue
# delta_tool_calls = None
if delta.get('tool_calls'):
for tool_call in delta['tool_calls']:
if tool_call['id'] and tool_call['function']['name']:
tool_id = tool_call['id']
tool_name = tool_call['function']['name']
if tool_call['id'] is None:
tool_call['id'] = tool_id
if tool_call['function']['name'] is None:
tool_call['function']['name'] = tool_name
if tool_call['function']['arguments'] is None:
tool_call['function']['arguments'] = ''
if tool_call['type'] is None:
tool_call['type'] = 'function'
# 跳过空的第一个 chunk只有 role 没有内容)
if chunk_idx == 0 and not delta_content and not delta.get('tool_calls'):
chunk_idx += 1
continue
# 构建 MessageChunk - 只包含增量内容
chunk_data = {
'role': role,
'content': delta_content if delta_content else None,
'tool_calls': delta.get('tool_calls'),
'is_final': bool(finish_reason),
}
# 移除 None 值
chunk_data = {k: v for k, v in chunk_data.items() if v is not None}
yield provider_message.MessageChunk(**chunk_data)
chunk_idx += 1

View File

@@ -7,7 +7,6 @@ metadata:
zh_Hans: 派欧云
icon: ppio.svg
spec:
litellm_provider: openai
config:
- name: base_url
label:

View File

@@ -0,0 +1,17 @@
from __future__ import annotations
import openai
import typing
from . import chatcmpl
class QHAIGCChatCompletions(chatcmpl.OpenAIChatCompletions):
"""启航 AI ChatCompletion API 请求器"""
client: openai.AsyncClient
default_config: dict[str, typing.Any] = {
'base_url': 'https://api.qhaigc.com/v1',
'timeout': 120,
}

View File

@@ -7,7 +7,6 @@ metadata:
zh_Hans: 启航 AI
icon: qhaigc.png
spec:
litellm_provider: openai
config:
- name: base_url
label:

View File

@@ -2,16 +2,19 @@ from __future__ import annotations
import typing
from . import litellmchat
import openai
from . import chatcmpl
class QiniuChatCompletions(litellmchat.LiteLLMRequester):
class QiniuChatCompletions(chatcmpl.OpenAIChatCompletions):
"""七牛云 ChatCompletion API 请求器"""
client: openai.AsyncClient
default_config: dict[str, typing.Any] = {
'base_url': 'https://api.qnaigc.com/v1',
'timeout': 120,
'custom_llm_provider': 'openai',
}
async def scan_models(self, api_key: str | None = None) -> dict[str, typing.Any]:

View File

@@ -0,0 +1,32 @@
from __future__ import annotations
import openai
import typing
from . import chatcmpl
import openai.types.chat.chat_completion as chat_completion
class ShengSuanYunChatCompletions(chatcmpl.OpenAIChatCompletions):
"""胜算云(ModelSpot.AI) ChatCompletion API 请求器"""
client: openai.AsyncClient
default_config: dict[str, typing.Any] = {
'base_url': 'https://router.shengsuanyun.com/api/v1',
'timeout': 120,
}
async def _req(
self,
args: dict,
extra_body: dict = {},
) -> chat_completion.ChatCompletion:
return await self.client.chat.completions.create(
**args,
extra_body=extra_body,
extra_headers={
'HTTP-Referer': 'https://langbot.app',
'X-Title': 'LangBot',
},
)

View File

@@ -7,7 +7,6 @@ metadata:
zh_Hans: 胜算云
icon: shengsuanyun.svg
spec:
litellm_provider: openai
config:
- name: base_url
label:

View File

@@ -0,0 +1,17 @@
from __future__ import annotations
import typing
import openai
from . import chatcmpl
class SiliconFlowChatCompletions(chatcmpl.OpenAIChatCompletions):
"""SiliconFlow ChatCompletion API 请求器"""
client: openai.AsyncClient
default_config: dict[str, typing.Any] = {
'base_url': 'https://api.siliconflow.cn/v1',
'timeout': 120,
}

View File

@@ -7,7 +7,6 @@ metadata:
zh_Hans: 硅基流动
icon: siliconflow.svg
spec:
litellm_provider: openai
config:
- name: base_url
label:

View File

@@ -0,0 +1,17 @@
from __future__ import annotations
import typing
import openai
from . import chatcmpl
class LangBotSpaceChatCompletions(chatcmpl.OpenAIChatCompletions):
"""LangBot Space ChatCompletion API 请求器"""
client: openai.AsyncClient
default_config: dict[str, typing.Any] = {
'base_url': 'https://api.langbot.cloud/v1',
'timeout': 120,
}

View File

@@ -7,7 +7,6 @@ metadata:
zh_Hans: Space
icon: space.webp
spec:
litellm_provider: openai
config:
- name: base_url
label:

View File

@@ -1,5 +0,0 @@
<svg width="60" height="50" viewBox="0 0 60 50" xmlns="http://www.w3.org/2000/svg">
<rect width="60" height="50" rx="8" fill="#0052D9"/>
<text x="30" y="28" font-family="Arial, sans-serif" font-size="10" font-weight="bold" fill="white" text-anchor="middle">Tencent</text>
<text x="30" y="40" font-family="Arial, sans-serif" font-size="8" fill="white" text-anchor="middle">Hunyuan</text>
</svg>

Before

Width:  |  Height:  |  Size: 400 B

View File

@@ -1,30 +0,0 @@
apiVersion: v1
kind: LLMAPIRequester
metadata:
name: tencent-chat-completions
label:
en_US: Tencent Hunyuan
zh_Hans: 腾讯混元
icon: tencent.svg
spec:
litellm_provider: openai
config:
- name: base_url
label:
en_US: Base URL
zh_Hans: 基础 URL
type: string
required: true
default: https://hunyuan.tencentcloudapi.com/v1
- name: timeout
label:
en_US: Timeout
zh_Hans: 超时时间
type: integer
required: true
default: 120
support_type:
- llm
- text-embedding
- rerank
provider_category: manufacturer

View File

@@ -1,5 +0,0 @@
<svg width="60" height="50" viewBox="0 0 60 50" xmlns="http://www.w3.org/2000/svg">
<rect width="60" height="50" rx="8" fill="#8B5CF6"/>
<text x="30" y="28" font-family="Arial, sans-serif" font-size="10" font-weight="bold" fill="white" text-anchor="middle">Together</text>
<text x="30" y="40" font-family="Arial, sans-serif" font-size="8" fill="white" text-anchor="middle">AI</text>
</svg>

Before

Width:  |  Height:  |  Size: 396 B

View File

@@ -1,30 +0,0 @@
apiVersion: v1
kind: LLMAPIRequester
metadata:
name: together-chat-completions
label:
en_US: Together AI
zh_Hans: Together AI
icon: together.svg
spec:
litellm_provider: together_ai
config:
- name: base_url
label:
en_US: Base URL
zh_Hans: 基础 URL
type: string
required: true
default: https://api.together.xyz/v1
- name: timeout
label:
en_US: Timeout
zh_Hans: 超时时间
type: integer
required: true
default: 120
support_type:
- llm
- text-embedding
- rerank
provider_category: manufacturer

Some files were not shown because too many files have changed in this diff Show More