mirror of
https://github.com/langbot-app/LangBot.git
synced 2026-06-14 09:46:03 +00:00
* refactor(provider): use LiteLLM as unified LLM requester backend
- Replace 23+ individual requester implementations with unified litellmchat.py
- Add litellm_provider field to 27 YAML manifests for provider routing
- Delete redundant requester subclasses
- Add unit tests for LiteLLMRequester (29 tests)
- Fix num_retries parameter name (was max_retries)
- Fix exception handling order for subclass exceptions
LiteLLM provides unified API for 100+ providers, eliminating need for
provider-specific requesters.
* fix: ruff format provider.py
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
* refactor(provider): simplify LiteLLM requester usage handling
- Remove unused Anthropic-specific tool schema generation
- Share completion argument construction between normal and streaming calls
- Use LiteLLM/OpenAI native usage fields for monitoring
- Collect stream token usage from LiteLLM stream_options
- Update LiteLLM requester tests for unified usage fields
* restore: restore deleted provider requester files
Restore individual provider requester implementations that were
removed in de61b5d3. These files coexist with the unified
litellmchat.py backend.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
* feat: update requesters and improve provider selection UI
- Added `litellm_provider` field to various requesters' YAML configurations.
- Removed obsolete Python requester files for OpenRouter, PPIO, QHAIGC, ShengSuanYun, SiliconFlow, Space, TokenPony, VolcArk, and Xai.
- Introduced new requesters for Tencent and Together AI with corresponding YAML configurations and SVG icons.
- Enhanced the ProviderForm component to include a searchable dropdown for selecting providers, improving user experience.
- Updated localization files to include search provider text for both English and Chinese.
* fix(provider): align litellm rebase with master
* fix(provider): capture streaming token usage; add token observability
The LiteLLM streaming requester only captured usage when a chunk had an
empty `choices` list. Many OpenAI-compatible gateways (e.g. new-api) and
providers send the final usage payload in a chunk that still carries an
empty-delta choice, so streamed calls always recorded 0 tokens in the
monitoring logs/dashboard (non-streaming worked).
- Capture stream usage whenever a chunk carries it, regardless of choices
- Add robust _normalize_usage (dict/obj shapes, derive missing total_tokens)
- Register litellm in bootutils/deps.py (was in pyproject only)
- Add MonitoringService.get_token_statistics + /monitoring/token-statistics
endpoint: summary, per-model breakdown, token timeseries, and a
zero-token-success data-quality signal
- Add TokenMonitoring dashboard tab (summary tiles, stacked token chart,
per-model table) + i18n (en/zh)
- Regression tests for stream usage capture and usage normalization
Verified end-to-end against a real OpenAI-compatible endpoint with
gpt-5.5 and claude-opus-4-8: tokens now recorded non-zero for both
streaming and non-streaming paths.
* refactor(provider): simplify litellm capabilities
* style: simplify wrapped expressions
* feat(models): persist context metadata
* fix(provider): handle dict embeddings and openai-compatible rerank in LiteLLMRequester
- invoke_embedding: support both object- and dict-shaped response.data
entries (OpenAI-compatible gateways like new-api return dicts)
- invoke_rerank: litellm.arerank rejects the 'openai' provider, so for
openai-compatible (or unspecified) providers call the standard
Jina/Cohere-style POST /v1/rerank endpoint directly over HTTP
- accept both 'relevance_score' and 'score' fields in rerank results
- add unit tests for the openai-compatible HTTP rerank path
* feat(provider): enforce requester support_type when adding models
- frontend: AddModelPopover only shows model-type tabs (llm/embedding/
rerank) that the provider's requester declares in its manifest
support_type; ModelsDialog fetches requester manifests and maps
requester -> support_type, passed down through ProviderCard
- backend: add _validate_provider_supports guard in create_llm_model /
create_embedding_model / create_rerank_model so a model cannot be
attached to a provider whose requester does not support that type,
even if the frontend restriction is bypassed (manifests without
support_type are allowed for backward compatibility)
- manifests: correct support_type for providers that do not offer all
three model types:
- llm only: anthropic, deepseek, groq, moonshot, openrouter, xai
- llm + text-embedding: openai, gemini, mistral
- add rerank to new-api (verified working via /v1/rerank)
- set llm + text-embedding + rerank for aggregator/unknown gateways
* feat(provider): add searchable alias to requester manifests
- add a free-text 'alias' field to every requester manifest spec,
containing the vendor's English/Chinese names, pinyin, common
nicknames and flagship model-series names (e.g. moonshot -> kimi,
月之暗面; zhipu -> glm, 智谱清言)
- frontend: ProviderForm requester search now also matches against
alias (substring/contains), so searching 'kimi' surfaces Moonshot,
'硅基' surfaces SiliconFlow, etc.
- also fix support_type: openrouter (relay) supports embedding+rerank;
LangBot Space gains rerank (coming soon)
* fix(provider): make support_type guard defensive against incomplete model_mgr
- _validate_provider_supports now uses getattr to gracefully skip when
model_mgr / provider_dict / manifest lookup is unavailable, instead of
raising AttributeError (fixes unit tests that mock ap.model_mgr as a
bare SimpleNamespace)
- add TestValidateProviderSupports covering: allow supported type,
reject unsupported type, allow when support_type missing, allow when
provider unknown, degrade safely when model_mgr is incomplete
* fix(persistence): guard 0004 migration against missing llm_models table
The 0004_add_llm_model_context_length migration called
inspector.get_columns('llm_models') unconditionally, raising
NoSuchTableError when the table does not exist (e.g. migrating a
fresh/empty DB, as exercised by the integration tests where
create_all() registers no tables because the ORM models are not
imported). Every other migration guards with a table-existence check
first; add the same guard here for both upgrade and downgrade.
Also restore the test head assertion to 0004 (it had been lowered to
0003 to mask this failure).
* Merge branch 'master' into feat/litellm
Resolve conflicts:
- uv.lock: regenerated via 'uv lock' to reconcile litellm/fastuuid
(ours) with openai bump (master).
- Alembic migrations: master added 0004_add_mcp_readme while this
branch added 0004_add_llm_model_context_length, both as children of
0003 (would create multiple heads). Re-chain the litellm migration as
0005_add_llm_model_context_length with down_revision=0004_add_mcp_readme
for a single linear head. Update test head assertion accordingly.
* fix(persistence): shorten migration revision id to fit varchar(32)
PostgreSQL stores alembic_version.version_num as varchar(32).
'0005_add_llm_model_context_length' (33 chars) overflowed it, raising
StringDataRightTruncationError in the PG migration tests. Rename the
revision (and file) to '0005_add_llm_context_length' (27 chars) and
update the head assertions in both SQLite and PostgreSQL migration
tests.
---------
Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
Co-authored-by: fdc310 <2213070223@qq.com>
Co-authored-by: RockChinQ <rockchinq@gmail.com>
108 lines
4.1 KiB
Python
108 lines
4.1 KiB
Python
from __future__ import annotations
|
|
|
|
import typing
|
|
from typing import TYPE_CHECKING
|
|
|
|
import langbot_plugin.api.entities.builtin.resource.tool as resource_tool
|
|
from langbot_plugin.api.entities.events import pipeline_query
|
|
|
|
if TYPE_CHECKING:
|
|
from ...core import app
|
|
from langbot.pkg.provider.tools.loaders import (
|
|
mcp as mcp_loader,
|
|
native as native_loader,
|
|
plugin as plugin_loader,
|
|
skill_authoring as skill_authoring_loader,
|
|
)
|
|
|
|
|
|
class ToolManager:
|
|
"""LLM工具管理器"""
|
|
|
|
ap: app.Application
|
|
|
|
native_tool_loader: native_loader.NativeToolLoader
|
|
plugin_tool_loader: plugin_loader.PluginToolLoader
|
|
mcp_tool_loader: mcp_loader.MCPLoader
|
|
skill_tool_loader: skill_authoring_loader.SkillToolLoader
|
|
|
|
def __init__(self, ap: app.Application):
|
|
self.ap = ap
|
|
|
|
async def initialize(self):
|
|
from langbot.pkg.utils import importutil
|
|
from langbot.pkg.provider.tools import loaders
|
|
from langbot.pkg.provider.tools.loaders import (
|
|
mcp as mcp_loader,
|
|
native as native_loader,
|
|
plugin as plugin_loader,
|
|
skill_authoring as skill_authoring_loader,
|
|
)
|
|
|
|
importutil.import_modules_in_pkg(loaders)
|
|
|
|
self.native_tool_loader = native_loader.NativeToolLoader(self.ap)
|
|
await self.native_tool_loader.initialize()
|
|
|
|
self.plugin_tool_loader = plugin_loader.PluginToolLoader(self.ap)
|
|
await self.plugin_tool_loader.initialize()
|
|
self.mcp_tool_loader = mcp_loader.MCPLoader(self.ap)
|
|
await self.mcp_tool_loader.initialize()
|
|
self.skill_tool_loader = skill_authoring_loader.SkillToolLoader(self.ap)
|
|
await self.skill_tool_loader.initialize()
|
|
|
|
async def get_all_tools(
|
|
self,
|
|
bound_plugins: list[str] | None = None,
|
|
bound_mcp_servers: list[str] | None = None,
|
|
include_skill_authoring: bool = False,
|
|
) -> list[resource_tool.LLMTool]:
|
|
all_functions: list[resource_tool.LLMTool] = []
|
|
|
|
all_functions.extend(await self.native_tool_loader.get_tools())
|
|
if include_skill_authoring:
|
|
all_functions.extend(await self.skill_tool_loader.get_tools())
|
|
all_functions.extend(await self.plugin_tool_loader.get_tools(bound_plugins))
|
|
all_functions.extend(await self.mcp_tool_loader.get_tools(bound_mcp_servers))
|
|
|
|
return all_functions
|
|
|
|
async def generate_tools_for_openai(self, use_funcs: list[resource_tool.LLMTool]) -> list:
|
|
tools = []
|
|
|
|
for function in use_funcs:
|
|
function_schema = {
|
|
'type': 'function',
|
|
'function': {
|
|
'name': function.name,
|
|
'description': function.description,
|
|
'parameters': function.parameters,
|
|
},
|
|
}
|
|
tools.append(function_schema)
|
|
|
|
return tools
|
|
|
|
async def execute_func_call(self, name: str, parameters: dict, query: pipeline_query.Query) -> typing.Any:
|
|
from langbot.pkg.telemetry import features as telemetry_features
|
|
|
|
if await self.native_tool_loader.has_tool(name):
|
|
telemetry_features.increment(query, 'tool_calls', 'native')
|
|
return await self.native_tool_loader.invoke_tool(name, parameters, query)
|
|
if await self.plugin_tool_loader.has_tool(name):
|
|
telemetry_features.increment(query, 'tool_calls', 'plugin')
|
|
return await self.plugin_tool_loader.invoke_tool(name, parameters, query)
|
|
if await self.mcp_tool_loader.has_tool(name):
|
|
telemetry_features.increment(query, 'tool_calls', 'mcp')
|
|
return await self.mcp_tool_loader.invoke_tool(name, parameters, query)
|
|
if await self.skill_tool_loader.has_tool(name):
|
|
telemetry_features.increment(query, 'tool_calls', 'skill')
|
|
return await self.skill_tool_loader.invoke_tool(name, parameters, query)
|
|
raise ValueError(f'未找到工具: {name}')
|
|
|
|
async def shutdown(self):
|
|
await self.native_tool_loader.shutdown()
|
|
await self.plugin_tool_loader.shutdown()
|
|
await self.mcp_tool_loader.shutdown()
|
|
await self.skill_tool_loader.shutdown()
|