Commit Graph

193 Commits

Author SHA1 Message Date
Junyan Qin
e8dc6fde53 feat: autoclean monitoring events 2026-03-27 11:57:24 +08:00
Junyan Chin
4a97895dea Feat/shadcn sidebar and page views (#2084)
* feat(web): migrate sidebar to shadcn and convert entity editors to page views

* feat(web): enhance sidebar with sections, collapsible persistence, sub-item sorting/limiting, and UI polish

- Reorganize sidebar into Home and Extensions sections with collapsible groups
- Split plugins page into plugins, market, and mcp as separate routes
- Add sidebar sub-items sorted by updatedAt with max 5 visible and expand/collapse toggle
- Persist collapsible section state and sidebar open state in localStorage
- Fix page refresh stripping query params by splitting handleChildClick/selectChild
- Swap plugin detail layout (config left, readme right)
- Add fixed headers with internal scroll for all detail and list pages
- Remove entity form borders and sidebar rail
- Improve dark mode sidebar/content contrast
- Rename monitoring to Dashboard, move to first position
- Update breadcrumb to show Home or Extensions based on current route
- Add i18n translations for more/less toggle in all 4 locales

* fix(web): fix scroll behavior - constrain layout to viewport, fix fixed headers and independent scroll areas

- Change SidebarProvider wrapper from min-h-svh to h-svh overflow-hidden to constrain layout to viewport height (root cause of all scroll issues)
- Fix create mode pages (bot, pipeline, knowledge): extract title bar out of scroll container so only form content scrolls
- Fix plugin detail: add overflow-x-hidden on both config and readme panels to prevent horizontal overflow
- Add min-h-0 to all TabsContent in edit mode for cross-browser flex shrink safety
- Change nested <main> to <div> in layout to avoid invalid nested <main> tags (SidebarInset already renders as <main>)

* style(web): polish UI - dashboard i18n, sidebar create text, cursor-pointer tabs, remove cancel buttons

* feat(web): add plugin context menu to sidebar sub-items

- Add hover-reveal dropdown menu (Ellipsis icon) on plugin sidebar items
- Menu items: Update (marketplace only), View Source (marketplace/github), Delete
- Add confirmation dialog with async task polling for delete/update operations
- Extend SidebarEntityItem with installSource and installInfo fields
- Fix PipelineFormComponent optional onCancel invocation

* fix(web): prevent plugin sidebar text from overlapping menu button

Add right padding on plugin sub-items and explicit truncate on text
span so long plugin names never overlap the hover menu button.

* feat(web): show update indicator on sidebar plugin menu

- Fetch marketplace plugin versions in SidebarDataContext.refreshPlugins
- Compare with installed version using isNewerVersion to set hasUpdate
- Show red dot on menu trigger when update available (always visible)
- Show 'New' badge on Update menu item when update available
- Marketplace fetch failure is silently caught to avoid blocking sidebar

* refactor(web): remove entity list pages, back buttons, and make sidebar toggle collapse

- Remove card grid list views from bots, pipelines, knowledge pages
- Show empty state placeholder when no entity is selected
- Preserve KB migration dialog at page level
- Remove back (ArrowLeft) buttons from all detail pages (bots, pipelines, knowledge, plugins)
- Sidebar parent click for bots/pipelines/knowledge now toggles collapse instead of navigating
- Breadcrumb second level is now non-clickable (always BreadcrumbPage)
- Add selectFromSidebar i18n keys in all 4 locales

* feat(web): enhance bot session monitor with refresh functionality and improve log card UI

* refactor(web): optimize pipeline detail page with vertical config nav and debug chat polish

- Convert pipeline config tab's horizontal sub-tabs to vertical left-side navigation with icons
- Replace hardcoded colors in PipelineFormComponent and DebugDialog with theme-aware Tailwind classes
- Replace custom SVG icons with lucide-react (User, Users, ImageIcon, Send, Reply, etc.)
- Replace hardcoded Chinese strings with i18n keys (allMembers, file, voice, uploadImage, uploading)
- Modernize chat bubbles to use bg-primary/10 and bg-muted instead of hardcoded blue/gray
- Translate all Chinese comments to English in both components
- Delete unused pipelineFormStyle.module.css
- Remove max-w-2xl constraint from config tab container

* fix(web): improve dark mode contrast and relocate WebSocket status indicator

Bump dark mode --muted, --accent, --secondary from oklch(0.18) to oklch(0.24)
to fix invisible TabsList, message bubbles, and selected items against the
oklch(0.17) background. Move WebSocket connection dot from pipeline title
into the Debug Chat tab trigger so it is always visible. Replace hardcoded
Quote border colors with theme-aware border-muted-foreground/50.

* fix(web): increase dark mode contrast for muted/accent/secondary to oklch(0.27)

Previous value of oklch(0.24) was still not distinguishable enough against
the oklch(0.17) background. Bump to oklch(0.27) for a 0.10 lightness gap,
matching the contrast ratio of the default shadcn zinc dark theme.

* style(web): replace hardcoded colors with theme tokens in monitoring dashboard

Convert all monitoring page components from hardcoded gray/white colors
to theme-aware CSS variable tokens (bg-card, text-foreground,
text-muted-foreground, bg-muted, bg-background, bg-accent, border).
Semantic colors (red/green/blue/purple for status badges and error
styling) are intentionally preserved.

* feat(web): show debug indicator for debugging plugins in sidebar

Add orange Bug icon next to plugin name in sidebar sub-items when the
plugin is connected via WebSocket debug mode. Hide context menu for
debug plugins since delete/update operations are not supported.

* feat(web): show install source and debug badge on plugin detail page

Display a badge next to the plugin title indicating the install source
(GitHub blue, Local green, Marketplace purple) or debugging status
(orange with Bug icon), matching the existing plugin card convention.

* fix(web): resolve eslint errors for CI - remove unused imports and variables

* fix(web): remove stale setSubtitle call and fix prettier formatting

* Refactor code formatting and improve readability

- Updated HomeSidebar.tsx to enhance clarity in conditional assignment.
- Adjusted CSS formatting in github-markdown.css for better alignment.
- Cleaned up tsconfig.json by consolidating array formatting for consistency.

* fix(ci): use local prettier instead of mirrors-prettier to avoid version mismatch (3.1.0 vs 3.8.1)
2026-03-27 01:51:13 +08:00
xiaolou
3c0495fc51 fix: 修复钉钉文件消息解析失效问题(优化 downloadCode 提取逻辑) (#2080)
* fix: resolve dingtalk file parsing issue by extracting downloadCode from content

* style: fix ruff format trailing whitespace

---------

Co-authored-by: RockChinQ <rockchinq@gmail.com>
2026-03-27 00:17:26 +08:00
Junyan Qin
f4db53b759 chore: bump version to 4.9.4 in pyproject.toml and __init__.py 2026-03-26 00:16:21 +08:00
fdc310
01852b81d4 Feat/openclaw weixin adapter (#2074)
* feat: add wexin openclaw adapter

* feat: The new feature will store the token and other configurations after login.

* fix: wexin qc to base64 and in log image print

* feat: add image to base64

* feat: add update file and image and voice
2026-03-25 23:34:35 +08:00
Junyan Chin
e1e5e7aedf fix: get_llm_models handler returns UUID strings instead of full model dicts (#2081)
The plugin SDK declares get_llm_models() -> list[str] (UUID strings),
but the host handler returned the full model dict list from
llm_model_service.get_llm_models(). This caused TypeError when
invoke_llm passed a dict to get_model_by_uuid (which is decorated
with @async_lru and requires hashable arguments).

Extract only the 'uuid' field to match the SDK contract.
2026-03-25 21:06:49 +08:00
zpf2000
6fa653f232 feat: 支持可配置的混合检索融合权重 (#2071)
* feat: 支持可配置的混合检索融合权重

* style: 修复 ruff format 检查
2026-03-24 09:50:08 +08:00
Junyan Qin
c9fc64360f feat(plugin): add unrestricted knowledge base query handlers
Add handlers for LIST_KNOWLEDGE_BASES and RETRIEVE_KNOWLEDGE actions
that allow plugins to list and retrieve from any knowledge base without
pipeline scope restrictions, complementing the existing pipeline-scoped handlers.
2026-03-23 21:06:23 +08:00
Guanchao Wang
88a04fdbe8 Merge pull request #2055 from langbot-app/copilot/fix-sender-name-parameter 2026-03-23 14:14:36 +08:00
WangCham
bbe019f0c6 fix: wrong agentid 2026-03-23 14:02:10 +08:00
RockChinQ
865f6ee81b style: format telegram.py for ruff 2026-03-21 22:10:23 +08:00
fdc310
bd5ec59b7c fix:The fix is in place — content = '' is now reset at the start of each loop iteration , which prevents stale text from being duplicated across tool call and end-turn chunks. (#2060) 2026-03-21 22:08:35 +08:00
fdc310
9c0cc1003d Fixed the issue where the at bot did not remove the at symbol, result… (#2062)
* Fixed the issue where the at bot did not remove the at symbol, resulting in some commands not being activated in group chats. Also, adjusted the logic in the on_message section.

* fix:reply_message  del bot_name
2026-03-21 22:07:31 +08:00
Bijin
ea07d8ad00 fix(telegram): add document message support (docx/pdf/etc) (#2069)
The Telegram adapter only handles TEXT, COMMAND, PHOTO, and VOICE
messages. Document files (docx, pdf, etc.) sent by users are silently
dropped because:

1. MessageHandler filters lack filters.Document.ALL
2. target2yiri() has no message.document branch
3. yiri2target() has no platform_message.File branch
4. send_message() has no 'document' component handler

Changes:
- Add filters.Document.ALL to the MessageHandler filter set
- Add message.document parsing in target2yiri() → platform_message.File
- Add platform_message.File handling in yiri2target() → document component
- Add 'document' type handling in send_message() via bot.send_document()

This allows Telegram document messages to flow through the existing
PreProcessor and Dify file upload pipeline, consistent with how other
adapters (Lark, KOOK, Discord, WeCom) already handle files.

Closes #2065
2026-03-21 22:06:54 +08:00
youhuanghe
254a13bba3 fix: 4355f0fa78 ruff lint 2026-03-16 06:39:29 +00:00
youhuanghe
4355f0fa78 feat(rag): expose vector listing API with backend filter support 2026-03-16 06:26:05 +00:00
Junyan Qin
031737f05d chore: remove all preset sensitive words 2026-03-16 13:42:19 +08:00
Nody the lobster
9e366fc536 fix: allow env overrides to create missing config keys (#2064)
Previously, environment variable overrides (e.g. SYSTEM__INSTANCE_ID)
were silently skipped if the target key didn't already exist in
data/config.yaml. This caused SaaS pods running older LangBot images
(whose config template lacked system.instance_id) to ignore the
SYSTEM__INSTANCE_ID env var, falling back to a random UUID that
didn't match the pod UUID — breaking idle timeout tracking.

Now env overrides create missing keys (as strings) and missing
intermediate dicts, so they work regardless of template version.

Co-authored-by: rocksclawbot <rocksclawbot@users.noreply.github.com>
2026-03-15 23:03:40 +08:00
Junyan Qin
1a1eadb282 chore: bump version 4.9.3 2026-03-14 20:20:48 +08:00
RockChinQ
351350ea03 fix: instance_id priority: config.yaml > file > generate new
- If system.instance_id set in config (via env var), use it
- If not set but file exists, read from file (don't generate new)
- If neither, generate new and save to file
2026-03-13 11:33:32 -04:00
RockChinQ
bc3d6ba92f feat: support instance_id in system config
Add instance_id field to system section in config.yaml.
Can be set via SYSTEM__INSTANCE_ID env var (auto-mapped).
Falls back to data/labels/instance_id.json if not set.
2026-03-13 11:31:51 -04:00
RockChinQ
345e4baf2a Revert "feat: support pre-setting instance_id via LANGBOT__INSTANCE_ID env var"
This reverts commit 6c64dc057f.
2026-03-13 11:30:36 -04:00
RockChinQ
6c64dc057f feat: support pre-setting instance_id via LANGBOT__INSTANCE_ID env var
In SaaS (cloud edition), the instance_id can now be injected via
environment variable to match the pod UUID. This enables zero-lookup
telemetry routing in Space - no need to reverse-lookup instance_id
to find the pod.
2026-03-13 11:26:16 -04:00
youhuanghe
eec0a9c9d9 feat(plugin): expose KB UUIDs in query variables and pass session context to retrieve API
Extract knowledge base UUID list into query.variables['_knowledge_base_uuids']
in PreProcessor so plugins can modify it during PromptPreProcessing. Runner now
reads from variables instead of pipeline_config. Also pass session_name,
bot_uuid, and sender_id to kb.retrieve() in the RETRIEVE_KNOWLEDGE_BASE handler
so knowledge engines receive proper session context.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-13 14:23:19 +00:00
Junyan Qin
4b0fad233e chore: bump version 4.9.2 2026-03-13 12:15:21 +08:00
Junyan Qin
52eb991a70 feat: add extra webhook prefix config 2026-03-13 12:06:22 +08:00
Junyan Qin
10c716be0c fix: bad model field ref 2026-03-13 11:47:31 +08:00
youhuanghe
6e77351eda refactor: up rag ingest timeout 2026-03-13 02:37:32 +00:00
Junyan Qin
20f5ebd9b8 chore: bump version 4.9.1 2026-03-12 23:24:33 +08:00
fdc310
d451b059fd feat: Implement WebSocket long connection client for WeChat Work AI Bot (#2054)
* feat: Implement WebSocket long connection client for WeChat Work AI Bot

- Added WecomBotWsClient to handle WebSocket connections for receiving messages and sending replies.
- Introduced a new migration (dbm022) to add 'enable-webhook' field to existing wecombot adapter configs, ensuring backward compatibility.
- Updated WecomBotAdapter to support both WebSocket and webhook modes based on the new configuration.
- Enhanced YAML configuration for WecomBot to include 'enable-webhook' and 'Secret' fields, adjusting requirements accordingly.
- Incremented database version to 22 to reflect schema changes.

* fix:db enable-webhook is false

* fix:add logic

* fix:Removed an unnecessary configuration check

* fix: migration

* fix: update migration

* fix:migration
2026-03-12 22:31:14 +08:00
marun
93c52fcd4c Enhance Lark Bot Ability to Reply to Quoted Messages (#2043)
* fix(database): Update database version requirement to 20

- Increase required_database_version from 19 to 20
- Add documentation on database schema version check

* feat(lark): Added support for message references and topic message grouping

- Implemented the function to extract reference message IDs from messages, supporting parent message identification

- Added a method to construct event messages from SDK message items

- Implemented the function to asynchronously obtain reference messages and convert them into message chains

- Integrated reference message injection logic into the message processing flow

- Added a mechanism to filter source components while retaining reference content

- Implemented a method to obtain the starter ID with topic awareness

- Provided session isolation support for topic range in group thread messages

- Supported stable maintenance of conversation context in group thread discussions

- Handled cases where topic messages cannot reliably detect reference targets

* feat(lark): Implement a duplicate prevention mechanism for Feishu topic message references

- Add class-level cache to store processed topic IDs and timestamps

- Implement a timed cleanup mechanism to remove expired topic records

- Add cache size limit to prevent memory from growing indefinitely

- Return the parent message ID and mark it as processed when the first reply is made to a topic

- Return None in subsequent replies to the same topic to avoid duplicate references

- Implement automatic cache trimming to ensure stable performance
2026-03-12 21:48:30 +08:00
huanghuoguoguo
f1608682e6 Feat/agentic rag and parser invoke api (#2052)
* feat: add pipeline api

* feat: add list parser

* ruff lint

* fix: add filter but agentic rag not to use

* feat: add bot uuid for memory..
2026-03-12 21:47:27 +08:00
youhuanghe
077e631c13 fix(rag): normalize vector search to distance semantics 2026-03-12 12:33:09 +00:00
Junyan Chin
79311ccde3 feat: model fallback chain (#2017) (#2018) 2026-03-12 03:33:05 +08:00
copilot-swe-agent[bot]
def798bf1f fix: WeCom sender_name shows user ID instead of actual username
- Add get_user_info() to WecomClient to fetch user name via /user/get API
- Update WecomEventConverter.target2yiri to accept bot param and fetch real user name
- Update register_listener call to pass self.bot for user name lookup
- URL-encode userid parameter for safety

Co-authored-by: RockChinQ <45992437+RockChinQ@users.noreply.github.com>
2026-03-11 17:52:43 +00:00
Guanchao Wang
89064a9d5b feat: add support for username (#2047)
* feat: add support for username

* fix: lint

* fix: migerations

* fix: change to version 21

* fix: remove duplicate dbm021 migration and rename dbm022

* feat: add user_id and user_name display with copy functionality in BotSessionMonitor

---------

Co-authored-by: wangcham <wangcham@gmail.com>
Co-authored-by: Junyan Qin <rockchinq@gmail.com>
2026-03-12 01:27:22 +08:00
Copilot
2655425fbe fix: deduplicate final chunk yield in Dify chatflow streaming (#2049)
* Initial plan

* fix: prevent duplicate messages when Dify chatflow sends both workflow_finished and message_end events

Co-authored-by: RockChinQ <45992437+RockChinQ@users.noreply.github.com>

* style: apply ruff formatting to difysvapi.py

Co-authored-by: RockChinQ <45992437+RockChinQ@users.noreply.github.com>

---------

Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: RockChinQ <45992437+RockChinQ@users.noreply.github.com>
2026-03-11 14:45:55 +08:00
youhuanghe
bd15b630b0 fix: chroma ruff lint 2026-03-11 04:07:21 +00:00
youhuanghe
fe5ce68436 feat(vector): add full-text and hybrid search support for Chroma backend
- Implement full-text search via Chroma's $contains filter
  - Implement hybrid search with RRF (Reciprocal Rank Fusion) combining
    vector and full-text results, with min-max normalized distances
  - Fix add_embeddings to use col.upsert instead of col.add for idempotency
  - Bump chromadb dependency to >=1.0.0,<2.0.0
  - Re-lock uv.lock with official PyPI source
2026-03-11 03:59:14 +00:00
Typer_Body
0541b05966 refactor: optimized error handling (#2020)
* Update output.yaml

* Update default-pipeline-config.json

* Update chat.py

* Add files via upload

* Update chat.py

* Update default-pipeline-config.json

* Update output.yaml

* Update constants.py

* feat: update logic

* fix: update required database version to 21

---------

Co-authored-by: Junyan Qin <rockchinq@gmail.com>
2026-03-10 22:01:23 +08:00
youhuanghe
13cb0aa9be bugfix: rollback filter, add to retrive settings 2026-03-10 12:49:24 +00:00
youhuanghe
a048369b38 feat: Pass session context (session_name) to knowledge engine retrieval filters.
Allow KnowledgeEngine plugins to filter retrieval results by session,enabling per-session memory isolation in plugin-based knowledge bases
2026-03-10 12:27:50 +00:00
Junyan Qin
a4e66f6459 feat: update version to 4.9.0 in pyproject.toml, __init__.py, and uv.lock 2026-03-09 20:10:01 +08:00
huanghuoguoguo
2a74a8d6ae Feat/dbm20 rag (#2037)
* feat(rag): add knowledge base migration from v4.9.0 to plugin architecture

Rewrite dbm020 to backup old knowledge_bases data and preserve
external_knowledge_bases table. Add migration API endpoints and
frontend dialog so users can opt-in to auto-install LangRAG plugin
and restore their knowledge bases with original UUIDs preserved.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix(rag): query marketplace for actual plugin version instead of 'latest'

The marketplace API does not support 'latest' as a version string.
Fetch the plugin info first to get latest_version, then use that
concrete version for installation.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* feat(rag): add data-only migration option and fix dialog width

Add option to migrate knowledge base data without auto-installing
the LangRAG plugin (for offline/intranet environments). Also
narrow the migration dialog to match other confirmation dialogs.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* refactor: to red and no more

* fix lint

* fix ruff lint

* feat: add external migration

* fix: show

* feat: add external plugin auto download

* feat: update migration messages for knowledge base in multiple languages

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Co-authored-by: Junyan Qin <rockchinq@gmail.com>
2026-03-09 20:05:38 +08:00
WangCham
11c05ea8db style(format): fix ruff formatting issues 2026-03-09 16:04:38 +08:00
WangCham
2b8bd1cc71 fix: invoke_llm failed when use plugin 2026-03-09 16:01:45 +08:00
doujianghub
9148e02679 fix: centralized pipeline config type coercion to prevent string-type crashes (#2031)
* fix: coerce pipeline config types at load time using metadata definitions

Pipeline configs stored in SQLAlchemy JSON columns can have values turned
into strings after UI edits (e.g. "120" instead of 120), causing runtime
arithmetic/logic errors. Add centralized type coercion in load_pipeline()
that leverages existing metadata YAML type definitions (integer, number,
float, boolean) to convert values before they reach downstream stages.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: address review - defensive getattr + add unit tests for config_coercion

- Use getattr with defaults for pipeline_config_meta_* attributes to
  avoid AttributeError when MockApplication lacks these fields
- Add 18 unit tests for config_coercion module covering all code paths

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* feat: add dynamic form stage tracking and snapshot management

* fix: standardize string formatting in config coercion and improve logging messages

---------

Co-authored-by: KPC <kpc@kpc.com>
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Co-authored-by: Junyan Qin <rockchinq@gmail.com>
2026-03-09 14:30:07 +08:00
fdc310
fd15284d91 fix(platform): websocket send_message not delivering to webchat frontend (#2039)
- Include websocket_proxy_bot in get_bot_by_uuid lookup so plugins can
  find it by uuid
- Rewrite send_message to broadcast directly via ws_connection_manager
  using the correct pipeline_uuid instead of misusing target_id
- Save messages to session history with unique IDs so they persist
  across page reloads and don't overwrite each other

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-09 13:22:03 +08:00
huanghuoguoguo
cadcf10047 Feat/rag plugin (#1995)
* [issue:1933] RAG engine plugin architecture (#1967)

* refactor: migrate RAG knowledge services to a plugin-oriented host service architecture.

* feat(rag): phase 2 core refactor with RPC Action handlers

* feat: 为 RAG 插件添加知识库创建和删除事件通知,并优化了 RAG 动作的参数传递和枚举使用。

* feat: 统一知识库管理为RAG引擎,支持动态配置并移除旧的外部知识库组件。

* refactor(rag): remove plugin_adapter, inline logic into RuntimeKnowledgeBase

BREAKING CHANGE: RAGPluginAdapter has been removed. All plugin
communication is now handled directly by RuntimeKnowledgeBase.

Architecture change:
- Before: RuntimeKnowledgeBase → RAGPluginAdapter → plugin_connector
- After:  RuntimeKnowledgeBase → plugin_connector (direct)

Changes to kbmgr.py (RuntimeKnowledgeBase):
- Remove RAGPluginAdapter import and usage
- Inline plugin communication methods:
  - _on_kb_create(): Notify plugin when KB is created
  - _on_kb_delete(): Notify plugin when KB is deleted
  - _ingest_document(): Call plugin for document ingestion
  - _retrieve(): Call plugin for retrieval
  - _delete_document(): Call plugin to delete document
- Simplify dispose(): Only notify plugin, no built-in VDB assumption

Changes to base.py (KnowledgeBaseInterface):
- Remove get_type() abstract method (outdated internal/external concept)
- Add get_rag_engine_plugin_id() abstract method

Changes to localagent.py:
- Remove get_type() call
- Simplify top_k retrieval from KB entity

Deleted files:
- pkg/rag/knowledge/plugin_adapter.py

Benefits:
- Reduced abstraction layer, simpler code
- Plugin communication logic centralized in RuntimeKnowledgeBase
- Easier to understand and maintain

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* refactor(api): remove ExternalKnowledgeBase infrastructure

BREAKING CHANGE: ExternalKnowledgeBase has been completely removed.
All knowledge bases are now unified under the single KnowledgeBase model,
differentiated by their rag_engine_plugin_id.

Deleted files:
- pkg/api/http/controller/groups/knowledge/external.py
  (ExternalKBController with /external-bases routes)
- pkg/api/http/service/external_kb.py
  (ExternalKnowledgeBaseService)
- pkg/rag/knowledge/external.py
  (ExternalKnowledgeBase implementation)

Modified files:
- pkg/entity/persistence/rag.py:
  Remove ExternalKnowledgeBase SQLAlchemy table definition
- pkg/core/app.py:
  Remove external_kb_service attribute from LangBotApplication
- pkg/core/stages/build_app.py:
  Remove external_kb_service initialization

Migration notes:
- Existing external knowledge base data should be migrated manually
- API consumers should use /api/v1/knowledge/bases for all KB operations
- Use /api/v1/knowledge/engines to discover available RAG engines

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* refactor(plugin): remove list_knowledge_retrievers from connector

Remove deprecated list_knowledge_retrievers functionality from the
plugin communication layer. This aligns with the SDK change that
removed the LIST_KNOWLEDGE_RETRIEVERS action.

Changes:
- connector.py: Remove list_knowledge_retrievers() method
- handler.py: Remove list_knowledge_retrievers() handler

The functionality is replaced by the new /api/v1/knowledge/engines
endpoint which lists available RAGEngine components with their
capabilities and configuration schemas.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* refactor(service): update knowledge service with capability-based checks

Replace type-based checks with capability-based checks for file
operations, aligning with the unified knowledge base architecture.

Changes to knowledge.py:
- store_file(): Replace get_type() check with doc_ingestion capability check
- delete_file(): Replace get_type() check with doc_ingestion capability check
- list_rag_engines(): Remove list_knowledge_retrievers call, simplify to
  only list RAGEngine components (KnowledgeRetriever type removed)

Changes to pipelines.py:
- Minor cleanup related to knowledge base references

The capability-based approach allows RAG engines to declare their
supported features (doc_ingestion, chunking_config, rerank, hybrid_search)
and the system responds accordingly, rather than hardcoding behavior
based on internal/external type distinction.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* feat(web): unify knowledge base UI, remove external KB components

BREAKING CHANGE: The internal/external knowledge base distinction
has been removed from the frontend. All knowledge bases are now
displayed in a unified list, differentiated by their RAG engine.

Changes to page.tsx:
- Remove Tab component (内置/外置 tabs)
- Remove selectedKbType state
- Unified knowledge base list display
- Single "Create Knowledge Base" button for all types

Changes to KBDetailDialog.tsx:
- Remove kbType prop
- Simplify dialog logic for unified KB handling
- Documents menu item conditionally shown based on doc_ingestion capability

Changes to KBForm.tsx:
- Remove retriever type handling code
- Simplify form for unified KB creation
- Dynamic form rendering based on RAG engine's creation_schema

Changes to KBCardVO.ts:
- Remove 'type' field from KBCardVO interface

Changes to BackendClient.ts:
- Remove all external KB related methods:
  - getExternalKnowledgeBases()
  - getExternalKnowledgeBase()
  - createExternalKnowledgeBase()
  - updateExternalKnowledgeBase()
  - deleteExternalKnowledgeBase()
  - retrieveFromExternalKnowledgeBase()

Changes to api/index.ts:
- Remove ExternalKnowledgeBase interface definition

UI/UX improvements:
- Users no longer need to understand internal vs external distinction
- RAG engine selection is now the primary differentiator
- Documents panel visibility is capability-driven (doc_ingestion)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* refactor(plugin): code review improvements for RAG handlers

- Unify embed_model field naming to embedding_model_uuid only
- Add structured error responses with error_type for RAG actions
- Fix file_size and mime_type detection in _store_file_task
- Improve error handling with detailed error context (error_type, original_error)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* refactor(rag): refactor KB dynamic form and vector manager

- Frontend: Refactor Knowledge Base form using DynamicForm components.
- Frontend: Remove obsolete jsonSchemaConverter utility.
- Backend: Update VectorManager and PluginHandler to support new RAG architecture.
- Chore: Update dependencies in pyproject.toml.

* fix: code review fixes for RAG refactor

- Remove DEBUG stderr outputs in handler.py
- Move repeated `import json` to file top
- Add warning log for unimplemented delete_by_filter

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* refactor(rag): consolidate valid_fields into entity constants

Define MUTABLE_FIELDS, CREATE_FIELDS, ALL_DB_FIELDS as class
constants in KnowledgeBase entity to eliminate duplication.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* refactor: 将知识库获取和RAG引擎信息丰富逻辑移至知识库管理器。

* refactor(rag): introduce RAGRuntimeService and clean up plugin handler

- Create RAGRuntimeService to encapsulate RAG capability implementation (Embedding, VectorOps).
- Refactor PluginHandler to delegate RAG actions to RAGRuntimeService.
- Move KnowledgeService enrichment and creation logic to RAGManager.
- Register RAGRuntimeService in Application and BuildAppStage.
- Clean up legacy code in KnowledgeService.

* refactor(rag): standardize logger and fix type hints

- Use self.ap.logger consistently in kbmgr.py and runtime.py, removing module-level loggers.
- Fix type hints for retrieve_knowledge in handler.py and connector.py to match implementation returning dict.

* refactor: 将引擎徽章的样式从 Tailwind CSS 类迁移到 CSS 模块。

* fix(web): resolve React rendering errors in plugins page

- Fix missing key prop in PluginComponentList by using ternary instead of Fragment
- Fix RAGEngine.name type to I18nObject and use extractI18nObject() for rendering
- Preserves multi-language support

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* fix(rag): update runtime service and web components

* refactor: 优化知识库设置结构并增强前端距离显示健壮性。

* fix: 处理前端距离显示中的空值。

* fix(rag): document retrieve ui and kbmgr top_k validation

* 更新 uv.lock 中的 PyPI 镜像源为官方地址。

* fix: address code review issues for RAG engine plugin architecture

P0 fixes:
- Fix ALL_DB_FIELDS missing collection_id and emoji fields
- Move rag_engine_plugin_id to CREATE_FIELDS (immutable after creation)
- Fix creation_settings mutable default value (dict -> None)
- Rename vector delete method to delete_by_file_id for correct semantics
- Fix delete_by_filter to raise NotImplementedError instead of silent no-op
- Add database migration script (dbm019) for new columns and table cleanup

P1 fixes:
- Clean up design-hesitation comments in connector.py
- Add _parse_plugin_id() with format validation for all RAG methods
- Make _retrieve() raise exceptions instead of silently returning empty results
- Extract _make_rag_error_response() helper for clean error formatting
- Remove unused imports from handler.py

P2 fixes:
- Fix runtime.py indentation inconsistencies
- Simplify get_file_stream to use storage abstraction uniformly
- Reduce redundant DB queries in knowledge service (extract _check_doc_capability)
- Fix engines.py URL encoding: use <path:plugin_id> instead of __ replacement
- Add read-only mode for engine settings in KBForm edit mode
- Simplify page.tsx handleKBCardClick to pass only kbId string

Co-authored-by: Cursor <cursoragent@cursor.com>

* fix: address code review findings for RAG plugin architecture

- Frontend: add retrieval_settings param to retrieveKnowledgeBase API call
- Backend: return {uuid} from PUT knowledge base to match frontend expectation
- Backend: validate query is non-empty in retrieve endpoint (400 on empty)
- Backend: rename vector_delete ids→file_ids for semantic clarity, keep
  backward compat by accepting both 'file_ids' and 'ids' in RPC handler
- Backend: ensure rag_engine.name fallback is always I18nObject-compatible
  dict, preventing frontend extractI18nObject from receiving plain strings
- Migration: fix misleading docstring about external_kb data migration

Co-authored-by: Cursor <cursoragent@cursor.com>

* Update langbot-plugin version to 0.2.6

* chore: update required database version from 18 to 19

* refactor: remove unused polymorphic component framework

* chore: fix lint and format issues for python and frontend

* fix(plugin): remove legacy `ids` fallback in rag_vector_delete handler

SDK now sends `file_ids` directly, the `ids` backward-compat fallback
is no longer needed.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix(rag): deep review fixes for critical bugs, security and quality

Critical:
- Fix StorageMgr.load() -> storage_provider.load() (C1, AttributeError)
- Update required_database_version 18 -> 19 (C2, migration never runs)

Security:
- Add path traversal validation in get_file_stream (C11)
- Add vectors/ids/metadata length validation in rag_vector_upsert (C12)

Logic fixes:
- Legacy KBs: set capabilities to [] instead of ['doc_ingestion'] (C4)
- Fix store_file return type int -> str (C5)
- Fix retrieve_knowledge return [] -> {'results': []} when disabled (C6)
- Re-raise exception in _on_kb_create instead of silently swallowing (C7)
- Log warning when KB not found in memory during delete (C8)

API fixes:
- Catch ValueError as 400 in create_knowledge_base endpoint (C15)
- Validate plugin_id format in engines endpoints (C16)

Quality:
- Remove dead if/else in migration with identical branches (C17)
- Fix variable shadowing: rag_context -> rag_context_text (C18)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* chore: remove unused os import to fix ruff lint

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* refactor(plugin): remove PolymorphicComponent sync from LangBot side

Remove sync_polymorphic_component_instances() from connector and handler,
and the post-connection sync call in initialize(). This dead code synced
an always-empty list of polymorphic instances that were never created.

Companion change to langbot-plugin-sdk PolymorphicComponent removal.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix(rag): fix vector_delete count bug and remove vestigial instance_id parameter

1. vector_delete: assign return value from delete_by_filter to count
   instead of silently returning 0 for filter-based deletion.

2. Remove instance_id parameter from the entire retrieve_knowledge
   call chain (kbmgr → connector → handler → runtime). This parameter
   was a remnant of the PolymorphicComponent mechanism and is no longer
   used — RAGEngine operates as a stateless singleton.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* feat(web): 支持 creation_schema 字段级别的 editable 属性控制编辑模式可修改性

- IDynamicFormItemSchema 添加 editable 可选属性
- DynamicFormItemConfig 透传 editable 属性
- DynamicFormComponent 接收 isEditing prop,按字段 editable 值控制禁用
- KBForm 解析 editable 并传递 isEditing 给动态表单组件
- editable 未指定时默认可编辑,editable: false 时编辑模式下禁用该字段

* feat(storage): 添加 size() 抽象方法及 LocalStorage/S3 实现

支持获取存储对象大小,S3 使用 head_object 避免下载整个文件

* fix(migration): 删除 external_knowledge_bases 表前记录日志警告

- 迁移时如果表中存在数据,先 warning 日志记录避免无感数据丢失
- 添加 chunk 清理注释说明:仅对旧版非插件架构 KB 有效

* fix(web): 修复检索结果长文本撑大容器导致查询按钮不可见

KBDetailDialog 的 main 容器添加 min-w-0 overflow-x-hidden,
限制 flex-1 子容器宽度,防止 Dify RAG 长文本撑出 Dialog 边界

* fix(rag): address code review issues for plugin architecture PR

- Fix SQL injection in migration helpers by using bind parameters
- Move numpy import to module level in vector/mgr.py
- Improve path traversal validation using posixpath.normpath
- Add call_rag_retrieve to connector, eliminating duplicate plugin_id
  parsing in kbmgr.py _retrieve
- Normalize typing style to modern dict/list/None syntax

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* style(web): fix prettier formatting errors

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* refactor(rag): update embedding handling in RuntimeConnectionHandler

- Renamed RAG_EMBED_DOCUMENTS and RAG_EMBED_QUERY actions to INVOKE_EMBEDDING for clarity.
- Removed embed_documents and embed_query methods from RuntimeEmbeddingModel and RAGRuntimeService.
- Integrated embedding model retrieval directly in the invoke_embedding method, improving error handling for missing models.
- Updated the embedding invocation logic to streamline the process and enhance error reporting.

* refactor(web): replace KnowledgeRetriever with RAGEngine across frontend and tests

KnowledgeRetriever component type has been removed in favor of the new
RAGEngine architecture. Update all remaining references in i18n locales,
plugin component icon mappings, marketplace filter, and unit tests.

Addresses reviewer notes from RockChinQ on PR #1967.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix(rag): address critical bugs found in deep review

- Fix path traversal bypass in runtime.py (check all path components for '..')
- Use normalized path for file loading instead of raw user input
- Change knowledge_bases from list to dict for O(1) lookup and race safety
- Add rollback on KB creation failure (clean up DB + runtime on plugin error)
- Add null check after KB update in knowledge service
- Fix file extension parsing to use os.path.splitext instead of split('.')
  (handles multi-dot filenames like 'report.v2.pdf' correctly)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix(rag): address remaining review issues across frontend and backend

Frontend:
- Fix KB delete: use async/await with error handling instead of fire-and-forget
- Fix capabilities null check: add optional chaining to prevent crash
- Add toast.error on KB info load failure instead of silent console.error
- Replace hard-coded Chinese validation message with i18n key
- Replace hard-coded English error messages in DynamicFormItemComponent with i18n
- Optimize document polling: stop when all documents reach terminal state
- Add i18n keys (fieldRequired, loadKnowledgeBaseFailed,
  deleteKnowledgeBaseFailed, getKnowledgeBaseListError) to all 4 locales

Backend:
- Fix KB delete atomicity: delete from DB first, then notify plugin
- Add RAG engine plugin existence validation before creating KB

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* style(rag): fix ruff formatting in kbmgr.py

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
Co-authored-by: Cursor <cursoragent@cursor.com>
Co-authored-by: Junyan Qin <rockchinq@gmail.com>

* chore: bump langbot-plugin to 0.3.0 (#1992)

* chore: correct sdk version to 0.3.0a1

* feat: normalize rag related actions' names

* refactor(rag): align IngestionContext fields with SDK changes

Remove redundant `chunking_strategy` field and rename `custom_settings`
to `creation_settings` to match the updated SDK entity definitions
(langbot-plugin-sdk#36).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* style: fix ruff formatting

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix(rag): enforce immutability of embedding_model_uuid and non-editable creation_settings fields

Remove embedding_model_uuid from MUTABLE_FIELDS to prevent post-creation
modification via API. Add backend validation for creation_settings to
preserve fields marked editable:false in the plugin's creation schema.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* style(rag): fix ruff formatting in knowledge service

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* refactor(rag): split settings into immutable creation_settings and mutable retrieval_settings

- Remove standalone embedding_model_uuid and top_k columns from KB entity
- Add retrieval_settings column; update MUTABLE_FIELDS/CREATE_FIELDS accordingly
- Merge migration logic into dbm019 (add retrieval_settings, migrate top_k
  and embedding_model_uuid into JSON settings, drop old columns on PostgreSQL)
- Remove _filter_creation_settings and per-field editable concept
- Frontend: creation_settings fields are all disabled when editing,
  retrieval_settings fields are always editable via a second DynamicFormComponent
- Remove editable from IDynamicFormItemSchema, DynamicFormItemConfig
- Clean up KBCardVO, KnowledgeBase API type, and localagent runner

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* bugfix: if ingest_document failed,not raise exep

* fix: ruff lint

* refactor(rag): remove unused _get_kb_entity method from RAGRuntimeService

* feat(vector): implement metadata filters for vector_search and vector_delete (#1997)

Add functional metadata filter support across all 5 VDB backends using
Chroma-style where syntax as the canonical format. Previously the filters
parameter existed throughout the stack but was entirely ignored.

- Add filter_utils.py with normalize_filter() and strip_unsupported_fields()
- Implement filter in search() and add delete_by_filter() for all backends:
  Chroma/SeekDB (native passthrough), Qdrant (translated to models.Filter),
  Milvus (translated to expr string), pgvector (translated to SQLAlchemy conditions)
- Milvus/pgvector limited to {text, file_id, chunk_uuid}; other fields logged and ignored
- Replace delete_by_filter() NotImplementedError with backend delegation in mgr.py
- Populate retrieval_context['filters'] from settings in kbmgr._retrieve()
- Pass search_type/query_text/documents through handler and runtime service

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* style(vector): fix ruff formatting

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix(vector): remove numpy dependency and fix SeekDB search modes

- Remove numpy array conversion for query vectors; all VDB backends
  accept list[float] directly
- Remove redundant get_or_create_collection call from upsert; backends
  handle collection creation internally in add_embeddings
- Fix SeekDB to raise ValueError when vector dimension is unknown
  instead of defaulting to 384
- Use hybrid_search() for full-text and hybrid search modes in SeekDB,
  since pyseekdb's query() always requires embeddings

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix(vector): escape single quotes in SeekDB documents and metadata

Document text containing apostrophes (e.g. "don't", "it's") causes
SQL syntax errors in OceanBase because single quotes were not in the
escape table. Add single-quote escaping and apply the escape table to
the documents parameter in add_embeddings(), not just metadata.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix(vector): use standard SQL escaping for single quotes in SeekDB

Change single quote escaping from MySQL-style \' to standard SQL ''
(doubled quote). The backslash escape is not recognized by OceanBase
in NO_BACKSLASH_ESCAPES mode, causing SQL syntax errors when metadata
text contains apostrophes (e.g. O'Shea in academic citations).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix(rag): persist retrieval_settings on knowledge base creation

retrieval_settings was not being passed from the service layer to
RAGManager.create_knowledge_base(), causing retrieval schema fields
(e.g. query_rewrite) to be lost on initial KB creation. They only
took effect after a subsequent edit/update.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* feat(web): add show_if conditional rendering for dynamic forms

Support conditional field visibility in plugin-defined forms via
show_if rules (eq, neq, in operators). Fields can depend on values
from the same form or cross-reference between creation and retrieval
settings via externalDependentValues.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix(rag): replace base64 with chunked file transfer for get_rag_file_stream

Use send_file() instead of base64 encoding for returning file content
in the GET_RAG_FILE_STREAM handler, avoiding memory issues with large files.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* feat(parser): add parser plugin integration and capability-aware upload UI (#2000)

* feat(parser): add parser plugin integration and capability-aware upload UI

Backend: add parser plugin API endpoints (list/invoke), connector and
handler support for parser actions, and KB manager passthrough.

Frontend: thread ragEngineCapabilities prop to FileUploadZone and use
doc_parsing capability to conditionally show the RAG engine option in
the parser selector. When no parser is available, show a warning
prompting users to install a parser plugin.

Update i18n: rename builtInParser to "Provided by RAG engine" and add
noParserAvailable warning message in all 4 locales.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix(parser): replace base64 with chunked file transfer and remove stale cache

- Remove @alru_cache from list_parsers() and list_rag_engines()
- Replace inline base64 file content with send_file/read_local_file
  chunked transfer pattern in parse_document and invoke_parser flows
- Remove unused base64 import from kbmgr.py

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>

* feat(web): add Parser component kind to plugin market UI and i18n

Add Parser to kindIconMap, market filter toggle, and all 4 locale files
so parser plugins are properly displayed and filterable in the plugin
market, matching the existing RAGEngine treatment.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* style(web): fix prettier formatting from merge

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* refactor: rename RAGEngine to KnowledgeEngine across frontend and backend

* fix(web): fix I18nObject import path in FileUploadZone and KBDoc

* chore: format files involved in RAGEngine to KnowledgeEngine refactor

* refactor: change rag engine to knowledge engine

* fix: update langbot-plugin version to 0.3.0rc1

* chore: disable migration 20 for now

---------

Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
Co-authored-by: Cursor <cursoragent@cursor.com>
Co-authored-by: Junyan Qin <rockchinq@gmail.com>
2026-03-06 21:54:38 +08:00
fdc310
3e8f47fd97 feat: judge and send runner category (local or cloud) for telemetry
* feat(chat): add runner_url to payload for telemetry tracking

* feat(telemetry): add runner_url to sanitized fields in telemetry payload

* feat(telemetry): replace runner_url with runner_category in telemetry payload and add runner utility functions

* fix:ruff
2026-03-06 00:44:09 +08:00