* refactor: pipeline routing rules - add routed_by_rule bypass and diagnostic logging
- Add routing rules editor (RoutingRulesEditor component)
- Add routed_by_rule bypass logic in response rules
- Add diagnostic logging for pipeline routing
- Database migration for bot pipeline routing rules
- Extract RoutingRulesEditor component from BotForm
- Revert log levels to debug
* feat: add message_has_element routing rule type
Support routing by message element type (Image, Voice, File, Forward,
Face, At, AtAll, Quote) with eq/neq operators.
* test: add unit tests for pipeline routing rules
20 tests covering _match_operator (eq/neq/contains/not_contains/
starts_with/regex/invalid) and resolve_pipeline_uuid (launcher_type/
launcher_id/message_content/message_has_element/first-match-wins/
skip-invalid/default-operator).
* fix(web): add missing 'message_has_element' to routing rule type validation
The Zod schema and TypeScript type for PipelineRoutingRule.type were
missing the 'message_has_element' variant, causing silent form validation
failure when saving routing rules with this type.
* feat: add pipeline discard functionality and localization support
* feat(web): improve drag-and-drop with DragOverlay, add discard monitoring and pipeline icons
- Add DragOverlay for smooth cursor-following drag in routing rules editor
- Remove transition to eliminate redundant swap animation on drop
- Record discarded messages in monitoring system via _record_discarded_message
- Display pipeline name (Workflow icon) and runner name (Play icon) on session monitor messages
- Show discard badge on discarded messages in session monitor
- Add i18n translations for discarded/userMessage/botMessage
* fix: ensure discarded messages appear in session monitor and improve icons
- Create/update monitoring session for discarded messages so they show in
the bot session monitor (was only inserting message rows, not sessions)
- Use human-readable 'Discarded' as pipeline_name instead of '__discard__'
- Change runner icon from Play to Bot for better AI Agent semantics
* fix: merge discarded messages into same session and remove session-level pipeline name
- Use LauncherTypes enum for session_id in discarded messages to match
the format used by monitoring_helper (fixes duplicate sessions)
- Don't overwrite session pipeline info on discard — a session can have
messages from multiple pipelines
- Remove pipeline_name from session list and chat header since it's
now shown per-message and a session is no longer single-pipeline
* fix(web): only show save button on config tab in bot detail page
* fix(web): scroll to bottom after messages render in session monitor
---------
Co-authored-by: RockChinQ <rockchinq@gmail.com>
* fix: coerce pipeline config types at load time using metadata definitions
Pipeline configs stored in SQLAlchemy JSON columns can have values turned
into strings after UI edits (e.g. "120" instead of 120), causing runtime
arithmetic/logic errors. Add centralized type coercion in load_pipeline()
that leverages existing metadata YAML type definitions (integer, number,
float, boolean) to convert values before they reach downstream stages.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* fix: address review - defensive getattr + add unit tests for config_coercion
- Use getattr with defaults for pipeline_config_meta_* attributes to
avoid AttributeError when MockApplication lacks these fields
- Add 18 unit tests for config_coercion module covering all code paths
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* feat: add dynamic form stage tracking and snapshot management
* fix: standardize string formatting in config coercion and improve logging messages
---------
Co-authored-by: KPC <kpc@kpc.com>
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Co-authored-by: Junyan Qin <rockchinq@gmail.com>
* [issue:1933] RAG engine plugin architecture (#1967)
* refactor: migrate RAG knowledge services to a plugin-oriented host service architecture.
* feat(rag): phase 2 core refactor with RPC Action handlers
* feat: 为 RAG 插件添加知识库创建和删除事件通知,并优化了 RAG 动作的参数传递和枚举使用。
* feat: 统一知识库管理为RAG引擎,支持动态配置并移除旧的外部知识库组件。
* refactor(rag): remove plugin_adapter, inline logic into RuntimeKnowledgeBase
BREAKING CHANGE: RAGPluginAdapter has been removed. All plugin
communication is now handled directly by RuntimeKnowledgeBase.
Architecture change:
- Before: RuntimeKnowledgeBase → RAGPluginAdapter → plugin_connector
- After: RuntimeKnowledgeBase → plugin_connector (direct)
Changes to kbmgr.py (RuntimeKnowledgeBase):
- Remove RAGPluginAdapter import and usage
- Inline plugin communication methods:
- _on_kb_create(): Notify plugin when KB is created
- _on_kb_delete(): Notify plugin when KB is deleted
- _ingest_document(): Call plugin for document ingestion
- _retrieve(): Call plugin for retrieval
- _delete_document(): Call plugin to delete document
- Simplify dispose(): Only notify plugin, no built-in VDB assumption
Changes to base.py (KnowledgeBaseInterface):
- Remove get_type() abstract method (outdated internal/external concept)
- Add get_rag_engine_plugin_id() abstract method
Changes to localagent.py:
- Remove get_type() call
- Simplify top_k retrieval from KB entity
Deleted files:
- pkg/rag/knowledge/plugin_adapter.py
Benefits:
- Reduced abstraction layer, simpler code
- Plugin communication logic centralized in RuntimeKnowledgeBase
- Easier to understand and maintain
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
* refactor(api): remove ExternalKnowledgeBase infrastructure
BREAKING CHANGE: ExternalKnowledgeBase has been completely removed.
All knowledge bases are now unified under the single KnowledgeBase model,
differentiated by their rag_engine_plugin_id.
Deleted files:
- pkg/api/http/controller/groups/knowledge/external.py
(ExternalKBController with /external-bases routes)
- pkg/api/http/service/external_kb.py
(ExternalKnowledgeBaseService)
- pkg/rag/knowledge/external.py
(ExternalKnowledgeBase implementation)
Modified files:
- pkg/entity/persistence/rag.py:
Remove ExternalKnowledgeBase SQLAlchemy table definition
- pkg/core/app.py:
Remove external_kb_service attribute from LangBotApplication
- pkg/core/stages/build_app.py:
Remove external_kb_service initialization
Migration notes:
- Existing external knowledge base data should be migrated manually
- API consumers should use /api/v1/knowledge/bases for all KB operations
- Use /api/v1/knowledge/engines to discover available RAG engines
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
* refactor(plugin): remove list_knowledge_retrievers from connector
Remove deprecated list_knowledge_retrievers functionality from the
plugin communication layer. This aligns with the SDK change that
removed the LIST_KNOWLEDGE_RETRIEVERS action.
Changes:
- connector.py: Remove list_knowledge_retrievers() method
- handler.py: Remove list_knowledge_retrievers() handler
The functionality is replaced by the new /api/v1/knowledge/engines
endpoint which lists available RAGEngine components with their
capabilities and configuration schemas.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
* refactor(service): update knowledge service with capability-based checks
Replace type-based checks with capability-based checks for file
operations, aligning with the unified knowledge base architecture.
Changes to knowledge.py:
- store_file(): Replace get_type() check with doc_ingestion capability check
- delete_file(): Replace get_type() check with doc_ingestion capability check
- list_rag_engines(): Remove list_knowledge_retrievers call, simplify to
only list RAGEngine components (KnowledgeRetriever type removed)
Changes to pipelines.py:
- Minor cleanup related to knowledge base references
The capability-based approach allows RAG engines to declare their
supported features (doc_ingestion, chunking_config, rerank, hybrid_search)
and the system responds accordingly, rather than hardcoding behavior
based on internal/external type distinction.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
* feat(web): unify knowledge base UI, remove external KB components
BREAKING CHANGE: The internal/external knowledge base distinction
has been removed from the frontend. All knowledge bases are now
displayed in a unified list, differentiated by their RAG engine.
Changes to page.tsx:
- Remove Tab component (内置/外置 tabs)
- Remove selectedKbType state
- Unified knowledge base list display
- Single "Create Knowledge Base" button for all types
Changes to KBDetailDialog.tsx:
- Remove kbType prop
- Simplify dialog logic for unified KB handling
- Documents menu item conditionally shown based on doc_ingestion capability
Changes to KBForm.tsx:
- Remove retriever type handling code
- Simplify form for unified KB creation
- Dynamic form rendering based on RAG engine's creation_schema
Changes to KBCardVO.ts:
- Remove 'type' field from KBCardVO interface
Changes to BackendClient.ts:
- Remove all external KB related methods:
- getExternalKnowledgeBases()
- getExternalKnowledgeBase()
- createExternalKnowledgeBase()
- updateExternalKnowledgeBase()
- deleteExternalKnowledgeBase()
- retrieveFromExternalKnowledgeBase()
Changes to api/index.ts:
- Remove ExternalKnowledgeBase interface definition
UI/UX improvements:
- Users no longer need to understand internal vs external distinction
- RAG engine selection is now the primary differentiator
- Documents panel visibility is capability-driven (doc_ingestion)
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
* refactor(plugin): code review improvements for RAG handlers
- Unify embed_model field naming to embedding_model_uuid only
- Add structured error responses with error_type for RAG actions
- Fix file_size and mime_type detection in _store_file_task
- Improve error handling with detailed error context (error_type, original_error)
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
* refactor(rag): refactor KB dynamic form and vector manager
- Frontend: Refactor Knowledge Base form using DynamicForm components.
- Frontend: Remove obsolete jsonSchemaConverter utility.
- Backend: Update VectorManager and PluginHandler to support new RAG architecture.
- Chore: Update dependencies in pyproject.toml.
* fix: code review fixes for RAG refactor
- Remove DEBUG stderr outputs in handler.py
- Move repeated `import json` to file top
- Add warning log for unimplemented delete_by_filter
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
* refactor(rag): consolidate valid_fields into entity constants
Define MUTABLE_FIELDS, CREATE_FIELDS, ALL_DB_FIELDS as class
constants in KnowledgeBase entity to eliminate duplication.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
* refactor: 将知识库获取和RAG引擎信息丰富逻辑移至知识库管理器。
* refactor(rag): introduce RAGRuntimeService and clean up plugin handler
- Create RAGRuntimeService to encapsulate RAG capability implementation (Embedding, VectorOps).
- Refactor PluginHandler to delegate RAG actions to RAGRuntimeService.
- Move KnowledgeService enrichment and creation logic to RAGManager.
- Register RAGRuntimeService in Application and BuildAppStage.
- Clean up legacy code in KnowledgeService.
* refactor(rag): standardize logger and fix type hints
- Use self.ap.logger consistently in kbmgr.py and runtime.py, removing module-level loggers.
- Fix type hints for retrieve_knowledge in handler.py and connector.py to match implementation returning dict.
* refactor: 将引擎徽章的样式从 Tailwind CSS 类迁移到 CSS 模块。
* fix(web): resolve React rendering errors in plugins page
- Fix missing key prop in PluginComponentList by using ternary instead of Fragment
- Fix RAGEngine.name type to I18nObject and use extractI18nObject() for rendering
- Preserves multi-language support
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
* fix(rag): update runtime service and web components
* refactor: 优化知识库设置结构并增强前端距离显示健壮性。
* fix: 处理前端距离显示中的空值。
* fix(rag): document retrieve ui and kbmgr top_k validation
* 更新 uv.lock 中的 PyPI 镜像源为官方地址。
* fix: address code review issues for RAG engine plugin architecture
P0 fixes:
- Fix ALL_DB_FIELDS missing collection_id and emoji fields
- Move rag_engine_plugin_id to CREATE_FIELDS (immutable after creation)
- Fix creation_settings mutable default value (dict -> None)
- Rename vector delete method to delete_by_file_id for correct semantics
- Fix delete_by_filter to raise NotImplementedError instead of silent no-op
- Add database migration script (dbm019) for new columns and table cleanup
P1 fixes:
- Clean up design-hesitation comments in connector.py
- Add _parse_plugin_id() with format validation for all RAG methods
- Make _retrieve() raise exceptions instead of silently returning empty results
- Extract _make_rag_error_response() helper for clean error formatting
- Remove unused imports from handler.py
P2 fixes:
- Fix runtime.py indentation inconsistencies
- Simplify get_file_stream to use storage abstraction uniformly
- Reduce redundant DB queries in knowledge service (extract _check_doc_capability)
- Fix engines.py URL encoding: use <path:plugin_id> instead of __ replacement
- Add read-only mode for engine settings in KBForm edit mode
- Simplify page.tsx handleKBCardClick to pass only kbId string
Co-authored-by: Cursor <cursoragent@cursor.com>
* fix: address code review findings for RAG plugin architecture
- Frontend: add retrieval_settings param to retrieveKnowledgeBase API call
- Backend: return {uuid} from PUT knowledge base to match frontend expectation
- Backend: validate query is non-empty in retrieve endpoint (400 on empty)
- Backend: rename vector_delete ids→file_ids for semantic clarity, keep
backward compat by accepting both 'file_ids' and 'ids' in RPC handler
- Backend: ensure rag_engine.name fallback is always I18nObject-compatible
dict, preventing frontend extractI18nObject from receiving plain strings
- Migration: fix misleading docstring about external_kb data migration
Co-authored-by: Cursor <cursoragent@cursor.com>
* Update langbot-plugin version to 0.2.6
* chore: update required database version from 18 to 19
* refactor: remove unused polymorphic component framework
* chore: fix lint and format issues for python and frontend
* fix(plugin): remove legacy `ids` fallback in rag_vector_delete handler
SDK now sends `file_ids` directly, the `ids` backward-compat fallback
is no longer needed.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* fix(rag): deep review fixes for critical bugs, security and quality
Critical:
- Fix StorageMgr.load() -> storage_provider.load() (C1, AttributeError)
- Update required_database_version 18 -> 19 (C2, migration never runs)
Security:
- Add path traversal validation in get_file_stream (C11)
- Add vectors/ids/metadata length validation in rag_vector_upsert (C12)
Logic fixes:
- Legacy KBs: set capabilities to [] instead of ['doc_ingestion'] (C4)
- Fix store_file return type int -> str (C5)
- Fix retrieve_knowledge return [] -> {'results': []} when disabled (C6)
- Re-raise exception in _on_kb_create instead of silently swallowing (C7)
- Log warning when KB not found in memory during delete (C8)
API fixes:
- Catch ValueError as 400 in create_knowledge_base endpoint (C15)
- Validate plugin_id format in engines endpoints (C16)
Quality:
- Remove dead if/else in migration with identical branches (C17)
- Fix variable shadowing: rag_context -> rag_context_text (C18)
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* chore: remove unused os import to fix ruff lint
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* refactor(plugin): remove PolymorphicComponent sync from LangBot side
Remove sync_polymorphic_component_instances() from connector and handler,
and the post-connection sync call in initialize(). This dead code synced
an always-empty list of polymorphic instances that were never created.
Companion change to langbot-plugin-sdk PolymorphicComponent removal.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* fix(rag): fix vector_delete count bug and remove vestigial instance_id parameter
1. vector_delete: assign return value from delete_by_filter to count
instead of silently returning 0 for filter-based deletion.
2. Remove instance_id parameter from the entire retrieve_knowledge
call chain (kbmgr → connector → handler → runtime). This parameter
was a remnant of the PolymorphicComponent mechanism and is no longer
used — RAGEngine operates as a stateless singleton.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* feat(web): 支持 creation_schema 字段级别的 editable 属性控制编辑模式可修改性
- IDynamicFormItemSchema 添加 editable 可选属性
- DynamicFormItemConfig 透传 editable 属性
- DynamicFormComponent 接收 isEditing prop,按字段 editable 值控制禁用
- KBForm 解析 editable 并传递 isEditing 给动态表单组件
- editable 未指定时默认可编辑,editable: false 时编辑模式下禁用该字段
* feat(storage): 添加 size() 抽象方法及 LocalStorage/S3 实现
支持获取存储对象大小,S3 使用 head_object 避免下载整个文件
* fix(migration): 删除 external_knowledge_bases 表前记录日志警告
- 迁移时如果表中存在数据,先 warning 日志记录避免无感数据丢失
- 添加 chunk 清理注释说明:仅对旧版非插件架构 KB 有效
* fix(web): 修复检索结果长文本撑大容器导致查询按钮不可见
KBDetailDialog 的 main 容器添加 min-w-0 overflow-x-hidden,
限制 flex-1 子容器宽度,防止 Dify RAG 长文本撑出 Dialog 边界
* fix(rag): address code review issues for plugin architecture PR
- Fix SQL injection in migration helpers by using bind parameters
- Move numpy import to module level in vector/mgr.py
- Improve path traversal validation using posixpath.normpath
- Add call_rag_retrieve to connector, eliminating duplicate plugin_id
parsing in kbmgr.py _retrieve
- Normalize typing style to modern dict/list/None syntax
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* style(web): fix prettier formatting errors
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* refactor(rag): update embedding handling in RuntimeConnectionHandler
- Renamed RAG_EMBED_DOCUMENTS and RAG_EMBED_QUERY actions to INVOKE_EMBEDDING for clarity.
- Removed embed_documents and embed_query methods from RuntimeEmbeddingModel and RAGRuntimeService.
- Integrated embedding model retrieval directly in the invoke_embedding method, improving error handling for missing models.
- Updated the embedding invocation logic to streamline the process and enhance error reporting.
* refactor(web): replace KnowledgeRetriever with RAGEngine across frontend and tests
KnowledgeRetriever component type has been removed in favor of the new
RAGEngine architecture. Update all remaining references in i18n locales,
plugin component icon mappings, marketplace filter, and unit tests.
Addresses reviewer notes from RockChinQ on PR #1967.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* fix(rag): address critical bugs found in deep review
- Fix path traversal bypass in runtime.py (check all path components for '..')
- Use normalized path for file loading instead of raw user input
- Change knowledge_bases from list to dict for O(1) lookup and race safety
- Add rollback on KB creation failure (clean up DB + runtime on plugin error)
- Add null check after KB update in knowledge service
- Fix file extension parsing to use os.path.splitext instead of split('.')
(handles multi-dot filenames like 'report.v2.pdf' correctly)
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* fix(rag): address remaining review issues across frontend and backend
Frontend:
- Fix KB delete: use async/await with error handling instead of fire-and-forget
- Fix capabilities null check: add optional chaining to prevent crash
- Add toast.error on KB info load failure instead of silent console.error
- Replace hard-coded Chinese validation message with i18n key
- Replace hard-coded English error messages in DynamicFormItemComponent with i18n
- Optimize document polling: stop when all documents reach terminal state
- Add i18n keys (fieldRequired, loadKnowledgeBaseFailed,
deleteKnowledgeBaseFailed, getKnowledgeBaseListError) to all 4 locales
Backend:
- Fix KB delete atomicity: delete from DB first, then notify plugin
- Add RAG engine plugin existence validation before creating KB
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* style(rag): fix ruff formatting in kbmgr.py
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
Co-authored-by: Cursor <cursoragent@cursor.com>
Co-authored-by: Junyan Qin <rockchinq@gmail.com>
* chore: bump langbot-plugin to 0.3.0 (#1992)
* chore: correct sdk version to 0.3.0a1
* feat: normalize rag related actions' names
* refactor(rag): align IngestionContext fields with SDK changes
Remove redundant `chunking_strategy` field and rename `custom_settings`
to `creation_settings` to match the updated SDK entity definitions
(langbot-plugin-sdk#36).
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* style: fix ruff formatting
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* fix(rag): enforce immutability of embedding_model_uuid and non-editable creation_settings fields
Remove embedding_model_uuid from MUTABLE_FIELDS to prevent post-creation
modification via API. Add backend validation for creation_settings to
preserve fields marked editable:false in the plugin's creation schema.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* style(rag): fix ruff formatting in knowledge service
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* refactor(rag): split settings into immutable creation_settings and mutable retrieval_settings
- Remove standalone embedding_model_uuid and top_k columns from KB entity
- Add retrieval_settings column; update MUTABLE_FIELDS/CREATE_FIELDS accordingly
- Merge migration logic into dbm019 (add retrieval_settings, migrate top_k
and embedding_model_uuid into JSON settings, drop old columns on PostgreSQL)
- Remove _filter_creation_settings and per-field editable concept
- Frontend: creation_settings fields are all disabled when editing,
retrieval_settings fields are always editable via a second DynamicFormComponent
- Remove editable from IDynamicFormItemSchema, DynamicFormItemConfig
- Clean up KBCardVO, KnowledgeBase API type, and localagent runner
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* bugfix: if ingest_document failed,not raise exep
* fix: ruff lint
* refactor(rag): remove unused _get_kb_entity method from RAGRuntimeService
* feat(vector): implement metadata filters for vector_search and vector_delete (#1997)
Add functional metadata filter support across all 5 VDB backends using
Chroma-style where syntax as the canonical format. Previously the filters
parameter existed throughout the stack but was entirely ignored.
- Add filter_utils.py with normalize_filter() and strip_unsupported_fields()
- Implement filter in search() and add delete_by_filter() for all backends:
Chroma/SeekDB (native passthrough), Qdrant (translated to models.Filter),
Milvus (translated to expr string), pgvector (translated to SQLAlchemy conditions)
- Milvus/pgvector limited to {text, file_id, chunk_uuid}; other fields logged and ignored
- Replace delete_by_filter() NotImplementedError with backend delegation in mgr.py
- Populate retrieval_context['filters'] from settings in kbmgr._retrieve()
- Pass search_type/query_text/documents through handler and runtime service
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* style(vector): fix ruff formatting
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* fix(vector): remove numpy dependency and fix SeekDB search modes
- Remove numpy array conversion for query vectors; all VDB backends
accept list[float] directly
- Remove redundant get_or_create_collection call from upsert; backends
handle collection creation internally in add_embeddings
- Fix SeekDB to raise ValueError when vector dimension is unknown
instead of defaulting to 384
- Use hybrid_search() for full-text and hybrid search modes in SeekDB,
since pyseekdb's query() always requires embeddings
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* fix(vector): escape single quotes in SeekDB documents and metadata
Document text containing apostrophes (e.g. "don't", "it's") causes
SQL syntax errors in OceanBase because single quotes were not in the
escape table. Add single-quote escaping and apply the escape table to
the documents parameter in add_embeddings(), not just metadata.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* fix(vector): use standard SQL escaping for single quotes in SeekDB
Change single quote escaping from MySQL-style \' to standard SQL ''
(doubled quote). The backslash escape is not recognized by OceanBase
in NO_BACKSLASH_ESCAPES mode, causing SQL syntax errors when metadata
text contains apostrophes (e.g. O'Shea in academic citations).
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* fix(rag): persist retrieval_settings on knowledge base creation
retrieval_settings was not being passed from the service layer to
RAGManager.create_knowledge_base(), causing retrieval schema fields
(e.g. query_rewrite) to be lost on initial KB creation. They only
took effect after a subsequent edit/update.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* feat(web): add show_if conditional rendering for dynamic forms
Support conditional field visibility in plugin-defined forms via
show_if rules (eq, neq, in operators). Fields can depend on values
from the same form or cross-reference between creation and retrieval
settings via externalDependentValues.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* fix(rag): replace base64 with chunked file transfer for get_rag_file_stream
Use send_file() instead of base64 encoding for returning file content
in the GET_RAG_FILE_STREAM handler, avoiding memory issues with large files.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* feat(parser): add parser plugin integration and capability-aware upload UI (#2000)
* feat(parser): add parser plugin integration and capability-aware upload UI
Backend: add parser plugin API endpoints (list/invoke), connector and
handler support for parser actions, and KB manager passthrough.
Frontend: thread ragEngineCapabilities prop to FileUploadZone and use
doc_parsing capability to conditionally show the RAG engine option in
the parser selector. When no parser is available, show a warning
prompting users to install a parser plugin.
Update i18n: rename builtInParser to "Provided by RAG engine" and add
noParserAvailable warning message in all 4 locales.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* fix(parser): replace base64 with chunked file transfer and remove stale cache
- Remove @alru_cache from list_parsers() and list_rag_engines()
- Replace inline base64 file content with send_file/read_local_file
chunked transfer pattern in parse_document and invoke_parser flows
- Remove unused base64 import from kbmgr.py
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
* feat(web): add Parser component kind to plugin market UI and i18n
Add Parser to kindIconMap, market filter toggle, and all 4 locale files
so parser plugins are properly displayed and filterable in the plugin
market, matching the existing RAGEngine treatment.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* style(web): fix prettier formatting from merge
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* refactor: rename RAGEngine to KnowledgeEngine across frontend and backend
* fix(web): fix I18nObject import path in FileUploadZone and KBDoc
* chore: format files involved in RAGEngine to KnowledgeEngine refactor
* refactor: change rag engine to knowledge engine
* fix: update langbot-plugin version to 0.3.0rc1
* chore: disable migration 20 for now
---------
Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
Co-authored-by: Cursor <cursoragent@cursor.com>
Co-authored-by: Junyan Qin <rockchinq@gmail.com>
* feat: add GitHub Actions workflow for linting with Ruff
* refactor: rename lint job and add formatting step to Ruff workflow
* chore: run ruff format
* chore: rename Ruff lint job to 'Lint' and add frontend linting workflow
* Initial plan
* Add package structure and resource path utilities
- Created langbot/ package with __init__.py and __main__.py entry point
- Added paths utility to find frontend and resource files from package installation
- Updated config loading to use resource paths
- Updated frontend serving to use resource paths
- Added MANIFEST.in for package data inclusion
- Updated pyproject.toml with build system and entry points
Co-authored-by: RockChinQ <45992437+RockChinQ@users.noreply.github.com>
* Add PyPI publishing workflow and update license
- Created GitHub Actions workflow to build frontend and publish to PyPI
- Added license field to pyproject.toml to fix deprecation warning
- Updated .gitignore to exclude build artifacts
- Tested package building successfully
Co-authored-by: RockChinQ <45992437+RockChinQ@users.noreply.github.com>
* Add PyPI installation documentation
- Created PYPI_INSTALLATION.md with detailed installation and usage instructions
- Updated README.md to feature uvx/pip installation as recommended method
- Updated README_EN.md with same changes for English documentation
Co-authored-by: RockChinQ <45992437+RockChinQ@users.noreply.github.com>
* Address code review feedback
- Made package-data configuration more specific to langbot package only
- Improved path detection with caching to avoid repeated file I/O
- Removed sys.path searching which was incorrect for package data
- Removed interactive input() call for non-interactive environment compatibility
- Simplified error messages for version check
Co-authored-by: RockChinQ <45992437+RockChinQ@users.noreply.github.com>
* Fix code review issues
- Use specific exception types instead of bare except
- Fix misleading comments about directory levels
- Remove redundant existence check before makedirs with exist_ok=True
- Use context manager for file opening to ensure proper cleanup
Co-authored-by: RockChinQ <45992437+RockChinQ@users.noreply.github.com>
* Simplify package configuration and document behavioral differences
- Removed redundant package-data configuration, relying on MANIFEST.in
- Added documentation about behavioral differences between package and source installation
- Clarified that include-package-data=true uses MANIFEST.in for data files
Co-authored-by: RockChinQ <45992437+RockChinQ@users.noreply.github.com>
* chore: update pyproject.toml
* chore: try pack templates in langbot/
* chore: update
* chore: update
* chore: update
* chore: update
* chore: update
* chore: adjust dir structure
* chore: fix imports
* fix: read default-pipeline-config.json
* fix: read default-pipeline-config.json
* fix: tests
* ci: publish pypi
* chore: bump version 4.6.0-beta.1 for testing
* chore: add templates/**
* fix: send adapters and requesters icons
* chore: bump version 4.6.0b2 for testing
* chore: add platform field for docker-compose.yaml
---------
Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: RockChinQ <45992437+RockChinQ@users.noreply.github.com>
Co-authored-by: Junyan Qin <rockchinq@gmail.com>
* feat: add comprehensive unit tests for pipeline stages
* fix: deps install in ci
* ci: use venv
* ci: run run_tests.sh
* fix: resolve circular import issues in pipeline tests
Update all test files to use lazy imports via importlib.import_module()
to avoid circular dependency errors. Fix mock_conversation fixture to
properly mock list.copy() method.
Changes:
- Use lazy import pattern in all test files
- Fix conftest.py fixture for conversation messages
- Add integration test file for full import tests
- Update documentation with known issues and workarounds
Tests now successfully avoid circular import errors while maintaining
full test coverage of pipeline stages.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
* docs: add comprehensive testing summary
Document implementation details, challenges, solutions, and future
improvements for the pipeline unit test suite.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
* refactor: rewrite unit tests to test actual pipeline stage code
Rewrote unit tests to properly test real stage implementations instead of
mock logic:
- Test actual BanSessionCheckStage with 7 test cases (100% coverage)
- Test actual RateLimit stage with 3 test cases (70% coverage)
- Test actual PipelineManager with 5 test cases
- Use lazy imports via import_module to avoid circular dependencies
- Import pipelinemgr first to ensure proper stage registration
- Use Query.model_construct() to bypass Pydantic validation in tests
- Remove obsolete pure unit tests that didn't test real code
- All 20 tests passing with 48% overall pipeline coverage
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
* test: add unit tests for GroupRespondRuleCheckStage
Added comprehensive unit tests for resprule stage:
- Test person message skips rule check
- Test group message with no matching rules (INTERRUPT)
- Test group message with matching rule (CONTINUE)
- Test AtBotRule removes At component correctly
- Test AtBotRule when no At component present
Coverage: 100% on resprule.py and atbot.py
All 25 tests passing with 51% overall pipeline coverage
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
* refactor: restructure tests to tests/unit_tests/pipeline
Reorganized test directory structure to support multiple test categories:
- Move tests/pipeline → tests/unit_tests/pipeline
- Rename .github/workflows/pipeline-tests.yml → run-tests.yml
- Update run_tests.sh to run all unit tests (not just pipeline)
- Update workflow to trigger on all pkg/** and tests/** changes
- Coverage now tracks entire pkg/ module instead of just pipeline
This structure allows for easy addition of more unit tests for other
modules in the future.
All 25 tests passing with 21% overall pkg coverage.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
* ci: upload codecov report
* ci: codecov file
* ci: coverage.xml
---------
Co-authored-by: Claude <noreply@anthropic.com>