fix(litellmchat): preserve provider_specific_fields for Gemini thought_signature (#2265 )

Update _normalize_stream_tool_calls to preserve provider_specific_fields (including thought_signature) from streaming tool call chunks. Also preserve provider_specific_fields from delta in invoke_llm_stream. This ensures Gemini's thought_signature is round-tripped correctly: 1. LiteLLM extracts thought_signature from Gemini response 2. It's preserved in Message/ToolCall entities (via SDK changes) 3. _convert_messages includes it in the next request Also add unit tests for provider_specific_fields round-tripping. Fixes: langbot-app/LangBot#1899
Add plugin rerank invocation action (#2242 )
2026-06-19 20:14:20 +00:00 · 2026-06-19 23:26:12 +08:00 · 2026-06-19 23:25:54 +08:00 · 2026-06-19 23:13:56 +08:00 · 2026-06-19 18:39:58 +08:00 · 2026-06-19 06:20:17 -04:00
14 changed files with 569 additions and 56 deletions
@@ -70,7 +70,7 @@ dependencies = [
    "chromadb>=1.0.0,<2.0.0",
    "qdrant-client (>=1.15.1,<2.0.0)",
    "pyseekdb==1.1.0.post3",
-    "langbot-plugin==0.4.4",
+    "langbot-plugin==0.4.5",
    "asyncpg>=0.30.0",
    "line-bot-sdk>=3.19.0",
    "matrix-nio>=0.25.2",
@@ -1302,11 +1302,19 @@ class BoxService:
    def get_recent_errors(self) -> list[dict]:
        return list(self._recent_errors)

-    def get_system_guidance(self) -> str:
+    def get_system_guidance(self, query_id=None) -> str:
        """Return LLM system-prompt guidance for the exec tool.

        All execution-specific prompt text is kept here so that callers
        (e.g. LocalAgentRunner) stay free of box domain knowledge.
+
+        ``query_id`` is the current turn's pipeline query id. When provided,
+        the guidance ALWAYS advertises the per-query outbox path so the agent
+        knows how to deliver generated files back to the user — even on turns
+        where the user sent no inbound attachment (e.g. "generate a QR code"),
+        which is exactly when the inbound-attachment note never fires. Outbound
+        collection in the wrapper runs on every turn regardless of inbound
+        files, so without this the file would be produced and silently dropped.
        """
        guidance = (
            'When the exec tool is available, use it for exact calculations, statistics, structured data parsing, '
@@ -1321,6 +1329,13 @@ class BoxService:
                'modify local files in the working directory, use exec with /workspace paths directly; do not ask the '
                'user for directory parameters unless they explicitly need a different directory.'
            )
+        if query_id is not None:
+            outbox_dir = f'{self.OUTBOX_MOUNT_DIR}/{query_id}'
+            guidance += (
+                f' If you produce any file (image, audio, document, etc.) that should be sent back to the user, '
+                f'write it into {outbox_dir}/ (create the directory if needed). Every file placed there will be '
+                'delivered to the user automatically; do not paste file contents or base64 into your reply.'
+            )
        return guidance

    async def get_status(self) -> dict:
@@ -514,6 +514,35 @@ class RuntimeConnectionHandler(handler.Handler):
            except Exception as e:
                return _make_rag_error_response(e, 'EmbeddingError', embedding_model_uuid=embedding_model_uuid)

+        @self.action(PluginToRuntimeAction.INVOKE_RERANK)
+        async def invoke_rerank(data: dict[str, Any]) -> handler.ActionResponse:
+            rerank_model_uuid = data['rerank_model_uuid']
+            query = data['query']
+            documents = data['documents']
+            top_k = data.get('top_k')
+            extra_args = data.get('extra_args', {})
+
+            try:
+                rerank_model = await self.ap.model_mgr.get_rerank_model_by_uuid(rerank_model_uuid)
+            except ValueError:
+                return handler.ActionResponse.error(
+                    message=f'Rerank model with rerank_model_uuid {rerank_model_uuid} not found',
+                )
+
+            try:
+                scores = await rerank_model.provider.invoke_rerank(
+                    model=rerank_model,
+                    query=query,
+                    documents=documents[:64],
+                    extra_args=extra_args,
+                )
+                scored = sorted(scores, key=lambda x: x.get('relevance_score', 0), reverse=True)
+                if top_k is not None:
+                    scored = scored[: int(top_k)]
+                return handler.ActionResponse.success(data={'results': scored})
+            except Exception as e:
+                return _make_rag_error_response(e, 'RerankError', rerank_model_uuid=rerank_model_uuid)
+
        @self.action(PluginToRuntimeAction.VECTOR_UPSERT)
        async def vector_upsert(data: dict[str, Any]) -> handler.ActionResponse:
            collection_id = data['collection_id']
@@ -363,9 +363,13 @@ class LiteLLMRequester(requester.ProviderAPIRequester):
    def _normalize_stream_tool_calls(
        self,
        raw_tool_calls: typing.Any,
-        tool_call_state: dict[int, dict[str, str]],
+        tool_call_state: dict[int, dict[str, typing.Any]],
    ) -> list[dict] | None:
-        """Fill OpenAI-style streaming tool-call deltas so MessageChunk can validate them."""
+        """Fill OpenAI-style streaming tool-call deltas so MessageChunk can validate them.
+
+        Also preserves provider_specific_fields (e.g., Gemini thought_signature) for
+        round-tripping to the next request.
+        """
        if not raw_tool_calls:
            return None

@@ -376,27 +380,59 @@ class LiteLLMRequester(requester.ProviderAPIRequester):
            if not isinstance(index, int):
                index = fallback_index

-            state = tool_call_state.setdefault(index, {'id': '', 'type': 'function', 'name': ''})
+            state = tool_call_state.setdefault(
+                index,
+                {
+                    'id': '',
+                    'type': 'function',
+                    'name': '',
+                    'provider_specific_fields': None,
+                },
+            )
            if tool_call.get('id'):
                state['id'] = tool_call['id']
            if tool_call.get('type'):
                state['type'] = tool_call['type']

+            # Preserve provider_specific_fields from the raw tool call
+            if 'provider_specific_fields' in tool_call:
+                state['provider_specific_fields'] = tool_call['provider_specific_fields']
+
            function = self._as_dict(tool_call.get('function'))
            if function.get('name'):
                state['name'] = function['name']

+            # Also check function-level provider_specific_fields
+            if 'provider_specific_fields' in function:
+                # Merge function-level into tool-level, function-level takes precedence
+                func_psf = function['provider_specific_fields']
+                if state['provider_specific_fields']:
+                    merged = {**state['provider_specific_fields'], **func_psf}
+                    state['provider_specific_fields'] = merged
+                else:
+                    state['provider_specific_fields'] = func_psf
+
            arguments = function.get('arguments')
            if arguments is None:
                arguments = ''
            elif not isinstance(arguments, str):
                arguments = str(arguments)

+            # Some OpenAI-compatible providers (notably Ollama's
+            # /v1/chat/completions) stream a tool-call delta with an `index` and
+            # a `function` payload but never emit an OpenAI-style `id`. Without
+            # an id the call used to be dropped here, so the whole tool call
+            # silently vanished: a tool-only turn then yielded no content and no
+            # tool call, the stream "completed" with 0 chars, and the chat
+            # appeared stuck. Synthesize a stable per-index id so named-but-idless
+            # tool calls survive. Providers that do send ids keep theirs.
+            if not state['id'] and state['name']:
+                state['id'] = f'call_{index}'
+
            if not state['id'] or not state['name']:
                continue

-            normalized.append(
-                {
+            tool_call_dict: dict[str, typing.Any] = {
                'id': state['id'],
                'type': state['type'] or 'function',
                'function': {
@@ -404,7 +440,12 @@ class LiteLLMRequester(requester.ProviderAPIRequester):
                    'arguments': arguments,
                },
            }
-            )
+
+            # Include provider_specific_fields if present
+            if state['provider_specific_fields']:
+                tool_call_dict['provider_specific_fields'] = state['provider_specific_fields']
+
+            normalized.append(tool_call_dict)

        return normalized or None

@@ -528,7 +569,7 @@ class LiteLLMRequester(requester.ProviderAPIRequester):

        chunk_idx = 0
        role = 'assistant'
-        tool_call_state: dict[int, dict[str, str]] = {}
+        tool_call_state: dict[int, dict[str, typing.Any]] = {}

        try:
            response = await acompletion(**args)
@@ -578,13 +619,17 @@ class LiteLLMRequester(requester.ProviderAPIRequester):
                    chunk_idx += 1
                    continue

-                chunk_data = {
+                chunk_data: dict[str, typing.Any] = {
                    'role': role,
                    'content': delta_content if delta_content else None,
                    'tool_calls': tool_calls,
                    'is_final': bool(finish_reason),
                }

+                # Preserve provider_specific_fields from delta (e.g., Gemini thought_signatures)
+                if delta.get('provider_specific_fields'):
+                    chunk_data['provider_specific_fields'] = delta['provider_specific_fields']
+
                chunk_data = {k: v for k, v in chunk_data.items() if v is not None}
                yield provider_message.MessageChunk(**chunk_data)
                chunk_idx += 1
@@ -3,8 +3,8 @@ kind: LLMAPIRequester
 metadata:
  name: moonshot-chat-completions
  label:
-    en_US: Moonshot
-    zh_Hans: 月之暗面
+    en_US: Moonshot / Kimi (Global · api.moonshot.ai)
+    zh_Hans: 月之暗面 / Kimi（国际站 · api.moonshot.ai）
  icon: moonshot.png
 spec:
  litellm_provider: openai
@@ -0,0 +1,33 @@
+apiVersion: v1
+kind: LLMAPIRequester
+metadata:
+  name: moonshot-cn-chat-completions
+  label:
+    en_US: Moonshot / Kimi (China · api.moonshot.cn)
+    zh_Hans: 月之暗面 / Kimi（国内站 · api.moonshot.cn）
+  icon: moonshot.png
+spec:
+  litellm_provider: openai
+  config:
+  - name: base_url
+    label:
+      en_US: Base URL
+      zh_Hans: 基础 URL
+    type: string
+    required: true
+    default: https://api.moonshot.cn/v1
+  - name: timeout
+    label:
+      en_US: Timeout
+      zh_Hans: 超时时间
+    type: integer
+    required: true
+    default: 120
+  alias: "moonshot Moonshot 月之暗面 月暗 kimi Kimi 月之 暗面 moonshot-v1 k2 cn 国内 国内站"
+  support_type:
+  - llm
+  provider_category: manufacturer
+execution:
+  python:
+    path: ./moonshotchatcmpl.py
+    attr: MoonshotChatCompletions
@@ -177,7 +177,7 @@ class LocalAgentRunner(runner.RequestRunner):
            req_messages.append(
                provider_message.Message(
                    role='system',
-                    content=self.ap.box_service.get_system_guidance(),
+                    content=self.ap.box_service.get_system_guidance(query.query_id),
                )
            )

@@ -546,6 +546,41 @@ async def test_box_service_rejects_host_mount_outside_allowed_roots(tmp_path):
        )


+class TestGetSystemGuidance:
+    """``get_system_guidance`` must ALWAYS advertise the per-query outbox path
+    when given a ``query_id`` — even with no inbound attachment — so files the
+    agent generates (QR codes, charts, rendered docs) are actually delivered.
+
+    The wrapper collects the outbox on every turn regardless of inbound files;
+    before this, the agent was only told the outbox path inside the
+    inbound-attachment note, so pure-generation turns produced files that were
+    silently dropped.
+    """
+
+    def _service(self, logger=None):
+        logger = logger or Mock()
+        runtime = BoxRuntime(logger=logger, backends=[FakeBackend(logger)], session_ttl_sec=300)
+        return BoxService(make_app(logger), client=_InProcessBoxRuntimeClient(logger, runtime))
+
+    def test_guidance_includes_outbox_when_query_id_given(self):
+        service = self._service()
+        guidance = service.get_system_guidance(42)
+        assert f'{service.OUTBOX_MOUNT_DIR}/42' in guidance
+        assert 'delivered to the user automatically' in guidance
+
+    def test_guidance_omits_outbox_without_query_id(self):
+        service = self._service()
+        guidance = service.get_system_guidance()
+        assert service.OUTBOX_MOUNT_DIR not in guidance
+        # core exec guidance is still present
+        assert 'exec tool' in guidance
+
+    def test_guidance_outbox_independent_of_inbound_attachments(self):
+        # A bare query_id (the pure-generation case) still gets the outbox note.
+        service = self._service()
+        assert f'{service.OUTBOX_MOUNT_DIR}/0' in service.get_system_guidance(0)
+
+
@pytest.mark.asyncio
 async def test_box_runtime_rejects_host_mount_conflict_in_same_session(tmp_path):
    logger = Mock()
@@ -27,6 +27,66 @@ def compiled_params(statement):
    return statement.compile().params


+class TestRagRerankAction:
+    """Tests for RAG rerank action handler."""
+
+    @pytest.fixture
+    def app(self):
+        mock_app = Mock()
+        mock_app.model_mgr = Mock()
+        mock_app.logger = Mock()
+        return mock_app
+
+    @pytest.mark.asyncio
+    async def test_invokes_rerank_model_and_sorts_scores(self, app):
+        """Rerank action uses the selected model and returns top scores."""
+        provider = Mock()
+        provider.invoke_rerank = AsyncMock(
+            return_value=[
+                {'index': 0, 'relevance_score': 0.2},
+                {'index': 1, 'relevance_score': 0.9},
+            ]
+        )
+        rerank_model = SimpleNamespace(provider=provider)
+        app.model_mgr.get_rerank_model_by_uuid = AsyncMock(return_value=rerank_model)
+        runtime_handler = make_handler(app)
+
+        response = await runtime_handler.actions[PluginToRuntimeAction.INVOKE_RERANK.value]({
+            'rerank_model_uuid': 'rerank-1',
+            'query': 'hello',
+            'documents': ['a', 'b'],
+            'top_k': 1,
+            'extra_args': {'return_documents': False},
+        })
+
+        assert response.code == 0
+        assert response.data['results'] == [{'index': 1, 'relevance_score': 0.9}]
+        app.model_mgr.get_rerank_model_by_uuid.assert_awaited_once_with('rerank-1')
+        provider.invoke_rerank.assert_awaited_once_with(
+            model=rerank_model,
+            query='hello',
+            documents=['a', 'b'],
+            extra_args={'return_documents': False},
+        )
+
+    @pytest.mark.asyncio
+    async def test_returns_error_when_rerank_model_missing(self, app):
+        """Missing rerank model returns an action error."""
+        app.model_mgr.get_rerank_model_by_uuid = AsyncMock(
+            side_effect=ValueError('not found')
+        )
+        runtime_handler = make_handler(app)
+
+        response = await runtime_handler.actions[PluginToRuntimeAction.INVOKE_RERANK.value]({
+            'rerank_model_uuid': 'missing',
+            'query': 'hello',
+            'documents': ['a'],
+        })
+
+        assert response.code != 0
+        assert 'Rerank model with rerank_model_uuid missing not found' in response.message
+
+
 class TestInitializePluginSettings:
    """Tests for initialize_plugin_settings action handler."""

@@ -352,6 +352,117 @@ class TestInvokeLLMStreamUsage:
        assert tool_chunks[1].tool_calls[0].function.arguments == '{"text":'
        assert tool_chunks[2].tool_calls[0].function.arguments == '"plugin-tool-ok"}'

+    @pytest.mark.asyncio
+    async def test_stream_tool_call_without_id_is_not_dropped(self):
+        """Regression for #2261.
+
+        Ollama's OpenAI-compatible streaming endpoint emits a tool-call delta
+        carrying an ``index`` and a ``function`` payload but never an
+        OpenAI-style ``id``. The requester used to drop any id-less tool call,
+        so a tool-only turn yielded nothing, the stream "completed" with 0
+        chars, and the chat got stuck. A stable per-index id must be
+        synthesized so the tool call survives.
+        """
+        import langbot_plugin.api.entities.builtin.pipeline.query as pipeline_query
+        import langbot_plugin.api.entities.builtin.provider.message as provider_message
+
+        mock_ap = Mock()
+        mock_ap.tool_mgr = Mock()
+        mock_ap.tool_mgr.generate_tools_for_openai = AsyncMock(
+            return_value=[{'type': 'function', 'function': {'name': 'zotero_search_items'}}]
+        )
+        requester = litellmchat.LiteLLMRequester(ap=mock_ap, config={'custom_llm_provider': 'openai'})
+        model = MockRuntimeModel('gpt-oss:20b', 'ollama')
+
+        # Ollama delivers the whole tool call in a single delta, with no id.
+        chunks = [
+            self._make_chunk(
+                tool_calls=[
+                    {
+                        'index': 0,
+                        'function': {'name': 'zotero_search_items', 'arguments': '{"query":"hello"}'},
+                    }
+                ]
+            ),
+            self._make_chunk(finish_reason='tool_calls'),
+        ]
+
+        async def _aiter(*args, **kwargs):
+            for c in chunks:
+                yield c
+
+        query = Mock(spec=pipeline_query.Query)
+        query.variables = {}
+        messages = [provider_message.Message(role='user', content='hello?')]
+        funcs = [Mock()]
+
+        with patch.object(litellmchat, 'acompletion', new=AsyncMock(side_effect=lambda **kw: _aiter())):
+            collected = [
+                chunk
+                async for chunk in requester.invoke_llm_stream(
+                    query=query,
+                    model=model,
+                    messages=messages,
+                    funcs=funcs,
+                )
+            ]
+
+        tool_chunks = [chunk for chunk in collected if chunk.tool_calls]
+        assert len(tool_chunks) == 1, 'id-less Ollama tool call must not be dropped'
+        tc = tool_chunks[0].tool_calls[0]
+        assert tc.id == 'call_0'
+        assert tc.function.name == 'zotero_search_items'
+        assert tc.function.arguments == '{"query":"hello"}'
+
+    @pytest.mark.asyncio
+    async def test_stream_multiple_tool_calls_without_id_get_distinct_ids(self):
+        """Two parallel id-less tool calls must keep distinct synthesized ids."""
+        import langbot_plugin.api.entities.builtin.pipeline.query as pipeline_query
+        import langbot_plugin.api.entities.builtin.provider.message as provider_message
+
+        mock_ap = Mock()
+        mock_ap.tool_mgr = Mock()
+        mock_ap.tool_mgr.generate_tools_for_openai = AsyncMock(
+            return_value=[{'type': 'function', 'function': {'name': 'zotero_search_items'}}]
+        )
+        requester = litellmchat.LiteLLMRequester(ap=mock_ap, config={'custom_llm_provider': 'openai'})
+        model = MockRuntimeModel('gpt-oss:20b', 'ollama')
+
+        chunks = [
+            self._make_chunk(
+                tool_calls=[
+                    {'index': 0, 'function': {'name': 'zotero_search_items', 'arguments': '{"q":"a"}'}},
+                    {'index': 1, 'function': {'name': 'zotero_get_notes', 'arguments': '{"q":"b"}'}},
+                ]
+            ),
+            self._make_chunk(finish_reason='tool_calls'),
+        ]
+
+        async def _aiter(*args, **kwargs):
+            for c in chunks:
+                yield c
+
+        query = Mock(spec=pipeline_query.Query)
+        query.variables = {}
+        messages = [provider_message.Message(role='user', content='hello?')]
+        funcs = [Mock()]
+
+        with patch.object(litellmchat, 'acompletion', new=AsyncMock(side_effect=lambda **kw: _aiter())):
+            collected = [
+                chunk
+                async for chunk in requester.invoke_llm_stream(
+                    query=query,
+                    model=model,
+                    messages=messages,
+                    funcs=funcs,
+                )
+            ]
+
+        tool_chunks = [chunk for chunk in collected if chunk.tool_calls]
+        assert len(tool_chunks) == 1
+        ids = {tc.id for tc in tool_chunks[0].tool_calls}
+        assert ids == {'call_0', 'call_1'}
+

 class TestProcessThinkingContent:
    """Test _process_thinking_content method"""
@@ -0,0 +1,172 @@
+"""Unit tests for provider_specific_fields round-trip in LiteLLMRequester.
+
+This tests the fix for GitHub issue #1899: Gemini requires thought_signature
+to be preserved across tool call rounds for function calls to work correctly.
+"""
+
+import langbot_plugin.api.entities.builtin.provider.message as provider_message
+
+from langbot.pkg.provider.modelmgr.requesters.litellmchat import LiteLLMRequester
+
+
+def _make_requester() -> LiteLLMRequester:
+    # _convert_messages and _normalize_stream_tool_calls do not touch instance config.
+    return LiteLLMRequester.__new__(LiteLLMRequester)
+
+
+def test_convert_messages_preserves_tool_call_provider_specific_fields():
+    """Tool calls should retain provider_specific_fields through _convert_messages."""
+    req = _make_requester()
+    msg = provider_message.Message(
+        role='assistant',
+        content=None,
+        tool_calls=[
+            provider_message.ToolCall(
+                id='call_123',
+                type='function',
+                function=provider_message.FunctionCall(
+                    name='search',
+                    arguments='{"query": "test"}',
+                ),
+                provider_specific_fields={
+                    'thought_signature': 'c2tpcF90aG91Z2h0X3NpZ25hdHVyZQ==',
+                },
+            ),
+        ],
+    )
+    out = req._convert_messages([msg])
+    assert len(out) == 1
+    assert out[0]['tool_calls'] is not None
+    assert len(out[0]['tool_calls']) == 1
+
+    tc = out[0]['tool_calls'][0]
+    assert tc['id'] == 'call_123'
+    assert tc['function']['name'] == 'search'
+    assert 'provider_specific_fields' in tc
+    assert tc['provider_specific_fields']['thought_signature'] == 'c2tpcF90aG91Z2h0X3NpZ25hdHVyZQ=='
+
+
+def test_convert_messages_preserves_message_provider_specific_fields():
+    """Messages should retain provider_specific_fields through _convert_messages."""
+    req = _make_requester()
+    msg = provider_message.Message(
+        role='assistant',
+        content='Hello',
+        provider_specific_fields={
+            'thought_signatures': ['sig1', 'sig2'],
+        },
+    )
+    out = req._convert_messages([msg])
+    assert len(out) == 1
+    assert 'provider_specific_fields' in out[0]
+    assert out[0]['provider_specific_fields']['thought_signatures'] == ['sig1', 'sig2']
+
+
+def test_normalize_stream_tool_calls_preserves_provider_specific_fields():
+    """Streaming tool calls should retain provider_specific_fields."""
+    req = _make_requester()
+    tool_call_state: dict[int, dict] = {}
+
+    # Simulate first chunk with id and type
+    raw_tool_calls_1 = [
+        {
+            'index': 0,
+            'id': 'call_abc',
+            'type': 'function',
+            'function': {
+                'name': 'get_weather',
+                'arguments': '',
+            },
+            'provider_specific_fields': {
+                'thought_signature': 'dGVzdF9zaWduYXR1cmU=',
+            },
+        },
+    ]
+    result_1 = req._normalize_stream_tool_calls(raw_tool_calls_1, tool_call_state)
+    assert result_1 is not None
+    assert len(result_1) == 1
+    assert result_1[0]['provider_specific_fields']['thought_signature'] == 'dGVzdF9zaWduYXR1cmU='
+
+    # Simulate second chunk without provider_specific_fields (should be retained from state)
+    raw_tool_calls_2 = [
+        {
+            'index': 0,
+            'function': {
+                'arguments': '{"city": "Tokyo"}',
+            },
+        },
+    ]
+    result_2 = req._normalize_stream_tool_calls(raw_tool_calls_2, tool_call_state)
+    assert result_2 is not None
+    assert len(result_2) == 1
+    # Should retain the provider_specific_fields from the first chunk
+    assert result_2[0]['provider_specific_fields']['thought_signature'] == 'dGVzdF9zaWduYXR1cmU='
+    assert result_2[0]['function']['arguments'] == '{"city": "Tokyo"}'
+
+
+def test_normalize_stream_tool_calls_merges_function_level_psf():
+    """Function-level provider_specific_fields should be merged into tool-level."""
+    req = _make_requester()
+    tool_call_state: dict[int, dict] = {}
+
+    raw_tool_calls = [
+        {
+            'index': 0,
+            'id': 'call_xyz',
+            'type': 'function',
+            'function': {
+                'name': 'search',
+                'arguments': '{}',
+                'provider_specific_fields': {
+                    'thought_signature': 'ZnVuY19sZXZlbF9zaWc=',
+                },
+            },
+        },
+    ]
+    result = req._normalize_stream_tool_calls(raw_tool_calls, tool_call_state)
+    assert result is not None
+    assert result[0]['provider_specific_fields']['thought_signature'] == 'ZnVuY19sZXZlbF9zaWc='
+
+
+def test_tool_call_roundtrip_through_message_entity():
+    """Full round-trip: LiteLLM response dict -> Message entity -> _convert_messages."""
+    # Simulate what LiteLLM returns for a Gemini tool call response
+    message_data = {
+        'role': 'assistant',
+        'content': None,
+        'tool_calls': [
+            {
+                'id': 'call_gemini_123',
+                'type': 'function',
+                'function': {
+                    'name': 'web_search',
+                    'arguments': '{"query": "test"}',
+                },
+                'provider_specific_fields': {
+                    'thought_signature': 'Z2VtaW5pX3NpZ25hdHVyZQ==',
+                },
+            },
+        ],
+        'provider_specific_fields': {
+            'thought_signatures': ['Z2VtaW5pX3NpZ25hdHVyZQ=='],
+        },
+    }
+
+    # Parse into Message entity (this is what invoke_llm does)
+    msg = provider_message.Message(**message_data)
+
+    # Verify the entity has the fields
+    assert msg.tool_calls is not None
+    assert len(msg.tool_calls) == 1
+    assert msg.tool_calls[0].provider_specific_fields is not None
+    assert msg.tool_calls[0].provider_specific_fields['thought_signature'] == 'Z2VtaW5pX3NpZ25hdHVyZQ=='
+    assert msg.provider_specific_fields is not None
+    assert msg.provider_specific_fields['thought_signatures'] == ['Z2VtaW5pX3NpZ25hdHVyZQ==']
+
+    # Convert back to dict for LiteLLM (this is what _convert_messages does)
+    req = _make_requester()
+    out = req._convert_messages([msg])
+
+    # Verify the fields are preserved in the output
+    assert out[0]['tool_calls'][0]['provider_specific_fields']['thought_signature'] == 'Z2VtaW5pX3NpZ25hdHVyZQ=='
+    assert out[0]['provider_specific_fields']['thought_signatures'] == ['Z2VtaW5pX3NpZ25hdHVyZQ==']
@@ -2082,7 +2082,7 @@ requires-dist = [
    { name = "ebooklib", specifier = ">=0.18" },
    { name = "gewechat-client", specifier = ">=0.1.5" },
    { name = "html2text", specifier = ">=2024.2.26" },
-    { name = "langbot-plugin", specifier = "==0.4.4" },
+    { name = "langbot-plugin", specifier = "==0.4.5" },
    { name = "langchain", specifier = ">=0.2.0" },
    { name = "langchain-core", specifier = ">=1.3.3" },
    { name = "langchain-text-splitters", specifier = ">=1.1.2" },
@@ -2146,7 +2146,7 @@ dev = [

 [[package]]
 name = "langbot-plugin"
-version = "0.4.4"
+version = "0.4.5"
 source = { registry = "https://pypi.org/simple" }
 dependencies = [
    { name = "aiofiles" },
@@ -2167,9 +2167,9 @@ dependencies = [
    { name = "watchdog" },
    { name = "websockets" },
 ]
-sdist = { url = "https://files.pythonhosted.org/packages/68/1a/636c057f6e07a0c87dc7b9c1a373d73df82787b7706ba3ba1a95f633ce7c/langbot_plugin-0.4.4.tar.gz", hash = "sha256:8fdad2d22fe8360d2911557fac17f258f57e85f1a36bd50cd488cb44f61225a4", size = 312741, upload-time = "2026-06-13T11:59:36.772Z" }
+sdist = { url = "https://files.pythonhosted.org/packages/f3/db/db33ec42b3242ea7de0c93b0523a8d32a3df76b13de177fd31671db0ba3f/langbot_plugin-0.4.5.tar.gz", hash = "sha256:3cafa5694f31e9e4b4a3d870c1bc23ee7ac6e8d271a0140c5198993471f220ec", size = 326504, upload-time = "2026-06-19T14:53:51.687Z" }
 wheels = [
-    { url = "https://files.pythonhosted.org/packages/f9/c6/3c313e4ec431fca68326f348bd2c7a61777d43c940bb46ae6c8ebfb66973/langbot_plugin-0.4.4-py3-none-any.whl", hash = "sha256:c91f082ca431539f34790e497e2f056f4e7030e46e0d2bf01a6114b055dd2feb", size = 214164, upload-time = "2026-06-13T11:59:38.053Z" },
+    { url = "https://files.pythonhosted.org/packages/81/92/8a08f8793de479fffa12a1906a25b6ff5b67a018520fa72d981569e1a6e4/langbot_plugin-0.4.5-py3-none-any.whl", hash = "sha256:12ab9aff0fb2459f75a11ba6999d2b5dfc753dcc7d265b078777b24e04b23c83", size = 215602, upload-time = "2026-06-19T14:53:50.021Z" },
 ]

 [[package]]
@@ -74,6 +74,15 @@
  }
 }

+/* Hide scrollbar while keeping scroll behaviour (horizontal tag/filter rows). */
+.scrollbar-hide {
+  -ms-overflow-style: none; /* IE / Edge */
+  scrollbar-width: none; /* Firefox */
+}
+.scrollbar-hide::-webkit-scrollbar {
+  display: none; /* Chrome / Safari / WebKit */
+}
+
@custom-variant dark (&:is(.dark *));

@theme inline {
@@ -787,13 +787,14 @@ function MarketPageContent({
          </div>
        </div>

-        {/* 用真实标签做快速筛选 */}
-        <div className="mx-auto flex w-full max-w-4xl items-center gap-2 overflow-x-auto pb-1 sm:flex-wrap sm:justify-center sm:overflow-visible">
+        {/* 用真实标签做快速筛选 —— 始终单行横向滚动，避免标签变多时换行错位 */}
+        <div className="relative mx-auto w-full max-w-4xl">
+          <div className="scrollbar-hide flex items-center gap-1.5 overflow-x-auto pb-1 pr-6">
            <Button
              type="button"
              variant={selectedTags.length === 0 ? 'secondary' : 'ghost'}
              size="sm"
-            className="h-8 shrink-0"
+              className="h-7 shrink-0 px-2.5 text-xs"
              onClick={() => handleTagsChange([])}
            >
              {t('market.allExtensions')}
@@ -806,7 +807,7 @@ function MarketPageContent({
                  type="button"
                  variant={selected ? 'secondary' : 'ghost'}
                  size="sm"
-                className="h-8 shrink-0"
+                  className="h-7 shrink-0 px-2.5 text-xs"
                  onClick={() => {
                    const newTags = selected
                      ? selectedTags.filter((t) => t !== tag.tag)
@@ -815,11 +816,14 @@ function MarketPageContent({
                  }}
                >
                  {tagNames[tag.tag] || tag.tag}
-                {selected && <X className="h-3.5 w-3.5" />}
+                  {selected && <X className="h-3 w-3" />}
                </Button>
              );
            })}
          </div>
+          {/* 右侧渐隐，提示还有更多标签可横向滚动查看 */}
+          <div className="pointer-events-none absolute right-0 top-0 bottom-1 w-8 bg-gradient-to-l from-background to-transparent" />
+        </div>
      </div>

      {/* Scrollable extension list section */}
Author	SHA1	Message	Date
huanghuoguoguo	acfac42107	fix(litellmchat): preserve provider_specific_fields for Gemini thought_signature (#2265 ) Update _normalize_stream_tool_calls to preserve provider_specific_fields (including thought_signature) from streaming tool call chunks. Also preserve provider_specific_fields from delta in invoke_llm_stream. This ensures Gemini's thought_signature is round-tripped correctly: 1. LiteLLM extracts thought_signature from Gemini response 2. It's preserved in Message/ToolCall entities (via SDK changes) 3. _convert_messages includes it in the next request Also add unit tests for provider_specific_fields round-tripping. Fixes: langbot-app/LangBot#1899	2026-06-19 23:26:12 +08:00
huanghuoguoguo	492827ea75	Add plugin rerank invocation action (#2242 )	2026-06-19 23:25:54 +08:00
huanghuoguoguo	4538fca901	chore(deps): bump langbot-plugin to 0.4.5 (#2266 ) Bumps the pinned langbot-plugin SDK from 0.4.4 to 0.4.5, which adds `provider_specific_fields` to the Message/ToolCall entities. This is the SDK dependency required by the Gemini thought_signature fix (#1899, #2265). The lock update is scoped to langbot-plugin only. pylibseekdb is deliberately held at 1.1.0: a free re-resolve drifts it to 1.3.0 (pyseekdb==1.1.0.post3 has no upper bound on it), which is out of scope here and should be handled in a separate dependency-upgrade PR.	2026-06-19 23:13:56 +08:00
Junyan Chin	b02c9517f6	feat(modelmgr): split Moonshot/Kimi into Global and China presets (#2264 ) Adding a Kimi/Moonshot provider failed model scanning out of the box for CN-region API keys: the single preset defaulted its base URL to the global endpoint `https://api.moonshot.ai/v1`, but CN-issued keys are only valid against `https://api.moonshot.cn/v1`, so scanning returned `401 Invalid Authentication`. Flipping the default would just move the breakage to international keys, since the base_url field is plain free-text and either region is equally common. Instead, offer two clearly labelled presets, mirroring how the Lark adapter exposes feishu.cn vs larksuite.com: - `moonshot-chat-completions` -> "Moonshot / Kimi (Global · api.moonshot.ai)" - `moonshot-cn-chat-completions` -> "Moonshot / Kimi (China · api.moonshot.cn)" The existing component name is kept unchanged so provider rows already in the DB keep resolving; only its display label is clarified. Both presets keep base_url as a free-text field, so users behind a proxy / one-api gateway can still enter a custom endpoint. Both carry the same `kimi` search aliases so either shows up when searching. Fixes #2232	2026-06-19 18:39:58 +08:00
RockChinQ	511b5a7bf4	style(web): shrink market tag filter row (height + font) Make the quick-filter tag pills more compact: h-8 -> h-7, default text -> text-xs with px-2.5, gap-2 -> gap-1.5, and the selected-X icon h-3.5 -> h-3. Keeps the single-row horizontal-scroll layout.	2026-06-19 06:20:17 -04:00
RockChinQ	65fbf4db59	style(web): keep market tag filter on a single horizontal-scroll row With many category tags the quick-filter row used `sm:flex-wrap` on desktop, so once tags overflowed the available width they wrapped onto a second, center-aligned line — leaving an orphan tag floating under the row (looked broken and only gets worse as more tags are added). Make the row a single, never-wrapping line that scrolls horizontally at every breakpoint, left-aligned, with the scrollbar hidden and a subtle right-edge fade to signal there's more to scroll. Adds a reusable `.scrollbar-hide` utility to global.css.	2026-06-19 06:15:31 -04:00
Junyan Chin	3d5b70cc5d	fix(modelmgr): keep id-less streamed tool calls (Ollama) (#2262 ) Ollama's OpenAI-compatible streaming endpoint emits a tool-call delta carrying an `index` and a `function` payload but never an OpenAI-style `id`. `_normalize_stream_tool_calls` dropped any tool call without an `id`, so a tool-only turn yielded neither content nor a tool call: the stream "completed" with 0 chars, the tool never ran, and the chat appeared stuck. Models on standard OpenAI APIs (e.g. SiliconFlow) were unaffected because they always send a `call_...` id. Synthesize a stable per-index id (`call_<index>`) when the provider omits one but a function name is present. Providers that do send ids keep theirs, and parallel id-less calls keep distinct ids. Adds regression tests for the single and multi id-less tool-call cases. Fixes #2261	2026-06-19 18:07:25 +08:00
RockChinQ	83623f6afe	fix(box): always advertise outbox path in exec guidance Outbound attachment collection (pipeline wrapper) runs on every turn regardless of inbound files, but the agent was only told the per-query outbox path inside the inbound-attachment note in LocalAgentRunner. So on pure-generation turns (e.g. "generate a QR code"/chart/mermaid where the user sent no file), the agent never learned the outbox path or the query_id, wrote the generated file nowhere deliverable, and it was silently dropped. Move the outbox instruction into BoxService.get_system_guidance(query_id), which is injected as a system message on every turn the exec tool is available. The inbound note keeps its own (now redundant but harmless) outbox line. Add unit tests asserting the outbox path is present with a query_id and absent without one.	2026-06-19 04:09:45 -04:00