Feat/test build (#2174)

* fix(ci): update unit-test workflow paths to match current source layout Replace stale pkg/** filter with src/langbot/** and add uv.lock. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * docs(tests): update README to reflect current test layout - Fix stale paths: tests/pipeline → tests/unit_tests/pipeline - Update CI Python versions: 3.11, 3.12, 3.13 - Add test directory structure for box, config, platform, plugin, provider, storage - Document pytest markers and uv commands - Mention planned E2E tests Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * feat(test): add shared test factories package Create tests/factories/ with reusable test factories: - FakeApp: mock application with all dependencies - Message chains: text_chain, mention_chain, image_chain - Query factories: text_query, group_text_query, command_query, etc. No test changes - maintains backward compatibility. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * feat(test): add fake provider factory Add tests/factories/provider.py with: - FakeProvider: deterministic fake LLM provider - Error simulation: timeout, auth, rate-limit, malformed - Request capture for assertions - fake_model: mock model with attached provider Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * feat(test): add fake platform factory Add tests/factories/platform.py with: - FakePlatform: simulated platform adapter - Inbound message construction: friend/group/image - Mention-bot flag simulation - Outbound message capture for assertions - Streaming output support simulation - Send failure simulation Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * feat(test): add comprehensive message/query factories Extend tests/factories/message.py with: - file_query: file attachment query - unsupported_query: unknown message segment - voice_query: audio/voice query - at_all_query: group @All mention - query_with_session: query with session object - query_with_config: query with custom pipeline config Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * feat(test): add fake message flow smoke test Create tests/smoke/test_fake_message_flow.py: - TestFakeMessageFlow: factory verification tests - TestMessageFlowIntegration: minimal flow smoke test - Tests FakeApp, FakeProvider, FakePlatform, query factories - Verifies LANGBOT_FAKE_PONG marker response - Captures outbound messages for assertions Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * feat(test): add developer test-quick command Add scripts/test-quick.sh and Makefile with: - test-quick: runs ruff check + unit tests + smoke tests - No real provider keys or platform accounts required - Suitable for local branch self-test Update tests/README.md: - Document test-quick command - Document test factories package - Add smoke tests and factories directory structure Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * fix(test): make test-quick reliable as developer gate Fixes for D-001验收问题: 1. test-quick.sh: use set -euo pipefail, uv run ruff, no tail pipe 2. Remove unused imports in factories (app.py, platform.py, provider.py) 3. Fix unused variable in smoke test 4. Add noqa: E402 to test_n8nsvapi.py lazy imports 5. Update smoke test docs: "minimal fake flow" not full pipeline Now test-quick is a reliable gate: lint failures exit 1, test failures propagate. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * test(unit): add preproc and taskmgr unit tests U-001: Pipeline Preprocessor tests - Normal text message processing - Empty message handling - Image segment with/without vision model - Model selection and fallback - Variable extraction U-004: Core Task Manager tests (pattern-based) - Task creation and tracking patterns - Task cancellation patterns - Scope-based cancellation - Task type filtering - Pruning completed tasks - Wait all tasks Taskmgr tests use pattern-based approach to avoid circular import in source code (taskmgr → app → http_controller → migration → taskmgr). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * test(unit): add config loader unit tests U-005: Config Loader tests - Valid YAML config loading - Valid JSON config loading - Invalid YAML/JSON error behavior - Missing config file creation from template - Template completion for missing keys - ConfigManager load/dump operations - Exists check for both YAML and JSON All tests use tmp_path fixture, no real project config. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * test(unit): add chat and command handler pattern tests U-002: Chat Handler tests (pattern-based) - Normal message event emission pattern - prevent_default handling - User message alteration pattern - Runner selection pattern - Streaming/non-streaming response patterns - Exception handling modes (show-error, show-hint, hide) - Message history update pattern - Telemetry payload pattern U-003: Command Handler tests (pattern-based) - Command parsing and text extraction - Event creation pattern - Privilege/admin check pattern - Command result handling (text, error, image) - prevent_default handling - String truncation helper Uses pattern-based testing to avoid circular import issues in source code. Direct imports of handler modules trigger circular import chain. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * style: fix unused imports after ruff auto-fix Remove unused imports in test files: - test_config_loader.py: remove unused os - test_taskmgr.py: remove unused Mock - test_preproc.py: remove unused unsupported_query, image_chain Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * test(unit): improve taskmgr tests to test real classes U-004 improved: Tests now import and test actual classes: - TaskContext: new(), trace(), to_dict(), placeholder() - TaskWrapper: task creation, context, exception/result capture, cancel, to_dict - AsyncTaskManager: create_task, create_user_task, cancel_task, cancel_by_scope - Task pruning behavior Uses pre-mocking technique: - Mock langbot.pkg.core.app before import (breaks circular chain) - Mock langbot.pkg.core.entities with proper Enum All 24 tests now test real class behavior, not patterns. taskmgr.py coverage should improve significantly. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * refactor(test): consolidate FakeApp and add sys.modules isolation utility - Extract tests/utils/import_isolation.py with isolated_sys_modules context manager - Extend tests/factories/app.py FakeApp with handler-specific attributes - Refactor test_chat_handler.py to use centralized FakeApp and cached imports - Refactor test_command_handler.py with mock_execute_factory fixture - Refactor test_smoke.py to move import-time sys.modules manipulation into fixture - Add SQLite migration integration tests (G-002) - Add HTTP API smoke integration tests (G-005) - Update CI workflow to call pytest for SQLite migrations (G-004) Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * feat(test): add developer quality gate consolidation (G-007) - Add scripts/test-integration-fast.sh for fast integration tests - Add scripts/test-coverage.sh with 12% baseline threshold - Update Makefile with test-integration-fast, test-coverage, test-all-local - Update CI workflow with integration and coverage jobs - Add smoke marker to pytest.ini - Update tests/README.md with quality gate layers documentation - Add tests/integration/pipeline/ for pipeline stage-chain tests Quality gate layers: - Quick: ruff + unit + smoke (~2 min) - Fast Integration: SQLite/API/Pipeline (~3 min) - Coverage: 12% threshold gate (~8 min) - Full Local: all three combined Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * feat(test): add PostgreSQL migration slow integration tests (G-003) - Add tests/integration/persistence/test_migrations_postgres.py - All tests marked with @pytest.mark.slow - Tests skip when TEST_POSTGRES_URL is not set (no local PostgreSQL) - Database isolation via clean_tables and clean_alembic_version fixtures - Update CI workflow to use pytest instead of inline Python script - Remove TODO(G-003) comment - Update tests/README.md with PostgreSQL test documentation Covered scenarios: - Baseline stamp sets revision - Upgrade from baseline to head - Upgrade idempotent - Get current on unstamped DB returns None Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * feat(test): Phase 1.5 coverage expansion - COV-001 to COV-013 Coverage baseline raised from 13.65% to 26% (+12.35%) Gate raised from 12% to 18% Tasks completed: - COV-001: Command system unit tests (100% coverage) - COV-002: API service unit tests batch 1 (user/apikey/model/provider) - COV-003: Provider model manager unit tests - COV-004: Pipeline remaining stage tests (aggregator/cntfilter/longtext/msgtrun) - COV-005: Storage and utils coverage pass - COV-006: Gate ratchet 12%→15% - COV-007: Gate ratchet 15%→18% - COV-008: API service batch 2 (bot/pipeline/webhook/space/maintenance/mcp) - COV-009: Blocked - API controller circular import issue documented - COV-010: Plugin runtime unit tests (+0.08%) - COV-011: RAG and vector unit tests (+0.68%) - COV-012: Core boot and migration unit tests - COV-013: Provider requester logic unit tests (+0.62%) Key additions: - tests/utils/import_isolation.py: sys.modules isolation for circular imports - Provider requester mock tests: proved HTTP-dependent code can be tested locally - Vector filter utilities: 100% coverage on pure functions - API services: fake persistence pattern for unit testing Blocked issue COV-009 documented in langbot-test-plan/1.5/issues/ Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * test(phase1): add unit tests for telemetry, plugin, rag, persistence Add initial unit tests for Phase 1 of test coverage improvement: - telemetry: test initialization, payload sanitization, early returns (14.3% → 62.9%) - plugin: test _parse_plugin_id static method - rag: test _to_i18n_name static method - persistence: test serialize_model with datetime handling Overall core coverage: 41.9% → 42.2% Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * test(phase2): add unit tests for core, persistence, plugin, utils - Add test_handler_helpers.py for plugin handler helpers (7 tests) - Add test_mgr_methods.py for persistence manager (5 tests) - Add test_app_config_validation.py for core app config (12 tests) - Add test_knowledge_service.py for API knowledge service (22 tests) - Add test_kbmgr.py for RAG knowledge base manager (39 tests) - Add test_survey_manager.py for survey manager (22 tests) - Add test_connector_methods.py for plugin connector (24 tests) - Add test_funcschema.py for utils function schema (9 tests) - Add test_platform.py for utils platform detection (7 tests) - Add test_extract_deps.py for plugin deps extraction (7 tests) - Add test_database_decorator.py for persistence decorator (7 tests) - Add test_load_config.py for core config loading (19 tests) - Add COVERAGE_EXCLUSIONS.md documenting external adapter exclusions - Fix test_chat_session_limit.py path for portability Coverage: core 28% → 30%, persistence 24% → 24.4%, plugin 27% → 28% Total: 1082 tests passed, core module coverage 45.5% Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * test(integration): add API controller integration tests - Add test_pipelines.py (10 tests) covering pipelines CRUD operations - GET/POST/PUT/DELETE on /api/v1/pipelines - Extensions endpoint - Metadata endpoint - Coverage: pipelines controller 27% → 80% - Add test_providers.py (10 tests) covering provider/model management - Provider CRUD with model counts - LLM model CRUD - Coverage: providers controller 23% → 81%, models 29% → 45% Tests use Quart TestClient with mocked services for real HTTP behavior without external dependencies. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * test(integration): add knowledge, bots, and model endpoints tests - Add test_knowledge.py (10 tests) covering knowledge base management - CRUD operations on /api/v1/knowledge/bases - Files management endpoints - Retrieve endpoint with validation - Coverage: knowledge/base.py 26% → 91% - Add test_bots.py (9 tests) covering bot management - CRUD operations on /api/v1/platform/bots - Logs endpoint - Send message endpoint with validation - Coverage: platform/bots.py 24% → 87% - Extend test_providers.py (+4 tests) for embedding/rerank models - Embedding models CRUD - Rerank models CRUD - Coverage: provider/models.py 29% → 60% Total integration tests: 53 (smoke 12 + pipelines 10 + providers 14 + knowledge 10 + bots 9) Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * test(integration): add embed and monitoring endpoint tests Add integration tests for embed widget and monitoring API endpoints: - test_embed.py: 15 tests for widget.js, logo, turnstile, messages, reset, feedback - test_monitoring.py: 15 tests for overview, messages, llm-calls, sessions, errors, export Coverage improvements: - embed.py: 17% → 56% - monitoring.py: 17% → 93% Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * test(e2e): add minimal startup E2E tests Add E2E tests for LangBot startup flow: - tests/e2e/utils/config_factory.py: minimal config generation - tests/e2e/utils/process_manager.py: LangBot subprocess management - tests/e2e/conftest.py: E2E fixtures (session-scoped process) - tests/e2e/test_startup.py: 12 tests for startup verification Tests verify: - boot.py + stages execution - database initialization (SQLite) - API availability - migrations applied Uses embedded databases (SQLite, Chroma) - no external dependencies. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * test(quality): fix fake tests and add missing coverage P0 fixes: - telemetry: rewrite fake tests with real behavior verification (25 tests) - config: delete copied-source tests, use proper imports (2 deleted) - persistence: fix try-except pass to verify specific errors P1 fixes: - pipeline: add real FixedWindowAlgo tests instead of mocks (12 tests) - provider: add SessionManager and ToolManager tests (25 tests) - storage: add S3StorageProvider tests with moto mock (16 tests) - plugin: add handler action tests for setting inheritance (15 tests) - rag: add file storage and ZIP processing tests (21 tests) - vector: add VDB filter conversion tests (30 tests) P2 fixes: - pipeline/msgtrun: strengthen assertions for exact message count - api: add response structure validation in integration tests New test files: - provider/test_session_manager.py - provider/test_tool_manager.py - storage/test_s3storage.py - plugin/test_handler_actions.py - rag/test_file_storage.py - vector/test_vdb_filter_conversion.py Source code bugs documented: - provider: TokenManager.next_token() ZeroDivisionError - telemetry: send_tasks class variable shared state - command: empty command IndexError, unused parameters - utils: funcschema KeyError - entity: vector.py independent declarative_base Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * docs(test): update coverage stats and test structure - Update coverage from 22% to 30% - Add new test files to structure: - provider: session_manager, tool_manager - storage: s3storage - plugin: handler_actions - rag: file_storage - vector: vdb_filter_conversion - telemetry: rewritten tests - Update module coverage percentages Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * test: add 105 new unit tests for untested core functionality Add comprehensive tests for B-class issues (core functionality untested): Pipeline: - test_pool.py: QueryPool ID generation, caching, async context (12 tests) - test_ratelimit.py: Fixed timing-sensitive test tolerance - test_pipelinemgr.py: Use real Pydantic StageProcessResult instead of Mock Utils: - test_version.py: Version comparison functions (20 tests) - test_logcache.py: Log page management and retrieval (18 tests) - test_httpclient.py: HTTP session pool management (10 tests) - test_proxy.py: Proxy configuration from env and config (10 tests) - test_image.py: URL parsing and base64 extraction (12 tests) - test_pkgmgr.py: Pip command generation (8 tests) Discover: - test_engine.py: I18nString, Metadata, Component manifest (15 tests) Test count: 1193 → 1298 (+105 tests) Note: Some B-class issues cannot be tested due to circular import bugs filed as GitHub issues #2175 (pipeline) and #2176 (persistence). * test: tighten phase 1 coverage contracts * test: align ci integration isolation --------- Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
2026-07-26 14:26:06 +00:00 · 2026-05-16 12:05:54 +08:00
parent 4a4c0921a4
commit 17bbc8bf10
130 changed files with 32711 additions and 889 deletions
@@ -0,0 +1,210 @@
+"""Tests for vector filter utilities."""
+
+from __future__ import annotations
+
+import pytest
+
+from langbot.pkg.vector.filter_utils import (
+    SUPPORTED_OPS,
+    normalize_filter,
+    strip_unsupported_fields,
+)
+
+
+class TestNormalizeFilter:
+    """Tests for normalize_filter function."""
+
+    def test_normalize_filter_empty_dict(self):
+        """Empty dict returns empty list."""
+        result = normalize_filter({})
+        assert result == []
+
+    def test_normalize_filter_none(self):
+        """None returns empty list."""
+        result = normalize_filter(None)
+        assert result == []
+
+    def test_normalize_filter_implicit_eq(self):
+        """Bare value becomes implicit $eq."""
+        result = normalize_filter({'file_id': 'abc123'})
+
+        assert len(result) == 1
+        assert result[0] == ('file_id', '$eq', 'abc123')
+
+    def test_normalize_filter_explicit_eq(self):
+        """Explicit $eq operator."""
+        result = normalize_filter({'file_id': {'$eq': 'abc123'}})
+
+        assert len(result) == 1
+        assert result[0] == ('file_id', '$eq', 'abc123')
+
+    def test_normalize_filter_comparison_operators(self):
+        """Test comparison operators: $gt, $gte, $lt, $lte."""
+        result = normalize_filter({'created_at': {'$gte': 1700000000}})
+
+        assert len(result) == 1
+        assert result[0] == ('created_at', '$gte', 1700000000)
+
+    def test_normalize_filter_ne_operator(self):
+        """Test $ne operator."""
+        result = normalize_filter({'status': {'$ne': 'deleted'}})
+
+        assert len(result) == 1
+        assert result[0] == ('status', '$ne', 'deleted')
+
+    def test_normalize_filter_in_operator(self):
+        """Test $in operator with list value."""
+        result = normalize_filter({'file_type': {'$in': ['pdf', 'docx', 'txt']}})
+
+        assert len(result) == 1
+        assert result[0] == ('file_type', '$in', ['pdf', 'docx', 'txt'])
+
+    def test_normalize_filter_nin_operator(self):
+        """Test $nin operator."""
+        result = normalize_filter({'status': {'$nin': ['deleted', 'archived']}})
+
+        assert len(result) == 1
+        assert result[0] == ('status', '$nin', ['deleted', 'archived'])
+
+    def test_normalize_filter_multiple_conditions(self):
+        """Multiple top-level keys are AND-ed (returned as multiple triples)."""
+        result = normalize_filter({
+            'file_id': 'abc',
+            'status': {'$ne': 'deleted'},
+            'created_at': {'$gte': 1700000000}
+        })
+
+        assert len(result) == 3
+        # Order should match dict iteration order
+        field_ops = [(field, op) for field, op, _ in result]
+        assert ('file_id', '$eq') in field_ops
+        assert ('status', '$ne') in field_ops
+        assert ('created_at', '$gte') in field_ops
+
+    def test_normalize_filter_unsupported_operator_raises(self):
+        """Unsupported operator raises ValueError."""
+        with pytest.raises(ValueError, match='Unsupported filter operator'):
+            normalize_filter({'field': {'$regex': 'pattern'}})
+
+    def test_normalize_filter_all_supported_ops(self):
+        """Test all supported operators are recognized."""
+        for op in SUPPORTED_OPS:
+            if op in ('$in', '$nin'):
+                filter_dict = {'field': {op: ['value1', 'value2']}}
+            else:
+                filter_dict = {'field': {op: 'value'}}
+
+            result = normalize_filter(filter_dict)
+            assert len(result) == 1
+            assert result[0][1] == op
+
+
+class TestStripUnsupportedFields:
+    """Tests for strip_unsupported_fields function."""
+
+    def test_strip_keeps_supported_fields(self):
+        """Fields in supported_fields are kept."""
+        triples = [
+            ('file_id', '$eq', 'abc'),
+            ('chunk_uuid', '$ne', 'def'),
+        ]
+
+        result = strip_unsupported_fields(triples, {'file_id', 'chunk_uuid'})
+
+        assert len(result) == 2
+        assert result == triples
+
+    def test_strip_removes_unsupported_fields(self):
+        """Fields not in supported_fields are removed."""
+        triples = [
+            ('file_id', '$eq', 'abc'),
+            ('unknown_field', '$ne', 'def'),
+        ]
+
+        result = strip_unsupported_fields(triples, {'file_id'})
+
+        assert len(result) == 1
+        assert result[0] == ('file_id', '$eq', 'abc')
+
+    def test_strip_empty_triples(self):
+        """Empty triples list returns empty list."""
+        result = strip_unsupported_fields([], {'file_id'})
+        assert result == []
+
+    def test_strip_all_unsupported(self):
+        """All fields unsupported returns empty list."""
+        triples = [
+            ('unknown1', '$eq', 'a'),
+            ('unknown2', '$eq', 'b'),
+        ]
+
+        result = strip_unsupported_fields(triples, {'file_id'})
+
+        assert result == []
+
+    def test_strip_with_field_aliases(self):
+        """Field aliases are resolved before checking support."""
+        triples = [
+            ('uuid', '$eq', 'abc'),  # alias for chunk_uuid
+            ('file_id', '$eq', 'def'),
+        ]
+
+        result = strip_unsupported_fields(
+            triples,
+            {'file_id', 'chunk_uuid'},
+            field_aliases={'uuid': 'chunk_uuid'}
+        )
+
+        assert len(result) == 2
+        # 'uuid' should be resolved to 'chunk_uuid'
+        assert result[0] == ('chunk_uuid', '$eq', 'abc')
+        assert result[1] == ('file_id', '$eq', 'def')
+
+    def test_strip_alias_not_in_supported(self):
+        """Alias resolved but still not in supported_fields is dropped."""
+        triples = [
+            ('uuid', '$eq', 'abc'),  # alias for chunk_uuid, but not supported
+        ]
+
+        result = strip_unsupported_fields(
+            triples,
+            {'file_id'},  # chunk_uuid not supported
+            field_aliases={'uuid': 'chunk_uuid'}
+        )
+
+        assert result == []
+
+    def test_strip_preserves_operator_and_value(self):
+        """Strip only affects field name, not operator or value."""
+        triples = [
+            ('file_id', '$in', ['a', 'b', 'c']),
+        ]
+
+        result = strip_unsupported_fields(triples, {'file_id'})
+
+        assert result[0] == ('file_id', '$in', ['a', 'b', 'c'])
+
+    def test_strip_none_aliases(self):
+        """None field_aliases is treated as empty dict."""
+        triples = [
+            ('file_id', '$eq', 'abc'),
+        ]
+
+        result = strip_unsupported_fields(triples, {'file_id'}, field_aliases=None)
+
+        assert len(result) == 1
+        assert result[0] == ('file_id', '$eq', 'abc')
+
+
+class TestSupportedOpsConstant:
+    """Tests for SUPPORTED_OPS constant."""
+
+    def test_supported_ops_contains_expected(self):
+        """SUPPORTED_OPS contains all expected operators."""
+        expected = {'$eq', '$ne', '$gt', '$gte', '$lt', '$lte', '$in', '$nin'}
+        assert SUPPORTED_OPS == expected
+
+    def test_supported_ops_is_frozenset(self):
+        """SUPPORTED_OPS is a frozenset for immutability."""
+        from collections.abc import Set
+        assert isinstance(SUPPORTED_OPS, Set)
@@ -0,0 +1,338 @@
+"""Tests for VectorDBManager provider selection logic.
+
+Tests the initialization logic that selects the appropriate VDB backend
+based on configuration, without actually creating real VDB instances.
+"""
+
+from __future__ import annotations
+
+from unittest.mock import MagicMock
+
+from tests.utils.import_isolation import isolated_sys_modules
+
+
+class TestVectorDBManagerInitialization:
+    """Tests for VectorDBManager.initialize provider selection."""
+
+    def _create_mock_app(self, vdb_config: dict | None):
+        """Create mock app with vdb configuration."""
+        mock_app = MagicMock()
+        mock_app.instance_config = MagicMock()
+        mock_app.instance_config.data = MagicMock()
+        mock_app.instance_config.data.get = MagicMock(return_value=vdb_config)
+        mock_app.logger = MagicMock()
+        mock_app.logger.info = MagicMock()
+        mock_app.logger.warning = MagicMock()
+        return mock_app
+
+    def _make_vector_import_mocks(self):
+        """Create mocks for VDB backends to prevent real imports."""
+        mocks = {}
+
+        # Mock core.app to break circular import
+        mocks['langbot.pkg.core.app'] = MagicMock()
+
+        # Mock all VDB backend implementations
+        for backend in ['chroma', 'qdrant', 'seekdb', 'milvus', 'pgvector_db']:
+            mocks[f'langbot.pkg.vector.vdbs.{backend}'] = MagicMock()
+
+        return mocks
+
+    def test_initialize_no_config_defaults_to_chroma(self):
+        """No vdb config defaults to Chroma."""
+        mock_app = self._create_mock_app(None)
+
+        mocks = self._make_vector_import_mocks()
+        # Create mock Chroma class
+        mock_chroma_class = MagicMock()
+        mocks['langbot.pkg.vector.vdbs.chroma'].ChromaVectorDatabase = mock_chroma_class
+
+        with isolated_sys_modules(mocks):
+            # Import after mocking
+            from langbot.pkg.vector.mgr import VectorDBManager
+
+            mgr = VectorDBManager(mock_app)
+
+            # Run initialize synchronously for test
+            import asyncio
+            asyncio.get_event_loop().run_until_complete(mgr.initialize())
+
+            # Chroma should be instantiated
+            mock_chroma_class.assert_called_once_with(mock_app)
+            mock_app.logger.warning.assert_called()
+
+    def test_initialize_chroma_backend(self):
+        """Explicit chroma config uses Chroma backend."""
+        vdb_config = {'use': 'chroma'}
+        mock_app = self._create_mock_app(vdb_config)
+
+        mocks = self._make_vector_import_mocks()
+        mock_chroma_class = MagicMock()
+        mocks['langbot.pkg.vector.vdbs.chroma'].ChromaVectorDatabase = mock_chroma_class
+
+        with isolated_sys_modules(mocks):
+            from langbot.pkg.vector.mgr import VectorDBManager
+
+            mgr = VectorDBManager(mock_app)
+
+            import asyncio
+            asyncio.get_event_loop().run_until_complete(mgr.initialize())
+
+            mock_chroma_class.assert_called_once_with(mock_app)
+            mock_app.logger.info.assert_called()
+
+    def test_initialize_qdrant_backend(self):
+        """Qdrant config uses Qdrant backend."""
+        vdb_config = {'use': 'qdrant'}
+        mock_app = self._create_mock_app(vdb_config)
+
+        mocks = self._make_vector_import_mocks()
+        mock_qdrant_class = MagicMock()
+        mocks['langbot.pkg.vector.vdbs.qdrant'].QdrantVectorDatabase = mock_qdrant_class
+
+        with isolated_sys_modules(mocks):
+            from langbot.pkg.vector.mgr import VectorDBManager
+
+            mgr = VectorDBManager(mock_app)
+
+            import asyncio
+            asyncio.get_event_loop().run_until_complete(mgr.initialize())
+
+            mock_qdrant_class.assert_called_once_with(mock_app)
+
+    def test_initialize_seekdb_backend(self):
+        """SeekDB config uses SeekDB backend."""
+        vdb_config = {'use': 'seekdb'}
+        mock_app = self._create_mock_app(vdb_config)
+
+        mocks = self._make_vector_import_mocks()
+        mock_seekdb_class = MagicMock()
+        mocks['langbot.pkg.vector.vdbs.seekdb'].SeekDBVectorDatabase = mock_seekdb_class
+
+        with isolated_sys_modules(mocks):
+            from langbot.pkg.vector.mgr import VectorDBManager
+
+            mgr = VectorDBManager(mock_app)
+
+            import asyncio
+            asyncio.get_event_loop().run_until_complete(mgr.initialize())
+
+            mock_seekdb_class.assert_called_once_with(mock_app)
+
+    def test_initialize_milvus_backend_with_uri(self):
+        """Milvus config with custom URI."""
+        vdb_config = {
+            'use': 'milvus',
+            'milvus': {
+                'uri': 'http://localhost:19530',
+                'token': 'root:Milvus',
+                'db_name': 'langbot_db'
+            }
+        }
+        mock_app = self._create_mock_app(vdb_config)
+
+        mocks = self._make_vector_import_mocks()
+        mock_milvus_class = MagicMock()
+        mocks['langbot.pkg.vector.vdbs.milvus'].MilvusVectorDatabase = mock_milvus_class
+
+        with isolated_sys_modules(mocks):
+            from langbot.pkg.vector.mgr import VectorDBManager
+
+            mgr = VectorDBManager(mock_app)
+
+            import asyncio
+            asyncio.get_event_loop().run_until_complete(mgr.initialize())
+
+            mock_milvus_class.assert_called_once_with(
+                mock_app,
+                uri='http://localhost:19530',
+                token='root:Milvus',
+                db_name='langbot_db'
+            )
+
+    def test_initialize_milvus_backend_defaults(self):
+        """Milvus defaults when config not fully specified."""
+        vdb_config = {'use': 'milvus'}
+        mock_app = self._create_mock_app(vdb_config)
+
+        mocks = self._make_vector_import_mocks()
+        mock_milvus_class = MagicMock()
+        mocks['langbot.pkg.vector.vdbs.milvus'].MilvusVectorDatabase = mock_milvus_class
+
+        with isolated_sys_modules(mocks):
+            from langbot.pkg.vector.mgr import VectorDBManager
+
+            mgr = VectorDBManager(mock_app)
+
+            import asyncio
+            asyncio.get_event_loop().run_until_complete(mgr.initialize())
+
+            # Should use default values
+            mock_milvus_class.assert_called_once_with(
+                mock_app,
+                uri='./data/milvus.db',
+                token=None,
+                db_name='default'
+            )
+
+    def test_initialize_pgvector_with_connection_string(self):
+        """pgvector with connection string."""
+        vdb_config = {
+            'use': 'pgvector',
+            'pgvector': {
+                'connection_string': 'postgresql://user:pass@host:5432/langbot'
+            }
+        }
+        mock_app = self._create_mock_app(vdb_config)
+
+        mocks = self._make_vector_import_mocks()
+        mock_pgvector_class = MagicMock()
+        mocks['langbot.pkg.vector.vdbs.pgvector_db'].PgVectorDatabase = mock_pgvector_class
+
+        with isolated_sys_modules(mocks):
+            from langbot.pkg.vector.mgr import VectorDBManager
+
+            mgr = VectorDBManager(mock_app)
+
+            import asyncio
+            asyncio.get_event_loop().run_until_complete(mgr.initialize())
+
+            mock_pgvector_class.assert_called_once_with(
+                mock_app,
+                connection_string='postgresql://user:pass@host:5432/langbot'
+            )
+
+    def test_initialize_pgvector_with_individual_params(self):
+        """pgvector with individual connection parameters."""
+        vdb_config = {
+            'use': 'pgvector',
+            'pgvector': {
+                'host': 'db.example.com',
+                'port': 5433,
+                'database': 'vectordb',
+                'user': 'admin',
+                'password': 'secret'
+            }
+        }
+        mock_app = self._create_mock_app(vdb_config)
+
+        mocks = self._make_vector_import_mocks()
+        mock_pgvector_class = MagicMock()
+        mocks['langbot.pkg.vector.vdbs.pgvector_db'].PgVectorDatabase = mock_pgvector_class
+
+        with isolated_sys_modules(mocks):
+            from langbot.pkg.vector.mgr import VectorDBManager
+
+            mgr = VectorDBManager(mock_app)
+
+            import asyncio
+            asyncio.get_event_loop().run_until_complete(mgr.initialize())
+
+            mock_pgvector_class.assert_called_once_with(
+                mock_app,
+                host='db.example.com',
+                port=5433,
+                database='vectordb',
+                user='admin',
+                password='secret'
+            )
+
+    def test_initialize_pgvector_defaults(self):
+        """pgvector defaults when no config params."""
+        vdb_config = {'use': 'pgvector'}
+        mock_app = self._create_mock_app(vdb_config)
+
+        mocks = self._make_vector_import_mocks()
+        mock_pgvector_class = MagicMock()
+        mocks['langbot.pkg.vector.vdbs.pgvector_db'].PgVectorDatabase = mock_pgvector_class
+
+        with isolated_sys_modules(mocks):
+            from langbot.pkg.vector.mgr import VectorDBManager
+
+            mgr = VectorDBManager(mock_app)
+
+            import asyncio
+            asyncio.get_event_loop().run_until_complete(mgr.initialize())
+
+            mock_pgvector_class.assert_called_once_with(
+                mock_app,
+                host='localhost',
+                port=5432,
+                database='langbot',
+                user='postgres',
+                password='postgres'
+            )
+
+    def test_initialize_unknown_backend_defaults_to_chroma(self):
+        """Unknown vdb type defaults to Chroma with warning."""
+        vdb_config = {'use': 'unknown_backend'}
+        mock_app = self._create_mock_app(vdb_config)
+
+        mocks = self._make_vector_import_mocks()
+        mock_chroma_class = MagicMock()
+        mocks['langbot.pkg.vector.vdbs.chroma'].ChromaVectorDatabase = mock_chroma_class
+
+        with isolated_sys_modules(mocks):
+            from langbot.pkg.vector.mgr import VectorDBManager
+
+            mgr = VectorDBManager(mock_app)
+
+            import asyncio
+            asyncio.get_event_loop().run_until_complete(mgr.initialize())
+
+            mock_chroma_class.assert_called_once_with(mock_app)
+            mock_app.logger.warning.assert_called()
+            # Should warn about no valid backend
+            warning_msg = mock_app.logger.warning.call_args[0][0]
+            assert 'No valid' in warning_msg or 'defaulting' in warning_msg
+
+
+class TestVectorDBManagerProxies:
+    """Tests for VectorDBManager proxy methods."""
+
+    def test_get_supported_search_types_no_vector_db(self):
+        """get_supported_search_types returns vector when no vector_db."""
+        mock_app = MagicMock()
+        mock_app.instance_config = MagicMock()
+        mock_app.instance_config.data = MagicMock()
+        mock_app.instance_config.data.get = MagicMock(return_value=None)
+        mock_app.logger = MagicMock()
+
+        mocks = {'langbot.pkg.core.app': MagicMock()}
+        for backend in ['chroma', 'qdrant', 'seekdb', 'milvus', 'pgvector_db']:
+            mocks[f'langbot.pkg.vector.vdbs.{backend}'] = MagicMock()
+
+        with isolated_sys_modules(mocks):
+            from langbot.pkg.vector.mgr import VectorDBManager
+
+            mgr = VectorDBManager(mock_app)
+            mgr.vector_db = None  # Explicitly None
+
+            result = mgr.get_supported_search_types()
+            assert result == ['vector']
+
+    def test_get_supported_search_types_with_vector_db(self):
+        """get_supported_search_types delegates to vector_db."""
+        mock_app = MagicMock()
+
+        # Create mock vector_db with supported_search_types
+        mock_vector_db = MagicMock()
+        mock_vector_db.supported_search_types = MagicMock(
+            return_value=[
+                MagicMock(value='vector'),
+                MagicMock(value='full_text'),
+            ]
+        )
+
+        mocks = {'langbot.pkg.core.app': MagicMock()}
+        for backend in ['chroma', 'qdrant', 'seekdb', 'milvus', 'pgvector_db']:
+            mocks[f'langbot.pkg.vector.vdbs.{backend}'] = MagicMock()
+
+        with isolated_sys_modules(mocks):
+            from langbot.pkg.vector.mgr import VectorDBManager
+
+            mgr = VectorDBManager(mock_app)
+            mgr.vector_db = mock_vector_db
+
+            result = mgr.get_supported_search_types()
+            assert result == ['vector', 'full_text']
@@ -0,0 +1,173 @@
+"""Tests for VectorDatabase base class and SearchType enum."""
+
+from __future__ import annotations
+
+from unittest.mock import AsyncMock
+import pytest
+
+from langbot.pkg.vector.vdb import SearchType, VectorDatabase
+
+
+class TestSearchType:
+    """Tests for SearchType enum."""
+
+    def test_search_type_values(self):
+        """Test SearchType enum values."""
+        assert SearchType.VECTOR.value == 'vector'
+        assert SearchType.FULL_TEXT.value == 'full_text'
+        assert SearchType.HYBRID.value == 'hybrid'
+
+    def test_search_type_is_string_enum(self):
+        """SearchType is a string enum."""
+        assert isinstance(SearchType.VECTOR, str)
+        assert SearchType.VECTOR == 'vector'
+
+    def test_search_type_from_string(self):
+        """Can create SearchType from string."""
+        assert SearchType('vector') == SearchType.VECTOR
+        assert SearchType('full_text') == SearchType.FULL_TEXT
+        assert SearchType('hybrid') == SearchType.HYBRID
+
+
+class TestVectorDatabaseAbstractMethods:
+    """Tests for VectorDatabase abstract methods."""
+
+    def test_vector_database_is_abstract(self):
+        """VectorDatabase is abstract and cannot be instantiated directly."""
+        with pytest.raises(TypeError):
+            VectorDatabase()
+
+    def test_abstract_methods_required(self):
+        """Subclass must implement all abstract methods."""
+        class IncompleteVectorDB(VectorDatabase):
+            pass
+
+        with pytest.raises(TypeError):
+            IncompleteVectorDB()
+
+    def test_supported_search_types_default(self):
+        """Default supported_search_types returns [VECTOR]."""
+        class MinimalVectorDB(VectorDatabase):
+            async def add_embeddings(self, collection, ids, embeddings_list, metadatas, documents=None):
+                pass
+
+            async def search(self, collection, query_embedding, k=5, search_type='vector', query_text='', filter=None, vector_weight=None):
+                pass
+
+            async def delete_by_file_id(self, collection, file_id):
+                pass
+
+            async def delete_by_filter(self, collection, filter):
+                pass
+
+            async def get_or_create_collection(self, collection):
+                pass
+
+            async def delete_collection(self, collection):
+                pass
+
+        db = MinimalVectorDB()
+        assert db.supported_search_types() == [SearchType.VECTOR]
+
+    def test_list_by_filter_default_implementation(self):
+        """list_by_filter has default implementation returning empty."""
+        class MinimalVectorDB(VectorDatabase):
+            async def add_embeddings(self, collection, ids, embeddings_list, metadatas, documents=None):
+                pass
+
+            async def search(self, collection, query_embedding, k=5, search_type='vector', query_text='', filter=None, vector_weight=None):
+                pass
+
+            async def delete_by_file_id(self, collection, file_id):
+                pass
+
+            async def delete_by_filter(self, collection, filter):
+                pass
+
+            async def get_or_create_collection(self, collection):
+                pass
+
+            async def delete_collection(self, collection):
+                pass
+
+        db = MinimalVectorDB()
+        # list_by_filter should return empty list and -1 for total
+        import asyncio
+        result = asyncio.get_event_loop().run_until_complete(
+            db.list_by_filter('test_collection')
+        )
+        assert result == ([], -1)
+
+
+class TestVectorDatabaseInterface:
+    """Tests for VectorDatabase interface contracts."""
+
+    @pytest.fixture
+    def mock_vector_db(self):
+        """Create a minimal mock VectorDatabase for testing."""
+        class MockVectorDB(VectorDatabase):
+            def __init__(self):
+                self.add_embeddings = AsyncMock()
+                self.search = AsyncMock(return_value={
+                    'ids': [['id1', 'id2']],
+                    'distances': [[0.1, 0.2]],
+                    'metadatas': [[{'key': 'val1'}, {'key': 'val2'}]]
+                })
+                self.delete_by_file_id = AsyncMock()
+                self.delete_by_filter = AsyncMock(return_value=5)
+                self.get_or_create_collection = AsyncMock()
+                self.delete_collection = AsyncMock()
+
+            async def add_embeddings(self, collection, ids, embeddings_list, metadatas, documents=None):
+                pass
+
+            async def search(self, collection, query_embedding, k=5, search_type='vector', query_text='', filter=None, vector_weight=None):
+                pass
+
+            async def delete_by_file_id(self, collection, file_id):
+                pass
+
+            async def delete_by_filter(self, collection, filter):
+                pass
+
+            async def get_or_create_collection(self, collection):
+                pass
+
+            async def delete_collection(self, collection):
+                pass
+
+        return MockVectorDB()
+
+    @pytest.mark.asyncio
+    async def test_add_embeddings_signature(self, mock_vector_db):
+        """add_embeddings has expected signature."""
+        await mock_vector_db.add_embeddings(
+            collection='test',
+            ids=['id1', 'id2'],
+            embeddings_list=[[0.1, 0.2], [0.3, 0.4]],
+            metadatas=[{'a': 1}, {'b': 2}],
+            documents=['doc1', 'doc2']
+        )
+        mock_vector_db.add_embeddings.assert_called_once()
+
+    @pytest.mark.asyncio
+    async def test_search_signature(self, mock_vector_db):
+        """search has expected signature with all optional params."""
+        import numpy as np
+
+        await mock_vector_db.search(
+            collection='test',
+            query_embedding=np.array([0.1, 0.2]),
+            k=10,
+            search_type='hybrid',
+            query_text='search text',
+            filter={'file_id': 'abc'},
+            vector_weight=0.7
+        )
+        mock_vector_db.search.assert_called_once()
+
+    @pytest.mark.asyncio
+    async def test_delete_by_filter_returns_int(self, mock_vector_db):
+        """delete_by_filter returns int count."""
+        result = await mock_vector_db.delete_by_filter('test', {'file_id': 'abc'})
+        assert isinstance(result, int)
@@ -0,0 +1,359 @@
+"""Tests for VDB backend filter conversion functions.
+
+Tests cover:
+- _build_qdrant_filter: Qdrant models.Filter conversion
+- _build_milvus_expr: Milvus boolean expression string conversion
+- _build_pg_conditions: PostgreSQL SQLAlchemy conditions conversion
+"""
+from __future__ import annotations
+
+from importlib import import_module
+
+
+def get_qdrant_module():
+    """Lazy import qdrant module."""
+    return import_module('langbot.pkg.vector.vdbs.qdrant')
+
+
+def get_milvus_module():
+    """Lazy import milvus module."""
+    return import_module('langbot.pkg.vector.vdbs.milvus')
+
+
+def get_pgvector_module():
+    """Lazy import pgvector module."""
+    return import_module('langbot.pkg.vector.vdbs.pgvector_db')
+
+
+class TestQdrantFilterConversion:
+    """Tests for _build_qdrant_filter function."""
+
+    def test_empty_filter_returns_empty_must(self):
+        """Empty filter dict returns Filter with None must/must_not."""
+        qdrant_module = get_qdrant_module()
+
+        result = qdrant_module._build_qdrant_filter({})
+        assert result.must is None
+        assert result.must_not is None
+
+    def test_eq_operator_creates_must_condition(self):
+        """$eq operator creates FieldCondition in must list."""
+        qdrant_module = get_qdrant_module()
+        from qdrant_client import models
+
+        result = qdrant_module._build_qdrant_filter({'file_id': 'abc'})
+
+        assert result.must is not None
+        assert len(result.must) == 1
+        condition = result.must[0]
+        assert condition.key == 'file_id'
+        assert isinstance(condition.match, models.MatchValue)
+        assert condition.match.value == 'abc'
+
+    def test_ne_operator_creates_must_not_condition(self):
+        """$ne operator creates FieldCondition in must_not list."""
+        qdrant_module = get_qdrant_module()
+        from qdrant_client import models
+
+        result = qdrant_module._build_qdrant_filter({'status': {'$ne': 'deleted'}})
+
+        assert result.must_not is not None
+        assert len(result.must_not) == 1
+        condition = result.must_not[0]
+        assert condition.key == 'status'
+        assert isinstance(condition.match, models.MatchValue)
+        assert condition.match.value == 'deleted'
+
+    def test_in_operator_creates_match_any(self):
+        """$in operator creates MatchAny condition."""
+        qdrant_module = get_qdrant_module()
+        from qdrant_client import models
+
+        result = qdrant_module._build_qdrant_filter({'file_type': {'$in': ['pdf', 'docx']}})
+
+        assert result.must is not None
+        assert len(result.must) == 1
+        condition = result.must[0]
+        assert condition.key == 'file_type'
+        assert isinstance(condition.match, models.MatchAny)
+        assert condition.match.any == ['pdf', 'docx']
+
+    def test_nin_operator_creates_must_not_match_any(self):
+        """$nin operator creates MatchAny in must_not."""
+        qdrant_module = get_qdrant_module()
+        from qdrant_client import models
+
+        result = qdrant_module._build_qdrant_filter({'status': {'$nin': ['deleted', 'archived']}})
+
+        assert result.must_not is not None
+        assert len(result.must_not) == 1
+        condition = result.must_not[0]
+        assert condition.key == 'status'
+        assert isinstance(condition.match, models.MatchAny)
+        assert condition.match.any == ['deleted', 'archived']
+
+    def test_range_operators_create_range_condition(self):
+        """$gt, $gte, $lt, $lte create Range conditions."""
+        qdrant_module = get_qdrant_module()
+        from qdrant_client import models
+
+        # Test $gt
+        result = qdrant_module._build_qdrant_filter({'created_at': {'$gt': 100}})
+        condition = result.must[0]
+        assert isinstance(condition.range, models.Range)
+        assert condition.range.gt == 100
+
+        # Test $gte
+        result = qdrant_module._build_qdrant_filter({'created_at': {'$gte': 100}})
+        condition = result.must[0]
+        assert condition.range.gte == 100
+
+        # Test $lt
+        result = qdrant_module._build_qdrant_filter({'created_at': {'$lt': 100}})
+        condition = result.must[0]
+        assert condition.range.lt == 100
+
+        # Test $lte
+        result = qdrant_module._build_qdrant_filter({'created_at': {'$lte': 100}})
+        condition = result.must[0]
+        assert condition.range.lte == 100
+
+    def test_multiple_conditions_combined(self):
+        """Multiple conditions are combined in must/must_not."""
+        qdrant_module = get_qdrant_module()
+
+        result = qdrant_module._build_qdrant_filter({
+            'file_id': 'abc',
+            'status': {'$ne': 'deleted'},
+            'created_at': {'$gte': 100},
+        })
+
+        assert len(result.must) == 2  # file_id eq + created_at gte
+        assert len(result.must_not) == 1  # status ne
+
+    def test_implicit_eq_handled(self):
+        """Implicit $eq (bare value) is correctly handled."""
+        qdrant_module = get_qdrant_module()
+        from qdrant_client import models
+
+        result = qdrant_module._build_qdrant_filter({'field': 'value'})
+
+        assert result.must is not None
+        condition = result.must[0]
+        assert isinstance(condition.match, models.MatchValue)
+
+
+class TestMilvusFilterConversion:
+    """Tests for _build_milvus_expr function.
+
+    NOTE: Milvus only supports fields: 'text', 'file_id', 'chunk_uuid'
+    Tests use only these supported fields.
+    """
+
+    def test_empty_filter_returns_empty_string(self):
+        """Empty filter dict returns empty string."""
+        milvus_module = get_milvus_module()
+
+        result = milvus_module._build_milvus_expr({})
+        assert result == ''
+
+    def test_eq_operator_expression(self):
+        """$eq operator creates == expression."""
+        milvus_module = get_milvus_module()
+
+        result = milvus_module._build_milvus_expr({'file_id': 'abc'})
+        assert result == 'file_id == "abc"'
+
+    def test_ne_operator_expression(self):
+        """$ne operator creates != expression."""
+        milvus_module = get_milvus_module()
+
+        result = milvus_module._build_milvus_expr({'file_id': {'$ne': 'deleted'}})
+        assert result == 'file_id != "deleted"'
+
+    def test_comparison_operators(self):
+        """$gt, $gte, $lt, $lte create comparison expressions."""
+        milvus_module = get_milvus_module()
+
+        assert milvus_module._build_milvus_expr({'chunk_uuid': {'$gt': 'uuid_100'}}) == 'chunk_uuid > "uuid_100"'
+        assert milvus_module._build_milvus_expr({'chunk_uuid': {'$gte': 'uuid_100'}}) == 'chunk_uuid >= "uuid_100"'
+        assert milvus_module._build_milvus_expr({'chunk_uuid': {'$lt': 'uuid_100'}}) == 'chunk_uuid < "uuid_100"'
+        assert milvus_module._build_milvus_expr({'chunk_uuid': {'$lte': 'uuid_100'}}) == 'chunk_uuid <= "uuid_100"'
+
+    def test_in_operator_expression(self):
+        """$in operator creates in [...] expression."""
+        milvus_module = get_milvus_module()
+
+        result = milvus_module._build_milvus_expr({'file_id': {'$in': ['pdf', 'docx']}})
+        assert result == 'file_id in ["pdf", "docx"]'
+
+    def test_nin_operator_expression(self):
+        """$nin operator creates not in [...] expression."""
+        milvus_module = get_milvus_module()
+
+        result = milvus_module._build_milvus_expr({'file_id': {'$nin': ['deleted', 'archived']}})
+        assert result == 'file_id not in ["deleted", "archived"]'
+
+    def test_multiple_conditions_joined_with_and(self):
+        """Multiple conditions are joined with 'and'."""
+        milvus_module = get_milvus_module()
+
+        result = milvus_module._build_milvus_expr({
+            'file_id': 'abc',
+            'chunk_uuid': {'$ne': 'def'},
+        })
+        assert 'and' in result
+        assert 'file_id == "abc"' in result
+        assert 'chunk_uuid != "def"' in result
+
+    def test_string_value_escaped(self):
+        """String values are properly escaped."""
+        milvus_module = get_milvus_module()
+
+        # Test backslash escape
+        result = milvus_module._build_milvus_expr({'file_id': 'C:\\Users\\test'})
+        assert '\\\\' in result
+
+        # Test quote escape
+        result = milvus_module._build_milvus_expr({'file_id': 'test "quoted"'})
+        assert '\\"' in result
+
+    def test_text_field_supported(self):
+        """text field is supported."""
+        milvus_module = get_milvus_module()
+
+        result = milvus_module._build_milvus_expr({'text': 'some text'})
+        assert result == 'text == "some text"'
+
+    def test_milvus_literal_function(self):
+        """Test _milvus_literal helper."""
+        milvus_module = get_milvus_module()
+
+        assert milvus_module._milvus_literal('string') == '"string"'
+        assert milvus_module._milvus_literal(42) == '42'
+        assert milvus_module._milvus_literal(3.14) == '3.14'
+
+    def test_unsupported_field_dropped(self):
+        """Unsupported fields are dropped (not in _MILVUS_SUPPORTED_FIELDS)."""
+        milvus_module = get_milvus_module()
+
+        result = milvus_module._build_milvus_expr({'unknown_field': 'value'})
+        assert result == ''
+
+    def test_uuid_alias_resolved(self):
+        """'uuid' alias is resolved to 'chunk_uuid'."""
+        milvus_module = get_milvus_module()
+
+        result = milvus_module._build_milvus_expr({'uuid': 'abc'})
+        assert result.startswith('chunk_uuid')
+        # uuid substring appears in chunk_uuid which is expected
+
+
+class TestPgVectorFilterConversion:
+    """Tests for _build_pg_conditions function.
+
+    NOTE: PGVector only supports fields: 'text', 'file_id', 'chunk_uuid'
+    Tests use only these supported fields.
+    """
+
+    def test_empty_filter_returns_empty_list(self):
+        """Empty filter dict returns empty list."""
+        pgvector_module = get_pgvector_module()
+
+        result = pgvector_module._build_pg_conditions({})
+        assert result == []
+
+    def test_eq_operator_creates_equality_condition(self):
+        """$eq operator creates SQLAlchemy == condition."""
+        pgvector_module = get_pgvector_module()
+
+        result = pgvector_module._build_pg_conditions({'file_id': 'abc'})
+
+        assert len(result) == 1
+        # Verify it's a SQLAlchemy BinaryExpression
+        from sqlalchemy.sql.expression import BinaryExpression
+        assert isinstance(result[0], BinaryExpression)
+
+    def test_ne_operator_creates_inequality_condition(self):
+        """$ne operator creates SQLAlchemy != condition."""
+        pgvector_module = get_pgvector_module()
+
+        result = pgvector_module._build_pg_conditions({'file_id': {'$ne': 'deleted'}})
+
+        assert len(result) == 1
+        # Operator should be ne (not equals)
+        assert '!=' in str(result[0]) or 'ne' in str(result[0].operator)
+
+    def test_comparison_operators(self):
+        """$gt, $gte, $lt, $lte create comparison conditions."""
+        pgvector_module = get_pgvector_module()
+
+        # Test all comparison operators with supported field
+        for op, expected_op in [
+            ('$gt', '>'),
+            ('$gte', '>='),
+            ('$lt', '<'),
+            ('$lte', '<='),
+        ]:
+            result = pgvector_module._build_pg_conditions({'chunk_uuid': {op: 'uuid_100'}})
+            assert len(result) == 1
+            assert expected_op in str(result[0])
+
+    def test_in_operator_creates_in_condition(self):
+        """$in operator creates SQLAlchemy in_ condition."""
+        pgvector_module = get_pgvector_module()
+
+        result = pgvector_module._build_pg_conditions({'file_id': {'$in': ['a', 'b', 'c']}})
+
+        assert len(result) == 1
+        assert 'IN' in str(result[0]).upper()
+
+    def test_nin_operator_creates_notin_condition(self):
+        """$nin operator creates SQLAlchemy notin_ condition."""
+        pgvector_module = get_pgvector_module()
+
+        result = pgvector_module._build_pg_conditions({'file_id': {'$nin': ['a', 'b']}})
+
+        assert len(result) == 1
+        assert 'NOT IN' in str(result[0]).upper()
+
+    def test_multiple_conditions_list(self):
+        """Multiple conditions return list of conditions."""
+        pgvector_module = get_pgvector_module()
+
+        result = pgvector_module._build_pg_conditions({
+            'file_id': 'abc',
+            'chunk_uuid': {'$ne': 'def'},
+        })
+
+        assert len(result) == 2
+
+    def test_unsupported_field_dropped(self):
+        """Unsupported fields are dropped (not in _PG_SUPPORTED_FIELDS)."""
+        pgvector_module = get_pgvector_module()
+
+        result = pgvector_module._build_pg_conditions({'unknown_field': 'value'})
+        assert result == []
+
+    def test_uuid_alias_resolved(self):
+        """'uuid' alias is resolved to 'chunk_uuid'."""
+        pgvector_module = get_pgvector_module()
+
+        result = pgvector_module._build_pg_conditions({'uuid': 'abc'})
+
+        assert len(result) == 1
+        # Should reference chunk_uuid column
+        assert 'chunk_uuid' in str(result[0])
+
+    def test_supported_fields_only(self):
+        """Only supported fields (text, file_id, chunk_uuid) are kept."""
+        pgvector_module = get_pgvector_module()
+
+        result = pgvector_module._build_pg_conditions({
+            'text': {'$ne': ''},
+            'file_id': 'abc',
+            'chunk_uuid': {'$in': ['x', 'y']},
+            'unsupported': 'value',
+        })
+
+        assert len(result) == 3  # Only supported fields