Feat/test build (#2174)

* fix(ci): update unit-test workflow paths to match current source layout

Replace stale pkg/** filter with src/langbot/** and add uv.lock.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* docs(tests): update README to reflect current test layout

- Fix stale paths: tests/pipeline → tests/unit_tests/pipeline
- Update CI Python versions: 3.11, 3.12, 3.13
- Add test directory structure for box, config, platform, plugin, provider, storage
- Document pytest markers and uv commands
- Mention planned E2E tests

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* feat(test): add shared test factories package

Create tests/factories/ with reusable test factories:
- FakeApp: mock application with all dependencies
- Message chains: text_chain, mention_chain, image_chain
- Query factories: text_query, group_text_query, command_query, etc.

No test changes - maintains backward compatibility.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* feat(test): add fake provider factory

Add tests/factories/provider.py with:
- FakeProvider: deterministic fake LLM provider
- Error simulation: timeout, auth, rate-limit, malformed
- Request capture for assertions
- fake_model: mock model with attached provider

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* feat(test): add fake platform factory

Add tests/factories/platform.py with:
- FakePlatform: simulated platform adapter
- Inbound message construction: friend/group/image
- Mention-bot flag simulation
- Outbound message capture for assertions
- Streaming output support simulation
- Send failure simulation

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* feat(test): add comprehensive message/query factories

Extend tests/factories/message.py with:
- file_query: file attachment query
- unsupported_query: unknown message segment
- voice_query: audio/voice query
- at_all_query: group @All mention
- query_with_session: query with session object
- query_with_config: query with custom pipeline config

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* feat(test): add fake message flow smoke test

Create tests/smoke/test_fake_message_flow.py:
- TestFakeMessageFlow: factory verification tests
- TestMessageFlowIntegration: minimal flow smoke test
- Tests FakeApp, FakeProvider, FakePlatform, query factories
- Verifies LANGBOT_FAKE_PONG marker response
- Captures outbound messages for assertions

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* feat(test): add developer test-quick command

Add scripts/test-quick.sh and Makefile with:
- test-quick: runs ruff check + unit tests + smoke tests
- No real provider keys or platform accounts required
- Suitable for local branch self-test

Update tests/README.md:
- Document test-quick command
- Document test factories package
- Add smoke tests and factories directory structure

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* fix(test): make test-quick reliable as developer gate

Fixes for D-001验收问题:
1. test-quick.sh: use set -euo pipefail, uv run ruff, no tail pipe
2. Remove unused imports in factories (app.py, platform.py, provider.py)
3. Fix unused variable in smoke test
4. Add noqa: E402 to test_n8nsvapi.py lazy imports
5. Update smoke test docs: "minimal fake flow" not full pipeline

Now test-quick is a reliable gate: lint failures exit 1, test failures propagate.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* test(unit): add preproc and taskmgr unit tests

U-001: Pipeline Preprocessor tests
- Normal text message processing
- Empty message handling
- Image segment with/without vision model
- Model selection and fallback
- Variable extraction

U-004: Core Task Manager tests (pattern-based)
- Task creation and tracking patterns
- Task cancellation patterns
- Scope-based cancellation
- Task type filtering
- Pruning completed tasks
- Wait all tasks

Taskmgr tests use pattern-based approach to avoid circular import
in source code (taskmgr → app → http_controller → migration → taskmgr).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* test(unit): add config loader unit tests

U-005: Config Loader tests
- Valid YAML config loading
- Valid JSON config loading
- Invalid YAML/JSON error behavior
- Missing config file creation from template
- Template completion for missing keys
- ConfigManager load/dump operations
- Exists check for both YAML and JSON

All tests use tmp_path fixture, no real project config.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* test(unit): add chat and command handler pattern tests

U-002: Chat Handler tests (pattern-based)
- Normal message event emission pattern
- prevent_default handling
- User message alteration pattern
- Runner selection pattern
- Streaming/non-streaming response patterns
- Exception handling modes (show-error, show-hint, hide)
- Message history update pattern
- Telemetry payload pattern

U-003: Command Handler tests (pattern-based)
- Command parsing and text extraction
- Event creation pattern
- Privilege/admin check pattern
- Command result handling (text, error, image)
- prevent_default handling
- String truncation helper

Uses pattern-based testing to avoid circular import issues in source code.
Direct imports of handler modules trigger circular import chain.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* style: fix unused imports after ruff auto-fix

Remove unused imports in test files:
- test_config_loader.py: remove unused os
- test_taskmgr.py: remove unused Mock
- test_preproc.py: remove unused unsupported_query, image_chain

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* test(unit): improve taskmgr tests to test real classes

U-004 improved: Tests now import and test actual classes:
- TaskContext: new(), trace(), to_dict(), placeholder()
- TaskWrapper: task creation, context, exception/result capture, cancel, to_dict
- AsyncTaskManager: create_task, create_user_task, cancel_task, cancel_by_scope
- Task pruning behavior

Uses pre-mocking technique:
- Mock langbot.pkg.core.app before import (breaks circular chain)
- Mock langbot.pkg.core.entities with proper Enum

All 24 tests now test real class behavior, not patterns.
taskmgr.py coverage should improve significantly.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* refactor(test): consolidate FakeApp and add sys.modules isolation utility

- Extract tests/utils/import_isolation.py with isolated_sys_modules context manager
- Extend tests/factories/app.py FakeApp with handler-specific attributes
- Refactor test_chat_handler.py to use centralized FakeApp and cached imports
- Refactor test_command_handler.py with mock_execute_factory fixture
- Refactor test_smoke.py to move import-time sys.modules manipulation into fixture
- Add SQLite migration integration tests (G-002)
- Add HTTP API smoke integration tests (G-005)
- Update CI workflow to call pytest for SQLite migrations (G-004)

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* feat(test): add developer quality gate consolidation (G-007)

- Add scripts/test-integration-fast.sh for fast integration tests
- Add scripts/test-coverage.sh with 12% baseline threshold
- Update Makefile with test-integration-fast, test-coverage, test-all-local
- Update CI workflow with integration and coverage jobs
- Add smoke marker to pytest.ini
- Update tests/README.md with quality gate layers documentation
- Add tests/integration/pipeline/ for pipeline stage-chain tests

Quality gate layers:
- Quick: ruff + unit + smoke (~2 min)
- Fast Integration: SQLite/API/Pipeline (~3 min)
- Coverage: 12% threshold gate (~8 min)
- Full Local: all three combined

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* feat(test): add PostgreSQL migration slow integration tests (G-003)

- Add tests/integration/persistence/test_migrations_postgres.py
- All tests marked with @pytest.mark.slow
- Tests skip when TEST_POSTGRES_URL is not set (no local PostgreSQL)
- Database isolation via clean_tables and clean_alembic_version fixtures
- Update CI workflow to use pytest instead of inline Python script
- Remove TODO(G-003) comment
- Update tests/README.md with PostgreSQL test documentation

Covered scenarios:
- Baseline stamp sets revision
- Upgrade from baseline to head
- Upgrade idempotent
- Get current on unstamped DB returns None

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* feat(test): Phase 1.5 coverage expansion - COV-001 to COV-013

Coverage baseline raised from 13.65% to 26% (+12.35%)
Gate raised from 12% to 18%

Tasks completed:
- COV-001: Command system unit tests (100% coverage)
- COV-002: API service unit tests batch 1 (user/apikey/model/provider)
- COV-003: Provider model manager unit tests
- COV-004: Pipeline remaining stage tests (aggregator/cntfilter/longtext/msgtrun)
- COV-005: Storage and utils coverage pass
- COV-006: Gate ratchet 12%→15%
- COV-007: Gate ratchet 15%→18%
- COV-008: API service batch 2 (bot/pipeline/webhook/space/maintenance/mcp)
- COV-009: Blocked - API controller circular import issue documented
- COV-010: Plugin runtime unit tests (+0.08%)
- COV-011: RAG and vector unit tests (+0.68%)
- COV-012: Core boot and migration unit tests
- COV-013: Provider requester logic unit tests (+0.62%)

Key additions:
- tests/utils/import_isolation.py: sys.modules isolation for circular imports
- Provider requester mock tests: proved HTTP-dependent code can be tested locally
- Vector filter utilities: 100% coverage on pure functions
- API services: fake persistence pattern for unit testing

Blocked issue COV-009 documented in langbot-test-plan/1.5/issues/

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* test(phase1): add unit tests for telemetry, plugin, rag, persistence

Add initial unit tests for Phase 1 of test coverage improvement:
- telemetry: test initialization, payload sanitization, early returns (14.3% → 62.9%)
- plugin: test _parse_plugin_id static method
- rag: test _to_i18n_name static method
- persistence: test serialize_model with datetime handling

Overall core coverage: 41.9% → 42.2%

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* test(phase2): add unit tests for core, persistence, plugin, utils

- Add test_handler_helpers.py for plugin handler helpers (7 tests)
- Add test_mgr_methods.py for persistence manager (5 tests)
- Add test_app_config_validation.py for core app config (12 tests)
- Add test_knowledge_service.py for API knowledge service (22 tests)
- Add test_kbmgr.py for RAG knowledge base manager (39 tests)
- Add test_survey_manager.py for survey manager (22 tests)
- Add test_connector_methods.py for plugin connector (24 tests)
- Add test_funcschema.py for utils function schema (9 tests)
- Add test_platform.py for utils platform detection (7 tests)
- Add test_extract_deps.py for plugin deps extraction (7 tests)
- Add test_database_decorator.py for persistence decorator (7 tests)
- Add test_load_config.py for core config loading (19 tests)
- Add COVERAGE_EXCLUSIONS.md documenting external adapter exclusions
- Fix test_chat_session_limit.py path for portability

Coverage: core 28% → 30%, persistence 24% → 24.4%, plugin 27% → 28%
Total: 1082 tests passed, core module coverage 45.5%

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* test(integration): add API controller integration tests

- Add test_pipelines.py (10 tests) covering pipelines CRUD operations
  - GET/POST/PUT/DELETE on /api/v1/pipelines
  - Extensions endpoint
  - Metadata endpoint
  - Coverage: pipelines controller 27% → 80%

- Add test_providers.py (10 tests) covering provider/model management
  - Provider CRUD with model counts
  - LLM model CRUD
  - Coverage: providers controller 23% → 81%, models 29% → 45%

Tests use Quart TestClient with mocked services for real HTTP behavior
without external dependencies.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* test(integration): add knowledge, bots, and model endpoints tests

- Add test_knowledge.py (10 tests) covering knowledge base management
  - CRUD operations on /api/v1/knowledge/bases
  - Files management endpoints
  - Retrieve endpoint with validation
  - Coverage: knowledge/base.py 26% → 91%

- Add test_bots.py (9 tests) covering bot management
  - CRUD operations on /api/v1/platform/bots
  - Logs endpoint
  - Send message endpoint with validation
  - Coverage: platform/bots.py 24% → 87%

- Extend test_providers.py (+4 tests) for embedding/rerank models
  - Embedding models CRUD
  - Rerank models CRUD
  - Coverage: provider/models.py 29% → 60%

Total integration tests: 53 (smoke 12 + pipelines 10 + providers 14 + knowledge 10 + bots 9)

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* test(integration): add embed and monitoring endpoint tests

Add integration tests for embed widget and monitoring API endpoints:
- test_embed.py: 15 tests for widget.js, logo, turnstile, messages, reset, feedback
- test_monitoring.py: 15 tests for overview, messages, llm-calls, sessions, errors, export

Coverage improvements:
- embed.py: 17% → 56%
- monitoring.py: 17% → 93%

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* test(e2e): add minimal startup E2E tests

Add E2E tests for LangBot startup flow:
- tests/e2e/utils/config_factory.py: minimal config generation
- tests/e2e/utils/process_manager.py: LangBot subprocess management
- tests/e2e/conftest.py: E2E fixtures (session-scoped process)
- tests/e2e/test_startup.py: 12 tests for startup verification

Tests verify:
- boot.py + stages execution
- database initialization (SQLite)
- API availability
- migrations applied

Uses embedded databases (SQLite, Chroma) - no external dependencies.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* test(quality): fix fake tests and add missing coverage

P0 fixes:
- telemetry: rewrite fake tests with real behavior verification (25 tests)
- config: delete copied-source tests, use proper imports (2 deleted)
- persistence: fix try-except pass to verify specific errors

P1 fixes:
- pipeline: add real FixedWindowAlgo tests instead of mocks (12 tests)
- provider: add SessionManager and ToolManager tests (25 tests)
- storage: add S3StorageProvider tests with moto mock (16 tests)
- plugin: add handler action tests for setting inheritance (15 tests)
- rag: add file storage and ZIP processing tests (21 tests)
- vector: add VDB filter conversion tests (30 tests)

P2 fixes:
- pipeline/msgtrun: strengthen assertions for exact message count
- api: add response structure validation in integration tests

New test files:
- provider/test_session_manager.py
- provider/test_tool_manager.py
- storage/test_s3storage.py
- plugin/test_handler_actions.py
- rag/test_file_storage.py
- vector/test_vdb_filter_conversion.py

Source code bugs documented:
- provider: TokenManager.next_token() ZeroDivisionError
- telemetry: send_tasks class variable shared state
- command: empty command IndexError, unused parameters
- utils: funcschema KeyError
- entity: vector.py independent declarative_base

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* docs(test): update coverage stats and test structure

- Update coverage from 22% to 30%
- Add new test files to structure:
  - provider: session_manager, tool_manager
  - storage: s3storage
  - plugin: handler_actions
  - rag: file_storage
  - vector: vdb_filter_conversion
  - telemetry: rewritten tests
- Update module coverage percentages

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* test: add 105 new unit tests for untested core functionality

Add comprehensive tests for B-class issues (core functionality untested):

Pipeline:
- test_pool.py: QueryPool ID generation, caching, async context (12 tests)
- test_ratelimit.py: Fixed timing-sensitive test tolerance
- test_pipelinemgr.py: Use real Pydantic StageProcessResult instead of Mock

Utils:
- test_version.py: Version comparison functions (20 tests)
- test_logcache.py: Log page management and retrieval (18 tests)
- test_httpclient.py: HTTP session pool management (10 tests)
- test_proxy.py: Proxy configuration from env and config (10 tests)
- test_image.py: URL parsing and base64 extraction (12 tests)
- test_pkgmgr.py: Pip command generation (8 tests)

Discover:
- test_engine.py: I18nString, Metadata, Component manifest (15 tests)

Test count: 1193 → 1298 (+105 tests)

Note: Some B-class issues cannot be tested due to circular import bugs
filed as GitHub issues #2175 (pipeline) and #2176 (persistence).

* test: tighten phase 1 coverage contracts

* test: align ci integration isolation

---------

Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
This commit is contained in:
huanghuoguoguo
2026-05-16 12:05:54 +08:00
committed by GitHub
parent 4a4c0921a4
commit 17bbc8bf10
130 changed files with 32711 additions and 889 deletions

View File

View File

@@ -0,0 +1,210 @@
"""Tests for vector filter utilities."""
from __future__ import annotations
import pytest
from langbot.pkg.vector.filter_utils import (
SUPPORTED_OPS,
normalize_filter,
strip_unsupported_fields,
)
class TestNormalizeFilter:
"""Tests for normalize_filter function."""
def test_normalize_filter_empty_dict(self):
"""Empty dict returns empty list."""
result = normalize_filter({})
assert result == []
def test_normalize_filter_none(self):
"""None returns empty list."""
result = normalize_filter(None)
assert result == []
def test_normalize_filter_implicit_eq(self):
"""Bare value becomes implicit $eq."""
result = normalize_filter({'file_id': 'abc123'})
assert len(result) == 1
assert result[0] == ('file_id', '$eq', 'abc123')
def test_normalize_filter_explicit_eq(self):
"""Explicit $eq operator."""
result = normalize_filter({'file_id': {'$eq': 'abc123'}})
assert len(result) == 1
assert result[0] == ('file_id', '$eq', 'abc123')
def test_normalize_filter_comparison_operators(self):
"""Test comparison operators: $gt, $gte, $lt, $lte."""
result = normalize_filter({'created_at': {'$gte': 1700000000}})
assert len(result) == 1
assert result[0] == ('created_at', '$gte', 1700000000)
def test_normalize_filter_ne_operator(self):
"""Test $ne operator."""
result = normalize_filter({'status': {'$ne': 'deleted'}})
assert len(result) == 1
assert result[0] == ('status', '$ne', 'deleted')
def test_normalize_filter_in_operator(self):
"""Test $in operator with list value."""
result = normalize_filter({'file_type': {'$in': ['pdf', 'docx', 'txt']}})
assert len(result) == 1
assert result[0] == ('file_type', '$in', ['pdf', 'docx', 'txt'])
def test_normalize_filter_nin_operator(self):
"""Test $nin operator."""
result = normalize_filter({'status': {'$nin': ['deleted', 'archived']}})
assert len(result) == 1
assert result[0] == ('status', '$nin', ['deleted', 'archived'])
def test_normalize_filter_multiple_conditions(self):
"""Multiple top-level keys are AND-ed (returned as multiple triples)."""
result = normalize_filter({
'file_id': 'abc',
'status': {'$ne': 'deleted'},
'created_at': {'$gte': 1700000000}
})
assert len(result) == 3
# Order should match dict iteration order
field_ops = [(field, op) for field, op, _ in result]
assert ('file_id', '$eq') in field_ops
assert ('status', '$ne') in field_ops
assert ('created_at', '$gte') in field_ops
def test_normalize_filter_unsupported_operator_raises(self):
"""Unsupported operator raises ValueError."""
with pytest.raises(ValueError, match='Unsupported filter operator'):
normalize_filter({'field': {'$regex': 'pattern'}})
def test_normalize_filter_all_supported_ops(self):
"""Test all supported operators are recognized."""
for op in SUPPORTED_OPS:
if op in ('$in', '$nin'):
filter_dict = {'field': {op: ['value1', 'value2']}}
else:
filter_dict = {'field': {op: 'value'}}
result = normalize_filter(filter_dict)
assert len(result) == 1
assert result[0][1] == op
class TestStripUnsupportedFields:
"""Tests for strip_unsupported_fields function."""
def test_strip_keeps_supported_fields(self):
"""Fields in supported_fields are kept."""
triples = [
('file_id', '$eq', 'abc'),
('chunk_uuid', '$ne', 'def'),
]
result = strip_unsupported_fields(triples, {'file_id', 'chunk_uuid'})
assert len(result) == 2
assert result == triples
def test_strip_removes_unsupported_fields(self):
"""Fields not in supported_fields are removed."""
triples = [
('file_id', '$eq', 'abc'),
('unknown_field', '$ne', 'def'),
]
result = strip_unsupported_fields(triples, {'file_id'})
assert len(result) == 1
assert result[0] == ('file_id', '$eq', 'abc')
def test_strip_empty_triples(self):
"""Empty triples list returns empty list."""
result = strip_unsupported_fields([], {'file_id'})
assert result == []
def test_strip_all_unsupported(self):
"""All fields unsupported returns empty list."""
triples = [
('unknown1', '$eq', 'a'),
('unknown2', '$eq', 'b'),
]
result = strip_unsupported_fields(triples, {'file_id'})
assert result == []
def test_strip_with_field_aliases(self):
"""Field aliases are resolved before checking support."""
triples = [
('uuid', '$eq', 'abc'), # alias for chunk_uuid
('file_id', '$eq', 'def'),
]
result = strip_unsupported_fields(
triples,
{'file_id', 'chunk_uuid'},
field_aliases={'uuid': 'chunk_uuid'}
)
assert len(result) == 2
# 'uuid' should be resolved to 'chunk_uuid'
assert result[0] == ('chunk_uuid', '$eq', 'abc')
assert result[1] == ('file_id', '$eq', 'def')
def test_strip_alias_not_in_supported(self):
"""Alias resolved but still not in supported_fields is dropped."""
triples = [
('uuid', '$eq', 'abc'), # alias for chunk_uuid, but not supported
]
result = strip_unsupported_fields(
triples,
{'file_id'}, # chunk_uuid not supported
field_aliases={'uuid': 'chunk_uuid'}
)
assert result == []
def test_strip_preserves_operator_and_value(self):
"""Strip only affects field name, not operator or value."""
triples = [
('file_id', '$in', ['a', 'b', 'c']),
]
result = strip_unsupported_fields(triples, {'file_id'})
assert result[0] == ('file_id', '$in', ['a', 'b', 'c'])
def test_strip_none_aliases(self):
"""None field_aliases is treated as empty dict."""
triples = [
('file_id', '$eq', 'abc'),
]
result = strip_unsupported_fields(triples, {'file_id'}, field_aliases=None)
assert len(result) == 1
assert result[0] == ('file_id', '$eq', 'abc')
class TestSupportedOpsConstant:
"""Tests for SUPPORTED_OPS constant."""
def test_supported_ops_contains_expected(self):
"""SUPPORTED_OPS contains all expected operators."""
expected = {'$eq', '$ne', '$gt', '$gte', '$lt', '$lte', '$in', '$nin'}
assert SUPPORTED_OPS == expected
def test_supported_ops_is_frozenset(self):
"""SUPPORTED_OPS is a frozenset for immutability."""
from collections.abc import Set
assert isinstance(SUPPORTED_OPS, Set)

View File

@@ -0,0 +1,338 @@
"""Tests for VectorDBManager provider selection logic.
Tests the initialization logic that selects the appropriate VDB backend
based on configuration, without actually creating real VDB instances.
"""
from __future__ import annotations
from unittest.mock import MagicMock
from tests.utils.import_isolation import isolated_sys_modules
class TestVectorDBManagerInitialization:
"""Tests for VectorDBManager.initialize provider selection."""
def _create_mock_app(self, vdb_config: dict | None):
"""Create mock app with vdb configuration."""
mock_app = MagicMock()
mock_app.instance_config = MagicMock()
mock_app.instance_config.data = MagicMock()
mock_app.instance_config.data.get = MagicMock(return_value=vdb_config)
mock_app.logger = MagicMock()
mock_app.logger.info = MagicMock()
mock_app.logger.warning = MagicMock()
return mock_app
def _make_vector_import_mocks(self):
"""Create mocks for VDB backends to prevent real imports."""
mocks = {}
# Mock core.app to break circular import
mocks['langbot.pkg.core.app'] = MagicMock()
# Mock all VDB backend implementations
for backend in ['chroma', 'qdrant', 'seekdb', 'milvus', 'pgvector_db']:
mocks[f'langbot.pkg.vector.vdbs.{backend}'] = MagicMock()
return mocks
def test_initialize_no_config_defaults_to_chroma(self):
"""No vdb config defaults to Chroma."""
mock_app = self._create_mock_app(None)
mocks = self._make_vector_import_mocks()
# Create mock Chroma class
mock_chroma_class = MagicMock()
mocks['langbot.pkg.vector.vdbs.chroma'].ChromaVectorDatabase = mock_chroma_class
with isolated_sys_modules(mocks):
# Import after mocking
from langbot.pkg.vector.mgr import VectorDBManager
mgr = VectorDBManager(mock_app)
# Run initialize synchronously for test
import asyncio
asyncio.get_event_loop().run_until_complete(mgr.initialize())
# Chroma should be instantiated
mock_chroma_class.assert_called_once_with(mock_app)
mock_app.logger.warning.assert_called()
def test_initialize_chroma_backend(self):
"""Explicit chroma config uses Chroma backend."""
vdb_config = {'use': 'chroma'}
mock_app = self._create_mock_app(vdb_config)
mocks = self._make_vector_import_mocks()
mock_chroma_class = MagicMock()
mocks['langbot.pkg.vector.vdbs.chroma'].ChromaVectorDatabase = mock_chroma_class
with isolated_sys_modules(mocks):
from langbot.pkg.vector.mgr import VectorDBManager
mgr = VectorDBManager(mock_app)
import asyncio
asyncio.get_event_loop().run_until_complete(mgr.initialize())
mock_chroma_class.assert_called_once_with(mock_app)
mock_app.logger.info.assert_called()
def test_initialize_qdrant_backend(self):
"""Qdrant config uses Qdrant backend."""
vdb_config = {'use': 'qdrant'}
mock_app = self._create_mock_app(vdb_config)
mocks = self._make_vector_import_mocks()
mock_qdrant_class = MagicMock()
mocks['langbot.pkg.vector.vdbs.qdrant'].QdrantVectorDatabase = mock_qdrant_class
with isolated_sys_modules(mocks):
from langbot.pkg.vector.mgr import VectorDBManager
mgr = VectorDBManager(mock_app)
import asyncio
asyncio.get_event_loop().run_until_complete(mgr.initialize())
mock_qdrant_class.assert_called_once_with(mock_app)
def test_initialize_seekdb_backend(self):
"""SeekDB config uses SeekDB backend."""
vdb_config = {'use': 'seekdb'}
mock_app = self._create_mock_app(vdb_config)
mocks = self._make_vector_import_mocks()
mock_seekdb_class = MagicMock()
mocks['langbot.pkg.vector.vdbs.seekdb'].SeekDBVectorDatabase = mock_seekdb_class
with isolated_sys_modules(mocks):
from langbot.pkg.vector.mgr import VectorDBManager
mgr = VectorDBManager(mock_app)
import asyncio
asyncio.get_event_loop().run_until_complete(mgr.initialize())
mock_seekdb_class.assert_called_once_with(mock_app)
def test_initialize_milvus_backend_with_uri(self):
"""Milvus config with custom URI."""
vdb_config = {
'use': 'milvus',
'milvus': {
'uri': 'http://localhost:19530',
'token': 'root:Milvus',
'db_name': 'langbot_db'
}
}
mock_app = self._create_mock_app(vdb_config)
mocks = self._make_vector_import_mocks()
mock_milvus_class = MagicMock()
mocks['langbot.pkg.vector.vdbs.milvus'].MilvusVectorDatabase = mock_milvus_class
with isolated_sys_modules(mocks):
from langbot.pkg.vector.mgr import VectorDBManager
mgr = VectorDBManager(mock_app)
import asyncio
asyncio.get_event_loop().run_until_complete(mgr.initialize())
mock_milvus_class.assert_called_once_with(
mock_app,
uri='http://localhost:19530',
token='root:Milvus',
db_name='langbot_db'
)
def test_initialize_milvus_backend_defaults(self):
"""Milvus defaults when config not fully specified."""
vdb_config = {'use': 'milvus'}
mock_app = self._create_mock_app(vdb_config)
mocks = self._make_vector_import_mocks()
mock_milvus_class = MagicMock()
mocks['langbot.pkg.vector.vdbs.milvus'].MilvusVectorDatabase = mock_milvus_class
with isolated_sys_modules(mocks):
from langbot.pkg.vector.mgr import VectorDBManager
mgr = VectorDBManager(mock_app)
import asyncio
asyncio.get_event_loop().run_until_complete(mgr.initialize())
# Should use default values
mock_milvus_class.assert_called_once_with(
mock_app,
uri='./data/milvus.db',
token=None,
db_name='default'
)
def test_initialize_pgvector_with_connection_string(self):
"""pgvector with connection string."""
vdb_config = {
'use': 'pgvector',
'pgvector': {
'connection_string': 'postgresql://user:pass@host:5432/langbot'
}
}
mock_app = self._create_mock_app(vdb_config)
mocks = self._make_vector_import_mocks()
mock_pgvector_class = MagicMock()
mocks['langbot.pkg.vector.vdbs.pgvector_db'].PgVectorDatabase = mock_pgvector_class
with isolated_sys_modules(mocks):
from langbot.pkg.vector.mgr import VectorDBManager
mgr = VectorDBManager(mock_app)
import asyncio
asyncio.get_event_loop().run_until_complete(mgr.initialize())
mock_pgvector_class.assert_called_once_with(
mock_app,
connection_string='postgresql://user:pass@host:5432/langbot'
)
def test_initialize_pgvector_with_individual_params(self):
"""pgvector with individual connection parameters."""
vdb_config = {
'use': 'pgvector',
'pgvector': {
'host': 'db.example.com',
'port': 5433,
'database': 'vectordb',
'user': 'admin',
'password': 'secret'
}
}
mock_app = self._create_mock_app(vdb_config)
mocks = self._make_vector_import_mocks()
mock_pgvector_class = MagicMock()
mocks['langbot.pkg.vector.vdbs.pgvector_db'].PgVectorDatabase = mock_pgvector_class
with isolated_sys_modules(mocks):
from langbot.pkg.vector.mgr import VectorDBManager
mgr = VectorDBManager(mock_app)
import asyncio
asyncio.get_event_loop().run_until_complete(mgr.initialize())
mock_pgvector_class.assert_called_once_with(
mock_app,
host='db.example.com',
port=5433,
database='vectordb',
user='admin',
password='secret'
)
def test_initialize_pgvector_defaults(self):
"""pgvector defaults when no config params."""
vdb_config = {'use': 'pgvector'}
mock_app = self._create_mock_app(vdb_config)
mocks = self._make_vector_import_mocks()
mock_pgvector_class = MagicMock()
mocks['langbot.pkg.vector.vdbs.pgvector_db'].PgVectorDatabase = mock_pgvector_class
with isolated_sys_modules(mocks):
from langbot.pkg.vector.mgr import VectorDBManager
mgr = VectorDBManager(mock_app)
import asyncio
asyncio.get_event_loop().run_until_complete(mgr.initialize())
mock_pgvector_class.assert_called_once_with(
mock_app,
host='localhost',
port=5432,
database='langbot',
user='postgres',
password='postgres'
)
def test_initialize_unknown_backend_defaults_to_chroma(self):
"""Unknown vdb type defaults to Chroma with warning."""
vdb_config = {'use': 'unknown_backend'}
mock_app = self._create_mock_app(vdb_config)
mocks = self._make_vector_import_mocks()
mock_chroma_class = MagicMock()
mocks['langbot.pkg.vector.vdbs.chroma'].ChromaVectorDatabase = mock_chroma_class
with isolated_sys_modules(mocks):
from langbot.pkg.vector.mgr import VectorDBManager
mgr = VectorDBManager(mock_app)
import asyncio
asyncio.get_event_loop().run_until_complete(mgr.initialize())
mock_chroma_class.assert_called_once_with(mock_app)
mock_app.logger.warning.assert_called()
# Should warn about no valid backend
warning_msg = mock_app.logger.warning.call_args[0][0]
assert 'No valid' in warning_msg or 'defaulting' in warning_msg
class TestVectorDBManagerProxies:
"""Tests for VectorDBManager proxy methods."""
def test_get_supported_search_types_no_vector_db(self):
"""get_supported_search_types returns vector when no vector_db."""
mock_app = MagicMock()
mock_app.instance_config = MagicMock()
mock_app.instance_config.data = MagicMock()
mock_app.instance_config.data.get = MagicMock(return_value=None)
mock_app.logger = MagicMock()
mocks = {'langbot.pkg.core.app': MagicMock()}
for backend in ['chroma', 'qdrant', 'seekdb', 'milvus', 'pgvector_db']:
mocks[f'langbot.pkg.vector.vdbs.{backend}'] = MagicMock()
with isolated_sys_modules(mocks):
from langbot.pkg.vector.mgr import VectorDBManager
mgr = VectorDBManager(mock_app)
mgr.vector_db = None # Explicitly None
result = mgr.get_supported_search_types()
assert result == ['vector']
def test_get_supported_search_types_with_vector_db(self):
"""get_supported_search_types delegates to vector_db."""
mock_app = MagicMock()
# Create mock vector_db with supported_search_types
mock_vector_db = MagicMock()
mock_vector_db.supported_search_types = MagicMock(
return_value=[
MagicMock(value='vector'),
MagicMock(value='full_text'),
]
)
mocks = {'langbot.pkg.core.app': MagicMock()}
for backend in ['chroma', 'qdrant', 'seekdb', 'milvus', 'pgvector_db']:
mocks[f'langbot.pkg.vector.vdbs.{backend}'] = MagicMock()
with isolated_sys_modules(mocks):
from langbot.pkg.vector.mgr import VectorDBManager
mgr = VectorDBManager(mock_app)
mgr.vector_db = mock_vector_db
result = mgr.get_supported_search_types()
assert result == ['vector', 'full_text']

View File

@@ -0,0 +1,173 @@
"""Tests for VectorDatabase base class and SearchType enum."""
from __future__ import annotations
from unittest.mock import AsyncMock
import pytest
from langbot.pkg.vector.vdb import SearchType, VectorDatabase
class TestSearchType:
"""Tests for SearchType enum."""
def test_search_type_values(self):
"""Test SearchType enum values."""
assert SearchType.VECTOR.value == 'vector'
assert SearchType.FULL_TEXT.value == 'full_text'
assert SearchType.HYBRID.value == 'hybrid'
def test_search_type_is_string_enum(self):
"""SearchType is a string enum."""
assert isinstance(SearchType.VECTOR, str)
assert SearchType.VECTOR == 'vector'
def test_search_type_from_string(self):
"""Can create SearchType from string."""
assert SearchType('vector') == SearchType.VECTOR
assert SearchType('full_text') == SearchType.FULL_TEXT
assert SearchType('hybrid') == SearchType.HYBRID
class TestVectorDatabaseAbstractMethods:
"""Tests for VectorDatabase abstract methods."""
def test_vector_database_is_abstract(self):
"""VectorDatabase is abstract and cannot be instantiated directly."""
with pytest.raises(TypeError):
VectorDatabase()
def test_abstract_methods_required(self):
"""Subclass must implement all abstract methods."""
class IncompleteVectorDB(VectorDatabase):
pass
with pytest.raises(TypeError):
IncompleteVectorDB()
def test_supported_search_types_default(self):
"""Default supported_search_types returns [VECTOR]."""
class MinimalVectorDB(VectorDatabase):
async def add_embeddings(self, collection, ids, embeddings_list, metadatas, documents=None):
pass
async def search(self, collection, query_embedding, k=5, search_type='vector', query_text='', filter=None, vector_weight=None):
pass
async def delete_by_file_id(self, collection, file_id):
pass
async def delete_by_filter(self, collection, filter):
pass
async def get_or_create_collection(self, collection):
pass
async def delete_collection(self, collection):
pass
db = MinimalVectorDB()
assert db.supported_search_types() == [SearchType.VECTOR]
def test_list_by_filter_default_implementation(self):
"""list_by_filter has default implementation returning empty."""
class MinimalVectorDB(VectorDatabase):
async def add_embeddings(self, collection, ids, embeddings_list, metadatas, documents=None):
pass
async def search(self, collection, query_embedding, k=5, search_type='vector', query_text='', filter=None, vector_weight=None):
pass
async def delete_by_file_id(self, collection, file_id):
pass
async def delete_by_filter(self, collection, filter):
pass
async def get_or_create_collection(self, collection):
pass
async def delete_collection(self, collection):
pass
db = MinimalVectorDB()
# list_by_filter should return empty list and -1 for total
import asyncio
result = asyncio.get_event_loop().run_until_complete(
db.list_by_filter('test_collection')
)
assert result == ([], -1)
class TestVectorDatabaseInterface:
"""Tests for VectorDatabase interface contracts."""
@pytest.fixture
def mock_vector_db(self):
"""Create a minimal mock VectorDatabase for testing."""
class MockVectorDB(VectorDatabase):
def __init__(self):
self.add_embeddings = AsyncMock()
self.search = AsyncMock(return_value={
'ids': [['id1', 'id2']],
'distances': [[0.1, 0.2]],
'metadatas': [[{'key': 'val1'}, {'key': 'val2'}]]
})
self.delete_by_file_id = AsyncMock()
self.delete_by_filter = AsyncMock(return_value=5)
self.get_or_create_collection = AsyncMock()
self.delete_collection = AsyncMock()
async def add_embeddings(self, collection, ids, embeddings_list, metadatas, documents=None):
pass
async def search(self, collection, query_embedding, k=5, search_type='vector', query_text='', filter=None, vector_weight=None):
pass
async def delete_by_file_id(self, collection, file_id):
pass
async def delete_by_filter(self, collection, filter):
pass
async def get_or_create_collection(self, collection):
pass
async def delete_collection(self, collection):
pass
return MockVectorDB()
@pytest.mark.asyncio
async def test_add_embeddings_signature(self, mock_vector_db):
"""add_embeddings has expected signature."""
await mock_vector_db.add_embeddings(
collection='test',
ids=['id1', 'id2'],
embeddings_list=[[0.1, 0.2], [0.3, 0.4]],
metadatas=[{'a': 1}, {'b': 2}],
documents=['doc1', 'doc2']
)
mock_vector_db.add_embeddings.assert_called_once()
@pytest.mark.asyncio
async def test_search_signature(self, mock_vector_db):
"""search has expected signature with all optional params."""
import numpy as np
await mock_vector_db.search(
collection='test',
query_embedding=np.array([0.1, 0.2]),
k=10,
search_type='hybrid',
query_text='search text',
filter={'file_id': 'abc'},
vector_weight=0.7
)
mock_vector_db.search.assert_called_once()
@pytest.mark.asyncio
async def test_delete_by_filter_returns_int(self, mock_vector_db):
"""delete_by_filter returns int count."""
result = await mock_vector_db.delete_by_filter('test', {'file_id': 'abc'})
assert isinstance(result, int)

View File

@@ -0,0 +1,359 @@
"""Tests for VDB backend filter conversion functions.
Tests cover:
- _build_qdrant_filter: Qdrant models.Filter conversion
- _build_milvus_expr: Milvus boolean expression string conversion
- _build_pg_conditions: PostgreSQL SQLAlchemy conditions conversion
"""
from __future__ import annotations
from importlib import import_module
def get_qdrant_module():
"""Lazy import qdrant module."""
return import_module('langbot.pkg.vector.vdbs.qdrant')
def get_milvus_module():
"""Lazy import milvus module."""
return import_module('langbot.pkg.vector.vdbs.milvus')
def get_pgvector_module():
"""Lazy import pgvector module."""
return import_module('langbot.pkg.vector.vdbs.pgvector_db')
class TestQdrantFilterConversion:
"""Tests for _build_qdrant_filter function."""
def test_empty_filter_returns_empty_must(self):
"""Empty filter dict returns Filter with None must/must_not."""
qdrant_module = get_qdrant_module()
result = qdrant_module._build_qdrant_filter({})
assert result.must is None
assert result.must_not is None
def test_eq_operator_creates_must_condition(self):
"""$eq operator creates FieldCondition in must list."""
qdrant_module = get_qdrant_module()
from qdrant_client import models
result = qdrant_module._build_qdrant_filter({'file_id': 'abc'})
assert result.must is not None
assert len(result.must) == 1
condition = result.must[0]
assert condition.key == 'file_id'
assert isinstance(condition.match, models.MatchValue)
assert condition.match.value == 'abc'
def test_ne_operator_creates_must_not_condition(self):
"""$ne operator creates FieldCondition in must_not list."""
qdrant_module = get_qdrant_module()
from qdrant_client import models
result = qdrant_module._build_qdrant_filter({'status': {'$ne': 'deleted'}})
assert result.must_not is not None
assert len(result.must_not) == 1
condition = result.must_not[0]
assert condition.key == 'status'
assert isinstance(condition.match, models.MatchValue)
assert condition.match.value == 'deleted'
def test_in_operator_creates_match_any(self):
"""$in operator creates MatchAny condition."""
qdrant_module = get_qdrant_module()
from qdrant_client import models
result = qdrant_module._build_qdrant_filter({'file_type': {'$in': ['pdf', 'docx']}})
assert result.must is not None
assert len(result.must) == 1
condition = result.must[0]
assert condition.key == 'file_type'
assert isinstance(condition.match, models.MatchAny)
assert condition.match.any == ['pdf', 'docx']
def test_nin_operator_creates_must_not_match_any(self):
"""$nin operator creates MatchAny in must_not."""
qdrant_module = get_qdrant_module()
from qdrant_client import models
result = qdrant_module._build_qdrant_filter({'status': {'$nin': ['deleted', 'archived']}})
assert result.must_not is not None
assert len(result.must_not) == 1
condition = result.must_not[0]
assert condition.key == 'status'
assert isinstance(condition.match, models.MatchAny)
assert condition.match.any == ['deleted', 'archived']
def test_range_operators_create_range_condition(self):
"""$gt, $gte, $lt, $lte create Range conditions."""
qdrant_module = get_qdrant_module()
from qdrant_client import models
# Test $gt
result = qdrant_module._build_qdrant_filter({'created_at': {'$gt': 100}})
condition = result.must[0]
assert isinstance(condition.range, models.Range)
assert condition.range.gt == 100
# Test $gte
result = qdrant_module._build_qdrant_filter({'created_at': {'$gte': 100}})
condition = result.must[0]
assert condition.range.gte == 100
# Test $lt
result = qdrant_module._build_qdrant_filter({'created_at': {'$lt': 100}})
condition = result.must[0]
assert condition.range.lt == 100
# Test $lte
result = qdrant_module._build_qdrant_filter({'created_at': {'$lte': 100}})
condition = result.must[0]
assert condition.range.lte == 100
def test_multiple_conditions_combined(self):
"""Multiple conditions are combined in must/must_not."""
qdrant_module = get_qdrant_module()
result = qdrant_module._build_qdrant_filter({
'file_id': 'abc',
'status': {'$ne': 'deleted'},
'created_at': {'$gte': 100},
})
assert len(result.must) == 2 # file_id eq + created_at gte
assert len(result.must_not) == 1 # status ne
def test_implicit_eq_handled(self):
"""Implicit $eq (bare value) is correctly handled."""
qdrant_module = get_qdrant_module()
from qdrant_client import models
result = qdrant_module._build_qdrant_filter({'field': 'value'})
assert result.must is not None
condition = result.must[0]
assert isinstance(condition.match, models.MatchValue)
class TestMilvusFilterConversion:
"""Tests for _build_milvus_expr function.
NOTE: Milvus only supports fields: 'text', 'file_id', 'chunk_uuid'
Tests use only these supported fields.
"""
def test_empty_filter_returns_empty_string(self):
"""Empty filter dict returns empty string."""
milvus_module = get_milvus_module()
result = milvus_module._build_milvus_expr({})
assert result == ''
def test_eq_operator_expression(self):
"""$eq operator creates == expression."""
milvus_module = get_milvus_module()
result = milvus_module._build_milvus_expr({'file_id': 'abc'})
assert result == 'file_id == "abc"'
def test_ne_operator_expression(self):
"""$ne operator creates != expression."""
milvus_module = get_milvus_module()
result = milvus_module._build_milvus_expr({'file_id': {'$ne': 'deleted'}})
assert result == 'file_id != "deleted"'
def test_comparison_operators(self):
"""$gt, $gte, $lt, $lte create comparison expressions."""
milvus_module = get_milvus_module()
assert milvus_module._build_milvus_expr({'chunk_uuid': {'$gt': 'uuid_100'}}) == 'chunk_uuid > "uuid_100"'
assert milvus_module._build_milvus_expr({'chunk_uuid': {'$gte': 'uuid_100'}}) == 'chunk_uuid >= "uuid_100"'
assert milvus_module._build_milvus_expr({'chunk_uuid': {'$lt': 'uuid_100'}}) == 'chunk_uuid < "uuid_100"'
assert milvus_module._build_milvus_expr({'chunk_uuid': {'$lte': 'uuid_100'}}) == 'chunk_uuid <= "uuid_100"'
def test_in_operator_expression(self):
"""$in operator creates in [...] expression."""
milvus_module = get_milvus_module()
result = milvus_module._build_milvus_expr({'file_id': {'$in': ['pdf', 'docx']}})
assert result == 'file_id in ["pdf", "docx"]'
def test_nin_operator_expression(self):
"""$nin operator creates not in [...] expression."""
milvus_module = get_milvus_module()
result = milvus_module._build_milvus_expr({'file_id': {'$nin': ['deleted', 'archived']}})
assert result == 'file_id not in ["deleted", "archived"]'
def test_multiple_conditions_joined_with_and(self):
"""Multiple conditions are joined with 'and'."""
milvus_module = get_milvus_module()
result = milvus_module._build_milvus_expr({
'file_id': 'abc',
'chunk_uuid': {'$ne': 'def'},
})
assert 'and' in result
assert 'file_id == "abc"' in result
assert 'chunk_uuid != "def"' in result
def test_string_value_escaped(self):
"""String values are properly escaped."""
milvus_module = get_milvus_module()
# Test backslash escape
result = milvus_module._build_milvus_expr({'file_id': 'C:\\Users\\test'})
assert '\\\\' in result
# Test quote escape
result = milvus_module._build_milvus_expr({'file_id': 'test "quoted"'})
assert '\\"' in result
def test_text_field_supported(self):
"""text field is supported."""
milvus_module = get_milvus_module()
result = milvus_module._build_milvus_expr({'text': 'some text'})
assert result == 'text == "some text"'
def test_milvus_literal_function(self):
"""Test _milvus_literal helper."""
milvus_module = get_milvus_module()
assert milvus_module._milvus_literal('string') == '"string"'
assert milvus_module._milvus_literal(42) == '42'
assert milvus_module._milvus_literal(3.14) == '3.14'
def test_unsupported_field_dropped(self):
"""Unsupported fields are dropped (not in _MILVUS_SUPPORTED_FIELDS)."""
milvus_module = get_milvus_module()
result = milvus_module._build_milvus_expr({'unknown_field': 'value'})
assert result == ''
def test_uuid_alias_resolved(self):
"""'uuid' alias is resolved to 'chunk_uuid'."""
milvus_module = get_milvus_module()
result = milvus_module._build_milvus_expr({'uuid': 'abc'})
assert result.startswith('chunk_uuid')
# uuid substring appears in chunk_uuid which is expected
class TestPgVectorFilterConversion:
"""Tests for _build_pg_conditions function.
NOTE: PGVector only supports fields: 'text', 'file_id', 'chunk_uuid'
Tests use only these supported fields.
"""
def test_empty_filter_returns_empty_list(self):
"""Empty filter dict returns empty list."""
pgvector_module = get_pgvector_module()
result = pgvector_module._build_pg_conditions({})
assert result == []
def test_eq_operator_creates_equality_condition(self):
"""$eq operator creates SQLAlchemy == condition."""
pgvector_module = get_pgvector_module()
result = pgvector_module._build_pg_conditions({'file_id': 'abc'})
assert len(result) == 1
# Verify it's a SQLAlchemy BinaryExpression
from sqlalchemy.sql.expression import BinaryExpression
assert isinstance(result[0], BinaryExpression)
def test_ne_operator_creates_inequality_condition(self):
"""$ne operator creates SQLAlchemy != condition."""
pgvector_module = get_pgvector_module()
result = pgvector_module._build_pg_conditions({'file_id': {'$ne': 'deleted'}})
assert len(result) == 1
# Operator should be ne (not equals)
assert '!=' in str(result[0]) or 'ne' in str(result[0].operator)
def test_comparison_operators(self):
"""$gt, $gte, $lt, $lte create comparison conditions."""
pgvector_module = get_pgvector_module()
# Test all comparison operators with supported field
for op, expected_op in [
('$gt', '>'),
('$gte', '>='),
('$lt', '<'),
('$lte', '<='),
]:
result = pgvector_module._build_pg_conditions({'chunk_uuid': {op: 'uuid_100'}})
assert len(result) == 1
assert expected_op in str(result[0])
def test_in_operator_creates_in_condition(self):
"""$in operator creates SQLAlchemy in_ condition."""
pgvector_module = get_pgvector_module()
result = pgvector_module._build_pg_conditions({'file_id': {'$in': ['a', 'b', 'c']}})
assert len(result) == 1
assert 'IN' in str(result[0]).upper()
def test_nin_operator_creates_notin_condition(self):
"""$nin operator creates SQLAlchemy notin_ condition."""
pgvector_module = get_pgvector_module()
result = pgvector_module._build_pg_conditions({'file_id': {'$nin': ['a', 'b']}})
assert len(result) == 1
assert 'NOT IN' in str(result[0]).upper()
def test_multiple_conditions_list(self):
"""Multiple conditions return list of conditions."""
pgvector_module = get_pgvector_module()
result = pgvector_module._build_pg_conditions({
'file_id': 'abc',
'chunk_uuid': {'$ne': 'def'},
})
assert len(result) == 2
def test_unsupported_field_dropped(self):
"""Unsupported fields are dropped (not in _PG_SUPPORTED_FIELDS)."""
pgvector_module = get_pgvector_module()
result = pgvector_module._build_pg_conditions({'unknown_field': 'value'})
assert result == []
def test_uuid_alias_resolved(self):
"""'uuid' alias is resolved to 'chunk_uuid'."""
pgvector_module = get_pgvector_module()
result = pgvector_module._build_pg_conditions({'uuid': 'abc'})
assert len(result) == 1
# Should reference chunk_uuid column
assert 'chunk_uuid' in str(result[0])
def test_supported_fields_only(self):
"""Only supported fields (text, file_id, chunk_uuid) are kept."""
pgvector_module = get_pgvector_module()
result = pgvector_module._build_pg_conditions({
'text': {'$ne': ''},
'file_id': 'abc',
'chunk_uuid': {'$in': ['x', 'y']},
'unsupported': 'value',
})
assert len(result) == 3 # Only supported fields