mirror of
https://github.com/langbot-app/LangBot.git
synced 2026-06-02 03:55:55 +00:00
* fix(ci): update unit-test workflow paths to match current source layout Replace stale pkg/** filter with src/langbot/** and add uv.lock. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * docs(tests): update README to reflect current test layout - Fix stale paths: tests/pipeline → tests/unit_tests/pipeline - Update CI Python versions: 3.11, 3.12, 3.13 - Add test directory structure for box, config, platform, plugin, provider, storage - Document pytest markers and uv commands - Mention planned E2E tests Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * feat(test): add shared test factories package Create tests/factories/ with reusable test factories: - FakeApp: mock application with all dependencies - Message chains: text_chain, mention_chain, image_chain - Query factories: text_query, group_text_query, command_query, etc. No test changes - maintains backward compatibility. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * feat(test): add fake provider factory Add tests/factories/provider.py with: - FakeProvider: deterministic fake LLM provider - Error simulation: timeout, auth, rate-limit, malformed - Request capture for assertions - fake_model: mock model with attached provider Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * feat(test): add fake platform factory Add tests/factories/platform.py with: - FakePlatform: simulated platform adapter - Inbound message construction: friend/group/image - Mention-bot flag simulation - Outbound message capture for assertions - Streaming output support simulation - Send failure simulation Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * feat(test): add comprehensive message/query factories Extend tests/factories/message.py with: - file_query: file attachment query - unsupported_query: unknown message segment - voice_query: audio/voice query - at_all_query: group @All mention - query_with_session: query with session object - query_with_config: query with custom pipeline config Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * feat(test): add fake message flow smoke test Create tests/smoke/test_fake_message_flow.py: - TestFakeMessageFlow: factory verification tests - TestMessageFlowIntegration: minimal flow smoke test - Tests FakeApp, FakeProvider, FakePlatform, query factories - Verifies LANGBOT_FAKE_PONG marker response - Captures outbound messages for assertions Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * feat(test): add developer test-quick command Add scripts/test-quick.sh and Makefile with: - test-quick: runs ruff check + unit tests + smoke tests - No real provider keys or platform accounts required - Suitable for local branch self-test Update tests/README.md: - Document test-quick command - Document test factories package - Add smoke tests and factories directory structure Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * fix(test): make test-quick reliable as developer gate Fixes for D-001验收问题: 1. test-quick.sh: use set -euo pipefail, uv run ruff, no tail pipe 2. Remove unused imports in factories (app.py, platform.py, provider.py) 3. Fix unused variable in smoke test 4. Add noqa: E402 to test_n8nsvapi.py lazy imports 5. Update smoke test docs: "minimal fake flow" not full pipeline Now test-quick is a reliable gate: lint failures exit 1, test failures propagate. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * test(unit): add preproc and taskmgr unit tests U-001: Pipeline Preprocessor tests - Normal text message processing - Empty message handling - Image segment with/without vision model - Model selection and fallback - Variable extraction U-004: Core Task Manager tests (pattern-based) - Task creation and tracking patterns - Task cancellation patterns - Scope-based cancellation - Task type filtering - Pruning completed tasks - Wait all tasks Taskmgr tests use pattern-based approach to avoid circular import in source code (taskmgr → app → http_controller → migration → taskmgr). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * test(unit): add config loader unit tests U-005: Config Loader tests - Valid YAML config loading - Valid JSON config loading - Invalid YAML/JSON error behavior - Missing config file creation from template - Template completion for missing keys - ConfigManager load/dump operations - Exists check for both YAML and JSON All tests use tmp_path fixture, no real project config. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * test(unit): add chat and command handler pattern tests U-002: Chat Handler tests (pattern-based) - Normal message event emission pattern - prevent_default handling - User message alteration pattern - Runner selection pattern - Streaming/non-streaming response patterns - Exception handling modes (show-error, show-hint, hide) - Message history update pattern - Telemetry payload pattern U-003: Command Handler tests (pattern-based) - Command parsing and text extraction - Event creation pattern - Privilege/admin check pattern - Command result handling (text, error, image) - prevent_default handling - String truncation helper Uses pattern-based testing to avoid circular import issues in source code. Direct imports of handler modules trigger circular import chain. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * style: fix unused imports after ruff auto-fix Remove unused imports in test files: - test_config_loader.py: remove unused os - test_taskmgr.py: remove unused Mock - test_preproc.py: remove unused unsupported_query, image_chain Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * test(unit): improve taskmgr tests to test real classes U-004 improved: Tests now import and test actual classes: - TaskContext: new(), trace(), to_dict(), placeholder() - TaskWrapper: task creation, context, exception/result capture, cancel, to_dict - AsyncTaskManager: create_task, create_user_task, cancel_task, cancel_by_scope - Task pruning behavior Uses pre-mocking technique: - Mock langbot.pkg.core.app before import (breaks circular chain) - Mock langbot.pkg.core.entities with proper Enum All 24 tests now test real class behavior, not patterns. taskmgr.py coverage should improve significantly. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * refactor(test): consolidate FakeApp and add sys.modules isolation utility - Extract tests/utils/import_isolation.py with isolated_sys_modules context manager - Extend tests/factories/app.py FakeApp with handler-specific attributes - Refactor test_chat_handler.py to use centralized FakeApp and cached imports - Refactor test_command_handler.py with mock_execute_factory fixture - Refactor test_smoke.py to move import-time sys.modules manipulation into fixture - Add SQLite migration integration tests (G-002) - Add HTTP API smoke integration tests (G-005) - Update CI workflow to call pytest for SQLite migrations (G-004) Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * feat(test): add developer quality gate consolidation (G-007) - Add scripts/test-integration-fast.sh for fast integration tests - Add scripts/test-coverage.sh with 12% baseline threshold - Update Makefile with test-integration-fast, test-coverage, test-all-local - Update CI workflow with integration and coverage jobs - Add smoke marker to pytest.ini - Update tests/README.md with quality gate layers documentation - Add tests/integration/pipeline/ for pipeline stage-chain tests Quality gate layers: - Quick: ruff + unit + smoke (~2 min) - Fast Integration: SQLite/API/Pipeline (~3 min) - Coverage: 12% threshold gate (~8 min) - Full Local: all three combined Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * feat(test): add PostgreSQL migration slow integration tests (G-003) - Add tests/integration/persistence/test_migrations_postgres.py - All tests marked with @pytest.mark.slow - Tests skip when TEST_POSTGRES_URL is not set (no local PostgreSQL) - Database isolation via clean_tables and clean_alembic_version fixtures - Update CI workflow to use pytest instead of inline Python script - Remove TODO(G-003) comment - Update tests/README.md with PostgreSQL test documentation Covered scenarios: - Baseline stamp sets revision - Upgrade from baseline to head - Upgrade idempotent - Get current on unstamped DB returns None Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * feat(test): Phase 1.5 coverage expansion - COV-001 to COV-013 Coverage baseline raised from 13.65% to 26% (+12.35%) Gate raised from 12% to 18% Tasks completed: - COV-001: Command system unit tests (100% coverage) - COV-002: API service unit tests batch 1 (user/apikey/model/provider) - COV-003: Provider model manager unit tests - COV-004: Pipeline remaining stage tests (aggregator/cntfilter/longtext/msgtrun) - COV-005: Storage and utils coverage pass - COV-006: Gate ratchet 12%→15% - COV-007: Gate ratchet 15%→18% - COV-008: API service batch 2 (bot/pipeline/webhook/space/maintenance/mcp) - COV-009: Blocked - API controller circular import issue documented - COV-010: Plugin runtime unit tests (+0.08%) - COV-011: RAG and vector unit tests (+0.68%) - COV-012: Core boot and migration unit tests - COV-013: Provider requester logic unit tests (+0.62%) Key additions: - tests/utils/import_isolation.py: sys.modules isolation for circular imports - Provider requester mock tests: proved HTTP-dependent code can be tested locally - Vector filter utilities: 100% coverage on pure functions - API services: fake persistence pattern for unit testing Blocked issue COV-009 documented in langbot-test-plan/1.5/issues/ Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * test(phase1): add unit tests for telemetry, plugin, rag, persistence Add initial unit tests for Phase 1 of test coverage improvement: - telemetry: test initialization, payload sanitization, early returns (14.3% → 62.9%) - plugin: test _parse_plugin_id static method - rag: test _to_i18n_name static method - persistence: test serialize_model with datetime handling Overall core coverage: 41.9% → 42.2% Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * test(phase2): add unit tests for core, persistence, plugin, utils - Add test_handler_helpers.py for plugin handler helpers (7 tests) - Add test_mgr_methods.py for persistence manager (5 tests) - Add test_app_config_validation.py for core app config (12 tests) - Add test_knowledge_service.py for API knowledge service (22 tests) - Add test_kbmgr.py for RAG knowledge base manager (39 tests) - Add test_survey_manager.py for survey manager (22 tests) - Add test_connector_methods.py for plugin connector (24 tests) - Add test_funcschema.py for utils function schema (9 tests) - Add test_platform.py for utils platform detection (7 tests) - Add test_extract_deps.py for plugin deps extraction (7 tests) - Add test_database_decorator.py for persistence decorator (7 tests) - Add test_load_config.py for core config loading (19 tests) - Add COVERAGE_EXCLUSIONS.md documenting external adapter exclusions - Fix test_chat_session_limit.py path for portability Coverage: core 28% → 30%, persistence 24% → 24.4%, plugin 27% → 28% Total: 1082 tests passed, core module coverage 45.5% Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * test(integration): add API controller integration tests - Add test_pipelines.py (10 tests) covering pipelines CRUD operations - GET/POST/PUT/DELETE on /api/v1/pipelines - Extensions endpoint - Metadata endpoint - Coverage: pipelines controller 27% → 80% - Add test_providers.py (10 tests) covering provider/model management - Provider CRUD with model counts - LLM model CRUD - Coverage: providers controller 23% → 81%, models 29% → 45% Tests use Quart TestClient with mocked services for real HTTP behavior without external dependencies. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * test(integration): add knowledge, bots, and model endpoints tests - Add test_knowledge.py (10 tests) covering knowledge base management - CRUD operations on /api/v1/knowledge/bases - Files management endpoints - Retrieve endpoint with validation - Coverage: knowledge/base.py 26% → 91% - Add test_bots.py (9 tests) covering bot management - CRUD operations on /api/v1/platform/bots - Logs endpoint - Send message endpoint with validation - Coverage: platform/bots.py 24% → 87% - Extend test_providers.py (+4 tests) for embedding/rerank models - Embedding models CRUD - Rerank models CRUD - Coverage: provider/models.py 29% → 60% Total integration tests: 53 (smoke 12 + pipelines 10 + providers 14 + knowledge 10 + bots 9) Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * test(integration): add embed and monitoring endpoint tests Add integration tests for embed widget and monitoring API endpoints: - test_embed.py: 15 tests for widget.js, logo, turnstile, messages, reset, feedback - test_monitoring.py: 15 tests for overview, messages, llm-calls, sessions, errors, export Coverage improvements: - embed.py: 17% → 56% - monitoring.py: 17% → 93% Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * test(e2e): add minimal startup E2E tests Add E2E tests for LangBot startup flow: - tests/e2e/utils/config_factory.py: minimal config generation - tests/e2e/utils/process_manager.py: LangBot subprocess management - tests/e2e/conftest.py: E2E fixtures (session-scoped process) - tests/e2e/test_startup.py: 12 tests for startup verification Tests verify: - boot.py + stages execution - database initialization (SQLite) - API availability - migrations applied Uses embedded databases (SQLite, Chroma) - no external dependencies. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * test(quality): fix fake tests and add missing coverage P0 fixes: - telemetry: rewrite fake tests with real behavior verification (25 tests) - config: delete copied-source tests, use proper imports (2 deleted) - persistence: fix try-except pass to verify specific errors P1 fixes: - pipeline: add real FixedWindowAlgo tests instead of mocks (12 tests) - provider: add SessionManager and ToolManager tests (25 tests) - storage: add S3StorageProvider tests with moto mock (16 tests) - plugin: add handler action tests for setting inheritance (15 tests) - rag: add file storage and ZIP processing tests (21 tests) - vector: add VDB filter conversion tests (30 tests) P2 fixes: - pipeline/msgtrun: strengthen assertions for exact message count - api: add response structure validation in integration tests New test files: - provider/test_session_manager.py - provider/test_tool_manager.py - storage/test_s3storage.py - plugin/test_handler_actions.py - rag/test_file_storage.py - vector/test_vdb_filter_conversion.py Source code bugs documented: - provider: TokenManager.next_token() ZeroDivisionError - telemetry: send_tasks class variable shared state - command: empty command IndexError, unused parameters - utils: funcschema KeyError - entity: vector.py independent declarative_base Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * docs(test): update coverage stats and test structure - Update coverage from 22% to 30% - Add new test files to structure: - provider: session_manager, tool_manager - storage: s3storage - plugin: handler_actions - rag: file_storage - vector: vdb_filter_conversion - telemetry: rewritten tests - Update module coverage percentages Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * test: add 105 new unit tests for untested core functionality Add comprehensive tests for B-class issues (core functionality untested): Pipeline: - test_pool.py: QueryPool ID generation, caching, async context (12 tests) - test_ratelimit.py: Fixed timing-sensitive test tolerance - test_pipelinemgr.py: Use real Pydantic StageProcessResult instead of Mock Utils: - test_version.py: Version comparison functions (20 tests) - test_logcache.py: Log page management and retrieval (18 tests) - test_httpclient.py: HTTP session pool management (10 tests) - test_proxy.py: Proxy configuration from env and config (10 tests) - test_image.py: URL parsing and base64 extraction (12 tests) - test_pkgmgr.py: Pip command generation (8 tests) Discover: - test_engine.py: I18nString, Metadata, Component manifest (15 tests) Test count: 1193 → 1298 (+105 tests) Note: Some B-class issues cannot be tested due to circular import bugs filed as GitHub issues #2175 (pipeline) and #2176 (persistence). * test: tighten phase 1 coverage contracts * test: align ci integration isolation --------- Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
409 lines
13 KiB
Markdown
409 lines
13 KiB
Markdown
# LangBot Test Suite
|
|
|
|
This directory contains the test suite for LangBot, with a focus on comprehensive unit testing of pipeline stages.
|
|
|
|
## Quality Gate Layers
|
|
|
|
LangBot uses a layered quality gate system for developers and CI:
|
|
|
|
| Layer | Command | What it runs | When to use |
|
|
|-------|---------|--------------|-------------|
|
|
| **Quick** | `make test-quick` or `bash scripts/test-quick.sh` | Ruff lint + Unit tests + Smoke tests | Before every commit |
|
|
| **Fast Integration** | `make test-integration-fast` or `bash scripts/test-integration-fast.sh` | SQLite/API/Pipeline integration (no external services) | Before PR, weekly |
|
|
| **Coverage Gate** | `make test-coverage` or `bash scripts/test-coverage.sh` | All tests with coverage, threshold: 18% | Before merge, CI |
|
|
| **Full Local** | `make test-all-local` | Quick + Integration + Coverage | Before major changes |
|
|
|
|
**Note**: PostgreSQL migration tests and slow tests are NOT in local default gates. They run in separate CI workflows.
|
|
|
|
### Developer Workflow
|
|
|
|
```bash
|
|
# Daily: Quick self-test
|
|
bash scripts/test-quick.sh
|
|
|
|
# Before PR: Full local gate
|
|
make test-all-local
|
|
|
|
# Or run each layer separately:
|
|
bash scripts/test-quick.sh # ~2 min
|
|
bash scripts/test-integration-fast.sh # ~3 min
|
|
bash scripts/test-coverage.sh # ~8 min
|
|
```
|
|
|
|
### Coverage Baseline
|
|
|
|
Current coverage threshold: **18%**
|
|
Actual coverage: **30%**
|
|
|
|
This is a conservative baseline to prevent coverage regression. It does NOT represent the final quality target. Key modules have higher coverage:
|
|
- `pipeline.preproc.preproc`: 53%
|
|
- `pipeline.process.process`: 96%
|
|
- `pipeline.respback.respback`: 88%
|
|
- `telemetry.telemetry`: 87%
|
|
- `provider.session.sessionmgr`: 100%
|
|
- `provider.tools.toolmgr`: 83%
|
|
- `storage.providers.s3storage`: 80%
|
|
|
|
## Important Note
|
|
|
|
Due to circular import dependencies in the pipeline module structure, the test files use **lazy imports** via `importlib.import_module()` instead of direct imports. This ensures tests can run without triggering circular import errors.
|
|
|
|
## Structure
|
|
|
|
```
|
|
tests/
|
|
├── __init__.py
|
|
├── factories/ # Shared test factories
|
|
│ ├── __init__.py # Factory exports
|
|
│ ├── app.py # FakeApp factory
|
|
│ ├── message.py # Message/query factories
|
|
│ ├── provider.py # FakeProvider factory
|
|
│ └── platform.py # FakePlatform factory
|
|
├── integration/ # Integration tests (real resources)
|
|
│ ├── __init__.py
|
|
│ ├── api/ # HTTP API tests
|
|
│ │ ├── __init__.py
|
|
│ │ └── test_smoke.py # API smoke tests
|
|
│ ├── pipeline/ # Pipeline stage-chain tests
|
|
│ │ ├── __init__.py
|
|
│ │ └── test_full_flow.py # Full flow integration
|
|
│ └── persistence/ # Database/persistence tests
|
|
│ ├── __init__.py
|
|
│ └── test_migrations.py # Alembic migration tests
|
|
├── smoke/ # Smoke tests (quick validation)
|
|
│ └── test_fake_message_flow.py
|
|
├── unit_tests/ # Unit tests
|
|
│ ├── box/ # Box module tests
|
|
│ ├── config/ # Configuration tests
|
|
│ ├── pipeline/ # Pipeline stage tests
|
|
│ │ └── conftest.py # Shared fixtures and test infrastructure
|
|
│ ├── platform/ # Platform adapter tests
|
|
│ ├── plugin/ # Plugin system tests
|
|
│ │ └── test_handler_actions.py # Action handler tests
|
|
│ ├── provider/ # Provider tests
|
|
│ │ ├── test_session_manager.py # SessionManager tests
|
|
│ │ └── test_tool_manager.py # ToolManager tests
|
|
│ ├── rag/ # RAG tests
|
|
│ │ └── test_file_storage.py # File/ZIP storage tests
|
|
│ ├── storage/ # Storage tests
|
|
│ │ └── test_s3storage.py # S3StorageProvider tests
|
|
│ ├── vector/ # Vector tests
|
|
│ │ └── test_vdb_filter_conversion.py # VDB filter tests
|
|
│ └── telemetry/ # Telemetry tests (rewritten)
|
|
├── utils/ # Test utilities
|
|
│ ├── __init__.py
|
|
│ └── import_isolation.py # sys.modules isolation for circular imports
|
|
└── README.md # This file
|
|
```
|
|
|
|
## Test Factories
|
|
|
|
The `tests/factories/` package provides reusable test factories:
|
|
|
|
```python
|
|
from tests.factories import (
|
|
FakeApp, # Mock application
|
|
FakeProvider, # Fake LLM provider
|
|
FakePlatform, # Fake platform adapter
|
|
text_query, # Create text query
|
|
group_text_query, # Create group query
|
|
command_query, # Create command query
|
|
)
|
|
|
|
# Create fake app
|
|
app = FakeApp()
|
|
|
|
# Create query with text
|
|
query = text_query("hello world")
|
|
|
|
# Create fake provider that returns specific response
|
|
provider = FakeProvider().returns("test response")
|
|
|
|
# Create fake platform for outbound capture
|
|
platform = FakePlatform()
|
|
await platform.reply_message(query.message_event, reply_chain)
|
|
outbound = platform.get_outbound_messages()
|
|
```
|
|
|
|
See `tests/factories/__init__.py` for all available factories.
|
|
|
|
## Test Architecture
|
|
|
|
### Fixtures (`conftest.py`)
|
|
|
|
The test suite uses a centralized fixture system that provides:
|
|
|
|
- **MockApplication**: Comprehensive mock of the Application object with all dependencies
|
|
- **Mock objects**: Pre-configured mocks for Session, Conversation, Model, Adapter
|
|
- **Sample data**: Ready-to-use Query objects, message chains, and configurations
|
|
- **Helper functions**: Utilities for creating results and common assertions
|
|
|
|
### Design Principles
|
|
|
|
1. **Isolation**: Each test is independent and doesn't rely on external systems
|
|
2. **Mocking**: All external dependencies are mocked to ensure fast, reliable tests
|
|
3. **Coverage**: Tests cover happy paths, edge cases, and error conditions
|
|
4. **Extensibility**: Easy to add new tests by reusing existing fixtures
|
|
|
|
## Running Tests
|
|
|
|
### Quick self-test for developers
|
|
|
|
For local branch validation without real provider keys:
|
|
|
|
```bash
|
|
make test-quick
|
|
```
|
|
|
|
or
|
|
|
|
```bash
|
|
bash scripts/test-quick.sh
|
|
```
|
|
|
|
This runs:
|
|
1. Ruff lint check
|
|
2. Unit tests
|
|
3. Smoke tests
|
|
|
|
Suitable for quick validation before committing.
|
|
|
|
### Using the test runner script (recommended for full coverage)
|
|
```bash
|
|
bash run_tests.sh
|
|
```
|
|
|
|
This script automatically:
|
|
- Activates the virtual environment
|
|
- Installs test dependencies if needed
|
|
- Runs tests with coverage
|
|
- Generates HTML coverage report
|
|
|
|
### Manual test execution
|
|
|
|
#### Run all unit tests
|
|
```bash
|
|
uv run pytest tests/unit_tests/ --cov=langbot --cov-report=xml --cov-report=term
|
|
```
|
|
|
|
#### Run specific test module
|
|
```bash
|
|
uv run pytest tests/unit_tests/pipeline/ -v
|
|
```
|
|
|
|
#### Run specific test file
|
|
```bash
|
|
uv run pytest tests/unit_tests/pipeline/test_bansess.py -v
|
|
```
|
|
|
|
#### Run with coverage
|
|
```bash
|
|
uv run pytest tests/unit_tests/pipeline/ --cov=langbot --cov-report=html
|
|
```
|
|
|
|
#### Run specific test
|
|
```bash
|
|
uv run pytest tests/unit_tests/pipeline/test_bansess.py::test_bansess_whitelist_allow -v
|
|
```
|
|
|
|
### Using markers
|
|
|
|
```bash
|
|
# Run only unit tests
|
|
uv run pytest tests/unit_tests/ -m unit
|
|
|
|
# Run only integration tests
|
|
uv run pytest tests/integration/ -m integration
|
|
|
|
# Run integration tests excluding slow ones
|
|
uv run pytest tests/integration/ -m "not slow" -q
|
|
|
|
# Skip slow tests
|
|
uv run pytest tests/unit_tests/ -m "not slow"
|
|
```
|
|
|
|
### Running integration tests
|
|
|
|
Integration tests validate real system behavior with actual database/network resources.
|
|
|
|
```bash
|
|
# Run all integration tests (excluding slow ones)
|
|
uv run pytest tests/integration/ -m "not slow" -q
|
|
|
|
# Run SQLite migration integration tests
|
|
uv run pytest tests/integration/persistence/test_migrations.py -q --tb=short
|
|
|
|
# Run API smoke integration tests
|
|
uv run pytest tests/integration/api/test_smoke.py -q
|
|
|
|
# Run pipeline full-flow integration tests
|
|
uv run pytest tests/integration/pipeline/test_full_flow.py -q
|
|
|
|
# Run with verbose output
|
|
uv run pytest tests/integration/ -v
|
|
```
|
|
|
|
Note: Integration tests use:
|
|
- Temporary databases (tmp_path) for persistence tests
|
|
- Fake app/services for API tests (no real provider/platform)
|
|
- Fake runner/provider for pipeline tests (no real LLM API)
|
|
- Do not require external services
|
|
|
|
### Running migration tests locally
|
|
|
|
SQLite migration tests can be run locally without any external dependencies:
|
|
|
|
```bash
|
|
# SQLite migration tests (uses tmp_path, no external DB needed)
|
|
uv run pytest tests/integration/persistence/test_migrations.py -q --tb=short
|
|
```
|
|
|
|
PostgreSQL migration tests require an external PostgreSQL database:
|
|
|
|
```bash
|
|
# PostgreSQL migration tests (requires PostgreSQL service)
|
|
# Tests are marked as slow and skipped if TEST_POSTGRES_URL is not set
|
|
TEST_POSTGRES_URL=postgresql+asyncpg://user:pass@localhost:5432/test_db \
|
|
uv run pytest tests/integration/persistence/test_migrations_postgres.py -q --tb=short
|
|
|
|
# Or skip by default (no PostgreSQL available)
|
|
uv run pytest tests/integration/persistence/test_migrations_postgres.py -q --tb=short
|
|
# Output: SKIPPED (TEST_POSTGRES_URL not set)
|
|
```
|
|
|
|
Note: PostgreSQL tests are **not** included in fast integration gate because they:
|
|
- Require external PostgreSQL service
|
|
- Are marked with `@pytest.mark.slow`
|
|
- Need `TEST_POSTGRES_URL` environment variable
|
|
|
|
CI workflow `.github/workflows/test-migrations.yml` runs:
|
|
- SQLite tests in `test-migrations-sqlite` job (fast, no external services)
|
|
- PostgreSQL tests in `test-migrations-postgres` job (uses PostgreSQL service container)
|
|
|
|
### Running pipeline integration tests locally
|
|
|
|
Pipeline full-flow integration tests validate real stage interactions:
|
|
|
|
```bash
|
|
# Run pipeline integration tests (uses fake runner, no real LLM API)
|
|
uv run pytest tests/integration/pipeline/test_full_flow.py -q --tb=short
|
|
|
|
# Run with coverage for pipeline modules
|
|
uv run pytest tests/integration/pipeline \
|
|
--cov=langbot.pkg.pipeline.preproc.preproc \
|
|
--cov=langbot.pkg.pipeline.process.process \
|
|
--cov=langbot.pkg.pipeline.respback.respback \
|
|
--cov-report=term -q
|
|
```
|
|
|
|
These tests:
|
|
- Use `FakeRunner` class to simulate LLM responses without real API calls
|
|
- Import real `PreProcessor`, `MessageProcessor`, `SendResponseBackStage` stages
|
|
- Validate stage chain: PreProcessor → Processor → SendResponseBackStage
|
|
- Test prevent_default, exception handling, and full message flow
|
|
- Do not require real LLM provider keys
|
|
|
|
### Known Issues
|
|
|
|
Some tests may encounter circular import errors. This is a known issue with the current module structure. The test infrastructure is designed to work around this using lazy imports, but if you encounter issues:
|
|
|
|
1. Make sure you're running from the project root directory
|
|
2. Ensure dependencies are installed: `uv sync --dev`
|
|
3. Try running a simple test first to verify the test infrastructure works
|
|
|
|
## CI/CD Integration
|
|
|
|
Tests are automatically run on:
|
|
- Pull request opened
|
|
- Pull request marked ready for review
|
|
- Push to PR branch
|
|
- Push to master/develop branches
|
|
|
|
The workflow runs tests on Python 3.11, 3.12, and 3.13 to ensure compatibility.
|
|
|
|
## Adding New Tests
|
|
|
|
### 1. For a new pipeline stage
|
|
|
|
Create a new test file `test_<stage_name>.py`:
|
|
|
|
```python
|
|
"""
|
|
<StageName> stage unit tests
|
|
"""
|
|
|
|
import pytest
|
|
from langbot.pkg.pipeline.<module>.<stage> import <StageClass>
|
|
from langbot.pkg.pipeline import entities as pipeline_entities
|
|
|
|
|
|
@pytest.mark.asyncio
|
|
async def test_stage_basic_flow(mock_app, sample_query):
|
|
"""Test basic flow"""
|
|
stage = <StageClass>(mock_app)
|
|
await stage.initialize({})
|
|
|
|
result = await stage.process(sample_query, '<StageName>')
|
|
|
|
assert result.result_type == pipeline_entities.ResultType.CONTINUE
|
|
```
|
|
|
|
### 2. For additional fixtures
|
|
|
|
Add new fixtures to the appropriate `conftest.py`:
|
|
|
|
```python
|
|
@pytest.fixture
|
|
def my_custom_fixture():
|
|
"""Description of fixture"""
|
|
return create_test_data()
|
|
```
|
|
|
|
### 3. For test data
|
|
|
|
Use the helper functions in `conftest.py`:
|
|
|
|
```python
|
|
from tests.unit_tests.pipeline.conftest import create_stage_result, assert_result_continue
|
|
|
|
result = create_stage_result(
|
|
result_type=pipeline_entities.ResultType.CONTINUE,
|
|
query=sample_query
|
|
)
|
|
|
|
assert_result_continue(result)
|
|
```
|
|
|
|
## Best Practices
|
|
|
|
1. **Test naming**: Use descriptive names that explain what's being tested
|
|
2. **Arrange-Act-Assert**: Structure tests clearly with setup, execution, and verification
|
|
3. **One assertion per test**: Focus each test on a single behavior
|
|
4. **Mock appropriately**: Mock external dependencies, not the code under test
|
|
5. **Use fixtures**: Reuse common test data through fixtures
|
|
6. **Document tests**: Add docstrings explaining what each test validates
|
|
|
|
## Troubleshooting
|
|
|
|
### Import errors
|
|
Make sure you've installed the package in development mode:
|
|
```bash
|
|
uv sync --dev
|
|
```
|
|
|
|
### Async test failures
|
|
Ensure you're using `@pytest.mark.asyncio` decorator for async tests.
|
|
|
|
### Mock not working
|
|
Check that you're mocking at the right level and using `AsyncMock` for async functions.
|
|
|
|
## Future Enhancements
|
|
|
|
- [x] Add integration tests for database migrations (SQLite)
|
|
- [x] Add PostgreSQL migration integration tests (G-003)
|
|
- [x] Add integration tests for full pipeline execution
|
|
- [x] Add API smoke integration tests
|
|
- [ ] Add E2E tests
|
|
- [ ] Add performance benchmarks
|
|
- [ ] Add mutation testing for better coverage quality
|
|
- [ ] Add property-based testing with Hypothesis |