mirror of
https://github.com/langbot-app/LangBot.git
synced 2026-06-04 04:54:36 +00:00
- Update coverage from 22% to 30% - Add new test files to structure: - provider: session_manager, tool_manager - storage: s3storage - plugin: handler_actions - rag: file_storage - vector: vdb_filter_conversion - telemetry: rewritten tests - Update module coverage percentages Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
409 lines
13 KiB
Markdown
409 lines
13 KiB
Markdown
# LangBot Test Suite
|
|
|
|
This directory contains the test suite for LangBot, with a focus on comprehensive unit testing of pipeline stages.
|
|
|
|
## Quality Gate Layers
|
|
|
|
LangBot uses a layered quality gate system for developers and CI:
|
|
|
|
| Layer | Command | What it runs | When to use |
|
|
|-------|---------|--------------|-------------|
|
|
| **Quick** | `make test-quick` or `bash scripts/test-quick.sh` | Ruff lint + Unit tests + Smoke tests | Before every commit |
|
|
| **Fast Integration** | `make test-integration-fast` or `bash scripts/test-integration-fast.sh` | SQLite/API/Pipeline integration (no external services) | Before PR, weekly |
|
|
| **Coverage Gate** | `make test-coverage` or `bash scripts/test-coverage.sh` | All tests with coverage, threshold: 18% | Before merge, CI |
|
|
| **Full Local** | `make test-all-local` | Quick + Integration + Coverage | Before major changes |
|
|
|
|
**Note**: PostgreSQL migration tests and slow tests are NOT in local default gates. They run in separate CI workflows.
|
|
|
|
### Developer Workflow
|
|
|
|
```bash
|
|
# Daily: Quick self-test
|
|
bash scripts/test-quick.sh
|
|
|
|
# Before PR: Full local gate
|
|
make test-all-local
|
|
|
|
# Or run each layer separately:
|
|
bash scripts/test-quick.sh # ~2 min
|
|
bash scripts/test-integration-fast.sh # ~3 min
|
|
bash scripts/test-coverage.sh # ~8 min
|
|
```
|
|
|
|
### Coverage Baseline
|
|
|
|
Current coverage threshold: **18%**
|
|
Actual coverage: **30%**
|
|
|
|
This is a conservative baseline to prevent coverage regression. It does NOT represent the final quality target. Key modules have higher coverage:
|
|
- `pipeline.preproc.preproc`: 53%
|
|
- `pipeline.process.process`: 96%
|
|
- `pipeline.respback.respback`: 88%
|
|
- `telemetry.telemetry`: 87%
|
|
- `provider.session.sessionmgr`: 100%
|
|
- `provider.tools.toolmgr`: 83%
|
|
- `storage.providers.s3storage`: 80%
|
|
|
|
## Important Note
|
|
|
|
Due to circular import dependencies in the pipeline module structure, the test files use **lazy imports** via `importlib.import_module()` instead of direct imports. This ensures tests can run without triggering circular import errors.
|
|
|
|
## Structure
|
|
|
|
```
|
|
tests/
|
|
├── __init__.py
|
|
├── factories/ # Shared test factories
|
|
│ ├── __init__.py # Factory exports
|
|
│ ├── app.py # FakeApp factory
|
|
│ ├── message.py # Message/query factories
|
|
│ ├── provider.py # FakeProvider factory
|
|
│ └── platform.py # FakePlatform factory
|
|
├── integration/ # Integration tests (real resources)
|
|
│ ├── __init__.py
|
|
│ ├── api/ # HTTP API tests
|
|
│ │ ├── __init__.py
|
|
│ │ └── test_smoke.py # API smoke tests
|
|
│ ├── pipeline/ # Pipeline stage-chain tests
|
|
│ │ ├── __init__.py
|
|
│ │ └── test_full_flow.py # Full flow integration
|
|
│ └── persistence/ # Database/persistence tests
|
|
│ ├── __init__.py
|
|
│ └── test_migrations.py # Alembic migration tests
|
|
├── smoke/ # Smoke tests (quick validation)
|
|
│ └── test_fake_message_flow.py
|
|
├── unit_tests/ # Unit tests
|
|
│ ├── box/ # Box module tests
|
|
│ ├── config/ # Configuration tests
|
|
│ ├── pipeline/ # Pipeline stage tests
|
|
│ │ └── conftest.py # Shared fixtures and test infrastructure
|
|
│ ├── platform/ # Platform adapter tests
|
|
│ ├── plugin/ # Plugin system tests
|
|
│ │ └── test_handler_actions.py # Action handler tests
|
|
│ ├── provider/ # Provider tests
|
|
│ │ ├── test_session_manager.py # SessionManager tests
|
|
│ │ └── test_tool_manager.py # ToolManager tests
|
|
│ ├── rag/ # RAG tests
|
|
│ │ └── test_file_storage.py # File/ZIP storage tests
|
|
│ ├── storage/ # Storage tests
|
|
│ │ └── test_s3storage.py # S3StorageProvider tests
|
|
│ ├── vector/ # Vector tests
|
|
│ │ └── test_vdb_filter_conversion.py # VDB filter tests
|
|
│ └── telemetry/ # Telemetry tests (rewritten)
|
|
├── utils/ # Test utilities
|
|
│ ├── __init__.py
|
|
│ └── import_isolation.py # sys.modules isolation for circular imports
|
|
└── README.md # This file
|
|
```
|
|
|
|
## Test Factories
|
|
|
|
The `tests/factories/` package provides reusable test factories:
|
|
|
|
```python
|
|
from tests.factories import (
|
|
FakeApp, # Mock application
|
|
FakeProvider, # Fake LLM provider
|
|
FakePlatform, # Fake platform adapter
|
|
text_query, # Create text query
|
|
group_text_query, # Create group query
|
|
command_query, # Create command query
|
|
)
|
|
|
|
# Create fake app
|
|
app = FakeApp()
|
|
|
|
# Create query with text
|
|
query = text_query("hello world")
|
|
|
|
# Create fake provider that returns specific response
|
|
provider = FakeProvider().returns("test response")
|
|
|
|
# Create fake platform for outbound capture
|
|
platform = FakePlatform()
|
|
await platform.reply_message(query.message_event, reply_chain)
|
|
outbound = platform.get_outbound_messages()
|
|
```
|
|
|
|
See `tests/factories/__init__.py` for all available factories.
|
|
|
|
## Test Architecture
|
|
|
|
### Fixtures (`conftest.py`)
|
|
|
|
The test suite uses a centralized fixture system that provides:
|
|
|
|
- **MockApplication**: Comprehensive mock of the Application object with all dependencies
|
|
- **Mock objects**: Pre-configured mocks for Session, Conversation, Model, Adapter
|
|
- **Sample data**: Ready-to-use Query objects, message chains, and configurations
|
|
- **Helper functions**: Utilities for creating results and common assertions
|
|
|
|
### Design Principles
|
|
|
|
1. **Isolation**: Each test is independent and doesn't rely on external systems
|
|
2. **Mocking**: All external dependencies are mocked to ensure fast, reliable tests
|
|
3. **Coverage**: Tests cover happy paths, edge cases, and error conditions
|
|
4. **Extensibility**: Easy to add new tests by reusing existing fixtures
|
|
|
|
## Running Tests
|
|
|
|
### Quick self-test for developers
|
|
|
|
For local branch validation without real provider keys:
|
|
|
|
```bash
|
|
make test-quick
|
|
```
|
|
|
|
or
|
|
|
|
```bash
|
|
bash scripts/test-quick.sh
|
|
```
|
|
|
|
This runs:
|
|
1. Ruff lint check
|
|
2. Unit tests
|
|
3. Smoke tests
|
|
|
|
Suitable for quick validation before committing.
|
|
|
|
### Using the test runner script (recommended for full coverage)
|
|
```bash
|
|
bash run_tests.sh
|
|
```
|
|
|
|
This script automatically:
|
|
- Activates the virtual environment
|
|
- Installs test dependencies if needed
|
|
- Runs tests with coverage
|
|
- Generates HTML coverage report
|
|
|
|
### Manual test execution
|
|
|
|
#### Run all unit tests
|
|
```bash
|
|
uv run pytest tests/unit_tests/ --cov=langbot --cov-report=xml --cov-report=term
|
|
```
|
|
|
|
#### Run specific test module
|
|
```bash
|
|
uv run pytest tests/unit_tests/pipeline/ -v
|
|
```
|
|
|
|
#### Run specific test file
|
|
```bash
|
|
uv run pytest tests/unit_tests/pipeline/test_bansess.py -v
|
|
```
|
|
|
|
#### Run with coverage
|
|
```bash
|
|
uv run pytest tests/unit_tests/pipeline/ --cov=langbot --cov-report=html
|
|
```
|
|
|
|
#### Run specific test
|
|
```bash
|
|
uv run pytest tests/unit_tests/pipeline/test_bansess.py::test_bansess_whitelist_allow -v
|
|
```
|
|
|
|
### Using markers
|
|
|
|
```bash
|
|
# Run only unit tests
|
|
uv run pytest tests/unit_tests/ -m unit
|
|
|
|
# Run only integration tests
|
|
uv run pytest tests/integration/ -m integration
|
|
|
|
# Run integration tests excluding slow ones
|
|
uv run pytest tests/integration/ -m "not slow" -q
|
|
|
|
# Skip slow tests
|
|
uv run pytest tests/unit_tests/ -m "not slow"
|
|
```
|
|
|
|
### Running integration tests
|
|
|
|
Integration tests validate real system behavior with actual database/network resources.
|
|
|
|
```bash
|
|
# Run all integration tests (excluding slow ones)
|
|
uv run pytest tests/integration/ -m "not slow" -q
|
|
|
|
# Run SQLite migration integration tests
|
|
uv run pytest tests/integration/persistence/test_migrations.py -q --tb=short
|
|
|
|
# Run API smoke integration tests
|
|
uv run pytest tests/integration/api/test_smoke.py -q
|
|
|
|
# Run pipeline full-flow integration tests
|
|
uv run pytest tests/integration/pipeline/test_full_flow.py -q
|
|
|
|
# Run with verbose output
|
|
uv run pytest tests/integration/ -v
|
|
```
|
|
|
|
Note: Integration tests use:
|
|
- Temporary databases (tmp_path) for persistence tests
|
|
- Fake app/services for API tests (no real provider/platform)
|
|
- Fake runner/provider for pipeline tests (no real LLM API)
|
|
- Do not require external services
|
|
|
|
### Running migration tests locally
|
|
|
|
SQLite migration tests can be run locally without any external dependencies:
|
|
|
|
```bash
|
|
# SQLite migration tests (uses tmp_path, no external DB needed)
|
|
uv run pytest tests/integration/persistence/test_migrations.py -q --tb=short
|
|
```
|
|
|
|
PostgreSQL migration tests require an external PostgreSQL database:
|
|
|
|
```bash
|
|
# PostgreSQL migration tests (requires PostgreSQL service)
|
|
# Tests are marked as slow and skipped if TEST_POSTGRES_URL is not set
|
|
TEST_POSTGRES_URL=postgresql+asyncpg://user:pass@localhost:5432/test_db \
|
|
uv run pytest tests/integration/persistence/test_migrations_postgres.py -q --tb=short
|
|
|
|
# Or skip by default (no PostgreSQL available)
|
|
uv run pytest tests/integration/persistence/test_migrations_postgres.py -q --tb=short
|
|
# Output: SKIPPED (TEST_POSTGRES_URL not set)
|
|
```
|
|
|
|
Note: PostgreSQL tests are **not** included in fast integration gate because they:
|
|
- Require external PostgreSQL service
|
|
- Are marked with `@pytest.mark.slow`
|
|
- Need `TEST_POSTGRES_URL` environment variable
|
|
|
|
CI workflow `.github/workflows/test-migrations.yml` runs:
|
|
- SQLite tests in `test-migrations-sqlite` job (fast, no external services)
|
|
- PostgreSQL tests in `test-migrations-postgres` job (uses PostgreSQL service container)
|
|
|
|
### Running pipeline integration tests locally
|
|
|
|
Pipeline full-flow integration tests validate real stage interactions:
|
|
|
|
```bash
|
|
# Run pipeline integration tests (uses fake runner, no real LLM API)
|
|
uv run pytest tests/integration/pipeline/test_full_flow.py -q --tb=short
|
|
|
|
# Run with coverage for pipeline modules
|
|
uv run pytest tests/integration/pipeline \
|
|
--cov=langbot.pkg.pipeline.preproc.preproc \
|
|
--cov=langbot.pkg.pipeline.process.process \
|
|
--cov=langbot.pkg.pipeline.respback.respback \
|
|
--cov-report=term -q
|
|
```
|
|
|
|
These tests:
|
|
- Use `FakeRunner` class to simulate LLM responses without real API calls
|
|
- Import real `PreProcessor`, `MessageProcessor`, `SendResponseBackStage` stages
|
|
- Validate stage chain: PreProcessor → Processor → SendResponseBackStage
|
|
- Test prevent_default, exception handling, and full message flow
|
|
- Do not require real LLM provider keys
|
|
|
|
### Known Issues
|
|
|
|
Some tests may encounter circular import errors. This is a known issue with the current module structure. The test infrastructure is designed to work around this using lazy imports, but if you encounter issues:
|
|
|
|
1. Make sure you're running from the project root directory
|
|
2. Ensure dependencies are installed: `uv sync --dev`
|
|
3. Try running a simple test first to verify the test infrastructure works
|
|
|
|
## CI/CD Integration
|
|
|
|
Tests are automatically run on:
|
|
- Pull request opened
|
|
- Pull request marked ready for review
|
|
- Push to PR branch
|
|
- Push to master/develop branches
|
|
|
|
The workflow runs tests on Python 3.11, 3.12, and 3.13 to ensure compatibility.
|
|
|
|
## Adding New Tests
|
|
|
|
### 1. For a new pipeline stage
|
|
|
|
Create a new test file `test_<stage_name>.py`:
|
|
|
|
```python
|
|
"""
|
|
<StageName> stage unit tests
|
|
"""
|
|
|
|
import pytest
|
|
from langbot.pkg.pipeline.<module>.<stage> import <StageClass>
|
|
from langbot.pkg.pipeline import entities as pipeline_entities
|
|
|
|
|
|
@pytest.mark.asyncio
|
|
async def test_stage_basic_flow(mock_app, sample_query):
|
|
"""Test basic flow"""
|
|
stage = <StageClass>(mock_app)
|
|
await stage.initialize({})
|
|
|
|
result = await stage.process(sample_query, '<StageName>')
|
|
|
|
assert result.result_type == pipeline_entities.ResultType.CONTINUE
|
|
```
|
|
|
|
### 2. For additional fixtures
|
|
|
|
Add new fixtures to the appropriate `conftest.py`:
|
|
|
|
```python
|
|
@pytest.fixture
|
|
def my_custom_fixture():
|
|
"""Description of fixture"""
|
|
return create_test_data()
|
|
```
|
|
|
|
### 3. For test data
|
|
|
|
Use the helper functions in `conftest.py`:
|
|
|
|
```python
|
|
from tests.unit_tests.pipeline.conftest import create_stage_result, assert_result_continue
|
|
|
|
result = create_stage_result(
|
|
result_type=pipeline_entities.ResultType.CONTINUE,
|
|
query=sample_query
|
|
)
|
|
|
|
assert_result_continue(result)
|
|
```
|
|
|
|
## Best Practices
|
|
|
|
1. **Test naming**: Use descriptive names that explain what's being tested
|
|
2. **Arrange-Act-Assert**: Structure tests clearly with setup, execution, and verification
|
|
3. **One assertion per test**: Focus each test on a single behavior
|
|
4. **Mock appropriately**: Mock external dependencies, not the code under test
|
|
5. **Use fixtures**: Reuse common test data through fixtures
|
|
6. **Document tests**: Add docstrings explaining what each test validates
|
|
|
|
## Troubleshooting
|
|
|
|
### Import errors
|
|
Make sure you've installed the package in development mode:
|
|
```bash
|
|
uv sync --dev
|
|
```
|
|
|
|
### Async test failures
|
|
Ensure you're using `@pytest.mark.asyncio` decorator for async tests.
|
|
|
|
### Mock not working
|
|
Check that you're mocking at the right level and using `AsyncMock` for async functions.
|
|
|
|
## Future Enhancements
|
|
|
|
- [x] Add integration tests for database migrations (SQLite)
|
|
- [x] Add PostgreSQL migration integration tests (G-003)
|
|
- [x] Add integration tests for full pipeline execution
|
|
- [x] Add API smoke integration tests
|
|
- [ ] Add E2E tests
|
|
- [ ] Add performance benchmarks
|
|
- [ ] Add mutation testing for better coverage quality
|
|
- [ ] Add property-based testing with Hypothesis |