mirror of https://github.com/langbot-app/LangBot.git synced 2026-07-18 10:26:07 +00:00

Files

T

youhuanghe 791d052687 feat(box/mcp): instance-based orphan cleanup, error classification, session API, and integration tests

## Changes

  ### Precise orphan container cleanup
  - Runtime generates a unique instance_id on startup
  - Every container gets a `langbot.box.instance_id` label
  - `cleanup_orphaned_containers()` only removes containers from
    previous instances, preserving containers owned by the current one
  - Containers from older versions (no label) are also cleaned up
  - `cleanup_orphaned_containers` added to `BaseSandboxBackend` as
    a no-op default method, removing hasattr duck-typing

  ### Fine-grained MCP error classification
  - New `MCPSessionErrorPhase` enum with 7 phases: session_create,
    dep_install, process_start, relay_connect, mcp_init, runtime,
    tool_call
  - Each phase in `_init_box_stdio_server()` sets the error phase
    before re-raising, enabling precise failure diagnosis
  - `retry_count` tracked across retry attempts
  - `get_runtime_info_dict()` exposes `error_phase` and `retry_count`

  ### GET /v1/sessions/{id} API
  - `BoxRuntime.get_session()` returns session details including
    managed process info when present
  - `handle_get_session` HTTP handler + route in server.py
  - `BoxRuntimeClient.get_session()` abstract method + remote impl

  ### stdio defaults to Box when runtime is available
  - `_uses_box_stdio()` checks `box_service.available` instead of
    requiring explicit `box` key in server_config
  - `BoxService.initialize()` catches runtime errors gracefully,
    sets `available=False` instead of crashing LangBot startup
  - When no container runtime exists, stdio MCP falls back to
    host-direct execution

  ### Code quality (from /simplify review)
  - Extracted `_VENV_DIRS` / `_VENV_BIN_DIRS` module-level constants
  - Removed dead `_box_network_mode()` method and unused `bc` variable
  - Fixed broken import `from ....box.models` → `from ...box.models`
  - Cached `_resolve_host_path()` result — computed once, passed through
  - Config hash now includes `host_path` field
  - Batched orphan cleanup into single `rm -f` command

  ### Session leak fix
  - `_cleanup_box_stdio_session()` now runs in `_lifecycle_loop`'s
    finally block, covering all exit paths (normal shutdown, error,
    retry, final failure)

  ### Integration tests
  - 6 end-to-end tests covering managed process lifecycle, WebSocket
    stdio bidirectional IO, session cleanup verification, single
    session query, process exit detection, and orphan cleanup safety

2026-05-04 21:23:23 +08:00

integration_tests

feat(box/mcp): instance-based orphan cleanup, error classification, session API, and integration tests

2026-05-04 21:23:23 +08:00

unit_tests

feat(box/mcp): instance-based orphan cleanup, error classification, session API, and integration tests

2026-05-04 21:23:23 +08:00

__init__.py

feat: add comprehensive unit tests for pipeline stages (#1701 )

2025-10-01 10:56:59 +08:00

README.md

feat: add comprehensive unit tests for pipeline stages (#1701 )

2025-10-01 10:56:59 +08:00

README.md

LangBot Test Suite

This directory contains the test suite for LangBot, with a focus on comprehensive unit testing of pipeline stages.

Important Note

Due to circular import dependencies in the pipeline module structure, the test files use lazy imports via importlib.import_module() instead of direct imports. This ensures tests can run without triggering circular import errors.

Structure

tests/
├── pipeline/                      # Pipeline stage tests
│   ├── conftest.py               # Shared fixtures and test infrastructure
│   ├── test_simple.py            # Basic infrastructure tests (always pass)
│   ├── test_bansess.py           # BanSessionCheckStage tests
│   ├── test_ratelimit.py         # RateLimit stage tests
│   ├── test_preproc.py           # PreProcessor stage tests
│   ├── test_respback.py          # SendResponseBackStage tests
│   ├── test_resprule.py          # GroupRespondRuleCheckStage tests
│   ├── test_pipelinemgr.py       # PipelineManager tests
│   └── test_stages_integration.py # Integration tests
└── README.md                      # This file

Test Architecture

Fixtures (`conftest.py`)

The test suite uses a centralized fixture system that provides:

MockApplication: Comprehensive mock of the Application object with all dependencies
Mock objects: Pre-configured mocks for Session, Conversation, Model, Adapter
Sample data: Ready-to-use Query objects, message chains, and configurations
Helper functions: Utilities for creating results and common assertions

Design Principles

Isolation: Each test is independent and doesn't rely on external systems
Mocking: All external dependencies are mocked to ensure fast, reliable tests
Coverage: Tests cover happy paths, edge cases, and error conditions
Extensibility: Easy to add new tests by reusing existing fixtures

Running Tests

Using the test runner script (recommended)

bash run_tests.sh

This script automatically:

Activates the virtual environment
Installs test dependencies if needed
Runs tests with coverage
Generates HTML coverage report

Manual test execution

Run all tests

pytest tests/pipeline/

Run only simple tests (no imports, always pass)

pytest tests/pipeline/test_simple.py -v

Run specific test file

pytest tests/pipeline/test_bansess.py -v

Run with coverage

pytest tests/pipeline/ --cov=pkg/pipeline --cov-report=html

Run specific test

pytest tests/pipeline/test_bansess.py::test_bansess_whitelist_allow -v

Known Issues

Some tests may encounter circular import errors. This is a known issue with the current module structure. The test infrastructure is designed to work around this using lazy imports, but if you encounter issues:

Make sure you're running from the project root directory
Ensure the virtual environment is activated
Try running test_simple.py first to verify the test infrastructure works

CI/CD Integration

Tests are automatically run on:

Pull request opened
Pull request marked ready for review
Push to PR branch
Push to master/develop branches

The workflow runs tests on Python 3.10, 3.11, and 3.12 to ensure compatibility.

Adding New Tests

1. For a new pipeline stage

Create a new test file test_<stage_name>.py:

"""
<StageName> stage unit tests
"""

import pytest
from pkg.pipeline.<module>.<stage> import <StageClass>
from pkg.pipeline import entities as pipeline_entities


@pytest.mark.asyncio
async def test_stage_basic_flow(mock_app, sample_query):
    """Test basic flow"""
    stage = <StageClass>(mock_app)
    await stage.initialize({})

    result = await stage.process(sample_query, '<StageName>')

    assert result.result_type == pipeline_entities.ResultType.CONTINUE

2. For additional fixtures

Add new fixtures to conftest.py:

@pytest.fixture
def my_custom_fixture():
    """Description of fixture"""
    return create_test_data()

3. For test data

Use the helper functions in conftest.py:

from tests.pipeline.conftest import create_stage_result, assert_result_continue

result = create_stage_result(
    result_type=pipeline_entities.ResultType.CONTINUE,
    query=sample_query
)

assert_result_continue(result)

Best Practices

Test naming: Use descriptive names that explain what's being tested
Arrange-Act-Assert: Structure tests clearly with setup, execution, and verification
One assertion per test: Focus each test on a single behavior
Mock appropriately: Mock external dependencies, not the code under test
Use fixtures: Reuse common test data through fixtures
Document tests: Add docstrings explaining what each test validates

Troubleshooting

Import errors

Make sure you've installed the package in development mode:

uv pip install -e .

Async test failures

Ensure you're using @pytest.mark.asyncio decorator for async tests.

Mock not working

Check that you're mocking at the right level and using AsyncMock for async functions.

Future Enhancements

Add integration tests for full pipeline execution
Add performance benchmarks
Add mutation testing for better coverage quality
Add property-based testing with Hypothesis

README.md

LangBot Test Suite

Important Note

Structure

Test Architecture

Fixtures (conftest.py)

Design Principles

Running Tests

Using the test runner script (recommended)

Manual test execution

Run all tests

Run only simple tests (no imports, always pass)

Run specific test file

Run with coverage

Run specific test

Known Issues

CI/CD Integration

Adding New Tests

1. For a new pipeline stage

2. For additional fixtures

3. For test data

Best Practices

Troubleshooting

Import errors

Async test failures

Mock not working

Future Enhancements

Fixtures (`conftest.py`)