The merge from master brought in new unit tests that target pre-refactor
APIs on feat/sandbox. Reconcile each:
- factories/app.py: FakeApp now exposes a Mock skill_mgr (with empty .skills
dict + inert prompt-addition builder) and a Mock pipeline_service so the
PreProcessor skill-index injection branch can run end-to-end in tests.
- pipeline/conftest.py: eagerly import langbot.pkg.pipeline.pipelinemgr so
pipeline.stage is fully initialised before any individual stage test
(preproc, longtext, ...) tries to lazy-load it. Without this preload,
running test_preproc.py in isolation hit a circular-import error via the
stage -> app -> pipelinemgr -> stage chain.
- provider/test_tool_manager.py: ToolManager now probes four loaders
(native -> plugin -> mcp -> skill). Inject inert native + skill mocks in
the execute_func_call fixture and assert all four shutdowns fire.
- utils/test_paths.py: drop the three cwd-dependent _check_if_source_install
cases. The refactor walks Path(__file__).resolve().parents looking for
pyproject.toml + main.py, so cwd no longer factors in and there's no
file read to mock-fail. The positive case and caching test still apply.
- utils/test_version.py: delete entirely. is_newer and compare_version_str
were removed when VersionManager was refactored to use the Space API for
release checks (1b4107a9); the tests targeted a surface that no longer
exists.
Until now ``BoxService.get_status`` returned ``available: true`` whenever
the runtime connector was healthy, even if the runtime itself reported
``backend: { available: false }`` (operator selected nsjail without the
binary, Docker daemon crashed mid-session, E2B credentials wrong, ...).
The dashboard / ``useBoxStatus`` hook / skill_service gate consumed the
top-level flag and showed "connected" while every actual call to native
exec or skill management would fail.
The native-tool loader already polled ``status.backend.available``
independently and hid its tools correctly, but every other consumer
(dashboard banner, the disabled-state hint, the LLM-facing message)
disagreed with it.
Combine the two in the payload: ``available = self._available AND
status.backend.available``. When ``backend.available`` is false we now
also surface a ``connector_error`` that names the backend
("Configured sandbox backend \"nsjail\" is unavailable") so the dialog
shows the actionable reason instead of an empty error pane. The
detailed ``backend`` object is preserved unchanged for the dialog.
Internal ``box_service.available`` (used by ``skill_service`` writes,
``mcp_stdio.uses_box_stdio``, the reconnect callback) is intentionally
NOT changed — it still tracks connector health only, so a backend blip
does not trigger spurious reconnect loops.
Tests:
- ``test_get_status_downgrades_available_when_backend_dead`` — exercise
the new branch (connector OK, backend.available=false → top-level
available=false, connector_error mentions the backend name)
- ``test_get_status_keeps_available_true_when_backend_ok`` — guard
against regressing the happy path
Live-verified with ``box.backend: nsjail`` on macOS (no nsjail binary):
``GET /api/v1/box/status`` now returns ``available: false`` with the
named connector_error, instead of the previous misleading
``available: true``.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The contributor's original PR (#1917) appended an ``Available Skills``
index to the system prompt before the LLM saw the user message, so the
LLM could decide whether to activate a skill. ``7145447b`` removed the
text-marker activation flow and, together with it, the entire system
prompt injection — but the Tool Call replacement only put the available
skills inside the ``activate`` tool's description. In practice the LLM
ignores tool descriptions for selection and goes straight to native
tools, so user-visible skill activation silently broke.
Restore the injection, adapted for the Tool Call era:
- SkillManager regains ``get_skill_index(bound_skills)`` and
``build_skill_aware_prompt_addition(bound_skills)``. The addendum
carries only ``name (display_name): description`` for each
pipeline-visible skill plus one instruction line pointing at the
``activate`` tool. No SKILL.md contents — KV cache stays clean
- PreProcessor appends the addendum to the first system message (or
inserts a new one) of ``query.prompt.messages`` for the local-agent
runner. Handles plain-string and ContentElement[] bodies. Skips
cleanly when no skills are visible
- 3 new test_preproc cases: injection happens, bound-skills subset
honoured, empty addendum touches nothing. 280 passed
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Skills now flow exclusively through the Box runtime. Every read and write
method funnels through ``_box_service()``; when Box is unavailable
(disabled in config, connection failed, or simply not installed) the
operation either returns an empty surface (``list_skills`` → []) or
raises with a clear ``Box runtime ... not initialised / disabled /
unavailable: ...`` message via the new ``_require_box(action)`` helper.
Why: the legacy local-fallback path scanned ``data/skills/``, but Box
manages its own ``box.local.skills_root`` (default ``data/box/skills/``).
The two diverging directories caused stale / phantom skill lists when
Box flapped, and the local-fallback writes silently bypassed all the
sandboxing the operator had configured.
SkillService (``api/http/service/skill.py``):
- New ``_require_box(action)`` returns the box service or raises a
structured ValueError. ``_require_box_for_write`` kept as alias
- ``list_skills`` → returns [] when Box is down so the UI can render
the disabled banner cleanly
- ``get_skill`` / ``get_skill_by_name`` → return None
- All read-file / write-file / scan-dir / create / update / delete /
install / preview methods → ``_require_box`` then box delegate.
Local fallback bodies (shutil.copytree, tempfile.mkdtemp, preview
pipelines) removed entirely
SkillManager (``pkg/skill/manager.py``):
- ``reload_skills`` returns early with empty cache when Box is down.
data/skills/ discovery loop removed
- ``refresh_skill_from_disk`` now just reports cache presence; the
on-disk re-parse is gone since Box is the only writer
Tests:
- Drop 11 obsolete test_skill_service.py tests that exercised the
removed local-fallback paths (create/install/file/delete/update)
- Add list-empty + read-refused tests; flip the legacy-allow test to
legacy-refuses-too
- Rewrite refresh_skill_from_disk test to match the new behaviour
Several helper methods (_managed_skill_path, _resolve_skill_path,
_preview_skill_candidates, _install_preview_candidates, etc.) are now
unreachable; a follow-up commit will prune them so this diff stays
reviewable.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Make the Box sandbox runtime optional. When ``box.enabled`` is false in
config (or when an enabled Box fails to connect), every dependent feature
degrades to the same disabled-state UX rather than crashing or silently
falling back to less safe code paths.
Backend:
- config.yaml: new top-level ``box.enabled: true`` flag (default true)
- BoxService:
- Read box.enabled on construction
- initialize() short-circuits when disabled — no remote WS connect, no
stdio subprocess fork
- _on_runtime_disconnect is a no-op when disabled (no reconnect loop
on a deliberately-off service)
- get_status() now exposes ``enabled`` so the frontend can tell
"disabled in config" from "configured but failed"
- MCP stdio loader (mcp_stdio.uses_box_stdio): requires box_service to
be available, not just installed
- MCP _init_stdio_python_server: when ap.box_service exists but is
unavailable, refuse the stdio server with an actionable error instead
of silently falling through to host-stdio (which bypasses the sandbox
the operator asked for). Setups without ap.box_service installed at
all keep the legacy host-stdio fallback for pre-Box dev mode
- SkillService._require_box_for_write: refuses create/update/install/
write_skill_file when ap.box_service is installed but unavailable.
Distinguishes disabled vs failed in the error message so the UI can
surface the right hint. Legacy setups (no ap.box_service) keep the
local fallback path — that distinction is what keeps the existing
local-skills tests valid
Tests:
- Box disabled-state behavior (4 cases)
- Skill write refusal in disabled & failed states (7 cases)
- MCP stdio runtime info policy updated to match new refuse-when-down
behavior
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The Box backends behave inconsistently when extra_mounts reference a
missing host directory (nsjail aborts the entire sandbox start, Docker
silently creates a root-owned empty dir on the host, E2B silently skips
the upload). The cache in skill_mgr.skills is only refreshed on
in-process mutations, so out-of-band changes — container rebuilds,
manual rm in the box volume, anything the LangBot API didn't drive —
leave a stale skill that later produces one of those bad mount paths.
- box/service.py: build_skill_extra_mounts now filters skills whose
package_root is not isdir on the LangBot-visible filesystem and logs
a warning, instead of passing the bad mount through to the backend
- skill/manager.py: reload_skills (Box path) drops skills whose
package_root is missing on the LangBot-side filesystem before they
reach the in-memory cache, with a summary warning
- api/http/controller/groups/skills.py: file/CRUD handlers now also
catch BoxError (RuntimeError subclass, previously slipping past
``except ValueError`` and surfacing as 500); list/get handlers gain
a try/except so a transient Box RPC failure becomes a clean 400
instead of a stack trace
Tests added for build_skill_extra_mounts (skip missing, skip empty,
no skill manager) and SkillManager.reload_skills (drop missing on Box
path). Full unit suite: 279 passed.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The skill subsystem moved to Tool-Call activation and a Box-managed
skill store; several tests still asserted removed APIs and a sys.modules
stub leaked across the suite. Full unit suite now green (was 23 failing).
- test_skill_tools: drop TestSkillManagerActivation (text-marker API
removed); rewrite TestSkillActivationHelper around the current
skill.activation.register_activated_skill; replace the CRUD
TestSkillAuthoringToolLoader with TestSkillToolLoader covering the
current activate/register_skill tools and sandbox-availability gating
- test_tool_manager_native: ToolManager attr is skill_tool_loader (not
skill_authoring_tool_loader); native loader now exposes 6 tools
(exec/read/write/edit/glob/grep) and requires initialize() with a
backend-available get_status()
- test_localagent_sandbox_exec: remove obsolete activation-marker
leakage tests and their helper providers
- test_model_service / pipeline conftest: give the mocks skill_mgr=None
so PreProcessor's local-agent skill-binding guard short-circuits
- test_n8nsvapi: stop permanently overwriting sys.modules
('langbot.pkg.provider.runner' etc.); save and restore around the
import so other modules get the real LocalAgentRunner base class
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The /api/v1/system/debug/exec endpoint passes user-supplied HTTP body
directly to Python exec(), enabling arbitrary code execution for any
authenticated user when debug_mode is enabled. This is a critical
security risk (CWE-94): a single misconfiguration or compromised JWT
grants full server-side code execution.
Remove the endpoint entirely. The /debug/plugin/action endpoint (which
does not use exec()) is left intact as it serves a different, scoped
purpose.
Co-authored-by: Junyan Chin <rockchinq@gmail.com>
When the Box runtime disconnects, there is a race between the heartbeat
flipping _available=false and the frontend polling get_status(). If the
poll arrives first, client.get_status() throws a ConnectionClosedError
which propagated as a 500, causing the frontend to show a grey dot
(null status) instead of a red dot with error details.
Now get_status() catches RPC errors and returns available=false with
the exception message as connector_error. get_sessions() returns an
empty list when unavailable or on RPC failure.
Replace the per-message session_id with a template-based system
configurable per pipeline via 'Sandbox Scope' in the local-agent panel.
Default scope is per-chat ({launcher_type}_{launcher_id}).
Unify skill exec into the same container as default exec — skills are
mounted at /workspace/.skills/{name}/ via extra_mounts instead of
getting separate containers. All pipeline-bound skills are injected
at container creation time.
- Add box-session-id-template to pipeline metadata (select, 4 options, 8 languages)
- Add resolve_box_session_id() and build_skill_extra_mounts() to BoxService
- Rewrite native.py skill exec path to use execute_tool with shared session
- Update tests for new session_id format
- Add design doc: docs/review/box-session-scope.md
Remove the separate langbot_box_runtime Docker service. Box Runtime
now always launches as a local stdio subprocess, regardless of whether
LangBot runs in Docker or not. The WebSocket transport path is kept
only for explicit runtime_url configuration (remote deployment).
This simplifies deployment by eliminating cross-container path mapping
and network hops. Box Runtime is a pure scheduling process (talks to
Docker socket / nsjail), it does not execute user code or touch the
filesystem, so container isolation is unnecessary — unlike Plugin
Runtime.
* feat(skills): add Agent Skills management system
Implement comprehensive skills management feature inspired by agentskills spec:
Backend:
- Add Skill and SkillPipelineBinding database entities
- Add database migration (dbm018) for skills tables
- Implement SkillManager for skill loading, matching, and resolution
- Implement SkillService for CRUD operations
- Add skills API endpoints for skill and pipeline binding management
- Integrate skill index injection into pipeline preprocessor
- Add skill activation detection in LocalAgentRunner
Frontend:
- Add Skills page with listing, search, and type filter
- Add SkillDetailDialog for create/edit with preview
- Add SkillCard and SkillForm components
- Add skills API methods to BackendClient
- Add skills entry to sidebar navigation
- Add i18n translations (en-US, zh-Hans)
Features:
- Support skill and workflow types
- Sub-skill composition via {{INVOKE_SKILL: name}} syntax
- Progressive disclosure (index in prompt, full instructions on activation)
- Pipeline-specific skill bindings with priority
* fix: resolve cherry-pick conflicts for agentskills onto sandbox
- Remove non-existent external_kb service import
- Add skill_mgr mock to localagent sandbox_exec tests
- Keep database version at 24 (sandbox branch's latest)
* feat(skills): upgrade to package-backed skills with sandbox execution
Evolve the skills system from pure prompt-based to package-backed with
sandbox tool execution support:
- Add source_type/package_root/entry_file/skill_tools fields to Skill entity
- SkillManager loads SKILL.md from local package directories
- SkillToolLoader as 4th dispatch layer in ToolManager (query-scoped)
- LocalAgent injects skill tools into use_funcs on skill activation
- BoxService.execute_skill_tool() runs scripts in sandbox (ro mount, env params)
- Skill tool names auto-namespaced as skill__{skill}__{tool}
- API validation for package_root allowlist and entry path traversal
- Frontend source_type toggle, package_root input, skill_tools editor
- Migration renumbered to 025 with ALTER TABLE fallback for existing DBs
- Fix unclosed limitation section in i18n files
- Fix skills API methods misplaced outside BackendClient class
* fix: test info
* feat(skills): switch skills to package-backed storage and add import tooling
- skills 从 inline/package 双轨收敛成 package-first
- instructions 改为写入并读取 SKILL.md
- 新增本地目录扫描和 GitHub 安装 skill
- 前端把 skills 整合进 plugins 页,新增 SkillsComponent 和 GitHub 导入弹窗
- skill form 去掉 source_type / type 筛选,改成目录扫描驱动
- Box skill tool 挂载模式从 ro 改成 rw
- 测试和中英文文案同步更新
* feat: simplify langbot skill create and import
* refactor(skills): clean up legacy skill API and harden activation flow
* refactor(skills): remove skill dependency expansion and add skill_get
* fix: lint
* fix: delete
* fix(skills): align tool manager loader initialization
* refactor: remove sandbox execute skill
* fix(skills): hide activation markers and isolate skill activation flow
* refactor(skills): switch skill model to filesystem-backed packages
* refactor(skills): switch skill model to filesystem-backed packages
* refactor(skills): unify runtime skill access around filesystem paths
* refactor(skills): unify runtime skill access around filesystem paths
* feat(skills): align rw package design and fix skill activation, visibility, and lint issues
* refactor(skills): replace rich authoring API with import/reload flow and update
Box design doc
* feat(box): add sandbox_exec tool loop for local-agent calculations
* feat(box): add host workspace mounting and sandbox_exec guidance
* feat(box): add BoxProfile with resource limits and improved output truncation
- Implement head+tail output truncation (60/40 split) so LLM sees both
beginning and final results; add streaming byte-limited reads in backend
to prevent unbounded memory usage (_MAX_RAW_OUTPUT_BYTES = 1MB)
- Define BoxProfile model with locked fields and max_timeout_sec clamping
- Add four built-in profiles: default, offline_readonly, network_basic,
network_extended with differentiated resource and security constraints
- Add resource limit fields to BoxSpec (cpus, memory_mb, pids_limit,
read_only_rootfs) and pass corresponding container CLI flags
(--cpus, --memory, --pids-limit, --read-only, --tmpfs)
- Profile loaded from config (box.profile), applied in service layer
before BoxSpec validation; locked fields cannot be overridden by
tool-call parameters
* feat(box): add obs
* refactor(box): unify box service lifecycle and local runtime
management
* refactor(box): remove legacy in-process runtime code and clean up smells
After the architecture settled on always using an independent Box Runtime
service, several pieces of compatibility code and design shortcuts were
left behind. This commit cleans them up:
- Remove `LocalBoxRuntimeClient` and `create_box_runtime_client` from
production code (moved to test-only helper).
- Remove unused `_clip_bytes` method from backend.
- Remove `__langbot_session_placeholder__` hack by making `BoxSpec.cmd`
default to empty and validating non-empty only in `runtime.execute()`.
- Extract `get_box_config()` helper to eliminate 5× duplicated config
access boilerplate.
- Remove `session_id`/`host_path`/`host_path_mode` from the LLM-facing
tool schema to enforce request-scoped session isolation.
- Fix dual shutdown path: `NativeToolLoader.shutdown()` no longer calls
`box_service.shutdown()` (handled by `Application.dispose()`).
- Simplify `_assert_session_compatible` with a loop.
- Inline client creation in `BoxRuntimeConnector`.
- Remove redundant `BOX__RUNTIME_URL` env var from docker-compose
(auto-detected by code).
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* feat(box/mcp): integrate MCP stdio with Box sandbox — auto-isolation, dep install, security
## Summary
When Podman/Docker is available, all stdio-mode MCP servers now automatically
run inside Box containers with dependency installation, path rewriting, and
lifecycle management. When no container runtime exists, LangBot starts normally
and stdio MCP falls back to host-direct execution.
## What changed
### MCP stdio → Box integration (mcp.py)
- Add `MCPServerBoxConfig` pydantic model for structured box configuration
with validation and defaults (network, host_path_mode, timeouts, resources)
- Auto-infer `host_path` from command/args with venv detection: recognizes
`.venv/bin/python` patterns and walks up to the project root
- Rewrite host paths to container `/workspace` paths transparently
- Replace venv python commands with container-native `python`
- Auto-detect `pyproject.toml`/`setup.py`/`requirements.txt` and run
`pip install` inside the container before starting the MCP server
- Copy project to `/tmp` before install to handle read-only mounts
- Add retry with exponential backoff (3 retries, 2s/4s/8s delays)
- Add Box managed process health monitoring (poll every 5s)
- Fix session leak: `_cleanup_box_stdio_session()` now runs in `finally`
block of `_lifecycle_loop`, covering all exit paths
- Fix retry logic: `_ready_event` is only set after all retries exhaust
or on success, not on first failure
- Enhance `get_runtime_info_dict()` with `box_session_id` and `box_enabled`
### Box security (security.py — new)
- `validate_sandbox_security()` blocks dangerous host paths:
`/etc`, `/proc`, `/sys`, `/dev`, `/root`, `/boot`, `/run`,
docker.sock, podman socket
- Called at the start of `CLISandboxBackend.start_session()`
### Box models (models.py)
- Add `BoxHostMountMode.NONE` — skips volume mount entirely
- Adjust `validate_host_mount_consistency` to allow arbitrary workdir
when `host_path_mode=NONE`
### Box backend (backend.py)
- Add `validate_sandbox_security()` call in `start_session()`
- Add `langbot.box.config_hash` label on containers for drift detection
- Handle `BoxHostMountMode.NONE` — skip `-v` mount arg
- Add `cleanup_orphaned_containers()` to base class (no-op default) and
CLI implementation (single batched `rm -f` command)
### Box runtime (runtime.py)
- Call `cleanup_orphaned_containers()` during `initialize()` to remove
lingering containers from previous runs
### Box service (service.py)
- Graceful degradation: `initialize()` catches runtime errors and sets
`available=False` instead of crashing LangBot startup
- Add `available` property and guard on `execute_sandbox_tool()`
- Add `skip_host_mount_validation` parameter to `build_spec()` and
`create_session()` — MCP paths are admin-configured and trusted,
bypassing `allowed_host_mount_roots` restrictions meant for
LLM-generated sandbox_exec commands
### Default behavior
- stdio MCP servers automatically use Box when `box_service.available`
is True (Podman/Docker detected); no explicit `box` config needed
- When no container runtime exists, falls back to host-direct stdio
- MCP Box defaults: `network=on` (for pip install), `read_only_rootfs=false`
(for site-packages), `host_path_mode=ro`, `startup_timeout=120s`
### Tests
- `test_box_security.py`: blocked paths, safe paths, subpath rejection
- `test_mcp_box_integration.py`: config model, path rewriting, venv
unwrap, host_path inference, payload building, runtime info, box
availability check
- `test_box_service.py`: `BoxHostMountMode.NONE` validation tests
* feat(box/mcp): instance-based orphan cleanup, error classification, session API, and integration tests
## Changes
### Precise orphan container cleanup
- Runtime generates a unique instance_id on startup
- Every container gets a `langbot.box.instance_id` label
- `cleanup_orphaned_containers()` only removes containers from
previous instances, preserving containers owned by the current one
- Containers from older versions (no label) are also cleaned up
- `cleanup_orphaned_containers` added to `BaseSandboxBackend` as
a no-op default method, removing hasattr duck-typing
### Fine-grained MCP error classification
- New `MCPSessionErrorPhase` enum with 7 phases: session_create,
dep_install, process_start, relay_connect, mcp_init, runtime,
tool_call
- Each phase in `_init_box_stdio_server()` sets the error phase
before re-raising, enabling precise failure diagnosis
- `retry_count` tracked across retry attempts
- `get_runtime_info_dict()` exposes `error_phase` and `retry_count`
### GET /v1/sessions/{id} API
- `BoxRuntime.get_session()` returns session details including
managed process info when present
- `handle_get_session` HTTP handler + route in server.py
- `BoxRuntimeClient.get_session()` abstract method + remote impl
### stdio defaults to Box when runtime is available
- `_uses_box_stdio()` checks `box_service.available` instead of
requiring explicit `box` key in server_config
- `BoxService.initialize()` catches runtime errors gracefully,
sets `available=False` instead of crashing LangBot startup
- When no container runtime exists, stdio MCP falls back to
host-direct execution
### Code quality (from /simplify review)
- Extracted `_VENV_DIRS` / `_VENV_BIN_DIRS` module-level constants
- Removed dead `_box_network_mode()` method and unused `bc` variable
- Fixed broken import `from ....box.models` → `from ...box.models`
- Cached `_resolve_host_path()` result — computed once, passed through
- Config hash now includes `host_path` field
- Batched orphan cleanup into single `rm -f` command
### Session leak fix
- `_cleanup_box_stdio_session()` now runs in `_lifecycle_loop`'s
finally block, covering all exit paths (normal shutdown, error,
retry, final failure)
### Integration tests
- 6 end-to-end tests covering managed process lifecycle, WebSocket
stdio bidirectional IO, session cleanup verification, single
session query, process exit detection, and orphan cleanup safety
* refactor: use rpc
* fix: import
* refactor(box): clean up sandbox subsystem code quality and efficiency
- Fix O(n²) stderr trimming in runtime.py with running length tracker
- Remove dead code: RESERVED_CONTAINER_PATHS, _subprocess_wait_task,
unused config_hash computation, unused imports
- Deduplicate connection callback in BoxRuntimeConnector, parse URL once
- Use enum comparison instead of stringly-typed spec.network.value check
- Replace manual _result_to_dict/_session_to_dict with model_dump()
- Cache NativeToolLoader tool definition and sandbox system guidance
- Extract _is_path_under() helper to eliminate duplicated path checks
- Import SANDBOX_EXEC_TOOL_NAME from native.py instead of redefining
- Add JSON startswith guard in logging_utils to skip futile json.loads
- Fix ruff lint errors (F401 unused imports, F841 unused variables)
* fix: ruff
* refactor(sandbox): keep box logic out of pipeline and localagent
- Move sandbox system-prompt guidance from LocalAgentRunner into
BoxService.get_system_guidance() so all box domain knowledge stays
in the box module.
- Remove standalone logging_utils.py; merge format_result_log() into
MessageHandler base class alongside cut_str().
- Strip sandbox-specific JSON parsing from log formatting; tool
results now use generic truncation.
- Revert TYPE_CHECKING changes in stage.py and runner.py that were
unrelated to this feature.
- Skip two test files affected by a pre-existing circular import
(runner ↔ app) until the import cycle is resolved in a separate PR.
* refactor(box): move box runtime to langbot-plugin-sdk
Extract self-contained box runtime modules (actions, backend, client,
errors, models, runtime, security, server) to langbot-plugin-sdk and
update all imports to use `langbot_plugin.box.*`. Keep only service
and
connector in LangBot core as they depend on the Application context.
- Update docker-compose to use `langbot_plugin.box.server` entry
point
- Update pyproject.toml to use local SDK via `tool.uv.sources`
- Remove migrated source files and their unit/integration tests
- Update remaining test imports to match new module paths
* fix: ruff
* fix(box): tighten sandbox exposure and restore box integration coverage
* refactor(types): remove quoted annotations under postponed evaluation
* chore(sandbox): move MCP loader changes to follow-up branch
* refactor(plugins): simplify GitHub install flow to default master archive
* revert(api): restore plugin GitHub import flow in plugins controller
* Improve data-root handling and skill install previews
* Add managed skill authoring tools for local agents
* Refactor the skills UI around sidebar detail pages
* Document why managed skill authoring tools bypass box
* fix: lint
* feat(web): refactor plugin/skill install flows and fix skills page
- Fix sidebar skill icon
- Add skills route and error page component
- Refactor plugin GitHub install from dialog modal to inline card
- Add skill install dropdown menu (create/upload/github) in sidebar
- Wire sidebar → skills page communication via pendingSkillInstallAction context
- Add i18n keys for error page and skill install actions
* fix(web): persist sidebar collapsible section open state on navigation
Sections opened via sub-item navigation now retain their expanded state
when the user switches to a different section, instead of collapsing
because the isActive fallback becomes false.
---------
Co-authored-by: youhuanghe <1051233107@qq.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-authored-by: Junyan Qin <rockchinq@gmail.com>