Commit Graph

110 Commits

Author SHA1 Message Date
huanghuoguoguo
1113bafe28 fix: harden agent runner runtime boundaries 2026-06-13 00:37:21 +08:00
huanghuoguoguo
897a708a13 Fix agent runner host migration and runtime guards
Migrates legacy runner blocks into plugin runner configs, preserves run-scoped history boundaries, enforces operation/file authorization, and sanitizes inline attachment persistence. Also fixes plugin runner form dirty handling and adds regression coverage.
2026-06-13 00:31:54 +08:00
huanghuoguoguo
fa31ddfe9c Fix agent runner steering and lifecycle hardening 2026-06-13 00:31:54 +08:00
huanghuoguoguo
14c9a3a8c6 feat(agent-runner): audit steering injection 2026-06-13 00:31:54 +08:00
huanghuoguoguo
90fb7305d0 chore: commit workspace changes 2026-06-13 00:31:54 +08:00
huanghuoguoguo
ea96d37e60 feat(agent-runner): enforce typed host permissions 2026-06-13 00:31:54 +08:00
huanghuoguoguo
8938ef7412 fix(agent-runner): harden state and event APIs 2026-06-13 00:31:14 +08:00
huanghuoguoguo
c859fc37bb refactor(agent-runner): remove protocol_version from various components and update related documentation 2026-06-13 00:31:14 +08:00
huanghuoguoguo
8c291fc974 test(agent): harden runner persistence coverage 2026-06-13 00:31:14 +08:00
huanghuoguoguo
173dc58272 feat(agent-runner): expose skill resources through host context 2026-06-13 00:30:33 +08:00
huanghuoguoguo
1ea61adde6 test: cover host skill tool scoping 2026-06-13 00:30:33 +08:00
huanghuoguoguo
b0e576dbb8 refactor(agent-runner): use protocol version field 2026-06-13 00:30:33 +08:00
huanghuoguoguo
b793409bed refactor(provider): formalize tool lookup contract 2026-06-13 00:30:33 +08:00
huanghuoguoguo
dbefd3364e refactor agent runner orchestration boundaries 2026-06-13 00:29:27 +08:00
huanghuoguoguo
e916c2e463 fix(agent-runner): align plugin runner runtime boundaries 2026-06-13 00:29:27 +08:00
huanghuoguoguo
c0f5f30f57 feat(agent-runner): add bounded native tool artifacts 2026-06-13 00:27:57 +08:00
huanghuoguoguo
bd690a79f0 feat(agent-runner): expose effective prompt and transcript history 2026-06-13 00:27:57 +08:00
huanghuoguoguo
3dc579feb3 refactor(agent-runner): make agent binding and auth snapshot explicit 2026-06-13 00:27:57 +08:00
huanghuoguoguo
86d5148534 refactor(agent-runner): simplify event-first entry path 2026-06-13 00:27:57 +08:00
huanghuoguoguo
efdc3678b1 refactor(agent-runner): align config with agent semantics 2026-06-13 00:27:10 +08:00
huanghuoguoguo
c351a3daed refactor(agent-runner): remove host context windowing 2026-06-13 00:27:10 +08:00
huanghuoguoguo
bfa5db767c feat(agent-runner): normalize binding config boundaries 2026-06-13 00:27:10 +08:00
huanghuoguoguo
9caef840c7 fix: enforce agent run API permissions 2026-06-13 00:27:10 +08:00
huanghuoguoguo
0bc68f3d3a fix(agent-runner): authorize external runner tools 2026-06-13 00:27:10 +08:00
huanghuoguoguo
b27e9c80cb docs(agent-runner): align runner protocol boundaries 2026-06-13 00:27:10 +08:00
huanghuoguoguo
a2b38f5bf2 fix(agent-runner): stabilize event context and streams 2026-06-13 00:27:10 +08:00
huanghuoguoguo
2fd2c6aadc refactor(agent-runner): tighten protocol v1 runtime boundaries 2026-06-13 00:27:09 +08:00
huanghuoguoguo
f9e07df539 feat(agent-runner): align protocol adapter terminology 2026-06-13 00:27:09 +08:00
huanghuoguoguo
d8d811e307 feat(agent-runner): route pipeline runs through event-first flow
- run_from_query() now delegates to run(event, binding) instead of maintaining
  a separate legacy execution path
- Pipeline Query is converted to AgentEventEnvelope via PipelineCompatAdapter
- Pipeline config is converted to AgentBinding with StatePolicy
- bound_plugins authorization preserved from Pipeline
- Legacy compatibility fields preserved:
  - query_id → context.runtime.query_id → session registry
  - prompt → context.compatibility.extra.prompt (not top-level)
  - params → context.compatibility.extra.params (with proper filtering)
  - max-round → bootstrap.messages and compatibility.legacy_messages
- Pipeline path gains event-first host capabilities:
  - EventLog and Transcript writing
  - ArtifactStore registration
  - PersistentStateStore for state.updated
- Removed legacy handlers:
  - _handle_artifact_created_query() (replaced by _handle_artifact_created)
  - _handle_state_updated() (replaced by _handle_state_updated_event)

This change unifies the execution path while preserving backward compatibility
for Pipeline-based runners. EventGateway is not implemented in this branch;
only the event-first entry point is reserved.
2026-06-13 00:27:09 +08:00
huanghuoguoguo
f23f343edc feat(agent-runner): add persistent state APIs 2026-06-13 00:27:09 +08:00
huanghuoguoguo
a7d90d196f feat(agent-runner): scope event-first state by binding 2026-06-13 00:27:09 +08:00
huanghuoguoguo
a7a359fb41 feat(agent-runner): persist created artifacts 2026-06-13 00:27:09 +08:00
huanghuoguoguo
e2712a8993 feat(agent-runner): add artifact store pull APIs 2026-06-13 00:27:09 +08:00
huanghuoguoguo
085a767f97 feat(agent-runner): add event-first context facts and pull APIs
Add EventLog and Transcript persistence entities for storing auditable
event facts and conversation history projection. Implement event-first
AgentRunContext builder that produces Protocol v1 compliant context
payloads with required fields: event, delivery, context (ContextAccess).

Key changes:
- EventLog ORM: auditable event records with indexes
- Transcript ORM: conversation history projection with composite indexes
- AgentRunContextBuilder: Protocol v1 payload with delivery, context, bootstrap
- EventLogStore/TranscriptStore: async stores for fact sources
- Host action handlers: HISTORY_PAGE, HISTORY_SEARCH, EVENT_GET, EVENT_PAGE
- Context validation: build_context output validates via SDK AgentRunContext
- Alembic migration for event_log and transcript tables
- Alembic env.py imports all ORM models for autogenerate discovery

Legacy compatibility: max-round messages go into bootstrap.messages and
compatibility.legacy_messages, not top-level messages field.
2026-06-13 00:27:09 +08:00
huanghuoguoguo
1b35ca67c5 fix(agent-runner): package context for plugin execution 2026-06-13 00:27:09 +08:00
huanghuoguoguo
4c98889566 feat: make agent runner config schema driven 2026-06-13 00:27:09 +08:00
huanghuoguoguo
07dcc0ec03 chore(agent): remove v1 wording from runner internals 2026-06-13 00:27:09 +08:00
huanghuoguoguo
e5b511de8f feat(agent): reserve stable runner event names 2026-06-13 00:26:44 +08:00
huanghuoguoguo
195d1a9c8e feat(agent-runner): enrich plugin runner host context 2026-06-13 00:26:43 +08:00
huanghuoguoguo
d419ee4139 fix: log agent runner best-effort failures 2026-06-13 00:26:43 +08:00
huanghuoguoguo
0cdecbbf36 test: address agent runner review comments 2026-06-13 00:26:43 +08:00
huanghuoguoguo
76582a578e feat: support dynamic agent runner defaults 2026-06-13 00:26:43 +08:00
huanghuoguoguo
d560a1eff3 feat(plugin): implement INVOKE_RERANK handler with run-scoped authorization
- Add invoke_rerank action handler in plugin handler
- Validate rerank model access via run session
- Cap documents at 64 for API limit
- Return sorted results by relevance score
2026-06-13 00:26:43 +08:00
huanghuoguoguo
790588fa22 perf(agent-runner): improve session registry and orchestrator efficiency
- Add pre-computed _authorized_ids (frozenset) at session registration for O(1) lookup
- Refactor is_resource_allowed() from linear search to set membership check
- Add thread-safe locking to get_session_registry() singleton
- Cache _session_registry and _state_store references in orchestrator __init__
- Add asyncio.gather() for parallel resource building in AgentResourceBuilder
- Create shared test fixtures in tests/unit_tests/agent/conftest.py
- Update test files to import from shared conftest.py

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-06-13 00:26:43 +08:00
huanghuoguoguo
78e6b9866b feat(agent-runner): integrate AgentRunner Protocol v1 with plugin system
Phase 0 integration complete - verified minimal loop with local-agent stub runner.

Changes:
- Add AgentRunOrchestrator for plugin-based agent execution
- Add AgentResultNormalizer for Protocol v1 result conversion
- Add AgentRunnerDescriptor for runner ID parsing (plugin:author/name/runner)
- Update chat handler to use new orchestrator instead of direct runner lookup
- Add plugin handler methods for list_agent_runners and run_agent
- Add connector methods for AgentRunner protocol forwarding
- Update pipeline API to include runner options in metadata
- Add integration docs and implementation plan

Integration verified:
- Runner: plugin:langbot/local-agent/default
- Input: "你好"
- Output: [stub] Echo: 你好
- Date: 2026-05-10 10:09

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-06-13 00:25:06 +08:00
RockChinQ
2b6dcfe9c7 feat(survey): add bot_response_success_100 milestone trigger event
Counts successful non-WebSocket bot responses (persisted in the metadata
table as survey_bot_response_count, survives restarts) and fires the
bot_response_success_100 survey event once the instance reaches 100
responses. Counting stops after the milestone has been triggered.

Existing first_bot_response_success behavior unchanged. 6 new unit tests.
2026-06-12 09:40:07 -04:00
RockChinQ
dd96da895c feat(telemetry): payload v2 with feature usage counters and instance heartbeat
Per-query events now carry event_type='query' and a features JSON object:
- tool_calls by source (native/plugin/mcp/skill) via ToolManager
- tool_call_rounds, kb usage (count/engine plugins/retrieved entries) via local-agent
- sandbox execs/errors via BoxService
- activated_skills and bound mcp_servers snapshots

New instance_heartbeat event (startup + daily) reports anonymous instance
profile: deploy platform, database/vdb kind, box backend/availability,
adapter type names, and resource counts. Respects space.disable_telemetry.

All collection helpers are defensive and never break the pipeline.
Verified: ruff, 37 telemetry unit tests (13 new), 504 box/provider/pipeline tests.
2026-06-12 08:11:43 -04:00
RockChinQ
47ade18596 fix(log): roll daily log file at midnight for long-running processes
The log filename was computed once at init_logging() startup and the
RotatingFileHandler only rotated by size, so a process running across
midnight kept appending every subsequent day's logs to the start-day
file (langbot-<start date>.log). No file ever appeared for the current
day until the process was restarted, confusing users into thinking
logging had stopped.

Replace RotatingFileHandler with DailyGroupedRotatingFileHandler, which
switches to langbot-<current date>.log when the local date changes while
still doing size-based numbered rotation within a day. On-disk naming
stays compatible with the maintenance log-retention cleanup
(LOG_FILE_PATTERN). Adds regression tests.
2026-06-10 04:58:11 -04:00
Junyan Chin
8e558ad3a1 Feat/saas sandbox adaptation (#2234)
* fix(box): trust Box-reported skill paths when filesystem is not shared

In separated deployments (Docker Compose, k8s sidecar, --standalone-box,
remote runtime.endpoint) the Box runtime owns its own filesystem, so the
skill package_root it reports via list_skills is not resolvable on the
LangBot side. LangBot's reload_skills and build_skill_extra_mounts
validated those paths with os.path.isdir() against its own filesystem,
which silently dropped every skill in such deployments — breaking the
sandbox skill feature for the nsjail/SaaS backend.

Add BoxService.shares_filesystem_with_box, derived from the connector
transport (stdio = shared, WebSocket = separated), with an explicit
override seam for tests/embedders. Gate both isdir() guards on it: keep
local validation in shared-fs stdio mode, trust Box-reported paths
otherwise. The Box runtime only reports skills found on its own
filesystem, so those paths are valid there by construction.

Adds topology-derivation tests (real connector, no mocks) and
skill-retention tests for both shared and separated filesystems.

* build(docker): ship a self-contained nsjail sandbox backend in the image

Compile nsjail 3.6 from source in a dedicated multi-stage build and carry
only the binary plus its runtime libs (libprotobuf32, libnl-route-3-200)
into the final image. This lets the Box runtime isolate sandboxed code via
nsjail user/mount/pid/net namespaces without a host Docker socket — the
prerequisite for running Box on LangBot Cloud (k8s), where mounting
docker.sock would grant node root and is not acceptable for multi-tenant.

The build toolchain (build-essential/bison/flex/protobuf-dev/libnl-dev)
stays in the nsjail-build stage and is not present in the shipped image.

Verified: image builds (583MB), nsjail --help exits 0, libraries resolve,
and the real NsjailBackend executes an isolated command end-to-end on a
v6.1/cgroup2 host matching LangBot Cloud prod (rlimit fallback path, since
container /sys/fs/cgroup is read-only; PID-namespace isolation confirmed).

* feat(box): SaaS guard to force a single global sandbox scope

Add system.limitation.force_box_session_id_template: when non-empty it
overrides every pipeline's box-session-id-template at resolve time, pinning
all queries to one shared sandbox (e.g. {global}). This is the authoritative,
unbypassable guard — it runs on every exec call, so editing the pipeline
config via API cannot escape it. The web UI locks the Sandbox Scope selector
via a combined box_scope_editable flag (box available AND not forced).

* build(deps): pin langbot-plugin==0.4.2b1 (nsjail cgroup container-safety beta)

* fix(web): show forced sandbox scope + make disabled tooltip tap-friendly

When a SaaS deployment pins every pipeline to a fixed sandbox scope via
system.limitation.force_box_session_id_template, the Sandbox Scope selector was
correctly locked but still displayed the pipeline's stored value (e.g. the
per-chat default), misrepresenting the scope that the runtime actually enforces
on every exec. Coerce the displayed/saved value to the forced template so the
locked selector truthfully shows the active scope (e.g. Global).

Also fix the disabled_tooltip being invisible on touch devices: hover-only Radix
tooltips never open without a pointer, so the explanation of why the field is
locked could not be read on mobile. Wrap the info icon so a tap toggles the
tooltip while desktop hover still works.

* feat(web): hide sidebar new-version prompt for edition=cloud

Cloud instances are upgraded centrally by the operator, so surfacing a GitHub
'new version available' badge to tenants is misleading and actionable only by
the operator. Skip the release check entirely when edition=cloud.

* style(web): prettier formatting for DisabledTooltipIcon ternary

* chore(deps): bump langbot-plugin to 0.4.2b2

Picks up the SDK fix that creates a read-write host_path before the
nsjail bind-mount, fixing the SaaS MCP shared-workspace sandbox failure
(exec exit 255 with empty output when host_path didn't exist).

* chore(deps): bump langbot-plugin to 0.4.2b3

Picks up the nsjail /dev-node fix so stdio MCP servers (uvx-launched) can
start under force_global_sandbox instead of failing with 'Connection closed
/ please check URL'.

* fix(web): show real MCP runtime status on installed extensions list

The installed-extensions list badge keyed solely off the enable flag, so a
server that was still CONNECTING (or in ERROR) was shown as 'Connected'.
Reflect the actual runtime_info.status (connecting/connected/error/disabled)
with matching colors, and poll quietly every 3s while any MCP server is
connecting so the badge transitions without a manual refresh.

* chore(deps): bump langbot-plugin to 0.4.2b4

Picks up the 30s start_managed_process timeout so cold uvx MCP bootstraps
don't get torn down mid-install.

* style(web): satisfy prettier — parenthesize nullish-coalescing in ternary

* fix(mcp): isolate transient test sessions from the shared Box session

A config-page 'test' (server_name='_', no persisted UUID) ran in the same
shared 'mcp-shared' Box session as live MCP servers. A failing test (e.g.
empty args) churned that shared session and tore down healthy, already-
connected servers — leaving them stuck after exhausting their retries.

Mark UUID-less sessions as transient, give them their own isolated Box
session ('mcp-test-<uuid>'), and fully delete that session on cleanup so
tests can never disturb live servers and don't leak sessions.

* fix(mcp): tear down transient test session after test completes

A successful config-page test left its isolated 'mcp-test-<uuid>' Box
session running (the lifecycle task blocks until shutdown). Wrap the
transient test coroutine so it always shuts the session down afterward,
preventing isolated test sessions from leaking.
2026-06-09 19:30:17 +08:00
RockChinQ
7330732f62 fix(ci): bump migration head assertion to 0004, apply prettier
- Update test_migrations / test_migrations_postgres head assertion from
  0003 to 0004 after adding the mcp readme migration.
- Reformat MCPForm.tsx / MCPReadme.tsx to satisfy prettier/prettier.
2026-06-06 03:56:14 -04:00