feat(telemetry): payload v2 with feature usage counters and instance heartbeat

Per-query events now carry event_type='query' and a features JSON object:
- tool_calls by source (native/plugin/mcp/skill) via ToolManager
- tool_call_rounds, kb usage (count/engine plugins/retrieved entries) via local-agent
- sandbox execs/errors via BoxService
- activated_skills and bound mcp_servers snapshots

New instance_heartbeat event (startup + daily) reports anonymous instance
profile: deploy platform, database/vdb kind, box backend/availability,
adapter type names, and resource counts. Respects space.disable_telemetry.

All collection helpers are defensive and never break the pipeline.
Verified: ruff, 37 telemetry unit tests (13 new), 504 box/provider/pipeline tests.
This commit is contained in:
RockChinQ
2026-06-12 08:11:43 -04:00
parent bca710dbd4
commit dd96da895c
10 changed files with 488 additions and 0 deletions
+3
View File
@@ -12,6 +12,7 @@ import pydantic
from langbot_plugin.box.client import BoxRuntimeClient
from .connector import BoxRuntimeConnector, _get_box_config
from ..telemetry import features as telemetry_features
from langbot_plugin.box.errors import BoxError, BoxValidationError
from langbot_plugin.box.models import (
BUILTIN_PROFILES,
@@ -218,6 +219,7 @@ class BoxService:
f'query_id={query.query_id} '
f'summary={json.dumps(self._summarize_result(result), ensure_ascii=False)}'
)
telemetry_features.increment(query, 'sandbox', 'execs')
return self._serialize_result(result)
def resolve_box_session_id(self, query: pipeline_query.Query) -> str:
@@ -785,6 +787,7 @@ class BoxService:
# ── Observability ─────────────────────────────────────────────────
def _record_error(self, exc: Exception, query: pipeline_query.Query):
telemetry_features.increment(query, 'sandbox', 'errors')
self._recent_errors.append(
{
'timestamp': _dt.datetime.now(_UTC).isoformat(),