Files
LangBot/tests/unit_tests
RockChinQ 39673444d2 fix(provider): capture streaming token usage; add token observability
The LiteLLM streaming requester only captured usage when a chunk had an
empty `choices` list. Many OpenAI-compatible gateways (e.g. new-api) and
providers send the final usage payload in a chunk that still carries an
empty-delta choice, so streamed calls always recorded 0 tokens in the
monitoring logs/dashboard (non-streaming worked).

- Capture stream usage whenever a chunk carries it, regardless of choices
- Add robust _normalize_usage (dict/obj shapes, derive missing total_tokens)
- Register litellm in bootutils/deps.py (was in pyproject only)
- Add MonitoringService.get_token_statistics + /monitoring/token-statistics
  endpoint: summary, per-model breakdown, token timeseries, and a
  zero-token-success data-quality signal
- Add TokenMonitoring dashboard tab (summary tiles, stacked token chart,
  per-model table) + i18n (en/zh)
- Regression tests for stream usage capture and usage normalization

Verified end-to-end against a real OpenAI-compatible endpoint with
gpt-5.5 and claude-opus-4-8: tokens now recorded non-zero for both
streaming and non-streaming paths.
2026-06-05 09:13:57 -04:00
..
2026-05-16 12:05:54 +08:00
2026-06-03 11:12:39 +08:00
2026-05-16 12:05:54 +08:00
2026-05-16 12:05:54 +08:00
2026-05-16 12:05:54 +08:00
2026-05-16 12:05:54 +08:00
2026-05-16 12:05:54 +08:00
2026-06-03 11:12:39 +08:00
2026-05-16 12:05:54 +08:00
2026-05-16 12:05:54 +08:00
2026-05-16 12:05:54 +08:00
2026-05-16 12:05:54 +08:00
2026-05-16 12:05:54 +08:00
2026-06-03 11:12:39 +08:00
2026-05-16 12:05:54 +08:00
2026-05-16 12:05:54 +08:00
2026-06-03 11:12:39 +08:00
2026-06-03 11:12:39 +08:00
2026-06-03 11:12:39 +08:00