From 8749a9b56f6a7d3c8ab7925b724c9887f5b593aa Mon Sep 17 00:00:00 2001 From: huanghuoguoguo <60681390+huanghuoguoguo@users.noreply.github.com> Date: Thu, 25 Jun 2026 10:07:04 +0800 Subject: [PATCH] test(skills): prepare user path performance gate --- .gitignore | 1 + skills/README.md | 2 +- skills/docs/user-guide.md | 138 ++++++++ .../e2e/ensure-local-agent-pipeline.mjs | 325 +++++++++++++++++- skills/scripts/e2e/lib/langbot-e2e.mjs | 3 +- skills/skills.index.json | 9 +- .../references/service-startup.md | 6 +- .../pipeline-debug-chat-performance.yaml | 5 + .../performance-reliability-testing.md | 23 ++ 9 files changed, 493 insertions(+), 19 deletions(-) create mode 100644 skills/docs/user-guide.md diff --git a/.gitignore b/.gitignore index d0fe6acb6..97a64ba81 100644 --- a/.gitignore +++ b/.gitignore @@ -48,6 +48,7 @@ coverage.xml .coverage src/langbot/web/ testsdk/ +.qa/ # Build artifacts /dist diff --git a/skills/README.md b/skills/README.md index f45b52859..d287f6762 100644 --- a/skills/README.md +++ b/skills/README.md @@ -26,7 +26,7 @@ and LangBot's own Local Agent) working with the LangBot ecosystem. ## Quick start (for an AI agent) -1. Read this README, `AGENTS.md`, and `qa-agent-docs/` to understand the layout. +1. Read this README, `AGENTS.md`, and `docs/user-guide.md` to understand the layout. 2. Read `skills/.env` for shared local defaults. On a new machine, copy `skills/.env.example` to `skills/.env.local` (gitignored) and override machine-specific values there. Never commit secrets. diff --git a/skills/docs/user-guide.md b/skills/docs/user-guide.md new file mode 100644 index 000000000..d52a77247 --- /dev/null +++ b/skills/docs/user-guide.md @@ -0,0 +1,138 @@ +# LangBot QA Skills User Guide + +Use this guide as the first operational path after reading `README.md` and +`AGENTS.md`. + +## 1. Configure Local Inputs + +Read `skills/.env`, then create `skills/.env.local` for machine-local values. +Do not commit `.env.local`, browser profiles, reports, tokens, API keys, OAuth +state, or provider credentials. + +Minimum local fields for live browser QA: + +```bash +LANGBOT_REPO=/path/to/LangBot +LANGBOT_WEB_REPO=/path/to/LangBot/web +LANGBOT_BACKEND_URL=http://127.0.0.1:5300 +LANGBOT_FRONTEND_URL=http://127.0.0.1:3000 +LANGBOT_DEV_FRONTEND_URL=http://127.0.0.1:3000 +LANGBOT_BROWSER_PROFILE=/path/to/langbot-browser-profile +LANGBOT_CHROMIUM_EXECUTABLE=/path/to/chromium-or-playwright-chrome +LANGBOT_E2E_LOGIN_USER=qa-local@example.com +``` + +`LANGBOT_E2E_LOGIN_USER` is a local QA account. The setup automation uses the +LangBot recovery key from the active checkout to initialize or refresh that +local account and write a browser `localStorage` token. It does not need the +user's GitHub or Space credentials. + +## 2. Check Readiness + +From `skills/`: + +```bash +bin/lbs env show +bin/lbs env doctor +bin/lbs validate +bin/lbs index --check +``` + +`env doctor` should report reachable backend and frontend URLs before live +browser cases are run. Missing Space provider credentials are not a LangBot +product pass; classify them as `env_issue` and configure the local Space +provider before measuring Debug Chat performance. + +## 3. Start Services + +Start the backend from `LANGBOT_REPO`: + +```bash +cd "$LANGBOT_REPO" +uv run main.py +``` + +Start the standalone frontend from `LANGBOT_WEB_REPO` and point it at the +backend: + +```bash +cd "$LANGBOT_WEB_REPO" +VITE_API_BASE_URL="$LANGBOT_BACKEND_URL" pnpm dev --host 0.0.0.0 +``` + +If `VITE_API_BASE_URL` is missing, browser tests can load the Vite page but send +API requests to the frontend port, which produces false UI failures. + +## 4. Prepare User-Path Fixtures + +For local-agent Debug Chat cases and the user-path performance gate: + +```bash +node scripts/e2e/ensure-local-agent-pipeline.mjs --write-env +``` + +The script: + +- refreshes the local QA login and browser token; +- marks the local wizard as skipped; +- creates or updates a local QA pipeline; +- scans Space LLM models, tests candidates, and switches to the first working + Space model with tested fallback models; +- writes `LANGBOT_PIPELINE_URL`, `LANGBOT_PIPELINE_NAME`, and local-agent + pipeline/model variables into `skills/.env.local`; +- returns `env_issue` when no Space model can be scanned or tested. + +Useful model controls: + +```bash +LANGBOT_E2E_MODEL_TEST_LIMIT=8 +LANGBOT_E2E_MODEL_FALLBACK_COUNT=3 +LANGBOT_E2E_SKIP_MODEL_UUIDS=uuid-a,uuid-b +LANGBOT_E2E_SKIP_MODEL_NAMES=model-a,model-b +LANGBOT_E2E_SCAN_SPACE_MODELS=true +``` + +The setup writes a current-runtime compatibility `max-round` value into the +pipeline config because this backend still reads that field directly during +message truncation. Do not treat it as a long-term QA contract. + +## 5. Run Gates + +Fast contract gate, no live service required: + +```bash +bin/lbs suite run langbot-performance-contract-gate --run-id langbot-contract-local +``` + +Live backend gate: + +```bash +bin/lbs suite run langbot-live-backend-gate --run-id langbot-backend-local +``` + +Browser-visible user-path performance gate: + +```bash +bin/lbs suite plan langbot-user-path-performance-gate +bin/lbs suite run langbot-user-path-performance-gate --run-id langbot-user-path-local --include-manual-check +``` + +`manual_check` means the agent must confirm the declared preconditions for that +run window. When setup automation is declared, run output may stop early with +`env_issue`; fix that environment input before treating the product path as +measured. + +## 6. Read Results + +Suite reports live under `skills/reports/`. Evidence lives under +`skills/reports/evidence//`. + +For performance cases, inspect: + +- `metrics.json` for p50/p95/p99, error rate, and total duration; +- `automation-result.json` for threshold decisions and artifacts; +- `console.log` and `network.log` for frontend/API failures; +- backend logs for provider, runner, WebSocket, or persistence failures. + +Do not call a user-path performance result a LangBot overhead regression until +provider/tool/network time has been separated or ruled out. diff --git a/skills/scripts/e2e/ensure-local-agent-pipeline.mjs b/skills/scripts/e2e/ensure-local-agent-pipeline.mjs index 0962c6bf5..da4336211 100644 --- a/skills/scripts/e2e/ensure-local-agent-pipeline.mjs +++ b/skills/scripts/e2e/ensure-local-agent-pipeline.mjs @@ -10,6 +10,7 @@ import { ensureEvidence, evidencePaths, loadEnvFiles, + redact, resetAndAuthLocalUser, safeScreenshot, setBrowserToken, @@ -17,9 +18,12 @@ import { writeResult, } from "./lib/langbot-e2e.mjs"; -const RUNNER_ID = "plugin:langbot/local-agent/default"; +const RUNNER_ID = "local-agent"; +const SPACE_PROVIDER_UUID = "00000000-0000-0000-0000-000000000000"; const DEFAULT_PIPELINE_NAME = "Agent QA Local Agent Debug Chat"; const DEFAULT_LOCAL_PASSWORD = "LangBotE2ELocalPass!2026"; +const DEFAULT_MODEL_TEST_LIMIT = 8; +const DEFAULT_MODEL_FALLBACK_COUNT = 3; const caseId = "ensure-local-agent-pipeline"; await loadEnvFiles(); @@ -45,11 +49,18 @@ const result = { pipeline_url: "", runner_id: RUNNER_ID, selected_model_id: "", + selected_model_name: "", + fallback_model_ids: [], model_count: 0, + space_model_count: 0, + scanned_space_model_count: 0, + tested_model_count: 0, + model_tests: [], created: false, updated: false, wrote_env: false, auth: null, + wizard: null, browser_token_check: null, page_signal: "", evidence: { @@ -71,6 +82,7 @@ try { const user = env.LANGBOT_E2E_LOGIN_USER || ""; const password = env.LANGBOT_E2E_LOGIN_PASSWORD || DEFAULT_LOCAL_PASSWORD; if (!user) { + result.status = "env_issue"; throw new Error("LANGBOT_E2E_LOGIN_USER is required so this setup can create/update the pipeline via backend API."); } @@ -81,6 +93,13 @@ try { backend_token_check: auth.check, }; + const wizard = await skipWizard({ backendUrl, token: auth.token }); + result.wizard = wizard; + if (wizard.status !== "pass") { + result.status = "fail"; + throw new Error(wizard.reason || "Failed to mark the local QA wizard as skipped."); + } + const prepared = await ensureLocalAgentPipeline({ backendUrl, token: auth.token, @@ -99,6 +118,10 @@ try { LANGBOT_PIPELINE_NAME: result.pipeline_name || pipelineName, LANGBOT_LOCAL_AGENT_PIPELINE_URL: result.pipeline_url, LANGBOT_LOCAL_AGENT_PIPELINE_NAME: result.pipeline_name || pipelineName, + ...(result.selected_model_id ? { + LANGBOT_LOCAL_AGENT_MODEL_UUID: result.selected_model_id, + LANGBOT_E2E_MODEL_UUID: result.selected_model_id, + } : {}), }); result.wrote_env = true; } @@ -127,6 +150,21 @@ try { process.exit(result.status === "pass" ? 0 : result.status === "env_issue" ? 2 : 1); +async function skipWizard({ backendUrl, token }) { + const response = await apiJson(backendUrl, "/api/v1/system/wizard/completed", { + method: "POST", + token, + body: { status: "skipped" }, + }); + const ok = response.status < 400 && response.json.code === 0; + return { + status: ok ? "pass" : "fail", + http_status: response.status, + code: response.json.code ?? null, + reason: ok ? "Wizard marked skipped for local QA." : response.json.msg || "Wizard status update failed.", + }; +} + async function ensureLocalAgentPipeline({ backendUrl, token, pipelineName, runnerId }) { const [pipelineList, modelList] = await Promise.all([ apiJson(backendUrl, "/api/v1/pipelines", { token }), @@ -149,7 +187,19 @@ async function ensureLocalAgentPipeline({ backendUrl, token, pipelineName, runne } const models = modelList.json.data?.models || []; - const selectedModel = models.find((model) => model.uuid) || null; + const skippedModelIds = new Set( + String(env.LANGBOT_E2E_SKIP_MODEL_UUIDS || "") + .split(",") + .map((item) => item.trim()) + .filter(Boolean), + ); + const skippedModelNames = new Set( + String(env.LANGBOT_E2E_SKIP_MODEL_NAMES || "") + .split(",") + .map((item) => item.trim()) + .filter(Boolean), + ); + const spaceModels = models.filter((model) => isSpaceModel(model) && !skippedModelIds.has(model.uuid)); const pipelines = pipelineList.json.data?.pipelines || []; let pipeline = pipelines.find((item) => item.name === pipelineName) || null; let created = false; @@ -170,6 +220,7 @@ async function ensureLocalAgentPipeline({ backendUrl, token, pipelineName, runne reason: createdResponse.json.msg || "Failed to create pipeline.", create_status: createdResponse.status, model_count: models.length, + space_model_count: spaceModels.length, }; } const pipelineId = createdResponse.json.data?.uuid || ""; @@ -183,6 +234,7 @@ async function ensureLocalAgentPipeline({ backendUrl, token, pipelineName, runne status: "fail", reason: "Pipeline was not created or resolved.", model_count: models.length, + space_model_count: spaceModels.length, }; } @@ -194,27 +246,37 @@ async function ensureLocalAgentPipeline({ backendUrl, token, pipelineName, runne get_status: loaded.status, pipeline_id: pipeline.uuid, model_count: models.length, + space_model_count: spaceModels.length, }; } pipeline = loaded.json.data.pipeline; const config = pipeline.config && typeof pipeline.config === "object" ? pipeline.config : {}; const ai = config.ai && typeof config.ai === "object" ? config.ai : {}; - const runnerConfig = ai.runner_config && typeof ai.runner_config === "object" ? ai.runner_config : {}; - const rawExistingLocalAgentConfig = runnerConfig[runnerId] && typeof runnerConfig[runnerId] === "object" - ? runnerConfig[runnerId] + const rawExistingLocalAgentConfig = ai["local-agent"] && typeof ai["local-agent"] === "object" + ? ai["local-agent"] : {}; const existingLocalAgentConfig = rawExistingLocalAgentConfig; const existingModel = existingLocalAgentConfig.model && typeof existingLocalAgentConfig.model === "object" ? existingLocalAgentConfig.model : {}; const requestedModelId = env.LANGBOT_LOCAL_AGENT_MODEL_UUID || env.LANGBOT_E2E_MODEL_UUID || ""; - const selectedModelId = requestedModelId || existingModel.primary || selectedModel?.uuid || ""; + const selected = await selectWorkingSpaceModel({ + backendUrl, + token, + models, + skippedModelIds, + skippedModelNames, + requestedModelId, + existingModelId: existingModel.primary || "", + }); + const selectedModelId = selected.selected_model_id || ""; const localAgentConfig = { timeout: 300, prompt: [{ role: "system", content: "You are a helpful assistant." }], "remove-think": false, "knowledge-bases": [], + "box-session-id-template": "{launcher_type}_{launcher_id}", "retrieval-top-k": 5, "rerank-model": "", "rerank-top-k": 5, @@ -227,9 +289,11 @@ async function ensureLocalAgentPipeline({ backendUrl, token, pipelineName, runne "context-keep-recent-tokens": 20000, "context-summary-tokens": 8000, ...existingLocalAgentConfig, + // Current backend truncation still reads this field directly. + "max-round": positiveInteger(existingLocalAgentConfig["max-round"], 10), model: { primary: selectedModelId, - fallbacks: requestedModelId ? [] : Array.isArray(existingModel.fallbacks) ? existingModel.fallbacks : [], + fallbacks: selected.fallback_model_ids || [], }, }; const updatedConfig = { @@ -239,12 +303,10 @@ async function ensureLocalAgentPipeline({ backendUrl, token, pipelineName, runne runner: { ...(ai.runner && typeof ai.runner === "object" ? ai.runner : {}), id: runnerId, + runner: runnerId, "expire-time": 0, }, - runner_config: { - ...runnerConfig, - [runnerId]: localAgentConfig, - }, + "local-agent": localAgentConfig, }, }; @@ -265,19 +327,31 @@ async function ensureLocalAgentPipeline({ backendUrl, token, pipelineName, runne update_status: updateResponse.status, pipeline_id: pipeline.uuid, model_count: models.length, + space_model_count: spaceModels.length, + scanned_space_model_count: selected.scanned_space_model_count, + tested_model_count: selected.tested_model_count, + model_tests: selected.model_tests, selected_model_id: selectedModelId, + selected_model_name: selected.selected_model_name, + fallback_model_ids: selected.fallback_model_ids, }; } return { status: selectedModelId ? "pass" : "env_issue", reason: selectedModelId - ? "Local-agent pipeline is configured for Debug Chat." - : "Pipeline was created but no LLM model is configured in this LangBot instance.", + ? `Local-agent pipeline is configured for Debug Chat with Space model ${selected.selected_model_name || selectedModelId} and ${selected.fallback_model_ids.length} fallback(s).` + : selected.reason || "No working Space LLM model is configured in this LangBot instance.", pipeline_id: pipeline.uuid, - pipeline_name: pipeline.name, + pipeline_name: pipelineName, model_count: models.length, + space_model_count: spaceModels.length, + scanned_space_model_count: selected.scanned_space_model_count, + tested_model_count: selected.tested_model_count, + model_tests: selected.model_tests, selected_model_id: selectedModelId, + selected_model_name: selected.selected_model_name, + fallback_model_ids: selected.fallback_model_ids, created, updated: true, }; @@ -287,6 +361,229 @@ function isApiFailure(response) { return response.status >= 400 || (response.json.code !== undefined && response.json.code !== 0); } +function isSpaceModel(model) { + const provider = model?.provider && typeof model.provider === "object" ? model.provider : {}; + return model?.provider_uuid === SPACE_PROVIDER_UUID + || provider.uuid === SPACE_PROVIDER_UUID + || provider.requester === "space-chat-completions" + || provider.name === "LangBot Models"; +} + +async function selectWorkingSpaceModel({ + backendUrl, + token, + models, + skippedModelIds, + skippedModelNames, + requestedModelId, + existingModelId, +}) { + const modelTests = []; + const testLimit = positiveInteger(env.LANGBOT_E2E_MODEL_TEST_LIMIT, DEFAULT_MODEL_TEST_LIMIT); + const fallbackCount = positiveInteger(env.LANGBOT_E2E_MODEL_FALLBACK_COUNT, DEFAULT_MODEL_FALLBACK_COUNT); + const workingModels = []; + const spaceModels = rankModels(models.filter((model) => ( + model.uuid + && isSpaceModel(model) + && !skippedModelIds.has(model.uuid) + && !skippedModelNames.has(model.name) + ))); + const requestedModel = requestedModelId + ? spaceModels.find((model) => model.uuid === requestedModelId) || null + : null; + const existingModel = existingModelId + ? spaceModels.find((model) => model.uuid === existingModelId) || null + : null; + const candidates = uniqueCandidates([ + ...(requestedModel ? [existingCandidate(requestedModel, "requested")] : []), + ...(existingModel ? [existingCandidate(existingModel, "existing-pipeline")] : []), + ...spaceModels.map((model) => existingCandidate(model, "configured-space")), + ]); + + let scanResult = { status: "skipped", models: [], reason: "" }; + if (env.LANGBOT_E2E_SCAN_SPACE_MODELS !== "false") { + scanResult = await scanSpaceModels({ backendUrl, token }); + if (scanResult.status === "pass") { + const knownNames = new Set(spaceModels.map((model) => model.name)); + candidates.push(...scanResult.models + .filter((model) => model.name && !knownNames.has(model.name) && !skippedModelNames.has(model.name)) + .map((model) => scannedCandidate(model))); + } + } + + const unique = uniqueCandidates(candidates); + for (const candidate of unique.slice(0, testLimit)) { + const test = await ensureAndTestModel({ backendUrl, token, candidate }); + modelTests.push(test); + if (test.status === "pass" && test.model_uuid) { + workingModels.push(test); + if (workingModels.length >= fallbackCount + 1) break; + } + } + + if (workingModels.length > 0) { + const [primary, ...fallbacks] = workingModels; + return { + status: "pass", + reason: "", + selected_model_id: primary.model_uuid, + selected_model_name: primary.model_name, + fallback_model_ids: fallbacks.map((model) => model.model_uuid), + scanned_space_model_count: scanResult.models.length, + tested_model_count: modelTests.length, + model_tests: modelTests, + }; + } + + const baseReason = unique.length === 0 + ? scanResult.reason || "No Space LLM model candidates are available." + : `No working Space LLM model found after testing ${modelTests.length} candidate(s).`; + return { + status: "env_issue", + reason: requestedModelId && !requestedModel + ? `Requested Space LLM model ${requestedModelId} is missing or skipped; ${baseReason}` + : baseReason, + selected_model_id: "", + selected_model_name: "", + fallback_model_ids: [], + scanned_space_model_count: scanResult.models.length, + tested_model_count: modelTests.length, + model_tests: modelTests, + }; +} + +async function scanSpaceModels({ backendUrl, token }) { + const response = await apiJson( + backendUrl, + `/api/v1/provider/providers/${encodeURIComponent(SPACE_PROVIDER_UUID)}/scan-models?type=llm`, + { token }, + ); + if (isApiFailure(response)) { + return { + status: "env_issue", + models: [], + reason: safeReason(response.json.msg || response.json.message || "Failed to scan Space LLM models."), + }; + } + return { + status: "pass", + models: response.json.data?.models || [], + reason: "", + }; +} + +async function ensureAndTestModel({ backendUrl, token, candidate }) { + let modelUuid = candidate.uuid || ""; + let created = false; + if (!modelUuid) { + const create = await apiJson(backendUrl, "/api/v1/provider/models/llm", { + method: "POST", + token, + body: { + name: candidate.name, + provider_uuid: SPACE_PROVIDER_UUID, + abilities: candidate.abilities || [], + context_length: candidate.context_length ?? null, + extra_args: {}, + prefered_ranking: positiveInteger(candidate.prefered_ranking, 0), + }, + }); + modelUuid = create.json.data?.uuid || ""; + if (isApiFailure(create) || !modelUuid) { + return modelTestResult(candidate, { + status: "fail", + reason: safeReason(create.json.msg || "Failed to create scanned Space model."), + http_status: create.status, + }); + } + created = true; + } + + const test = await apiJson(backendUrl, `/api/v1/provider/models/llm/${encodeURIComponent(modelUuid)}/test`, { + method: "POST", + token, + body: { extra_args: {} }, + }); + const passed = !isApiFailure(test); + if (!passed && created) { + await apiJson(backendUrl, `/api/v1/provider/models/llm/${encodeURIComponent(modelUuid)}`, { + method: "DELETE", + token, + }).catch(() => {}); + } + return modelTestResult(candidate, { + status: passed ? "pass" : "fail", + reason: passed ? "" : safeReason(test.json.msg || test.json.message || "Space model test failed."), + http_status: test.status, + model_uuid: modelUuid, + created, + }); +} + +function modelTestResult(candidate, details) { + return { + source: candidate.source, + model_uuid: details.model_uuid || candidate.uuid || "", + model_name: candidate.name, + status: details.status, + reason: details.reason || "", + http_status: details.http_status ?? null, + created: Boolean(details.created), + }; +} + +function existingCandidate(model, source) { + return { + source, + uuid: model.uuid, + name: model.name, + abilities: model.abilities || [], + context_length: model.context_length, + prefered_ranking: model.prefered_ranking, + }; +} + +function scannedCandidate(model) { + return { + source: "scanned-space", + uuid: "", + name: model.name || model.id, + abilities: model.abilities || [], + context_length: model.context_length, + prefered_ranking: model.prefered_ranking, + }; +} + +function uniqueCandidates(candidates) { + const seen = new Set(); + const result = []; + for (const candidate of candidates) { + const key = candidate.uuid ? `uuid:${candidate.uuid}` : `name:${candidate.name}`; + if (!candidate.name || seen.has(key)) continue; + seen.add(key); + result.push(candidate); + } + return result; +} + +function rankModels(models) { + return [...models].sort((left, right) => { + const leftRank = Number.isFinite(Number(left.prefered_ranking)) ? Number(left.prefered_ranking) : 9999; + const rightRank = Number.isFinite(Number(right.prefered_ranking)) ? Number(right.prefered_ranking) : 9999; + if (leftRank !== rightRank) return leftRank - rightRank; + return String(left.name || "").localeCompare(String(right.name || "")); + }); +} + +function positiveInteger(value, fallback) { + const parsed = Number(value); + return Number.isInteger(parsed) && parsed > 0 ? parsed : fallback; +} + +function safeReason(value) { + return redact(String(value || "")).slice(0, 1000); +} + async function upsertEnvLocal(path, updates) { let text = ""; try { diff --git a/skills/scripts/e2e/lib/langbot-e2e.mjs b/skills/scripts/e2e/lib/langbot-e2e.mjs index fc7a52e4f..a7584c904 100644 --- a/skills/scripts/e2e/lib/langbot-e2e.mjs +++ b/skills/scripts/e2e/lib/langbot-e2e.mjs @@ -72,6 +72,7 @@ export async function writeResult(paths, result) { } export async function loadEnvFiles(paths = ["skills/.env", "skills/.env.local"]) { + const processEnvKeys = new Set(Object.keys(env)); for (const path of paths) { let text = ""; try { @@ -86,7 +87,7 @@ export async function loadEnvFiles(paths = ["skills/.env", "skills/.env.local"]) if (equals <= 0) continue; const key = trimmed.slice(0, equals).trim(); const value = trimmed.slice(equals + 1).trim().replace(/^["']|["']$/g, ""); - if (!(key in env)) env[key] = value; + if (!processEnvKeys.has(key)) env[key] = value; } } } diff --git a/skills/skills.index.json b/skills/skills.index.json index 190cf1305..9a2cbd13d 100644 --- a/skills/skills.index.json +++ b/skills/skills.index.json @@ -1057,8 +1057,13 @@ "metrics" ], "automation": "scripts/e2e/pipeline-debug-chat.mjs", - "setup_automation": [], - "setup_provides_env": [], + "setup_automation": [ + "node:scripts/e2e/ensure-local-agent-pipeline.mjs --write-env" + ], + "setup_provides_env": [ + "LANGBOT_PIPELINE_URL", + "LANGBOT_PIPELINE_NAME" + ], "evidence_required": [ "ui", "screenshot", diff --git a/skills/skills/langbot-env-setup/references/service-startup.md b/skills/skills/langbot-env-setup/references/service-startup.md index 4f7b3ec27..b63960cdb 100644 --- a/skills/skills/langbot-env-setup/references/service-startup.md +++ b/skills/skills/langbot-env-setup/references/service-startup.md @@ -53,7 +53,7 @@ Start the new frontend from the web repo: ```bash cd "$LANGBOT_WEB_REPO" -npm run dev +VITE_API_BASE_URL="$LANGBOT_BACKEND_URL" pnpm dev --host 0.0.0.0 ``` Healthy startup includes: @@ -68,6 +68,10 @@ Quick check: curl -I --max-time 3 "$LANGBOT_FRONTEND_URL" ``` +If `VITE_API_BASE_URL` is missing, Vite still serves the page but frontend API +calls may go to the frontend port instead of the backend port. That produces +false browser failures in login, wizard, pipeline, and Debug Chat cases. + ## Completion Signal Environment setup is not complete until the required frontend/backend URLs are reachable and the chosen browser-control path can open the WebUI. diff --git a/skills/skills/langbot-testing/cases/pipeline-debug-chat-performance.yaml b/skills/skills/langbot-testing/cases/pipeline-debug-chat-performance.yaml index a1a4944b5..266cbb57d 100644 --- a/skills/skills/langbot-testing/cases/pipeline-debug-chat-performance.yaml +++ b/skills/skills/langbot-testing/cases/pipeline-debug-chat-performance.yaml @@ -39,6 +39,11 @@ automation_debug_chat_response_p95_ms: "120000" automation_debug_chat_max_error_rate: "0" metrics_thresholds_json: '{"response_p95_ms":{"max":120000},"error_rate":{"max":0}}' load_profile_json: '{"prompts":1,"browser":true,"path":"Pipeline Debug Chat","metric":"send-to-visible-completion"}' +setup_automation: + - "node:scripts/e2e/ensure-local-agent-pipeline.mjs --write-env" +setup_provides_env: + - LANGBOT_PIPELINE_URL + - LANGBOT_PIPELINE_NAME preconditions: - "LANGBOT_PIPELINE_URL or LANGBOT_PIPELINE_NAME points to the pipeline intended for this Debug Chat performance run." - "The target pipeline is safe to reset Debug Chat history for this run." diff --git a/skills/skills/langbot-testing/references/performance-reliability-testing.md b/skills/skills/langbot-testing/references/performance-reliability-testing.md index 6517858d8..db325318f 100644 --- a/skills/skills/langbot-testing/references/performance-reliability-testing.md +++ b/skills/skills/langbot-testing/references/performance-reliability-testing.md @@ -159,6 +159,29 @@ provider latency, model route health, plugin/runtime logs, WebSocket behavior, and browser console/network evidence before attributing the whole duration to LangBot. +### User-Path Gate Runbook + +1. Start the backend and frontend. The frontend must be launched with + `VITE_API_BASE_URL="$LANGBOT_BACKEND_URL"` so browser API calls reach the + backend. +2. Run `node scripts/e2e/ensure-local-agent-pipeline.mjs --write-env`. The + setup refreshes the local QA login, skips the wizard, prepares a Debug Chat + pipeline, scans Space models, tests candidates, writes tested fallback + models, and writes the selected pipeline/model env values to + `skills/.env.local`. +3. If setup returns `env_issue`, read `model_tests` and provider errors first. + A missing Space key, failed Space scan, or unavailable model route is not a + LangBot performance regression. +4. Run + `bin/lbs suite run langbot-user-path-performance-gate --include-manual-check`. +5. Interpret `response_p95_ms` as browser-visible send-to-completion time. It + includes provider latency; use backend logs and model test evidence to + separate LangBot overhead from the external model route. + +The setup keeps a `max-round` value in the generated pipeline config only +because the current backend truncator still reads that field directly. Do not +use it as a quality requirement for future local-agent behavior. + ## Running The First Gate Start with the reusable suite: