test(skills): prepare user path performance gate

This commit is contained in:
huanghuoguoguo
2026-06-25 10:07:04 +08:00
parent 67437c2f5a
commit 8749a9b56f
9 changed files with 493 additions and 19 deletions
+1
View File
@@ -48,6 +48,7 @@ coverage.xml
.coverage
src/langbot/web/
testsdk/
.qa/
# Build artifacts
/dist
+1 -1
View File
@@ -26,7 +26,7 @@ and LangBot's own Local Agent) working with the LangBot ecosystem.
## Quick start (for an AI agent)
1. Read this README, `AGENTS.md`, and `qa-agent-docs/` to understand the layout.
1. Read this README, `AGENTS.md`, and `docs/user-guide.md` to understand the layout.
2. Read `skills/.env` for shared local defaults. On a new machine, copy
`skills/.env.example` to `skills/.env.local` (gitignored) and override
machine-specific values there. Never commit secrets.
+138
View File
@@ -0,0 +1,138 @@
# LangBot QA Skills User Guide
Use this guide as the first operational path after reading `README.md` and
`AGENTS.md`.
## 1. Configure Local Inputs
Read `skills/.env`, then create `skills/.env.local` for machine-local values.
Do not commit `.env.local`, browser profiles, reports, tokens, API keys, OAuth
state, or provider credentials.
Minimum local fields for live browser QA:
```bash
LANGBOT_REPO=/path/to/LangBot
LANGBOT_WEB_REPO=/path/to/LangBot/web
LANGBOT_BACKEND_URL=http://127.0.0.1:5300
LANGBOT_FRONTEND_URL=http://127.0.0.1:3000
LANGBOT_DEV_FRONTEND_URL=http://127.0.0.1:3000
LANGBOT_BROWSER_PROFILE=/path/to/langbot-browser-profile
LANGBOT_CHROMIUM_EXECUTABLE=/path/to/chromium-or-playwright-chrome
LANGBOT_E2E_LOGIN_USER=qa-local@example.com
```
`LANGBOT_E2E_LOGIN_USER` is a local QA account. The setup automation uses the
LangBot recovery key from the active checkout to initialize or refresh that
local account and write a browser `localStorage` token. It does not need the
user's GitHub or Space credentials.
## 2. Check Readiness
From `skills/`:
```bash
bin/lbs env show
bin/lbs env doctor
bin/lbs validate
bin/lbs index --check
```
`env doctor` should report reachable backend and frontend URLs before live
browser cases are run. Missing Space provider credentials are not a LangBot
product pass; classify them as `env_issue` and configure the local Space
provider before measuring Debug Chat performance.
## 3. Start Services
Start the backend from `LANGBOT_REPO`:
```bash
cd "$LANGBOT_REPO"
uv run main.py
```
Start the standalone frontend from `LANGBOT_WEB_REPO` and point it at the
backend:
```bash
cd "$LANGBOT_WEB_REPO"
VITE_API_BASE_URL="$LANGBOT_BACKEND_URL" pnpm dev --host 0.0.0.0
```
If `VITE_API_BASE_URL` is missing, browser tests can load the Vite page but send
API requests to the frontend port, which produces false UI failures.
## 4. Prepare User-Path Fixtures
For local-agent Debug Chat cases and the user-path performance gate:
```bash
node scripts/e2e/ensure-local-agent-pipeline.mjs --write-env
```
The script:
- refreshes the local QA login and browser token;
- marks the local wizard as skipped;
- creates or updates a local QA pipeline;
- scans Space LLM models, tests candidates, and switches to the first working
Space model with tested fallback models;
- writes `LANGBOT_PIPELINE_URL`, `LANGBOT_PIPELINE_NAME`, and local-agent
pipeline/model variables into `skills/.env.local`;
- returns `env_issue` when no Space model can be scanned or tested.
Useful model controls:
```bash
LANGBOT_E2E_MODEL_TEST_LIMIT=8
LANGBOT_E2E_MODEL_FALLBACK_COUNT=3
LANGBOT_E2E_SKIP_MODEL_UUIDS=uuid-a,uuid-b
LANGBOT_E2E_SKIP_MODEL_NAMES=model-a,model-b
LANGBOT_E2E_SCAN_SPACE_MODELS=true
```
The setup writes a current-runtime compatibility `max-round` value into the
pipeline config because this backend still reads that field directly during
message truncation. Do not treat it as a long-term QA contract.
## 5. Run Gates
Fast contract gate, no live service required:
```bash
bin/lbs suite run langbot-performance-contract-gate --run-id langbot-contract-local
```
Live backend gate:
```bash
bin/lbs suite run langbot-live-backend-gate --run-id langbot-backend-local
```
Browser-visible user-path performance gate:
```bash
bin/lbs suite plan langbot-user-path-performance-gate
bin/lbs suite run langbot-user-path-performance-gate --run-id langbot-user-path-local --include-manual-check
```
`manual_check` means the agent must confirm the declared preconditions for that
run window. When setup automation is declared, run output may stop early with
`env_issue`; fix that environment input before treating the product path as
measured.
## 6. Read Results
Suite reports live under `skills/reports/`. Evidence lives under
`skills/reports/evidence/<run-id>/`.
For performance cases, inspect:
- `metrics.json` for p50/p95/p99, error rate, and total duration;
- `automation-result.json` for threshold decisions and artifacts;
- `console.log` and `network.log` for frontend/API failures;
- backend logs for provider, runner, WebSocket, or persistence failures.
Do not call a user-path performance result a LangBot overhead regression until
provider/tool/network time has been separated or ruled out.
@@ -10,6 +10,7 @@ import {
ensureEvidence,
evidencePaths,
loadEnvFiles,
redact,
resetAndAuthLocalUser,
safeScreenshot,
setBrowserToken,
@@ -17,9 +18,12 @@ import {
writeResult,
} from "./lib/langbot-e2e.mjs";
const RUNNER_ID = "plugin:langbot/local-agent/default";
const RUNNER_ID = "local-agent";
const SPACE_PROVIDER_UUID = "00000000-0000-0000-0000-000000000000";
const DEFAULT_PIPELINE_NAME = "Agent QA Local Agent Debug Chat";
const DEFAULT_LOCAL_PASSWORD = "LangBotE2ELocalPass!2026";
const DEFAULT_MODEL_TEST_LIMIT = 8;
const DEFAULT_MODEL_FALLBACK_COUNT = 3;
const caseId = "ensure-local-agent-pipeline";
await loadEnvFiles();
@@ -45,11 +49,18 @@ const result = {
pipeline_url: "",
runner_id: RUNNER_ID,
selected_model_id: "",
selected_model_name: "",
fallback_model_ids: [],
model_count: 0,
space_model_count: 0,
scanned_space_model_count: 0,
tested_model_count: 0,
model_tests: [],
created: false,
updated: false,
wrote_env: false,
auth: null,
wizard: null,
browser_token_check: null,
page_signal: "",
evidence: {
@@ -71,6 +82,7 @@ try {
const user = env.LANGBOT_E2E_LOGIN_USER || "";
const password = env.LANGBOT_E2E_LOGIN_PASSWORD || DEFAULT_LOCAL_PASSWORD;
if (!user) {
result.status = "env_issue";
throw new Error("LANGBOT_E2E_LOGIN_USER is required so this setup can create/update the pipeline via backend API.");
}
@@ -81,6 +93,13 @@ try {
backend_token_check: auth.check,
};
const wizard = await skipWizard({ backendUrl, token: auth.token });
result.wizard = wizard;
if (wizard.status !== "pass") {
result.status = "fail";
throw new Error(wizard.reason || "Failed to mark the local QA wizard as skipped.");
}
const prepared = await ensureLocalAgentPipeline({
backendUrl,
token: auth.token,
@@ -99,6 +118,10 @@ try {
LANGBOT_PIPELINE_NAME: result.pipeline_name || pipelineName,
LANGBOT_LOCAL_AGENT_PIPELINE_URL: result.pipeline_url,
LANGBOT_LOCAL_AGENT_PIPELINE_NAME: result.pipeline_name || pipelineName,
...(result.selected_model_id ? {
LANGBOT_LOCAL_AGENT_MODEL_UUID: result.selected_model_id,
LANGBOT_E2E_MODEL_UUID: result.selected_model_id,
} : {}),
});
result.wrote_env = true;
}
@@ -127,6 +150,21 @@ try {
process.exit(result.status === "pass" ? 0 : result.status === "env_issue" ? 2 : 1);
async function skipWizard({ backendUrl, token }) {
const response = await apiJson(backendUrl, "/api/v1/system/wizard/completed", {
method: "POST",
token,
body: { status: "skipped" },
});
const ok = response.status < 400 && response.json.code === 0;
return {
status: ok ? "pass" : "fail",
http_status: response.status,
code: response.json.code ?? null,
reason: ok ? "Wizard marked skipped for local QA." : response.json.msg || "Wizard status update failed.",
};
}
async function ensureLocalAgentPipeline({ backendUrl, token, pipelineName, runnerId }) {
const [pipelineList, modelList] = await Promise.all([
apiJson(backendUrl, "/api/v1/pipelines", { token }),
@@ -149,7 +187,19 @@ async function ensureLocalAgentPipeline({ backendUrl, token, pipelineName, runne
}
const models = modelList.json.data?.models || [];
const selectedModel = models.find((model) => model.uuid) || null;
const skippedModelIds = new Set(
String(env.LANGBOT_E2E_SKIP_MODEL_UUIDS || "")
.split(",")
.map((item) => item.trim())
.filter(Boolean),
);
const skippedModelNames = new Set(
String(env.LANGBOT_E2E_SKIP_MODEL_NAMES || "")
.split(",")
.map((item) => item.trim())
.filter(Boolean),
);
const spaceModels = models.filter((model) => isSpaceModel(model) && !skippedModelIds.has(model.uuid));
const pipelines = pipelineList.json.data?.pipelines || [];
let pipeline = pipelines.find((item) => item.name === pipelineName) || null;
let created = false;
@@ -170,6 +220,7 @@ async function ensureLocalAgentPipeline({ backendUrl, token, pipelineName, runne
reason: createdResponse.json.msg || "Failed to create pipeline.",
create_status: createdResponse.status,
model_count: models.length,
space_model_count: spaceModels.length,
};
}
const pipelineId = createdResponse.json.data?.uuid || "";
@@ -183,6 +234,7 @@ async function ensureLocalAgentPipeline({ backendUrl, token, pipelineName, runne
status: "fail",
reason: "Pipeline was not created or resolved.",
model_count: models.length,
space_model_count: spaceModels.length,
};
}
@@ -194,27 +246,37 @@ async function ensureLocalAgentPipeline({ backendUrl, token, pipelineName, runne
get_status: loaded.status,
pipeline_id: pipeline.uuid,
model_count: models.length,
space_model_count: spaceModels.length,
};
}
pipeline = loaded.json.data.pipeline;
const config = pipeline.config && typeof pipeline.config === "object" ? pipeline.config : {};
const ai = config.ai && typeof config.ai === "object" ? config.ai : {};
const runnerConfig = ai.runner_config && typeof ai.runner_config === "object" ? ai.runner_config : {};
const rawExistingLocalAgentConfig = runnerConfig[runnerId] && typeof runnerConfig[runnerId] === "object"
? runnerConfig[runnerId]
const rawExistingLocalAgentConfig = ai["local-agent"] && typeof ai["local-agent"] === "object"
? ai["local-agent"]
: {};
const existingLocalAgentConfig = rawExistingLocalAgentConfig;
const existingModel = existingLocalAgentConfig.model && typeof existingLocalAgentConfig.model === "object"
? existingLocalAgentConfig.model
: {};
const requestedModelId = env.LANGBOT_LOCAL_AGENT_MODEL_UUID || env.LANGBOT_E2E_MODEL_UUID || "";
const selectedModelId = requestedModelId || existingModel.primary || selectedModel?.uuid || "";
const selected = await selectWorkingSpaceModel({
backendUrl,
token,
models,
skippedModelIds,
skippedModelNames,
requestedModelId,
existingModelId: existingModel.primary || "",
});
const selectedModelId = selected.selected_model_id || "";
const localAgentConfig = {
timeout: 300,
prompt: [{ role: "system", content: "You are a helpful assistant." }],
"remove-think": false,
"knowledge-bases": [],
"box-session-id-template": "{launcher_type}_{launcher_id}",
"retrieval-top-k": 5,
"rerank-model": "",
"rerank-top-k": 5,
@@ -227,9 +289,11 @@ async function ensureLocalAgentPipeline({ backendUrl, token, pipelineName, runne
"context-keep-recent-tokens": 20000,
"context-summary-tokens": 8000,
...existingLocalAgentConfig,
// Current backend truncation still reads this field directly.
"max-round": positiveInteger(existingLocalAgentConfig["max-round"], 10),
model: {
primary: selectedModelId,
fallbacks: requestedModelId ? [] : Array.isArray(existingModel.fallbacks) ? existingModel.fallbacks : [],
fallbacks: selected.fallback_model_ids || [],
},
};
const updatedConfig = {
@@ -239,12 +303,10 @@ async function ensureLocalAgentPipeline({ backendUrl, token, pipelineName, runne
runner: {
...(ai.runner && typeof ai.runner === "object" ? ai.runner : {}),
id: runnerId,
runner: runnerId,
"expire-time": 0,
},
runner_config: {
...runnerConfig,
[runnerId]: localAgentConfig,
},
"local-agent": localAgentConfig,
},
};
@@ -265,19 +327,31 @@ async function ensureLocalAgentPipeline({ backendUrl, token, pipelineName, runne
update_status: updateResponse.status,
pipeline_id: pipeline.uuid,
model_count: models.length,
space_model_count: spaceModels.length,
scanned_space_model_count: selected.scanned_space_model_count,
tested_model_count: selected.tested_model_count,
model_tests: selected.model_tests,
selected_model_id: selectedModelId,
selected_model_name: selected.selected_model_name,
fallback_model_ids: selected.fallback_model_ids,
};
}
return {
status: selectedModelId ? "pass" : "env_issue",
reason: selectedModelId
? "Local-agent pipeline is configured for Debug Chat."
: "Pipeline was created but no LLM model is configured in this LangBot instance.",
? `Local-agent pipeline is configured for Debug Chat with Space model ${selected.selected_model_name || selectedModelId} and ${selected.fallback_model_ids.length} fallback(s).`
: selected.reason || "No working Space LLM model is configured in this LangBot instance.",
pipeline_id: pipeline.uuid,
pipeline_name: pipeline.name,
pipeline_name: pipelineName,
model_count: models.length,
space_model_count: spaceModels.length,
scanned_space_model_count: selected.scanned_space_model_count,
tested_model_count: selected.tested_model_count,
model_tests: selected.model_tests,
selected_model_id: selectedModelId,
selected_model_name: selected.selected_model_name,
fallback_model_ids: selected.fallback_model_ids,
created,
updated: true,
};
@@ -287,6 +361,229 @@ function isApiFailure(response) {
return response.status >= 400 || (response.json.code !== undefined && response.json.code !== 0);
}
function isSpaceModel(model) {
const provider = model?.provider && typeof model.provider === "object" ? model.provider : {};
return model?.provider_uuid === SPACE_PROVIDER_UUID
|| provider.uuid === SPACE_PROVIDER_UUID
|| provider.requester === "space-chat-completions"
|| provider.name === "LangBot Models";
}
async function selectWorkingSpaceModel({
backendUrl,
token,
models,
skippedModelIds,
skippedModelNames,
requestedModelId,
existingModelId,
}) {
const modelTests = [];
const testLimit = positiveInteger(env.LANGBOT_E2E_MODEL_TEST_LIMIT, DEFAULT_MODEL_TEST_LIMIT);
const fallbackCount = positiveInteger(env.LANGBOT_E2E_MODEL_FALLBACK_COUNT, DEFAULT_MODEL_FALLBACK_COUNT);
const workingModels = [];
const spaceModels = rankModels(models.filter((model) => (
model.uuid
&& isSpaceModel(model)
&& !skippedModelIds.has(model.uuid)
&& !skippedModelNames.has(model.name)
)));
const requestedModel = requestedModelId
? spaceModels.find((model) => model.uuid === requestedModelId) || null
: null;
const existingModel = existingModelId
? spaceModels.find((model) => model.uuid === existingModelId) || null
: null;
const candidates = uniqueCandidates([
...(requestedModel ? [existingCandidate(requestedModel, "requested")] : []),
...(existingModel ? [existingCandidate(existingModel, "existing-pipeline")] : []),
...spaceModels.map((model) => existingCandidate(model, "configured-space")),
]);
let scanResult = { status: "skipped", models: [], reason: "" };
if (env.LANGBOT_E2E_SCAN_SPACE_MODELS !== "false") {
scanResult = await scanSpaceModels({ backendUrl, token });
if (scanResult.status === "pass") {
const knownNames = new Set(spaceModels.map((model) => model.name));
candidates.push(...scanResult.models
.filter((model) => model.name && !knownNames.has(model.name) && !skippedModelNames.has(model.name))
.map((model) => scannedCandidate(model)));
}
}
const unique = uniqueCandidates(candidates);
for (const candidate of unique.slice(0, testLimit)) {
const test = await ensureAndTestModel({ backendUrl, token, candidate });
modelTests.push(test);
if (test.status === "pass" && test.model_uuid) {
workingModels.push(test);
if (workingModels.length >= fallbackCount + 1) break;
}
}
if (workingModels.length > 0) {
const [primary, ...fallbacks] = workingModels;
return {
status: "pass",
reason: "",
selected_model_id: primary.model_uuid,
selected_model_name: primary.model_name,
fallback_model_ids: fallbacks.map((model) => model.model_uuid),
scanned_space_model_count: scanResult.models.length,
tested_model_count: modelTests.length,
model_tests: modelTests,
};
}
const baseReason = unique.length === 0
? scanResult.reason || "No Space LLM model candidates are available."
: `No working Space LLM model found after testing ${modelTests.length} candidate(s).`;
return {
status: "env_issue",
reason: requestedModelId && !requestedModel
? `Requested Space LLM model ${requestedModelId} is missing or skipped; ${baseReason}`
: baseReason,
selected_model_id: "",
selected_model_name: "",
fallback_model_ids: [],
scanned_space_model_count: scanResult.models.length,
tested_model_count: modelTests.length,
model_tests: modelTests,
};
}
async function scanSpaceModels({ backendUrl, token }) {
const response = await apiJson(
backendUrl,
`/api/v1/provider/providers/${encodeURIComponent(SPACE_PROVIDER_UUID)}/scan-models?type=llm`,
{ token },
);
if (isApiFailure(response)) {
return {
status: "env_issue",
models: [],
reason: safeReason(response.json.msg || response.json.message || "Failed to scan Space LLM models."),
};
}
return {
status: "pass",
models: response.json.data?.models || [],
reason: "",
};
}
async function ensureAndTestModel({ backendUrl, token, candidate }) {
let modelUuid = candidate.uuid || "";
let created = false;
if (!modelUuid) {
const create = await apiJson(backendUrl, "/api/v1/provider/models/llm", {
method: "POST",
token,
body: {
name: candidate.name,
provider_uuid: SPACE_PROVIDER_UUID,
abilities: candidate.abilities || [],
context_length: candidate.context_length ?? null,
extra_args: {},
prefered_ranking: positiveInteger(candidate.prefered_ranking, 0),
},
});
modelUuid = create.json.data?.uuid || "";
if (isApiFailure(create) || !modelUuid) {
return modelTestResult(candidate, {
status: "fail",
reason: safeReason(create.json.msg || "Failed to create scanned Space model."),
http_status: create.status,
});
}
created = true;
}
const test = await apiJson(backendUrl, `/api/v1/provider/models/llm/${encodeURIComponent(modelUuid)}/test`, {
method: "POST",
token,
body: { extra_args: {} },
});
const passed = !isApiFailure(test);
if (!passed && created) {
await apiJson(backendUrl, `/api/v1/provider/models/llm/${encodeURIComponent(modelUuid)}`, {
method: "DELETE",
token,
}).catch(() => {});
}
return modelTestResult(candidate, {
status: passed ? "pass" : "fail",
reason: passed ? "" : safeReason(test.json.msg || test.json.message || "Space model test failed."),
http_status: test.status,
model_uuid: modelUuid,
created,
});
}
function modelTestResult(candidate, details) {
return {
source: candidate.source,
model_uuid: details.model_uuid || candidate.uuid || "",
model_name: candidate.name,
status: details.status,
reason: details.reason || "",
http_status: details.http_status ?? null,
created: Boolean(details.created),
};
}
function existingCandidate(model, source) {
return {
source,
uuid: model.uuid,
name: model.name,
abilities: model.abilities || [],
context_length: model.context_length,
prefered_ranking: model.prefered_ranking,
};
}
function scannedCandidate(model) {
return {
source: "scanned-space",
uuid: "",
name: model.name || model.id,
abilities: model.abilities || [],
context_length: model.context_length,
prefered_ranking: model.prefered_ranking,
};
}
function uniqueCandidates(candidates) {
const seen = new Set();
const result = [];
for (const candidate of candidates) {
const key = candidate.uuid ? `uuid:${candidate.uuid}` : `name:${candidate.name}`;
if (!candidate.name || seen.has(key)) continue;
seen.add(key);
result.push(candidate);
}
return result;
}
function rankModels(models) {
return [...models].sort((left, right) => {
const leftRank = Number.isFinite(Number(left.prefered_ranking)) ? Number(left.prefered_ranking) : 9999;
const rightRank = Number.isFinite(Number(right.prefered_ranking)) ? Number(right.prefered_ranking) : 9999;
if (leftRank !== rightRank) return leftRank - rightRank;
return String(left.name || "").localeCompare(String(right.name || ""));
});
}
function positiveInteger(value, fallback) {
const parsed = Number(value);
return Number.isInteger(parsed) && parsed > 0 ? parsed : fallback;
}
function safeReason(value) {
return redact(String(value || "")).slice(0, 1000);
}
async function upsertEnvLocal(path, updates) {
let text = "";
try {
+2 -1
View File
@@ -72,6 +72,7 @@ export async function writeResult(paths, result) {
}
export async function loadEnvFiles(paths = ["skills/.env", "skills/.env.local"]) {
const processEnvKeys = new Set(Object.keys(env));
for (const path of paths) {
let text = "";
try {
@@ -86,7 +87,7 @@ export async function loadEnvFiles(paths = ["skills/.env", "skills/.env.local"])
if (equals <= 0) continue;
const key = trimmed.slice(0, equals).trim();
const value = trimmed.slice(equals + 1).trim().replace(/^["']|["']$/g, "");
if (!(key in env)) env[key] = value;
if (!processEnvKeys.has(key)) env[key] = value;
}
}
}
+7 -2
View File
@@ -1057,8 +1057,13 @@
"metrics"
],
"automation": "scripts/e2e/pipeline-debug-chat.mjs",
"setup_automation": [],
"setup_provides_env": [],
"setup_automation": [
"node:scripts/e2e/ensure-local-agent-pipeline.mjs --write-env"
],
"setup_provides_env": [
"LANGBOT_PIPELINE_URL",
"LANGBOT_PIPELINE_NAME"
],
"evidence_required": [
"ui",
"screenshot",
@@ -53,7 +53,7 @@ Start the new frontend from the web repo:
```bash
cd "$LANGBOT_WEB_REPO"
npm run dev
VITE_API_BASE_URL="$LANGBOT_BACKEND_URL" pnpm dev --host 0.0.0.0
```
Healthy startup includes:
@@ -68,6 +68,10 @@ Quick check:
curl -I --max-time 3 "$LANGBOT_FRONTEND_URL"
```
If `VITE_API_BASE_URL` is missing, Vite still serves the page but frontend API
calls may go to the frontend port instead of the backend port. That produces
false browser failures in login, wizard, pipeline, and Debug Chat cases.
## Completion Signal
Environment setup is not complete until the required frontend/backend URLs are reachable and the chosen browser-control path can open the WebUI.
@@ -39,6 +39,11 @@ automation_debug_chat_response_p95_ms: "120000"
automation_debug_chat_max_error_rate: "0"
metrics_thresholds_json: '{"response_p95_ms":{"max":120000},"error_rate":{"max":0}}'
load_profile_json: '{"prompts":1,"browser":true,"path":"Pipeline Debug Chat","metric":"send-to-visible-completion"}'
setup_automation:
- "node:scripts/e2e/ensure-local-agent-pipeline.mjs --write-env"
setup_provides_env:
- LANGBOT_PIPELINE_URL
- LANGBOT_PIPELINE_NAME
preconditions:
- "LANGBOT_PIPELINE_URL or LANGBOT_PIPELINE_NAME points to the pipeline intended for this Debug Chat performance run."
- "The target pipeline is safe to reset Debug Chat history for this run."
@@ -159,6 +159,29 @@ provider latency, model route health, plugin/runtime logs, WebSocket behavior,
and browser console/network evidence before attributing the whole duration to
LangBot.
### User-Path Gate Runbook
1. Start the backend and frontend. The frontend must be launched with
`VITE_API_BASE_URL="$LANGBOT_BACKEND_URL"` so browser API calls reach the
backend.
2. Run `node scripts/e2e/ensure-local-agent-pipeline.mjs --write-env`. The
setup refreshes the local QA login, skips the wizard, prepares a Debug Chat
pipeline, scans Space models, tests candidates, writes tested fallback
models, and writes the selected pipeline/model env values to
`skills/.env.local`.
3. If setup returns `env_issue`, read `model_tests` and provider errors first.
A missing Space key, failed Space scan, or unavailable model route is not a
LangBot performance regression.
4. Run
`bin/lbs suite run langbot-user-path-performance-gate --include-manual-check`.
5. Interpret `response_p95_ms` as browser-visible send-to-completion time. It
includes provider latency; use backend logs and model test evidence to
separate LangBot overhead from the external model route.
The setup keeps a `max-round` value in the generated pipeline config only
because the current backend truncator still reads that field directly. Do not
use it as a quality requirement for future local-agent behavior.
## Running The First Gate
Start with the reusable suite: