mirror of
https://github.com/langbot-app/LangBot.git
synced 2026-06-25 06:54:19 +00:00
test(skills): prepare user path performance gate
This commit is contained in:
@@ -48,6 +48,7 @@ coverage.xml
|
||||
.coverage
|
||||
src/langbot/web/
|
||||
testsdk/
|
||||
.qa/
|
||||
|
||||
# Build artifacts
|
||||
/dist
|
||||
|
||||
+1
-1
@@ -26,7 +26,7 @@ and LangBot's own Local Agent) working with the LangBot ecosystem.
|
||||
|
||||
## Quick start (for an AI agent)
|
||||
|
||||
1. Read this README, `AGENTS.md`, and `qa-agent-docs/` to understand the layout.
|
||||
1. Read this README, `AGENTS.md`, and `docs/user-guide.md` to understand the layout.
|
||||
2. Read `skills/.env` for shared local defaults. On a new machine, copy
|
||||
`skills/.env.example` to `skills/.env.local` (gitignored) and override
|
||||
machine-specific values there. Never commit secrets.
|
||||
|
||||
@@ -0,0 +1,138 @@
|
||||
# LangBot QA Skills User Guide
|
||||
|
||||
Use this guide as the first operational path after reading `README.md` and
|
||||
`AGENTS.md`.
|
||||
|
||||
## 1. Configure Local Inputs
|
||||
|
||||
Read `skills/.env`, then create `skills/.env.local` for machine-local values.
|
||||
Do not commit `.env.local`, browser profiles, reports, tokens, API keys, OAuth
|
||||
state, or provider credentials.
|
||||
|
||||
Minimum local fields for live browser QA:
|
||||
|
||||
```bash
|
||||
LANGBOT_REPO=/path/to/LangBot
|
||||
LANGBOT_WEB_REPO=/path/to/LangBot/web
|
||||
LANGBOT_BACKEND_URL=http://127.0.0.1:5300
|
||||
LANGBOT_FRONTEND_URL=http://127.0.0.1:3000
|
||||
LANGBOT_DEV_FRONTEND_URL=http://127.0.0.1:3000
|
||||
LANGBOT_BROWSER_PROFILE=/path/to/langbot-browser-profile
|
||||
LANGBOT_CHROMIUM_EXECUTABLE=/path/to/chromium-or-playwright-chrome
|
||||
LANGBOT_E2E_LOGIN_USER=qa-local@example.com
|
||||
```
|
||||
|
||||
`LANGBOT_E2E_LOGIN_USER` is a local QA account. The setup automation uses the
|
||||
LangBot recovery key from the active checkout to initialize or refresh that
|
||||
local account and write a browser `localStorage` token. It does not need the
|
||||
user's GitHub or Space credentials.
|
||||
|
||||
## 2. Check Readiness
|
||||
|
||||
From `skills/`:
|
||||
|
||||
```bash
|
||||
bin/lbs env show
|
||||
bin/lbs env doctor
|
||||
bin/lbs validate
|
||||
bin/lbs index --check
|
||||
```
|
||||
|
||||
`env doctor` should report reachable backend and frontend URLs before live
|
||||
browser cases are run. Missing Space provider credentials are not a LangBot
|
||||
product pass; classify them as `env_issue` and configure the local Space
|
||||
provider before measuring Debug Chat performance.
|
||||
|
||||
## 3. Start Services
|
||||
|
||||
Start the backend from `LANGBOT_REPO`:
|
||||
|
||||
```bash
|
||||
cd "$LANGBOT_REPO"
|
||||
uv run main.py
|
||||
```
|
||||
|
||||
Start the standalone frontend from `LANGBOT_WEB_REPO` and point it at the
|
||||
backend:
|
||||
|
||||
```bash
|
||||
cd "$LANGBOT_WEB_REPO"
|
||||
VITE_API_BASE_URL="$LANGBOT_BACKEND_URL" pnpm dev --host 0.0.0.0
|
||||
```
|
||||
|
||||
If `VITE_API_BASE_URL` is missing, browser tests can load the Vite page but send
|
||||
API requests to the frontend port, which produces false UI failures.
|
||||
|
||||
## 4. Prepare User-Path Fixtures
|
||||
|
||||
For local-agent Debug Chat cases and the user-path performance gate:
|
||||
|
||||
```bash
|
||||
node scripts/e2e/ensure-local-agent-pipeline.mjs --write-env
|
||||
```
|
||||
|
||||
The script:
|
||||
|
||||
- refreshes the local QA login and browser token;
|
||||
- marks the local wizard as skipped;
|
||||
- creates or updates a local QA pipeline;
|
||||
- scans Space LLM models, tests candidates, and switches to the first working
|
||||
Space model with tested fallback models;
|
||||
- writes `LANGBOT_PIPELINE_URL`, `LANGBOT_PIPELINE_NAME`, and local-agent
|
||||
pipeline/model variables into `skills/.env.local`;
|
||||
- returns `env_issue` when no Space model can be scanned or tested.
|
||||
|
||||
Useful model controls:
|
||||
|
||||
```bash
|
||||
LANGBOT_E2E_MODEL_TEST_LIMIT=8
|
||||
LANGBOT_E2E_MODEL_FALLBACK_COUNT=3
|
||||
LANGBOT_E2E_SKIP_MODEL_UUIDS=uuid-a,uuid-b
|
||||
LANGBOT_E2E_SKIP_MODEL_NAMES=model-a,model-b
|
||||
LANGBOT_E2E_SCAN_SPACE_MODELS=true
|
||||
```
|
||||
|
||||
The setup writes a current-runtime compatibility `max-round` value into the
|
||||
pipeline config because this backend still reads that field directly during
|
||||
message truncation. Do not treat it as a long-term QA contract.
|
||||
|
||||
## 5. Run Gates
|
||||
|
||||
Fast contract gate, no live service required:
|
||||
|
||||
```bash
|
||||
bin/lbs suite run langbot-performance-contract-gate --run-id langbot-contract-local
|
||||
```
|
||||
|
||||
Live backend gate:
|
||||
|
||||
```bash
|
||||
bin/lbs suite run langbot-live-backend-gate --run-id langbot-backend-local
|
||||
```
|
||||
|
||||
Browser-visible user-path performance gate:
|
||||
|
||||
```bash
|
||||
bin/lbs suite plan langbot-user-path-performance-gate
|
||||
bin/lbs suite run langbot-user-path-performance-gate --run-id langbot-user-path-local --include-manual-check
|
||||
```
|
||||
|
||||
`manual_check` means the agent must confirm the declared preconditions for that
|
||||
run window. When setup automation is declared, run output may stop early with
|
||||
`env_issue`; fix that environment input before treating the product path as
|
||||
measured.
|
||||
|
||||
## 6. Read Results
|
||||
|
||||
Suite reports live under `skills/reports/`. Evidence lives under
|
||||
`skills/reports/evidence/<run-id>/`.
|
||||
|
||||
For performance cases, inspect:
|
||||
|
||||
- `metrics.json` for p50/p95/p99, error rate, and total duration;
|
||||
- `automation-result.json` for threshold decisions and artifacts;
|
||||
- `console.log` and `network.log` for frontend/API failures;
|
||||
- backend logs for provider, runner, WebSocket, or persistence failures.
|
||||
|
||||
Do not call a user-path performance result a LangBot overhead regression until
|
||||
provider/tool/network time has been separated or ruled out.
|
||||
@@ -10,6 +10,7 @@ import {
|
||||
ensureEvidence,
|
||||
evidencePaths,
|
||||
loadEnvFiles,
|
||||
redact,
|
||||
resetAndAuthLocalUser,
|
||||
safeScreenshot,
|
||||
setBrowserToken,
|
||||
@@ -17,9 +18,12 @@ import {
|
||||
writeResult,
|
||||
} from "./lib/langbot-e2e.mjs";
|
||||
|
||||
const RUNNER_ID = "plugin:langbot/local-agent/default";
|
||||
const RUNNER_ID = "local-agent";
|
||||
const SPACE_PROVIDER_UUID = "00000000-0000-0000-0000-000000000000";
|
||||
const DEFAULT_PIPELINE_NAME = "Agent QA Local Agent Debug Chat";
|
||||
const DEFAULT_LOCAL_PASSWORD = "LangBotE2ELocalPass!2026";
|
||||
const DEFAULT_MODEL_TEST_LIMIT = 8;
|
||||
const DEFAULT_MODEL_FALLBACK_COUNT = 3;
|
||||
const caseId = "ensure-local-agent-pipeline";
|
||||
|
||||
await loadEnvFiles();
|
||||
@@ -45,11 +49,18 @@ const result = {
|
||||
pipeline_url: "",
|
||||
runner_id: RUNNER_ID,
|
||||
selected_model_id: "",
|
||||
selected_model_name: "",
|
||||
fallback_model_ids: [],
|
||||
model_count: 0,
|
||||
space_model_count: 0,
|
||||
scanned_space_model_count: 0,
|
||||
tested_model_count: 0,
|
||||
model_tests: [],
|
||||
created: false,
|
||||
updated: false,
|
||||
wrote_env: false,
|
||||
auth: null,
|
||||
wizard: null,
|
||||
browser_token_check: null,
|
||||
page_signal: "",
|
||||
evidence: {
|
||||
@@ -71,6 +82,7 @@ try {
|
||||
const user = env.LANGBOT_E2E_LOGIN_USER || "";
|
||||
const password = env.LANGBOT_E2E_LOGIN_PASSWORD || DEFAULT_LOCAL_PASSWORD;
|
||||
if (!user) {
|
||||
result.status = "env_issue";
|
||||
throw new Error("LANGBOT_E2E_LOGIN_USER is required so this setup can create/update the pipeline via backend API.");
|
||||
}
|
||||
|
||||
@@ -81,6 +93,13 @@ try {
|
||||
backend_token_check: auth.check,
|
||||
};
|
||||
|
||||
const wizard = await skipWizard({ backendUrl, token: auth.token });
|
||||
result.wizard = wizard;
|
||||
if (wizard.status !== "pass") {
|
||||
result.status = "fail";
|
||||
throw new Error(wizard.reason || "Failed to mark the local QA wizard as skipped.");
|
||||
}
|
||||
|
||||
const prepared = await ensureLocalAgentPipeline({
|
||||
backendUrl,
|
||||
token: auth.token,
|
||||
@@ -99,6 +118,10 @@ try {
|
||||
LANGBOT_PIPELINE_NAME: result.pipeline_name || pipelineName,
|
||||
LANGBOT_LOCAL_AGENT_PIPELINE_URL: result.pipeline_url,
|
||||
LANGBOT_LOCAL_AGENT_PIPELINE_NAME: result.pipeline_name || pipelineName,
|
||||
...(result.selected_model_id ? {
|
||||
LANGBOT_LOCAL_AGENT_MODEL_UUID: result.selected_model_id,
|
||||
LANGBOT_E2E_MODEL_UUID: result.selected_model_id,
|
||||
} : {}),
|
||||
});
|
||||
result.wrote_env = true;
|
||||
}
|
||||
@@ -127,6 +150,21 @@ try {
|
||||
|
||||
process.exit(result.status === "pass" ? 0 : result.status === "env_issue" ? 2 : 1);
|
||||
|
||||
async function skipWizard({ backendUrl, token }) {
|
||||
const response = await apiJson(backendUrl, "/api/v1/system/wizard/completed", {
|
||||
method: "POST",
|
||||
token,
|
||||
body: { status: "skipped" },
|
||||
});
|
||||
const ok = response.status < 400 && response.json.code === 0;
|
||||
return {
|
||||
status: ok ? "pass" : "fail",
|
||||
http_status: response.status,
|
||||
code: response.json.code ?? null,
|
||||
reason: ok ? "Wizard marked skipped for local QA." : response.json.msg || "Wizard status update failed.",
|
||||
};
|
||||
}
|
||||
|
||||
async function ensureLocalAgentPipeline({ backendUrl, token, pipelineName, runnerId }) {
|
||||
const [pipelineList, modelList] = await Promise.all([
|
||||
apiJson(backendUrl, "/api/v1/pipelines", { token }),
|
||||
@@ -149,7 +187,19 @@ async function ensureLocalAgentPipeline({ backendUrl, token, pipelineName, runne
|
||||
}
|
||||
|
||||
const models = modelList.json.data?.models || [];
|
||||
const selectedModel = models.find((model) => model.uuid) || null;
|
||||
const skippedModelIds = new Set(
|
||||
String(env.LANGBOT_E2E_SKIP_MODEL_UUIDS || "")
|
||||
.split(",")
|
||||
.map((item) => item.trim())
|
||||
.filter(Boolean),
|
||||
);
|
||||
const skippedModelNames = new Set(
|
||||
String(env.LANGBOT_E2E_SKIP_MODEL_NAMES || "")
|
||||
.split(",")
|
||||
.map((item) => item.trim())
|
||||
.filter(Boolean),
|
||||
);
|
||||
const spaceModels = models.filter((model) => isSpaceModel(model) && !skippedModelIds.has(model.uuid));
|
||||
const pipelines = pipelineList.json.data?.pipelines || [];
|
||||
let pipeline = pipelines.find((item) => item.name === pipelineName) || null;
|
||||
let created = false;
|
||||
@@ -170,6 +220,7 @@ async function ensureLocalAgentPipeline({ backendUrl, token, pipelineName, runne
|
||||
reason: createdResponse.json.msg || "Failed to create pipeline.",
|
||||
create_status: createdResponse.status,
|
||||
model_count: models.length,
|
||||
space_model_count: spaceModels.length,
|
||||
};
|
||||
}
|
||||
const pipelineId = createdResponse.json.data?.uuid || "";
|
||||
@@ -183,6 +234,7 @@ async function ensureLocalAgentPipeline({ backendUrl, token, pipelineName, runne
|
||||
status: "fail",
|
||||
reason: "Pipeline was not created or resolved.",
|
||||
model_count: models.length,
|
||||
space_model_count: spaceModels.length,
|
||||
};
|
||||
}
|
||||
|
||||
@@ -194,27 +246,37 @@ async function ensureLocalAgentPipeline({ backendUrl, token, pipelineName, runne
|
||||
get_status: loaded.status,
|
||||
pipeline_id: pipeline.uuid,
|
||||
model_count: models.length,
|
||||
space_model_count: spaceModels.length,
|
||||
};
|
||||
}
|
||||
pipeline = loaded.json.data.pipeline;
|
||||
|
||||
const config = pipeline.config && typeof pipeline.config === "object" ? pipeline.config : {};
|
||||
const ai = config.ai && typeof config.ai === "object" ? config.ai : {};
|
||||
const runnerConfig = ai.runner_config && typeof ai.runner_config === "object" ? ai.runner_config : {};
|
||||
const rawExistingLocalAgentConfig = runnerConfig[runnerId] && typeof runnerConfig[runnerId] === "object"
|
||||
? runnerConfig[runnerId]
|
||||
const rawExistingLocalAgentConfig = ai["local-agent"] && typeof ai["local-agent"] === "object"
|
||||
? ai["local-agent"]
|
||||
: {};
|
||||
const existingLocalAgentConfig = rawExistingLocalAgentConfig;
|
||||
const existingModel = existingLocalAgentConfig.model && typeof existingLocalAgentConfig.model === "object"
|
||||
? existingLocalAgentConfig.model
|
||||
: {};
|
||||
const requestedModelId = env.LANGBOT_LOCAL_AGENT_MODEL_UUID || env.LANGBOT_E2E_MODEL_UUID || "";
|
||||
const selectedModelId = requestedModelId || existingModel.primary || selectedModel?.uuid || "";
|
||||
const selected = await selectWorkingSpaceModel({
|
||||
backendUrl,
|
||||
token,
|
||||
models,
|
||||
skippedModelIds,
|
||||
skippedModelNames,
|
||||
requestedModelId,
|
||||
existingModelId: existingModel.primary || "",
|
||||
});
|
||||
const selectedModelId = selected.selected_model_id || "";
|
||||
const localAgentConfig = {
|
||||
timeout: 300,
|
||||
prompt: [{ role: "system", content: "You are a helpful assistant." }],
|
||||
"remove-think": false,
|
||||
"knowledge-bases": [],
|
||||
"box-session-id-template": "{launcher_type}_{launcher_id}",
|
||||
"retrieval-top-k": 5,
|
||||
"rerank-model": "",
|
||||
"rerank-top-k": 5,
|
||||
@@ -227,9 +289,11 @@ async function ensureLocalAgentPipeline({ backendUrl, token, pipelineName, runne
|
||||
"context-keep-recent-tokens": 20000,
|
||||
"context-summary-tokens": 8000,
|
||||
...existingLocalAgentConfig,
|
||||
// Current backend truncation still reads this field directly.
|
||||
"max-round": positiveInteger(existingLocalAgentConfig["max-round"], 10),
|
||||
model: {
|
||||
primary: selectedModelId,
|
||||
fallbacks: requestedModelId ? [] : Array.isArray(existingModel.fallbacks) ? existingModel.fallbacks : [],
|
||||
fallbacks: selected.fallback_model_ids || [],
|
||||
},
|
||||
};
|
||||
const updatedConfig = {
|
||||
@@ -239,12 +303,10 @@ async function ensureLocalAgentPipeline({ backendUrl, token, pipelineName, runne
|
||||
runner: {
|
||||
...(ai.runner && typeof ai.runner === "object" ? ai.runner : {}),
|
||||
id: runnerId,
|
||||
runner: runnerId,
|
||||
"expire-time": 0,
|
||||
},
|
||||
runner_config: {
|
||||
...runnerConfig,
|
||||
[runnerId]: localAgentConfig,
|
||||
},
|
||||
"local-agent": localAgentConfig,
|
||||
},
|
||||
};
|
||||
|
||||
@@ -265,19 +327,31 @@ async function ensureLocalAgentPipeline({ backendUrl, token, pipelineName, runne
|
||||
update_status: updateResponse.status,
|
||||
pipeline_id: pipeline.uuid,
|
||||
model_count: models.length,
|
||||
space_model_count: spaceModels.length,
|
||||
scanned_space_model_count: selected.scanned_space_model_count,
|
||||
tested_model_count: selected.tested_model_count,
|
||||
model_tests: selected.model_tests,
|
||||
selected_model_id: selectedModelId,
|
||||
selected_model_name: selected.selected_model_name,
|
||||
fallback_model_ids: selected.fallback_model_ids,
|
||||
};
|
||||
}
|
||||
|
||||
return {
|
||||
status: selectedModelId ? "pass" : "env_issue",
|
||||
reason: selectedModelId
|
||||
? "Local-agent pipeline is configured for Debug Chat."
|
||||
: "Pipeline was created but no LLM model is configured in this LangBot instance.",
|
||||
? `Local-agent pipeline is configured for Debug Chat with Space model ${selected.selected_model_name || selectedModelId} and ${selected.fallback_model_ids.length} fallback(s).`
|
||||
: selected.reason || "No working Space LLM model is configured in this LangBot instance.",
|
||||
pipeline_id: pipeline.uuid,
|
||||
pipeline_name: pipeline.name,
|
||||
pipeline_name: pipelineName,
|
||||
model_count: models.length,
|
||||
space_model_count: spaceModels.length,
|
||||
scanned_space_model_count: selected.scanned_space_model_count,
|
||||
tested_model_count: selected.tested_model_count,
|
||||
model_tests: selected.model_tests,
|
||||
selected_model_id: selectedModelId,
|
||||
selected_model_name: selected.selected_model_name,
|
||||
fallback_model_ids: selected.fallback_model_ids,
|
||||
created,
|
||||
updated: true,
|
||||
};
|
||||
@@ -287,6 +361,229 @@ function isApiFailure(response) {
|
||||
return response.status >= 400 || (response.json.code !== undefined && response.json.code !== 0);
|
||||
}
|
||||
|
||||
function isSpaceModel(model) {
|
||||
const provider = model?.provider && typeof model.provider === "object" ? model.provider : {};
|
||||
return model?.provider_uuid === SPACE_PROVIDER_UUID
|
||||
|| provider.uuid === SPACE_PROVIDER_UUID
|
||||
|| provider.requester === "space-chat-completions"
|
||||
|| provider.name === "LangBot Models";
|
||||
}
|
||||
|
||||
async function selectWorkingSpaceModel({
|
||||
backendUrl,
|
||||
token,
|
||||
models,
|
||||
skippedModelIds,
|
||||
skippedModelNames,
|
||||
requestedModelId,
|
||||
existingModelId,
|
||||
}) {
|
||||
const modelTests = [];
|
||||
const testLimit = positiveInteger(env.LANGBOT_E2E_MODEL_TEST_LIMIT, DEFAULT_MODEL_TEST_LIMIT);
|
||||
const fallbackCount = positiveInteger(env.LANGBOT_E2E_MODEL_FALLBACK_COUNT, DEFAULT_MODEL_FALLBACK_COUNT);
|
||||
const workingModels = [];
|
||||
const spaceModels = rankModels(models.filter((model) => (
|
||||
model.uuid
|
||||
&& isSpaceModel(model)
|
||||
&& !skippedModelIds.has(model.uuid)
|
||||
&& !skippedModelNames.has(model.name)
|
||||
)));
|
||||
const requestedModel = requestedModelId
|
||||
? spaceModels.find((model) => model.uuid === requestedModelId) || null
|
||||
: null;
|
||||
const existingModel = existingModelId
|
||||
? spaceModels.find((model) => model.uuid === existingModelId) || null
|
||||
: null;
|
||||
const candidates = uniqueCandidates([
|
||||
...(requestedModel ? [existingCandidate(requestedModel, "requested")] : []),
|
||||
...(existingModel ? [existingCandidate(existingModel, "existing-pipeline")] : []),
|
||||
...spaceModels.map((model) => existingCandidate(model, "configured-space")),
|
||||
]);
|
||||
|
||||
let scanResult = { status: "skipped", models: [], reason: "" };
|
||||
if (env.LANGBOT_E2E_SCAN_SPACE_MODELS !== "false") {
|
||||
scanResult = await scanSpaceModels({ backendUrl, token });
|
||||
if (scanResult.status === "pass") {
|
||||
const knownNames = new Set(spaceModels.map((model) => model.name));
|
||||
candidates.push(...scanResult.models
|
||||
.filter((model) => model.name && !knownNames.has(model.name) && !skippedModelNames.has(model.name))
|
||||
.map((model) => scannedCandidate(model)));
|
||||
}
|
||||
}
|
||||
|
||||
const unique = uniqueCandidates(candidates);
|
||||
for (const candidate of unique.slice(0, testLimit)) {
|
||||
const test = await ensureAndTestModel({ backendUrl, token, candidate });
|
||||
modelTests.push(test);
|
||||
if (test.status === "pass" && test.model_uuid) {
|
||||
workingModels.push(test);
|
||||
if (workingModels.length >= fallbackCount + 1) break;
|
||||
}
|
||||
}
|
||||
|
||||
if (workingModels.length > 0) {
|
||||
const [primary, ...fallbacks] = workingModels;
|
||||
return {
|
||||
status: "pass",
|
||||
reason: "",
|
||||
selected_model_id: primary.model_uuid,
|
||||
selected_model_name: primary.model_name,
|
||||
fallback_model_ids: fallbacks.map((model) => model.model_uuid),
|
||||
scanned_space_model_count: scanResult.models.length,
|
||||
tested_model_count: modelTests.length,
|
||||
model_tests: modelTests,
|
||||
};
|
||||
}
|
||||
|
||||
const baseReason = unique.length === 0
|
||||
? scanResult.reason || "No Space LLM model candidates are available."
|
||||
: `No working Space LLM model found after testing ${modelTests.length} candidate(s).`;
|
||||
return {
|
||||
status: "env_issue",
|
||||
reason: requestedModelId && !requestedModel
|
||||
? `Requested Space LLM model ${requestedModelId} is missing or skipped; ${baseReason}`
|
||||
: baseReason,
|
||||
selected_model_id: "",
|
||||
selected_model_name: "",
|
||||
fallback_model_ids: [],
|
||||
scanned_space_model_count: scanResult.models.length,
|
||||
tested_model_count: modelTests.length,
|
||||
model_tests: modelTests,
|
||||
};
|
||||
}
|
||||
|
||||
async function scanSpaceModels({ backendUrl, token }) {
|
||||
const response = await apiJson(
|
||||
backendUrl,
|
||||
`/api/v1/provider/providers/${encodeURIComponent(SPACE_PROVIDER_UUID)}/scan-models?type=llm`,
|
||||
{ token },
|
||||
);
|
||||
if (isApiFailure(response)) {
|
||||
return {
|
||||
status: "env_issue",
|
||||
models: [],
|
||||
reason: safeReason(response.json.msg || response.json.message || "Failed to scan Space LLM models."),
|
||||
};
|
||||
}
|
||||
return {
|
||||
status: "pass",
|
||||
models: response.json.data?.models || [],
|
||||
reason: "",
|
||||
};
|
||||
}
|
||||
|
||||
async function ensureAndTestModel({ backendUrl, token, candidate }) {
|
||||
let modelUuid = candidate.uuid || "";
|
||||
let created = false;
|
||||
if (!modelUuid) {
|
||||
const create = await apiJson(backendUrl, "/api/v1/provider/models/llm", {
|
||||
method: "POST",
|
||||
token,
|
||||
body: {
|
||||
name: candidate.name,
|
||||
provider_uuid: SPACE_PROVIDER_UUID,
|
||||
abilities: candidate.abilities || [],
|
||||
context_length: candidate.context_length ?? null,
|
||||
extra_args: {},
|
||||
prefered_ranking: positiveInteger(candidate.prefered_ranking, 0),
|
||||
},
|
||||
});
|
||||
modelUuid = create.json.data?.uuid || "";
|
||||
if (isApiFailure(create) || !modelUuid) {
|
||||
return modelTestResult(candidate, {
|
||||
status: "fail",
|
||||
reason: safeReason(create.json.msg || "Failed to create scanned Space model."),
|
||||
http_status: create.status,
|
||||
});
|
||||
}
|
||||
created = true;
|
||||
}
|
||||
|
||||
const test = await apiJson(backendUrl, `/api/v1/provider/models/llm/${encodeURIComponent(modelUuid)}/test`, {
|
||||
method: "POST",
|
||||
token,
|
||||
body: { extra_args: {} },
|
||||
});
|
||||
const passed = !isApiFailure(test);
|
||||
if (!passed && created) {
|
||||
await apiJson(backendUrl, `/api/v1/provider/models/llm/${encodeURIComponent(modelUuid)}`, {
|
||||
method: "DELETE",
|
||||
token,
|
||||
}).catch(() => {});
|
||||
}
|
||||
return modelTestResult(candidate, {
|
||||
status: passed ? "pass" : "fail",
|
||||
reason: passed ? "" : safeReason(test.json.msg || test.json.message || "Space model test failed."),
|
||||
http_status: test.status,
|
||||
model_uuid: modelUuid,
|
||||
created,
|
||||
});
|
||||
}
|
||||
|
||||
function modelTestResult(candidate, details) {
|
||||
return {
|
||||
source: candidate.source,
|
||||
model_uuid: details.model_uuid || candidate.uuid || "",
|
||||
model_name: candidate.name,
|
||||
status: details.status,
|
||||
reason: details.reason || "",
|
||||
http_status: details.http_status ?? null,
|
||||
created: Boolean(details.created),
|
||||
};
|
||||
}
|
||||
|
||||
function existingCandidate(model, source) {
|
||||
return {
|
||||
source,
|
||||
uuid: model.uuid,
|
||||
name: model.name,
|
||||
abilities: model.abilities || [],
|
||||
context_length: model.context_length,
|
||||
prefered_ranking: model.prefered_ranking,
|
||||
};
|
||||
}
|
||||
|
||||
function scannedCandidate(model) {
|
||||
return {
|
||||
source: "scanned-space",
|
||||
uuid: "",
|
||||
name: model.name || model.id,
|
||||
abilities: model.abilities || [],
|
||||
context_length: model.context_length,
|
||||
prefered_ranking: model.prefered_ranking,
|
||||
};
|
||||
}
|
||||
|
||||
function uniqueCandidates(candidates) {
|
||||
const seen = new Set();
|
||||
const result = [];
|
||||
for (const candidate of candidates) {
|
||||
const key = candidate.uuid ? `uuid:${candidate.uuid}` : `name:${candidate.name}`;
|
||||
if (!candidate.name || seen.has(key)) continue;
|
||||
seen.add(key);
|
||||
result.push(candidate);
|
||||
}
|
||||
return result;
|
||||
}
|
||||
|
||||
function rankModels(models) {
|
||||
return [...models].sort((left, right) => {
|
||||
const leftRank = Number.isFinite(Number(left.prefered_ranking)) ? Number(left.prefered_ranking) : 9999;
|
||||
const rightRank = Number.isFinite(Number(right.prefered_ranking)) ? Number(right.prefered_ranking) : 9999;
|
||||
if (leftRank !== rightRank) return leftRank - rightRank;
|
||||
return String(left.name || "").localeCompare(String(right.name || ""));
|
||||
});
|
||||
}
|
||||
|
||||
function positiveInteger(value, fallback) {
|
||||
const parsed = Number(value);
|
||||
return Number.isInteger(parsed) && parsed > 0 ? parsed : fallback;
|
||||
}
|
||||
|
||||
function safeReason(value) {
|
||||
return redact(String(value || "")).slice(0, 1000);
|
||||
}
|
||||
|
||||
async function upsertEnvLocal(path, updates) {
|
||||
let text = "";
|
||||
try {
|
||||
|
||||
@@ -72,6 +72,7 @@ export async function writeResult(paths, result) {
|
||||
}
|
||||
|
||||
export async function loadEnvFiles(paths = ["skills/.env", "skills/.env.local"]) {
|
||||
const processEnvKeys = new Set(Object.keys(env));
|
||||
for (const path of paths) {
|
||||
let text = "";
|
||||
try {
|
||||
@@ -86,7 +87,7 @@ export async function loadEnvFiles(paths = ["skills/.env", "skills/.env.local"])
|
||||
if (equals <= 0) continue;
|
||||
const key = trimmed.slice(0, equals).trim();
|
||||
const value = trimmed.slice(equals + 1).trim().replace(/^["']|["']$/g, "");
|
||||
if (!(key in env)) env[key] = value;
|
||||
if (!processEnvKeys.has(key)) env[key] = value;
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
@@ -1057,8 +1057,13 @@
|
||||
"metrics"
|
||||
],
|
||||
"automation": "scripts/e2e/pipeline-debug-chat.mjs",
|
||||
"setup_automation": [],
|
||||
"setup_provides_env": [],
|
||||
"setup_automation": [
|
||||
"node:scripts/e2e/ensure-local-agent-pipeline.mjs --write-env"
|
||||
],
|
||||
"setup_provides_env": [
|
||||
"LANGBOT_PIPELINE_URL",
|
||||
"LANGBOT_PIPELINE_NAME"
|
||||
],
|
||||
"evidence_required": [
|
||||
"ui",
|
||||
"screenshot",
|
||||
|
||||
@@ -53,7 +53,7 @@ Start the new frontend from the web repo:
|
||||
|
||||
```bash
|
||||
cd "$LANGBOT_WEB_REPO"
|
||||
npm run dev
|
||||
VITE_API_BASE_URL="$LANGBOT_BACKEND_URL" pnpm dev --host 0.0.0.0
|
||||
```
|
||||
|
||||
Healthy startup includes:
|
||||
@@ -68,6 +68,10 @@ Quick check:
|
||||
curl -I --max-time 3 "$LANGBOT_FRONTEND_URL"
|
||||
```
|
||||
|
||||
If `VITE_API_BASE_URL` is missing, Vite still serves the page but frontend API
|
||||
calls may go to the frontend port instead of the backend port. That produces
|
||||
false browser failures in login, wizard, pipeline, and Debug Chat cases.
|
||||
|
||||
## Completion Signal
|
||||
|
||||
Environment setup is not complete until the required frontend/backend URLs are reachable and the chosen browser-control path can open the WebUI.
|
||||
|
||||
@@ -39,6 +39,11 @@ automation_debug_chat_response_p95_ms: "120000"
|
||||
automation_debug_chat_max_error_rate: "0"
|
||||
metrics_thresholds_json: '{"response_p95_ms":{"max":120000},"error_rate":{"max":0}}'
|
||||
load_profile_json: '{"prompts":1,"browser":true,"path":"Pipeline Debug Chat","metric":"send-to-visible-completion"}'
|
||||
setup_automation:
|
||||
- "node:scripts/e2e/ensure-local-agent-pipeline.mjs --write-env"
|
||||
setup_provides_env:
|
||||
- LANGBOT_PIPELINE_URL
|
||||
- LANGBOT_PIPELINE_NAME
|
||||
preconditions:
|
||||
- "LANGBOT_PIPELINE_URL or LANGBOT_PIPELINE_NAME points to the pipeline intended for this Debug Chat performance run."
|
||||
- "The target pipeline is safe to reset Debug Chat history for this run."
|
||||
|
||||
@@ -159,6 +159,29 @@ provider latency, model route health, plugin/runtime logs, WebSocket behavior,
|
||||
and browser console/network evidence before attributing the whole duration to
|
||||
LangBot.
|
||||
|
||||
### User-Path Gate Runbook
|
||||
|
||||
1. Start the backend and frontend. The frontend must be launched with
|
||||
`VITE_API_BASE_URL="$LANGBOT_BACKEND_URL"` so browser API calls reach the
|
||||
backend.
|
||||
2. Run `node scripts/e2e/ensure-local-agent-pipeline.mjs --write-env`. The
|
||||
setup refreshes the local QA login, skips the wizard, prepares a Debug Chat
|
||||
pipeline, scans Space models, tests candidates, writes tested fallback
|
||||
models, and writes the selected pipeline/model env values to
|
||||
`skills/.env.local`.
|
||||
3. If setup returns `env_issue`, read `model_tests` and provider errors first.
|
||||
A missing Space key, failed Space scan, or unavailable model route is not a
|
||||
LangBot performance regression.
|
||||
4. Run
|
||||
`bin/lbs suite run langbot-user-path-performance-gate --include-manual-check`.
|
||||
5. Interpret `response_p95_ms` as browser-visible send-to-completion time. It
|
||||
includes provider latency; use backend logs and model test evidence to
|
||||
separate LangBot overhead from the external model route.
|
||||
|
||||
The setup keeps a `max-round` value in the generated pipeline config only
|
||||
because the current backend truncator still reads that field directly. Do not
|
||||
use it as a quality requirement for future local-agent behavior.
|
||||
|
||||
## Running The First Gate
|
||||
|
||||
Start with the reusable suite:
|
||||
|
||||
Reference in New Issue
Block a user