test(skills): prepare user path performance gate

2026-06-25 06:54:19 +00:00 · 2026-06-25 10:07:04 +08:00
parent 67437c2f5a
commit 8749a9b56f
9 changed files with 493 additions and 19 deletions
@@ -48,6 +48,7 @@ coverage.xml
 .coverage
 src/langbot/web/
 testsdk/
+.qa/

 # Build artifacts
 /dist
@@ -26,7 +26,7 @@ and LangBot's own Local Agent) working with the LangBot ecosystem.

 ## Quick start (for an AI agent)

-1. Read this README, `AGENTS.md`, and `qa-agent-docs/` to understand the layout.
+1. Read this README, `AGENTS.md`, and `docs/user-guide.md` to understand the layout.
 2. Read `skills/.env` for shared local defaults. On a new machine, copy
   `skills/.env.example` to `skills/.env.local` (gitignored) and override
   machine-specific values there. Never commit secrets.
@@ -0,0 +1,138 @@
+# LangBot QA Skills User Guide
+
+Use this guide as the first operational path after reading `README.md` and
+`AGENTS.md`.
+
+## 1. Configure Local Inputs
+
+Read `skills/.env`, then create `skills/.env.local` for machine-local values.
+Do not commit `.env.local`, browser profiles, reports, tokens, API keys, OAuth
+state, or provider credentials.
+
+Minimum local fields for live browser QA:
+
+```bash
+LANGBOT_REPO=/path/to/LangBot
+LANGBOT_WEB_REPO=/path/to/LangBot/web
+LANGBOT_BACKEND_URL=http://127.0.0.1:5300
+LANGBOT_FRONTEND_URL=http://127.0.0.1:3000
+LANGBOT_DEV_FRONTEND_URL=http://127.0.0.1:3000
+LANGBOT_BROWSER_PROFILE=/path/to/langbot-browser-profile
+LANGBOT_CHROMIUM_EXECUTABLE=/path/to/chromium-or-playwright-chrome
+LANGBOT_E2E_LOGIN_USER=qa-local@example.com
+```
+
+`LANGBOT_E2E_LOGIN_USER` is a local QA account. The setup automation uses the
+LangBot recovery key from the active checkout to initialize or refresh that
+local account and write a browser `localStorage` token. It does not need the
+user's GitHub or Space credentials.
+
+## 2. Check Readiness
+
+From `skills/`:
+
+```bash
+bin/lbs env show
+bin/lbs env doctor
+bin/lbs validate
+bin/lbs index --check
+```
+
+`env doctor` should report reachable backend and frontend URLs before live
+browser cases are run. Missing Space provider credentials are not a LangBot
+product pass; classify them as `env_issue` and configure the local Space
+provider before measuring Debug Chat performance.
+
+## 3. Start Services
+
+Start the backend from `LANGBOT_REPO`:
+
+```bash
+cd "$LANGBOT_REPO"
+uv run main.py
+```
+
+Start the standalone frontend from `LANGBOT_WEB_REPO` and point it at the
+backend:
+
+```bash
+cd "$LANGBOT_WEB_REPO"
+VITE_API_BASE_URL="$LANGBOT_BACKEND_URL" pnpm dev --host 0.0.0.0
+```
+
+If `VITE_API_BASE_URL` is missing, browser tests can load the Vite page but send
+API requests to the frontend port, which produces false UI failures.
+
+## 4. Prepare User-Path Fixtures
+
+For local-agent Debug Chat cases and the user-path performance gate:
+
+```bash
+node scripts/e2e/ensure-local-agent-pipeline.mjs --write-env
+```
+
+The script:
+
+- refreshes the local QA login and browser token;
+- marks the local wizard as skipped;
+- creates or updates a local QA pipeline;
+- scans Space LLM models, tests candidates, and switches to the first working
+  Space model with tested fallback models;
+- writes `LANGBOT_PIPELINE_URL`, `LANGBOT_PIPELINE_NAME`, and local-agent
+  pipeline/model variables into `skills/.env.local`;
+- returns `env_issue` when no Space model can be scanned or tested.
+
+Useful model controls:
+
+```bash
+LANGBOT_E2E_MODEL_TEST_LIMIT=8
+LANGBOT_E2E_MODEL_FALLBACK_COUNT=3
+LANGBOT_E2E_SKIP_MODEL_UUIDS=uuid-a,uuid-b
+LANGBOT_E2E_SKIP_MODEL_NAMES=model-a,model-b
+LANGBOT_E2E_SCAN_SPACE_MODELS=true
+```
+
+The setup writes a current-runtime compatibility `max-round` value into the
+pipeline config because this backend still reads that field directly during
+message truncation. Do not treat it as a long-term QA contract.
+
+## 5. Run Gates
+
+Fast contract gate, no live service required:
+
+```bash
+bin/lbs suite run langbot-performance-contract-gate --run-id langbot-contract-local
+```
+
+Live backend gate:
+
+```bash
+bin/lbs suite run langbot-live-backend-gate --run-id langbot-backend-local
+```
+
+Browser-visible user-path performance gate:
+
+```bash
+bin/lbs suite plan langbot-user-path-performance-gate
+bin/lbs suite run langbot-user-path-performance-gate --run-id langbot-user-path-local --include-manual-check
+```
+
+`manual_check` means the agent must confirm the declared preconditions for that
+run window. When setup automation is declared, run output may stop early with
+`env_issue`; fix that environment input before treating the product path as
+measured.
+
+## 6. Read Results
+
+Suite reports live under `skills/reports/`. Evidence lives under
+`skills/reports/evidence/<run-id>/`.
+
+For performance cases, inspect:
+
+- `metrics.json` for p50/p95/p99, error rate, and total duration;
+- `automation-result.json` for threshold decisions and artifacts;
+- `console.log` and `network.log` for frontend/API failures;
+- backend logs for provider, runner, WebSocket, or persistence failures.
+
+Do not call a user-path performance result a LangBot overhead regression until
+provider/tool/network time has been separated or ruled out.
@@ -10,6 +10,7 @@ import {
  ensureEvidence,
  evidencePaths,
  loadEnvFiles,
+  redact,
  resetAndAuthLocalUser,
  safeScreenshot,
  setBrowserToken,
@@ -17,9 +18,12 @@ import {
  writeResult,
 } from "./lib/langbot-e2e.mjs";

-const RUNNER_ID = "plugin:langbot/local-agent/default";
+const RUNNER_ID = "local-agent";
+const SPACE_PROVIDER_UUID = "00000000-0000-0000-0000-000000000000";
 const DEFAULT_PIPELINE_NAME = "Agent QA Local Agent Debug Chat";
 const DEFAULT_LOCAL_PASSWORD = "LangBotE2ELocalPass!2026";
+const DEFAULT_MODEL_TEST_LIMIT = 8;
+const DEFAULT_MODEL_FALLBACK_COUNT = 3;
 const caseId = "ensure-local-agent-pipeline";

 await loadEnvFiles();
@@ -45,11 +49,18 @@ const result = {
  pipeline_url: "",
  runner_id: RUNNER_ID,
  selected_model_id: "",
+  selected_model_name: "",
+  fallback_model_ids: [],
  model_count: 0,
+  space_model_count: 0,
+  scanned_space_model_count: 0,
+  tested_model_count: 0,
+  model_tests: [],
  created: false,
  updated: false,
  wrote_env: false,
  auth: null,
+  wizard: null,
  browser_token_check: null,
  page_signal: "",
  evidence: {
@@ -71,6 +82,7 @@ try {
  const user = env.LANGBOT_E2E_LOGIN_USER || "";
  const password = env.LANGBOT_E2E_LOGIN_PASSWORD || DEFAULT_LOCAL_PASSWORD;
  if (!user) {
+    result.status = "env_issue";
    throw new Error("LANGBOT_E2E_LOGIN_USER is required so this setup can create/update the pipeline via backend API.");
  }

@@ -81,6 +93,13 @@ try {
    backend_token_check: auth.check,
  };

+  const wizard = await skipWizard({ backendUrl, token: auth.token });
+  result.wizard = wizard;
+  if (wizard.status !== "pass") {
+    result.status = "fail";
+    throw new Error(wizard.reason || "Failed to mark the local QA wizard as skipped.");
+  }
+
  const prepared = await ensureLocalAgentPipeline({
    backendUrl,
    token: auth.token,
@@ -99,6 +118,10 @@ try {
      LANGBOT_PIPELINE_NAME: result.pipeline_name || pipelineName,
      LANGBOT_LOCAL_AGENT_PIPELINE_URL: result.pipeline_url,
      LANGBOT_LOCAL_AGENT_PIPELINE_NAME: result.pipeline_name || pipelineName,
+      ...(result.selected_model_id ? {
+        LANGBOT_LOCAL_AGENT_MODEL_UUID: result.selected_model_id,
+        LANGBOT_E2E_MODEL_UUID: result.selected_model_id,
+      } : {}),
    });
    result.wrote_env = true;
  }
@@ -127,6 +150,21 @@ try {

 process.exit(result.status === "pass" ? 0 : result.status === "env_issue" ? 2 : 1);

+async function skipWizard({ backendUrl, token }) {
+  const response = await apiJson(backendUrl, "/api/v1/system/wizard/completed", {
+    method: "POST",
+    token,
+    body: { status: "skipped" },
+  });
+  const ok = response.status < 400 && response.json.code === 0;
+  return {
+    status: ok ? "pass" : "fail",
+    http_status: response.status,
+    code: response.json.code ?? null,
+    reason: ok ? "Wizard marked skipped for local QA." : response.json.msg || "Wizard status update failed.",
+  };
+}
+
 async function ensureLocalAgentPipeline({ backendUrl, token, pipelineName, runnerId }) {
  const [pipelineList, modelList] = await Promise.all([
    apiJson(backendUrl, "/api/v1/pipelines", { token }),
@@ -149,7 +187,19 @@ async function ensureLocalAgentPipeline({ backendUrl, token, pipelineName, runne
  }

  const models = modelList.json.data?.models || [];
-  const selectedModel = models.find((model) => model.uuid) || null;
+  const skippedModelIds = new Set(
+    String(env.LANGBOT_E2E_SKIP_MODEL_UUIDS || "")
+      .split(",")
+      .map((item) => item.trim())
+      .filter(Boolean),
+  );
+  const skippedModelNames = new Set(
+    String(env.LANGBOT_E2E_SKIP_MODEL_NAMES || "")
+      .split(",")
+      .map((item) => item.trim())
+      .filter(Boolean),
+  );
+  const spaceModels = models.filter((model) => isSpaceModel(model) && !skippedModelIds.has(model.uuid));
  const pipelines = pipelineList.json.data?.pipelines || [];
  let pipeline = pipelines.find((item) => item.name === pipelineName) || null;
  let created = false;
@@ -170,6 +220,7 @@ async function ensureLocalAgentPipeline({ backendUrl, token, pipelineName, runne
        reason: createdResponse.json.msg || "Failed to create pipeline.",
        create_status: createdResponse.status,
        model_count: models.length,
+        space_model_count: spaceModels.length,
      };
    }
    const pipelineId = createdResponse.json.data?.uuid || "";
@@ -183,6 +234,7 @@ async function ensureLocalAgentPipeline({ backendUrl, token, pipelineName, runne
      status: "fail",
      reason: "Pipeline was not created or resolved.",
      model_count: models.length,
+      space_model_count: spaceModels.length,
    };
  }

@@ -194,27 +246,37 @@ async function ensureLocalAgentPipeline({ backendUrl, token, pipelineName, runne
      get_status: loaded.status,
      pipeline_id: pipeline.uuid,
      model_count: models.length,
+      space_model_count: spaceModels.length,
    };
  }
  pipeline = loaded.json.data.pipeline;

  const config = pipeline.config && typeof pipeline.config === "object" ? pipeline.config : {};
  const ai = config.ai && typeof config.ai === "object" ? config.ai : {};
-  const runnerConfig = ai.runner_config && typeof ai.runner_config === "object" ? ai.runner_config : {};
-  const rawExistingLocalAgentConfig = runnerConfig[runnerId] && typeof runnerConfig[runnerId] === "object"
-    ? runnerConfig[runnerId]
+  const rawExistingLocalAgentConfig = ai["local-agent"] && typeof ai["local-agent"] === "object"
+    ? ai["local-agent"]
    : {};
  const existingLocalAgentConfig = rawExistingLocalAgentConfig;
  const existingModel = existingLocalAgentConfig.model && typeof existingLocalAgentConfig.model === "object"
    ? existingLocalAgentConfig.model
    : {};
  const requestedModelId = env.LANGBOT_LOCAL_AGENT_MODEL_UUID || env.LANGBOT_E2E_MODEL_UUID || "";
-  const selectedModelId = requestedModelId || existingModel.primary || selectedModel?.uuid || "";
+  const selected = await selectWorkingSpaceModel({
+    backendUrl,
+    token,
+    models,
+    skippedModelIds,
+    skippedModelNames,
+    requestedModelId,
+    existingModelId: existingModel.primary || "",
+  });
+  const selectedModelId = selected.selected_model_id || "";
  const localAgentConfig = {
    timeout: 300,
    prompt: [{ role: "system", content: "You are a helpful assistant." }],
    "remove-think": false,
    "knowledge-bases": [],
+    "box-session-id-template": "{launcher_type}_{launcher_id}",
    "retrieval-top-k": 5,
    "rerank-model": "",
    "rerank-top-k": 5,
@@ -227,9 +289,11 @@ async function ensureLocalAgentPipeline({ backendUrl, token, pipelineName, runne
    "context-keep-recent-tokens": 20000,
    "context-summary-tokens": 8000,
    ...existingLocalAgentConfig,
+    // Current backend truncation still reads this field directly.
+    "max-round": positiveInteger(existingLocalAgentConfig["max-round"], 10),
    model: {
      primary: selectedModelId,
-      fallbacks: requestedModelId ? [] : Array.isArray(existingModel.fallbacks) ? existingModel.fallbacks : [],
+      fallbacks: selected.fallback_model_ids || [],
    },
  };
  const updatedConfig = {
@@ -239,12 +303,10 @@ async function ensureLocalAgentPipeline({ backendUrl, token, pipelineName, runne
      runner: {
        ...(ai.runner && typeof ai.runner === "object" ? ai.runner : {}),
        id: runnerId,
+        runner: runnerId,
        "expire-time": 0,
      },
-      runner_config: {
-        ...runnerConfig,
-        [runnerId]: localAgentConfig,
-      },
+      "local-agent": localAgentConfig,
    },
  };

@@ -265,19 +327,31 @@ async function ensureLocalAgentPipeline({ backendUrl, token, pipelineName, runne
      update_status: updateResponse.status,
      pipeline_id: pipeline.uuid,
      model_count: models.length,
+      space_model_count: spaceModels.length,
+      scanned_space_model_count: selected.scanned_space_model_count,
+      tested_model_count: selected.tested_model_count,
+      model_tests: selected.model_tests,
      selected_model_id: selectedModelId,
+      selected_model_name: selected.selected_model_name,
+      fallback_model_ids: selected.fallback_model_ids,
    };
  }

  return {
    status: selectedModelId ? "pass" : "env_issue",
    reason: selectedModelId
-      ? "Local-agent pipeline is configured for Debug Chat."
-      : "Pipeline was created but no LLM model is configured in this LangBot instance.",
+      ? `Local-agent pipeline is configured for Debug Chat with Space model ${selected.selected_model_name || selectedModelId} and ${selected.fallback_model_ids.length} fallback(s).`
+      : selected.reason || "No working Space LLM model is configured in this LangBot instance.",
    pipeline_id: pipeline.uuid,
-    pipeline_name: pipeline.name,
+    pipeline_name: pipelineName,
    model_count: models.length,
+    space_model_count: spaceModels.length,
+    scanned_space_model_count: selected.scanned_space_model_count,
+    tested_model_count: selected.tested_model_count,
+    model_tests: selected.model_tests,
    selected_model_id: selectedModelId,
+    selected_model_name: selected.selected_model_name,
+    fallback_model_ids: selected.fallback_model_ids,
    created,
    updated: true,
  };
@@ -287,6 +361,229 @@ function isApiFailure(response) {
  return response.status >= 400 || (response.json.code !== undefined && response.json.code !== 0);
 }

+function isSpaceModel(model) {
+  const provider = model?.provider && typeof model.provider === "object" ? model.provider : {};
+  return model?.provider_uuid === SPACE_PROVIDER_UUID
+    || provider.uuid === SPACE_PROVIDER_UUID
+    || provider.requester === "space-chat-completions"
+    || provider.name === "LangBot Models";
+}
+
+async function selectWorkingSpaceModel({
+  backendUrl,
+  token,
+  models,
+  skippedModelIds,
+  skippedModelNames,
+  requestedModelId,
+  existingModelId,
+}) {
+  const modelTests = [];
+  const testLimit = positiveInteger(env.LANGBOT_E2E_MODEL_TEST_LIMIT, DEFAULT_MODEL_TEST_LIMIT);
+  const fallbackCount = positiveInteger(env.LANGBOT_E2E_MODEL_FALLBACK_COUNT, DEFAULT_MODEL_FALLBACK_COUNT);
+  const workingModels = [];
+  const spaceModels = rankModels(models.filter((model) => (
+    model.uuid
+      && isSpaceModel(model)
+      && !skippedModelIds.has(model.uuid)
+      && !skippedModelNames.has(model.name)
+  )));
+  const requestedModel = requestedModelId
+    ? spaceModels.find((model) => model.uuid === requestedModelId) || null
+    : null;
+  const existingModel = existingModelId
+    ? spaceModels.find((model) => model.uuid === existingModelId) || null
+    : null;
+  const candidates = uniqueCandidates([
+    ...(requestedModel ? [existingCandidate(requestedModel, "requested")] : []),
+    ...(existingModel ? [existingCandidate(existingModel, "existing-pipeline")] : []),
+    ...spaceModels.map((model) => existingCandidate(model, "configured-space")),
+  ]);
+
+  let scanResult = { status: "skipped", models: [], reason: "" };
+  if (env.LANGBOT_E2E_SCAN_SPACE_MODELS !== "false") {
+    scanResult = await scanSpaceModels({ backendUrl, token });
+    if (scanResult.status === "pass") {
+      const knownNames = new Set(spaceModels.map((model) => model.name));
+      candidates.push(...scanResult.models
+        .filter((model) => model.name && !knownNames.has(model.name) && !skippedModelNames.has(model.name))
+        .map((model) => scannedCandidate(model)));
+    }
+  }
+
+  const unique = uniqueCandidates(candidates);
+  for (const candidate of unique.slice(0, testLimit)) {
+    const test = await ensureAndTestModel({ backendUrl, token, candidate });
+    modelTests.push(test);
+    if (test.status === "pass" && test.model_uuid) {
+      workingModels.push(test);
+      if (workingModels.length >= fallbackCount + 1) break;
+    }
+  }
+
+  if (workingModels.length > 0) {
+    const [primary, ...fallbacks] = workingModels;
+    return {
+      status: "pass",
+      reason: "",
+      selected_model_id: primary.model_uuid,
+      selected_model_name: primary.model_name,
+      fallback_model_ids: fallbacks.map((model) => model.model_uuid),
+      scanned_space_model_count: scanResult.models.length,
+      tested_model_count: modelTests.length,
+      model_tests: modelTests,
+    };
+  }
+
+  const baseReason = unique.length === 0
+    ? scanResult.reason || "No Space LLM model candidates are available."
+    : `No working Space LLM model found after testing ${modelTests.length} candidate(s).`;
+  return {
+    status: "env_issue",
+    reason: requestedModelId && !requestedModel
+      ? `Requested Space LLM model ${requestedModelId} is missing or skipped; ${baseReason}`
+      : baseReason,
+    selected_model_id: "",
+    selected_model_name: "",
+    fallback_model_ids: [],
+    scanned_space_model_count: scanResult.models.length,
+    tested_model_count: modelTests.length,
+    model_tests: modelTests,
+  };
+}
+
+async function scanSpaceModels({ backendUrl, token }) {
+  const response = await apiJson(
+    backendUrl,
+    `/api/v1/provider/providers/${encodeURIComponent(SPACE_PROVIDER_UUID)}/scan-models?type=llm`,
+    { token },
+  );
+  if (isApiFailure(response)) {
+    return {
+      status: "env_issue",
+      models: [],
+      reason: safeReason(response.json.msg || response.json.message || "Failed to scan Space LLM models."),
+    };
+  }
+  return {
+    status: "pass",
+    models: response.json.data?.models || [],
+    reason: "",
+  };
+}
+
+async function ensureAndTestModel({ backendUrl, token, candidate }) {
+  let modelUuid = candidate.uuid || "";
+  let created = false;
+  if (!modelUuid) {
+    const create = await apiJson(backendUrl, "/api/v1/provider/models/llm", {
+      method: "POST",
+      token,
+      body: {
+        name: candidate.name,
+        provider_uuid: SPACE_PROVIDER_UUID,
+        abilities: candidate.abilities || [],
+        context_length: candidate.context_length ?? null,
+        extra_args: {},
+        prefered_ranking: positiveInteger(candidate.prefered_ranking, 0),
+      },
+    });
+    modelUuid = create.json.data?.uuid || "";
+    if (isApiFailure(create) || !modelUuid) {
+      return modelTestResult(candidate, {
+        status: "fail",
+        reason: safeReason(create.json.msg || "Failed to create scanned Space model."),
+        http_status: create.status,
+      });
+    }
+    created = true;
+  }
+
+  const test = await apiJson(backendUrl, `/api/v1/provider/models/llm/${encodeURIComponent(modelUuid)}/test`, {
+    method: "POST",
+    token,
+    body: { extra_args: {} },
+  });
+  const passed = !isApiFailure(test);
+  if (!passed && created) {
+    await apiJson(backendUrl, `/api/v1/provider/models/llm/${encodeURIComponent(modelUuid)}`, {
+      method: "DELETE",
+      token,
+    }).catch(() => {});
+  }
+  return modelTestResult(candidate, {
+    status: passed ? "pass" : "fail",
+    reason: passed ? "" : safeReason(test.json.msg || test.json.message || "Space model test failed."),
+    http_status: test.status,
+    model_uuid: modelUuid,
+    created,
+  });
+}
+
+function modelTestResult(candidate, details) {
+  return {
+    source: candidate.source,
+    model_uuid: details.model_uuid || candidate.uuid || "",
+    model_name: candidate.name,
+    status: details.status,
+    reason: details.reason || "",
+    http_status: details.http_status ?? null,
+    created: Boolean(details.created),
+  };
+}
+
+function existingCandidate(model, source) {
+  return {
+    source,
+    uuid: model.uuid,
+    name: model.name,
+    abilities: model.abilities || [],
+    context_length: model.context_length,
+    prefered_ranking: model.prefered_ranking,
+  };
+}
+
+function scannedCandidate(model) {
+  return {
+    source: "scanned-space",
+    uuid: "",
+    name: model.name || model.id,
+    abilities: model.abilities || [],
+    context_length: model.context_length,
+    prefered_ranking: model.prefered_ranking,
+  };
+}
+
+function uniqueCandidates(candidates) {
+  const seen = new Set();
+  const result = [];
+  for (const candidate of candidates) {
+    const key = candidate.uuid ? `uuid:${candidate.uuid}` : `name:${candidate.name}`;
+    if (!candidate.name || seen.has(key)) continue;
+    seen.add(key);
+    result.push(candidate);
+  }
+  return result;
+}
+
+function rankModels(models) {
+  return [...models].sort((left, right) => {
+    const leftRank = Number.isFinite(Number(left.prefered_ranking)) ? Number(left.prefered_ranking) : 9999;
+    const rightRank = Number.isFinite(Number(right.prefered_ranking)) ? Number(right.prefered_ranking) : 9999;
+    if (leftRank !== rightRank) return leftRank - rightRank;
+    return String(left.name || "").localeCompare(String(right.name || ""));
+  });
+}
+
+function positiveInteger(value, fallback) {
+  const parsed = Number(value);
+  return Number.isInteger(parsed) && parsed > 0 ? parsed : fallback;
+}
+
+function safeReason(value) {
+  return redact(String(value || "")).slice(0, 1000);
+}
+
 async function upsertEnvLocal(path, updates) {
  let text = "";
  try {
@@ -72,6 +72,7 @@ export async function writeResult(paths, result) {
 }

 export async function loadEnvFiles(paths = ["skills/.env", "skills/.env.local"]) {
+  const processEnvKeys = new Set(Object.keys(env));
  for (const path of paths) {
    let text = "";
    try {
@@ -86,7 +87,7 @@ export async function loadEnvFiles(paths = ["skills/.env", "skills/.env.local"])
      if (equals <= 0) continue;
      const key = trimmed.slice(0, equals).trim();
      const value = trimmed.slice(equals + 1).trim().replace(/^["']|["']$/g, "");
-      if (!(key in env)) env[key] = value;
+      if (!processEnvKeys.has(key)) env[key] = value;
    }
  }
 }
@@ -1057,8 +1057,13 @@
            "metrics"
          ],
          "automation": "scripts/e2e/pipeline-debug-chat.mjs",
-          "setup_automation": [],
-          "setup_provides_env": [],
+          "setup_automation": [
+            "node:scripts/e2e/ensure-local-agent-pipeline.mjs --write-env"
+          ],
+          "setup_provides_env": [
+            "LANGBOT_PIPELINE_URL",
+            "LANGBOT_PIPELINE_NAME"
+          ],
          "evidence_required": [
            "ui",
            "screenshot",
@@ -53,7 +53,7 @@ Start the new frontend from the web repo:

 ```bash
 cd "$LANGBOT_WEB_REPO"
-npm run dev
+VITE_API_BASE_URL="$LANGBOT_BACKEND_URL" pnpm dev --host 0.0.0.0
 ```

 Healthy startup includes:
@@ -68,6 +68,10 @@ Quick check:
 curl -I --max-time 3 "$LANGBOT_FRONTEND_URL"
 ```

+If `VITE_API_BASE_URL` is missing, Vite still serves the page but frontend API
+calls may go to the frontend port instead of the backend port. That produces
+false browser failures in login, wizard, pipeline, and Debug Chat cases.
+
 ## Completion Signal

 Environment setup is not complete until the required frontend/backend URLs are reachable and the chosen browser-control path can open the WebUI.
@@ -39,6 +39,11 @@ automation_debug_chat_response_p95_ms: "120000"
 automation_debug_chat_max_error_rate: "0"
 metrics_thresholds_json: '{"response_p95_ms":{"max":120000},"error_rate":{"max":0}}'
 load_profile_json: '{"prompts":1,"browser":true,"path":"Pipeline Debug Chat","metric":"send-to-visible-completion"}'
+setup_automation:
+  - "node:scripts/e2e/ensure-local-agent-pipeline.mjs --write-env"
+setup_provides_env:
+  - LANGBOT_PIPELINE_URL
+  - LANGBOT_PIPELINE_NAME
 preconditions:
  - "LANGBOT_PIPELINE_URL or LANGBOT_PIPELINE_NAME points to the pipeline intended for this Debug Chat performance run."
  - "The target pipeline is safe to reset Debug Chat history for this run."
@@ -159,6 +159,29 @@ provider latency, model route health, plugin/runtime logs, WebSocket behavior,
 and browser console/network evidence before attributing the whole duration to
 LangBot.

+### User-Path Gate Runbook
+
+1. Start the backend and frontend. The frontend must be launched with
+   `VITE_API_BASE_URL="$LANGBOT_BACKEND_URL"` so browser API calls reach the
+   backend.
+2. Run `node scripts/e2e/ensure-local-agent-pipeline.mjs --write-env`. The
+   setup refreshes the local QA login, skips the wizard, prepares a Debug Chat
+   pipeline, scans Space models, tests candidates, writes tested fallback
+   models, and writes the selected pipeline/model env values to
+   `skills/.env.local`.
+3. If setup returns `env_issue`, read `model_tests` and provider errors first.
+   A missing Space key, failed Space scan, or unavailable model route is not a
+   LangBot performance regression.
+4. Run
+   `bin/lbs suite run langbot-user-path-performance-gate --include-manual-check`.
+5. Interpret `response_p95_ms` as browser-visible send-to-completion time. It
+   includes provider latency; use backend logs and model test evidence to
+   separate LangBot overhead from the external model route.
+
+The setup keeps a `max-round` value in the generated pipeline config only
+because the current backend truncator still reads that field directly. Do not
+use it as a quality requirement for future local-agent behavior.
+
 ## Running The First Gate

 Start with the reusable suite: