docs(docker): add Box sandbox runtime to k8s manifest and deploy guide

The k8s manifest was missing the Box runtime that backs the sandbox
tools, the activate skill tool, skill add/edit and stdio MCP. Add a
langbot-box Deployment/Service (port 5410), wire langbot to it via
BOX__RUNTIME__ENDPOINT (explicit Service name since the in-container
default langbot_box uses an underscore, invalid for k8s DNS), and share
the Box workspace root as a node hostPath pinned via podAffinity so the
node Docker daemon resolves bind-mount paths consistently. Document the
component, the shared-FS constraint, security implications and readiness
checks in README_K8S.md (zh + en).
This commit is contained in:
RockChinQ
2026-06-07 11:18:27 -04:00
parent e223edeb45
commit 5c3a619e2d
2 changed files with 269 additions and 2 deletions

View File

@@ -30,10 +30,18 @@ Kubernetes 部署包含以下组件:
- WebSocket 通信(端口 5400
- 插件数据持久化卷
3. **持久化存储**:
3. **langbot-box**: Box 沙箱运行时服务(可选)
- WebSocket 通信(端口 5410
- 为 LangBot 提供沙箱工具exec / read / write / edit / glob / grep`activate` 技能工具、技能新增/编辑、以及 stdio 模式的 MCP 服务器
- 通过挂载节点的 Docker socket 创建沙箱容器(镜像仅自带 Docker CLI不含 dockerd / nsjail
- 使用 hostPath 作为工作区根目录,并通过 podAffinity 与 langbot 调度到同一节点详见下方「Box 沙箱运行时」一节)
- 如不需要沙箱能力,可不部署此组件,并在 langbot 上设置 `BOX__ENABLED=false`
4. **持久化存储**:
- `langbot-data`: LangBot 主数据
- `langbot-plugins`: 插件文件
- `langbot-plugin-runtime-data`: 插件运行时数据
- Box 工作区根目录使用节点上的 hostPath`/app/data/box`),而非 PVC原因见下方说明
### 快速开始
@@ -152,6 +160,48 @@ resources:
cpu: "2000m"
```
### Box 沙箱运行时
`langbot-box` 为 LangBot 提供代码沙箱能力,支撑以下功能:
- 原生沙箱工具:`exec` / `read` / `write` / `edit` / `glob` / `grep`
- 技能(Skill)的 `activate` 工具、技能的新增与编辑
- stdio 模式的 MCP 服务器
它是**可选组件**。不部署时,LangBot 仍可正常运行,仪表盘和技能列表只读可见,但上述沙箱相关能力会被禁用。此时请在 `langbot` Deployment 上设置 `BOX__ENABLED=false`(或在 `data/config.yaml` 中设置 `box.enabled: false`),以匹配实际部署。
#### 工作原理与关键约束
LangBot 官方镜像**只内置了 Docker CLI**(不含 dockerd,也不含 nsjail)。因此 Box 运行时通过挂载到节点的 Docker socket(`/var/run/docker.sock`)来创建沙箱容器——它本身不在 Pod 内跑 dockerd。
由此带来一个**关键约束**:负责创建沙箱容器的是**节点上的 Docker 守护进程**,它解析 bind-mount 路径时使用的是**节点文件系统**的视角。所以 Box 工作区根目录必须在以下三处是**同一个绝对路径**:
1. 节点上的实际路径
2. `langbot-box` 容器内的挂载路径
3. 它创建的每个沙箱容器内的挂载路径
这正是本 manifest 不用普通 PVC、而用 `hostPath`(固定在 `/app/data/box`)的原因:Pod 内 PVC 的路径只存在于 Pod 的 mount namespace 中,节点的 dockerd 看不到。同时,`langbot``langbot-box` 通过 `podAffinity` 被强制调度到**同一节点**,以共享这个 hostPath。
#### 连接与配置
- `langbot` 通过 WebSocket 连接 Box 运行时,端点由 ConfigMap 中的 `BOX__RUNTIME__ENDPOINT: ws://langbot-box:5410` 指定。
> 注意:容器内默认主机名是 `langbot_box`(带下划线),而下划线不是合法的 Kubernetes DNS 名称,因此这里**必须显式**用合法的 Service 名 `langbot-box` 指定端点。
- Box 运行时**不读取**自身的 `box.local.*` / `BOX__*` 环境变量;它的配置由 LangBot 在连接时通过 INIT RPC 下发。因此 `BOX__LOCAL__*`(`HOST_ROOT` / `DEFAULT_WORKSPACE` / `SKILLS_ROOT` / `ALLOWED_MOUNT_ROOTS`)都配置在 `langbot` Deployment 上,其中 `HOST_ROOT` 必须与两侧的 `box-root` 挂载路径一致(`/app/data/box`)。
#### 安全提示
挂载节点的 Docker socket 会让 Box 运行时(以及在沙箱中执行的任意代码)获得对节点的**实质 root 权限**。请仅在你信任该工作负载的节点上部署 Box,最好使用专用节点池并配合污点/容忍(taint/toleration)隔离。若需要更强的隔离边界,可将 `box.backend` 切换为 `e2b`(设置 `E2B_API_KEY`),并移除 docker.sock 挂载与 hostPath。
#### 验证 Box 是否就绪
```bash
# 查看 Box 运行时日志(应能看到选用的 backend,例如 "using backend: docker")
kubectl logs -n langbot -l app=langbot-box -f
# 在 langbot 日志中确认已连上 Box 运行时
kubectl logs -n langbot -l app=langbot | grep -i "box runtime"
```
### 常用操作
#### 查看日志
@@ -343,10 +393,18 @@ The Kubernetes deployment includes the following components:
- WebSocket communication (port 5400)
- Plugin data persistence volume
3. **Persistent Storage**:
3. **langbot-box**: Box sandbox runtime service (optional)
- WebSocket communication (port 5410)
- Backs LangBot's sandbox tools (exec / read / write / edit / glob / grep), the `activate` skill tool, skill add/edit, and stdio-mode MCP servers
- Creates sandbox containers via the node's mounted Docker socket (the image ships only the Docker CLI — no dockerd / nsjail)
- Uses a hostPath as its workspace root and is co-scheduled with langbot on the same node via podAffinity (see the "Box sandbox runtime" section below)
- If you do not need the sandbox, skip this component and set `BOX__ENABLED=false` on langbot
4. **Persistent Storage**:
- `langbot-data`: LangBot main data
- `langbot-plugins`: Plugin files
- `langbot-plugin-runtime-data`: Plugin runtime data
- The Box workspace root uses a node hostPath (`/app/data/box`), not a PVC (see the explanation below)
### Quick Start
@@ -465,6 +523,48 @@ resources:
cpu: "2000m"
```
### Box Sandbox Runtime
`langbot-box` provides the code-sandbox capability backing the following LangBot features:
- Native sandbox tools: `exec` / `read` / `write` / `edit` / `glob` / `grep`
- The skill `activate` tool, and skill add/edit
- stdio-mode MCP servers
It is an **optional component**. Without it, LangBot still runs and the dashboard / skills list remain visible (read-only), but the sandbox features above are disabled. In that case set `BOX__ENABLED=false` on the `langbot` Deployment (or `box.enabled: false` in `data/config.yaml`) to match.
#### How it works & the key constraint
The official LangBot image ships **only the Docker CLI** (no dockerd, no nsjail). The Box runtime therefore creates sandbox containers by talking to the node's Docker daemon over the mounted socket (`/var/run/docker.sock`) — it does not run dockerd inside the Pod.
This imposes a **key constraint**: the daemon that creates sandbox containers is the **node's Docker daemon**, which resolves bind-mount paths against the **node filesystem**. So the Box workspace root must be the **same absolute path** in all three places:
1. The actual path on the node
2. The mount path inside the `langbot-box` container
3. The mount path inside every sandbox container it spawns
This is exactly why this manifest uses a `hostPath` (fixed at `/app/data/box`) instead of a regular PVC: a PVC path only exists inside the Pod's mount namespace, which the node's dockerd cannot see. `langbot` and `langbot-box` are also pinned to the **same node** via `podAffinity` so they share this hostPath.
#### Connection & configuration
- `langbot` connects to the Box runtime over WebSocket, using the endpoint from the ConfigMap: `BOX__RUNTIME__ENDPOINT: ws://langbot-box:5410`.
> Note: the in-container default hostname is `langbot_box` (with an underscore), which is **not** a valid Kubernetes DNS name. The endpoint must therefore be set **explicitly** to the valid Service name `langbot-box`.
- The Box runtime does **not** read its own `box.local.*` / `BOX__*` environment variables; its configuration is pushed by LangBot via the INIT RPC on connect. So `BOX__LOCAL__*` (`HOST_ROOT` / `DEFAULT_WORKSPACE` / `SKILLS_ROOT` / `ALLOWED_MOUNT_ROOTS`) are set on the `langbot` Deployment, where `HOST_ROOT` must match the `box-root` mountPath on both sides (`/app/data/box`).
#### Security note
Mounting the node's Docker socket grants the Box runtime (and any code executed in the sandbox) effective root on the node. Only deploy Box on nodes you trust for this workload — ideally a dedicated node pool isolated with taints/tolerations. For a stronger isolation boundary, switch `box.backend` to `e2b` (set `E2B_API_KEY`) and drop the docker.sock mount + hostPath.
#### Verify Box is ready
```bash
# Box runtime logs (should show the selected backend, e.g. "using backend: docker")
kubectl logs -n langbot -l app=langbot-box -f
# Confirm langbot connected to the Box runtime
kubectl logs -n langbot -l app=langbot | grep -i "box runtime"
```
### Common Operations
#### View Logs

View File

@@ -83,6 +83,11 @@ metadata:
data:
TZ: "Asia/Shanghai"
PLUGIN__RUNTIME_WS_URL: "ws://langbot-plugin-runtime:5400/control/ws"
# Box sandbox runtime endpoint. LangBot connects to the Box runtime over
# WebSocket. The hostname MUST match the langbot-box Service name. Note the
# in-container default ("langbot_box") uses an underscore, which is an
# invalid Kubernetes DNS name — so the endpoint is always set explicitly here.
BOX__RUNTIME__ENDPOINT: "ws://langbot-box:5410"
---
# Deployment for LangBot Plugin Runtime
@@ -169,6 +174,136 @@ spec:
protocol: TCP
name: runtime
---
# Deployment for LangBot Box (sandbox) runtime
#
# The Box runtime backs LangBot's sandbox tools (exec / read / write / edit /
# glob / grep), the `activate` skill tool, skill add/edit, and stdio-mode MCP
# servers. It is OPTIONAL: if you do not deploy it, set `BOX__ENABLED=false` on
# the langbot Deployment (or `box.enabled: false` in config.yaml) so the
# dashboard renders cleanly with sandbox features disabled.
#
# IMPORTANT — how the sandbox actually runs:
# The bundled image ships only the Docker CLI (no dockerd, no nsjail). The Box
# runtime therefore creates sandbox containers by talking to a Docker daemon
# over the mounted socket (`/var/run/docker.sock`). Because that daemon
# resolves bind-mount paths on the NODE filesystem, the Box workspace root
# must be the SAME absolute path inside the box container, inside every
# sandbox container it spawns, AND on the node. That is why this manifest uses
# a hostPath at a fixed absolute path (/app/data/box) and pins langbot + box
# to the same node via podAffinity. A normal PVC will NOT work for the box
# workspace, because the node's dockerd cannot see paths that exist only
# inside the pod's mount namespace.
#
# Security note: mounting the host Docker socket grants the Box runtime (and any
# code executed in the sandbox) effective root on the node. Only deploy Box on
# nodes you trust for this workload, ideally a dedicated node pool. For a
# stronger isolation boundary, switch box.backend to 'e2b' (set E2B_API_KEY) and
# drop the docker.sock mount + hostPath entirely.
apiVersion: apps/v1
kind: Deployment
metadata:
name: langbot-box
namespace: langbot
labels:
app: langbot-box
spec:
replicas: 1
selector:
matchLabels:
app: langbot-box
template:
metadata:
labels:
app: langbot-box
spec:
# Pin to the same node as langbot so they share the hostPath box root.
affinity:
podAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
- labelSelector:
matchLabels:
app: langbot
topologyKey: kubernetes.io/hostname
containers:
- name: langbot-box
image: rockchin/langbot:latest
imagePullPolicy: Always
# Launched through the same CLI entry point as the plugin runtime.
# No flag => WebSocket control transport (default), listening on 5410.
command: ["uv", "run", "--no-sync", "-m", "langbot_plugin.cli.__init__", "box"]
ports:
- containerPort: 5410
name: box-rpc
protocol: TCP
env:
- name: TZ
valueFrom:
configMapKeyRef:
name: langbot-config
key: TZ
# The Box runtime does NOT read box.local.* / BOX__* from its own env;
# it receives its configuration from LangBot via the INIT RPC action.
# Do not add BOX__* here — they would be silently ignored.
volumeMounts:
# Box workspace root — identical path on node, box, and sandbox
# containers (see the IMPORTANT note above).
- name: box-root
mountPath: /app/data/box
# Host Docker socket — the sandbox backend uses it to create containers.
- name: docker-sock
mountPath: /var/run/docker.sock
resources:
requests:
memory: "256Mi"
cpu: "100m"
limits:
memory: "1Gi"
cpu: "1000m"
livenessProbe:
tcpSocket:
port: 5410
initialDelaySeconds: 20
periodSeconds: 10
timeoutSeconds: 5
failureThreshold: 3
readinessProbe:
tcpSocket:
port: 5410
initialDelaySeconds: 10
periodSeconds: 5
timeoutSeconds: 3
failureThreshold: 3
volumes:
- name: box-root
hostPath:
path: /app/data/box
type: DirectoryOrCreate
- name: docker-sock
hostPath:
path: /var/run/docker.sock
type: Socket
restartPolicy: Always
---
# Service for LangBot Box runtime
apiVersion: v1
kind: Service
metadata:
name: langbot-box
namespace: langbot
labels:
app: langbot-box
spec:
type: ClusterIP
selector:
app: langbot-box
ports:
- port: 5410
targetPort: 5410
protocol: TCP
name: box-rpc
---
# Deployment for LangBot
apiVersion: apps/v1
@@ -213,11 +348,36 @@ spec:
configMapKeyRef:
name: langbot-config
key: PLUGIN__RUNTIME_WS_URL
# Box (sandbox) runtime endpoint. Connects LangBot to the langbot-box
# Service over WebSocket. Remove this (and the langbot-box Deployment)
# and set BOX__ENABLED=false if you do not want the sandbox.
- name: BOX__RUNTIME__ENDPOINT
valueFrom:
configMapKeyRef:
name: langbot-config
key: BOX__RUNTIME__ENDPOINT
# box.local.* config — forwarded to the Box runtime via INIT RPC. The
# host_root MUST match the box-root hostPath mountPath below AND the box
# Deployment's box-root mountPath, so that skill package paths resolve
# identically on both sides and on the node's Docker daemon.
- name: BOX__LOCAL__HOST_ROOT
value: "/app/data/box"
- name: BOX__LOCAL__DEFAULT_WORKSPACE
value: "default"
- name: BOX__LOCAL__SKILLS_ROOT
value: "skills"
- name: BOX__LOCAL__ALLOWED_MOUNT_ROOTS
value: "/app/data/box"
volumeMounts:
- name: data
mountPath: /app/data
- name: plugins
mountPath: /app/plugins
# Same node-level box root as the langbot-box Deployment. Mounted over
# the data PVC's /app/data/box subpath so both LangBot and the Box
# runtime (and the node's dockerd) agree on one absolute path.
- name: box-root
mountPath: /app/data/box
resources:
requests:
memory: "1Gi"
@@ -250,6 +410,13 @@ spec:
- name: plugins
persistentVolumeClaim:
claimName: langbot-plugins
# Node-level box workspace root, shared with the langbot-box Deployment.
# hostPath (not PVC) because the node's Docker daemon must see the same
# absolute path when bind-mounting workspaces into sandbox containers.
- name: box-root
hostPath:
path: /app/data/box
type: DirectoryOrCreate
restartPolicy: Always
---