Commit Graph

3031 Commits

Author SHA1 Message Date
dependabot[bot] 7a844682b3 chore(deps): bump github.com/shirou/gopsutil/v4 from 4.26.5 to 4.26.6 (#5730)
Bumps [github.com/shirou/gopsutil/v4](https://github.com/shirou/gopsutil) from 4.26.5 to 4.26.6.
- [Release notes](https://github.com/shirou/gopsutil/releases)
- [Commits](https://github.com/shirou/gopsutil/compare/v4.26.5...v4.26.6)

---
updated-dependencies:
- dependency-name: github.com/shirou/gopsutil/v4
  dependency-version: 4.26.6
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2026-07-02 18:26:39 +02:00
dependabot[bot] 6626bf4a07 chore(deps): bump google.golang.org/grpc from 1.81.1 to 1.82.0 (#5729)
Bumps [google.golang.org/grpc](https://github.com/grpc/grpc-go) from 1.81.1 to 1.82.0.
- [Release notes](https://github.com/grpc/grpc-go/releases)
- [Commits](https://github.com/grpc/grpc-go/compare/v1.81.1...v1.82.0)

---
updated-dependencies:
- dependency-name: google.golang.org/grpc
  dependency-version: 1.82.0
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2026-07-02 18:26:13 +02:00
MHSanaei c0df365524 chore(frontend): bump minor npm deps
Update frontend dependencies to newer patch/minor versions in package.json and refresh package-lock accordingly. This includes runtime libraries (i18next, react-router-dom, recharts) and tooling updates (typescript-eslint, vite) to keep the frontend stack current and aligned.
2026-07-02 18:24:09 +02:00
Vitaliy Pavlov 5361b56e5e fix(update): avoid full dnf system upgrade (#5717) 2026-07-02 18:20:15 +02:00
nima1024m 9e13b32c34 fix: make all self-managed file downloads/installs atomic, with real completion status (#5711)
* fix(script): download the live x-ui.sh script atomically before replacing it

update_menu(), update_shell(), and update.sh's update_x-ui() all overwrote
/usr/bin/x-ui in place via `curl -o`, truncating and rewriting the same
inode a currently-running x-ui process may still be reading from. A
network hiccup or slow write during that overwrite leaves a
half-old/half-new script on disk, which then fails with bogus syntax
errors on the next run. Download to /usr/bin/x-ui-temp and `mv -f` into
place instead, matching the atomic pattern install.sh already uses.

Also fixes update_menu() checking chmod's exit code instead of curl's,
which meant a failed download could still report "Update successful."

* fix(script): close remaining gaps in the atomic script-update path

Code review of the previous commit found the atomic mv fix was itself
incomplete:

- None of the mv -f calls checked their exit status, so a failed move
  fell through to chmod and "success" messaging while /usr/bin/x-ui
  stayed on the old file.
- update_shell()'s `[[ -s x-ui-temp ]]` guard couldn't tell "curl -z
  got a 304, nothing to do" from "a stale temp file survived an
  earlier crashed run" -- the latter could get moved into place with
  no freshness check.
- update_menu(), update_shell(), and update_x-ui() all hardcoded the
  same /usr/bin/x-ui-temp path, so two concurrent updates (e.g. a
  cron auto-update racing an interactive menu update) could collide.
- update.sh's update_x-ui() was missing the non-empty-file guard
  update_shell() already had.

x-ui.sh's update_menu() and update_shell() now share a
replace_xui_script() helper that uses a PID-suffixed temp path
(/usr/bin/x-ui-temp.$$), pre-cleans it before every attempt, and
checks the exit status of curl, the non-empty test, and mv before
treating the update as successful. update.sh's update_x-ui() gets the
same sequence inlined (it's fetched as a standalone script and can't
call x-ui.sh's function), closing the missing-guard gap and using its
own unique temp path.

* fix(script,panel): harden the remaining self-update download paths

install.sh had the same unguarded /usr/bin/x-ui-temp overwrite the two
already-fixed scripts had: no exit-status check on mv, and a fixed temp
name shared with x-ui.sh/update.sh's (now-unique) temp files. Give it
its own PID-suffixed temp path, an empty-file guard, and an mv
exit-status check, matching the pattern used there.

Audited the web dashboard's Go-native updater (panel.go) for the same
bug class: it already uses os.CreateTemp for a genuinely unique temp
file and cleans up via both a deferred Remove and a shell EXIT trap, so
it was never exposed to the fixed-path race. It was missing a check
for a zero-byte download (a 200 OK with an empty body would chmod +x
and exec an empty script) -- added that alongside the existing size
cap.

Not addressed here: once startUpdate()'s child process starts, the Go
service releases it and returns success immediately. If update.sh
fails partway through, the still-running old panel keeps answering
/status, so the frontend's poll can report success with no update
having happened. Fixing that needs update.sh to signal completion
status back and the frontend to check it -- a separate follow-up.

* feat(panel): report real completion status for the web self-update

Fixes the fire-and-forget gap flagged in the atomic-overwrite fix: once
startUpdate() launches update.sh detached, the Go service had no way to
learn whether it actually succeeded. If update.sh failed partway
(network drop, disk full, permission denied), the still-running old
panel kept answering /status, so the frontend's poll reported success
with nothing having changed.

update.sh now writes its outcome to a small JSON status file
(/etc/x-ui/update-status.json by default) via `trap ... EXIT`, which
covers every exit path in the script -- including the two bare `exit 1`
call sites that don't go through the existing _fail() helper. The Go
service generates a run ID before launching, passes it and the status
path to update.sh via XUI_UPDATE_RUN_ID/XUI_UPDATE_STATUS_FILE, and a
new GET /panel/api/server/getUpdateStatus endpoint reports it back. The
frontend now polls that instead of blindly trusting HTTP reachability,
and shows a distinct error or "couldn't confirm" message instead of
silently reloading into a false success.

Adversarial review of this surfaced three more issues, fixed here:
- No lock stopped two concurrent /updatePanel calls from launching two
  update.sh runs that would race each other on the actual update work
  (tar extraction, service unit swap). Added an in-memory guard with a
  5-minute self-expiring window, so a run that never reaches a terminal
  state doesn't lock out retries indefinitely.
- XUI_UPDATE_RUN_ID is read from the environment and was interpolated
  unquoted into the status JSON; a malformed value would produce
  invalid JSON. Now validated as digits-only before use.
- The run ID is a UnixNano timestamp (19 digits), sent as a raw JSON
  number it would lose precision in JavaScript (past
  Number.MAX_SAFE_INTEGER), letting two different runs round to the
  same value on the wire and defeat the whole comparison. It's now a
  decimal string end to end (Go, the status file, and the generated
  frontend type).

install.sh's equivalent temp-file/mv path and the Go-native
downloadPanelUpdater() path were audited for the same bug classes
during this work; findings from that audit were addressed separately.

* fix(panel): release the update lock as soon as the run finishes

An exhaustive multi-angle review of the whole branch (12 finder angles,
3-vote adversarial verification, a fresh-eyes sweep) surfaced a real
bug in the concurrency guard added in the previous commit, plus several
smaller issues; this fixes what's actionable now.

The bug: acquireUpdateSlot only ever released on the 5-minute stale
timeout or if launching itself failed. If update.sh launched fine but
failed fast (bad GitHub API response, "x-ui not installed", any of its
early exit paths), the status file correctly reported "failed" within
seconds, but a retry was still rejected with "a panel update is
already in progress" for up to 5 more minutes -- the guard never
looked at the very status file this branch built to know a run was
done. It now tracks which run ID currently holds the slot and checks
that run's own status before falling back to the timeout, so a fast
failure clears the way for an immediate retry. Added a regression test
for this, plus one confirming a stale, unrelated runID can't be
mistaken for the current run finishing.

Also:
- Added a genuinely concurrent test for the guard: 200 goroutines
  racing acquireUpdateSlot, asserting exactly one wins. The previous
  tests only ever called it from one goroutine, so they gave no signal
  if the mutex's check-then-set were silently broken -- verified this
  by temporarily removing the lock and confirming the old tests still
  passed while the new one caught it immediately under -race.
- Removed the redundant upfront "pending" status write: GetUpdateStatus
  already defaults a missing/stale file to pending, and the frontend
  matches by run ID regardless, so the write changed no observable
  behavior. Deleted writeUpdateStatus entirely since that was its only
  caller.
- Renamed replace_xui_script()'s unclear "conditional" parameter to
  use_if_modified_since, matching what it actually controls.
- Added HTTP-level tests for the new getUpdateStatus endpoint,
  including a regression test that the runId wire format is a JSON
  string (decoding into a Go string field fails outright if it were
  ever a bare number). updatePanel's actual launch path is not
  covered: on a Linux test runner it would make a real network call
  and could exec a real update.sh, so only its non-Linux guard path is
  safely testable without mocking.

Not fixed here, tracked separately: the same unsafe-overwrite pattern
this branch eliminated for /usr/bin/x-ui is still present for the
systemd unit file install in update.sh and install.sh (lower severity
since systemd only reads it on daemon-reload, not continuously); and
startUpdate's systemd-run-vs-detached-fallback branching has no test
coverage since testing it safely needs dependency injection this fix
doesn't warrant bundling in.

* fix(script): make systemd unit file installation atomic

Same anti-pattern as the /usr/bin/x-ui overwrite fixed earlier: every
site that lands the systemd unit at ${xui_service}/x-ui.service --
copying it from the extracted release tarball, or falling back to a
GitHub download per distro family -- wrote straight onto the live
path via cp/curl, no temp file, no verification. A network drop
mid-download or an interrupted cp leaves the unit file truncated;
systemd then fails to parse it on the next daemon-reload/start,
leaving the panel unable to come up until an operator manually
re-copies a good unit file.

Lower severity than the /usr/bin/x-ui case (systemd only reads this
file on demand at daemon-reload time, not continuously the way bash
interprets a running script line by line), but it's the identical
gap, just left uncovered when that fix landed.

Added a small shared helper in both update.sh and install.sh --
_install_xui_service_unit() -- covering both source types (cp from
the tarball, curl from GitHub): write to a PID-suffixed temp file,
verify the copy/download succeeded and the result is non-empty, then
mv -f into place and check that exit status too, matching the pattern
already used for /usr/bin/x-ui. All 4 cp sites and the 3-way curl
fallback in each file now go through it; verified no other site
writes new content to the unit path (the remaining ${xui_service}
references are a pre-install existence check, an rm during old-version
cleanup, and the chown/chmod that already ran after the file is safely
in place -- none of those need atomicity).

Verified with bash -n on both files, plus a standalone scratch test
exercising cp-success, cp-with-missing-source, cp-with-empty-source,
and curl-failure paths: on every failure the previous, good unit file
content is left untouched and no temp file is leaked behind.

* fix(script): make Alpine's OpenRC init script install atomic; drop a stray comment

A final maximum-rigor review of the whole PR (12 finder angles including
a repo-wide sweep for any remaining instance of the bug class this PR
fixes) found two more real issues:

- Alpine's /etc/init.d/x-ui startup script is downloaded via a bare
  `curl -fLRo` straight onto the live path in both update.sh and
  install.sh -- the exact same unguarded-overwrite pattern already
  fixed for /usr/bin/x-ui and the systemd unit file, just left
  uncovered on the OpenRC side. A network drop mid-download truncates
  the live init script; OpenRC then fails to source/execute it on the
  next start, leaving the panel unable to come up. Fixed with the same
  temp-file + non-empty check + mv -f (with its own exit-status check)
  pattern used everywhere else in this PR. Verified with bash -n and a
  standalone scratch-script test covering success, empty-download, and
  destination-preserved-on-failure paths.

- internal/web/service/panel/panel_test.go had one line-level `//`
  comment on a call site, which the root CLAUDE.md's hard rule ("No //
  line comments in committed Go/TS... rename instead of annotating")
  explicitly prohibits. The comment duplicated context already stated
  in the test's own doc comment two lines above, so it's simply
  removed rather than reworded.

Also flagged, deliberately not bundled here since it's a different
subsystem: x-ui.sh's update_geofiles() downloads Xray's live
geoip.dat/geosite.dat with the same unguarded curl -o pattern. Tracked
as its own follow-up.

* fix(script): make geo-data file downloads atomic

Same anti-pattern as /usr/bin/x-ui, the systemd unit file, and the
Alpine init script fixed in prior PRs: update_geofiles() downloaded
Xray's live geoip.dat/geosite.dat (and the IR/RU variants) with curl
writing straight onto the exact path Xray reads at runtime
(internal/xray/process.go's GetGeoipPath/GetGeositePath), no temp
file, no verification. The existing check only inspected the reported
HTTP status via -w '%{http_code}', not file integrity, so a network
drop mid-download could leave a truncated .dat file on disk that
passes the status check. Xray then fails to parse it on the next
restart/reload, breaking any routing rules that reference geoip:/
geosite:.

The -z conditional-GET usage needed care here: the original code
pointed both -z and -o at the same live path. Fixed by pointing -z at
the live file (to keep the "already current" freshness check) while
-o writes to a PID-suffixed temp file, matching the pattern already
proven in x-ui.sh's replace_xui_script(). Verified with a local HTTP
server that a 304 response leaves the temp file untouched/nonexistent
(so the existing "already up to date" branch still works unchanged),
and added a non-empty check plus a checked mv -f before treating a
download as installed.

Verified with bash -n and an end-to-end scratch test against a local
server covering: fresh download, 304-not-modified, empty response
body, and a 404 -- confirming a failure at any stage leaves the
previous good .dat file completely untouched and no temp file behind.

* fix(script): verify the release tarball extraction, not just the download

The final maximum-rigor review found the most significant remaining gap
in this whole effort: update.sh and install.sh check the tarball
download's exit status, but never check tar's exit status, and never
verify the extracted x-ui binary actually exists before continuing.
Worse, by the time extraction runs, the previous installation has
already been stopped and deleted -- there's no rollback. A truncated
download that still passes curl's own check, or a tar failure (disk
full, killed process), left the panel silently in a broken half-state:
chmod/config/service-install all continued to run against a missing or
empty binary, with no error surfaced anywhere. This is the same bug
class as everything else in this PR (unverified write to a path
something then depends on), just for the tarball itself rather than a
single file -- and it also covers the geo-data files this PR already
fixed once for the interactive/cron path, since they ship inside this
same tarball on every panel update.

Added: a non-empty check on the downloaded archive (both files, both
install.sh call sites) and a check that tar succeeded and produced a
non-empty x-ui binary before proceeding, failing loudly with a message
that explicitly says the previous install is already gone, since
silently continuing here is worse than anywhere else in this PR.

This doesn't make the multi-file extraction fully atomic (that would
mean extracting to a temp directory and atomically swapping the whole
install tree into place, a materially larger restructuring than
anything else in this PR) -- but it closes the "fails silently, user
discovers it days later when Xray can't start" gap, which was the
actual reported problem this whole effort traces back to.

Also fixed, all much smaller:
- replace_xui_script() in x-ui.sh implicitly returned chmod's exit
  status instead of success, so a successful atomic install could be
  reported as failed if chmod transiently failed after the mv already
  landed the new script. Added an explicit `return 0`.
- update_geofiles() had no default case branch; an unrecognized
  argument would silently reuse whatever dat_files/dat_source values a
  previous call left in the un-scoped globals instead of failing.
  Currently unreachable (all three call sites pass fixed literals) but
  cheap, defensive, and worth having.
- internal/web/controller/server.go's updatePanel has one branch (an
  unparseable "dev" form value) that's both untested and safe to test
  on any platform, since it's rejected before any real exec/network
  call. Added the missing test case.

Verified: bash -n on all three scripts; an empirical scratch test
covering an empty downloaded archive, a corrupt (non-gzip) archive,
and a successfully-extracting-but-empty archive, confirming each is
caught before the script proceeds; full go build/vet/test -race
across the whole module; frontend generation confirmed still in sync.

* fix(panel): base the update-slot staleness fallback on process liveness

Addresses the automated review on the upstream PR (MHSanaei/3x-ui#5711).

Blocking finding: acquireUpdateSlot's staleness fallback freed the
update slot purely on elapsed wall-clock time (5 minutes), with no
check on whether the update.sh process it launched was actually still
running. update.sh runs install_base() (apt-get/dnf/pacman update and
install) before update_x-ui even starts, plus several GitHub
downloads (release tarball, x-ui.sh, and possibly a service unit or
x-ui.rc) -- on a slow or throttled host, a small VPS being the typical
deployment target for this project, that alone can plausibly exceed 5
minutes with nothing wrong. A second /updatePanel call arriving in
that window (an admin retrying after the frontend's 90s poll times
out, or overlapping master-node bulk-update calls) would launch a
second update.sh, racing the exact rm/tar/mv/systemctl sequence this
whole PR exists to make safe.

Fixed by recording the launched process's PID (detached-fallback path
only; the systemd-run path's own process has already exited by the
time startUpdate returns, so it never learns update.sh's real PID) and
checking it via the standard POSIX kill(pid, 0) liveness probe before
treating a run as stale, following the existing panel_unix.go /
panel_other.go platform-split pattern already used for
setDetachedProcess. A confirmed-alive process now keeps the slot held
past updateStaleAfter (raised from 5 to 20 minutes as a safer baseline
for the systemd-run path, which still has no way to check liveness
directly). updateHardCeiling (2 hours) is an absolute backstop so a
genuinely wedged run can never lock out retries permanently even on
the PID-tracked path.

Added two regression tests exercising the new logic (gated to Linux,
since processAlive is a no-op stub elsewhere): a live PID keeps the
slot held past the stale window, and the hard ceiling overrides
liveness. Traced both by hand against the new acquireUpdateSlot logic;
could not execute-verify processAlive itself on this Windows dev
machine (no WSL distro installed, and installing one felt
disproportionate to validate kill(pid, 0), an extremely well-established
POSIX primitive), but cross-compiled clean for linux/amd64 and this
repo's CI runs the real test suite on Linux.

Also fixed, both suggestions from the same review:
- install.sh: two failure paths right after tarball extraction were
  exiting without cleaning up the already-downloaded x-ui.sh temp file
  (xui_script_temp), leaving it behind. Every other new failure branch
  in this PR removes its temp file before exiting; these two now do
  too.
- frontend/src/pages/api-docs/endpoints.ts: updatePanel's doc entry
  did not reflect that a successful response now carries an obj with
  runId. Added an inline response example matching the existing
  pattern used for other ad hoc (non-schema-backed) responses like
  getWebCertFiles.

Verified: go build/vet clean on both windows (native) and a linux/amd64
cross-compile; full go test ./... clean; go test -race on the panel
and controller packages; bash -n on all three shell scripts; npm run
gen confirms the openapi.json diff is exactly the new response example
with no stray changes to src/generated; TestAPIRoutesDocumented still
passes.
2026-07-02 18:19:33 +02:00
nima1024m ade74eb321 fix(balancers): keep mixed strategies on one observer (#5674)
* fix(balancers): keep mixed strategies on one observer

Xray resolves Observatory and Burst Observatory through the same global observer feature. When any burst-required strategy is present, keep all observer-backed balancer selectors on burstObservatory and remove the regular observatory so mixed leastPing configs cannot generate two competing observer blocks.

* test(balancers): cover observer strategy combinations

Exercise the observer sync matrix for random, round-robin, leastPing, and leastLoad balancers. Include mixed and stale-observer cases so the panel keeps only the observer type that Xray should consume.

* fix(balancers): clarify observer empty state

Update the Observatory tab empty hint to describe the actual auto-managed cases. Least Ping, Least Load, and fallback Random or Round-robin balancers now explain why an observer is added before the balancer can choose a target.

* fix(balancers): remove mixed observer switch

Show only the observer settings panel that matches the current balancer requirements. Legacy configs that still contain both observatory blocks now display a warning instead of a tab switch, since saving balancers normalizes the config back to one global observer.

* test(balancers): cover observer cleanup on deletion

Add direct balancer deletion and outbound cascade cases for leastLoad, fallback, and mixed leastPing scenarios. These tests pin that the final unneeded observer is removed, burst switches back to regular observatory when only leastPing remains, and burst remains when a burst-required balancer survives.
2026-07-02 18:18:30 +02:00
MHSanaei 97e2c9e7ba fix(web): sync the VLESS generate-key dropdown with the encryption field
The auth-kind dropdown in the VLESS "Generate Key" block was hardcoded to
x25519 on mount, while the "Already selected" text next to it was derived
independently from settings.encryption. Editing an inbound whose encryption
uses another kind (e.g. ML-KEM-768) showed a mismatched dropdown, and
clicking Generate without noticing would produce a keypair of the wrong
kind for the inbound.

Extract the encryption-string parsing into a shared pure helper
(lib/xray/vless-encryption), use it both for the selected-auth label and to
initialize/sync the dropdown, so the two can no longer diverge. When the
encryption is none or unparseable the dropdown keeps its x25519 default.

Closes #5744
2026-07-02 17:37:04 +02:00
MHSanaei 5e8327e728 fix(settings): include savePayload in the category body memo deps
react-hooks/exhaustive-deps flagged the omission; a stale closure could
hand SecurityTab an outdated save callback after a mutation state change.
2026-07-02 17:16:12 +02:00
MHSanaei 7c12700c7d fix(sub): resolve subscription clients and stats from normalized tables
A subscription fetch inside a large inbound cost seconds because every
layer re-parsed the inbound's full settings JSON: getInboundsBySubId
preloaded the whole client_traffics table of each matched inbound,
matchingClients parsed all clients to filter by subId, and then every
per-protocol generator (raw links, JSON outbounds, Clash proxies) parsed
the blob again per link — once to find the client by email and once for
inbound-level fields like encryption or method. At 500k clients in one
inbound that was 13s per raw fetch and 8.5s per JSON fetch; at 100k,
2.6s/1.7s. After this change both cost ~70ms at 100k.

matchingClients now resolves through the indexed clients/client_inbounds
tables (ListForInboundBySubId, ordered by clients.id like ListForInbound
— the same source the running Xray users are built from), and the
per-request SubService carries two caches: clientsByInbound, primed by
matchingClients/inboundLinks so clientForLink resolves a client without
parsing settings (with the old full-parse as fallback, which also fixes
the export-all-links path that re-parsed the blob once per client), and
settingsByInbound, a once-per-request shallow decode that skips
materializing the clients array entirely. The ClientStats preload is
replaced by loading only the subscriber's traffic rows (indexed
clients.sub_id); statsForClient's per-email DB fallback (#5567) covers
any miss, and the case-insensitive email dedupe keeps the #5134
guarantee for case-differing duplicate rows.
2026-07-02 16:58:00 +02:00
MHSanaei c0d17e132d fix(job): batch ip-limit per-email lookups and persistence
processObserved paid four round-trips per observed email every 10s scan:
an inbound-resolving join, a tracking-row read, an autocommit Save (one
fsync each under synchronous=FULL), and — worst of all — a full JSON
parse of the owning inbound's settings blob just to read that one
client's limitIp. On a big single inbound that parse alone made a scan
cost ~1.5s per online client.

The scan now front-loads three chunked batch queries (clients.limit_ip,
email->inbound through the client_inbounds relation keeping the lowest
inbound id like the old First(), and the tracking rows) and writes every
inbound_client_ips change inside one transaction, so M observed emails
cost a handful of queries and a single fsync. The per-email LIKE fallback
remains for emails missing from the relation, preserving the #4963
stale-email cleanup. limitIp now comes from the clients table (same
source B3 gates on) instead of the settings blob, and xray disconnects
for banned clients run after the commit so their network round-trips
never extend the write transaction node syncs contend with.
2026-07-02 16:39:31 +02:00
MHSanaei fc5be5b9e4 feat(web): broadcast delta client stats above a snapshot threshold
Both 5s broadcasters (the local traffic poll and the node traffic sync)
shipped the complete client_traffics table on every cycle while a browser
was connected. At 500k clients that is a 1.7s full-table read plus an
86MB marshal per job per poll — and the hub drops any payload over 10MB
and sends an invalidate the frontend ignores for these message types, so
past ~55k clients all of it was pure waste and the UI got nothing.

Installs at or below 5000 clients (clientStatsSnapshotMaxClients) keep
the exact full-snapshot behavior — it exists because a pure delta feed
left UI rows stale when nothing moved in a cycle (see GetAllClientTraffics)
— and the payload now carries snapshot=true. Above the threshold the jobs
send only this cycle's active rows (the xray poll's active emails, or the
emails online on the synced nodes) with snapshot=false, and scope the
last-online map to those rows; the initial full map still arrives over
REST and the clients page refetches every 5s.

GetActiveClientTraffics gains the overlayGlobalTraffic pass so delta rows
carry the same cross-panel usage as snapshot rows. The node job also
stops reading the full last-online map before the has-clients gate, which
was a wasted full-table read on every tick with no dashboard open.

Frontend: useClients keeps its live summary strictly snapshot-driven
(snapshot=false payloads skip the allClientStats replace and the summary
falls back to the server-computed one); the per-row page merge and the
inbounds-page merges already handle deltas.
2026-07-02 16:34:01 +02:00
MHSanaei c3cc8b4374 fix(job): gate ip-limit scan on clients.limit_ip instead of parsing all settings
hasLimitIp ran settings LIKE '%limitIp%' and JSON-parsed every matching
inbound's settings blob — and since clients marshal limitIp without
omitempty, every inbound matched, so each 10s scan loaded and parsed
every settings blob in the database (~75MB of JSON at 500k clients) just
to decide whether any limit exists.

It now probes the normalized clients table (limit_ip > 0, Limit(1) count
like depletedCond does), which SyncInbound and the legacy seeder keep in
sync with the settings JSON. Semantics note: a limitIp that exists only
in settings JSON with no clients row no longer enables enforcement — the
enforcement path itself already resolves clients through the same
normalized tables.
2026-07-02 16:24:18 +02:00
MHSanaei 97588dd0b9 fix(traffic): disable depleted clients by id instead of a second full scan
disableInvalidClients evaluated the depleted predicate twice per poll:
once to SELECT the rows (for xray removal and settings sync) and again in
the UPDATE that flips enable off — each a full client_traffics scan, the
second also re-running the cross-panel EXISTS subquery when global rows
exist.

The UPDATE now flips the already-collected rows by primary key in
sqlInChunk batches, sorted for stable lock order. Same rows, same
RowsAffected, half the scan cost; id-based matching also stays correct
for rows with empty emails.
2026-07-02 16:24:18 +02:00
MHSanaei fb1d055b06 fix(traffic): persist delayed-start expiry only for converted clients
addClientTraffic's second pass wrote expiry_time for every polled row via
UPDATE ... WHERE expiry_time < 0 — a no-op statement per active client on
every 5s poll, since almost all rows carry a positive expiry. At 10k
active clients that was 10k pointless indexed UPDATEs per poll.

adjustTraffics now returns the emails it actually converted this tick and
the persistence pass writes exactly those, in sorted order to keep
concurrent writers lock-compatible on Postgres. Behavior is unchanged:
unconverted rows never matched the WHERE clause anyway.
2026-07-02 16:24:18 +02:00
MHSanaei 4fc301682f test(scale): cover traffic poll, ws payloads, ip-limit job, sub and xray config at 500k
The paths that run continuously in production had no scale coverage: the
5s traffic poll (AddTraffic with its auto-renew and depleted scans), the
websocket snapshot the job broadcasts while a browser is connected, the
10s ip-limit job (hasLimitIp LIKE scan + per-email settings parse), a
subscription fetch inside a huge inbound, and the full Xray config build.

New benchmarks reuse the XUI_SCALE_TEST / XUI_DB_TYPE gating and stay
log-only. Sizes default to 10k/100k; XUI_SCALE_SIZES=500000 raises the
ladder without editing code. seedScaleDataset writes inbounds, clients,
client_inbounds and client_traffics directly in one transaction instead
of SyncInbound, so a 500k seed takes seconds. XUI_SCALE_DB_PATH persists
the seeded SQLite file for manual smoke runs against a live panel.
2026-07-02 16:12:46 +02:00
MHSanaei 28f7690224 docs: move architecture map into docs/ and refresh it against the live tree
The architecture/code map previously lived in .claude/CLAUDE.md, which was
gitignored (local-only) and auto-loaded into every agent session alongside
the root CLAUDE.md. Track it in docs/architecture.md instead and reference
it from CLAUDE.md so it is read on demand.

While moving it, fact-check the whole map against the current tree:
- add the missing internal/eventbus and internal/tunnelmonitor packages,
  the service/email subpackage, and util/wirecodec
- document node mTLS (tls_client.go, node_mtls.go, setting_mtls.go) and the
  fourth TLS verify mode
- add the Host, ClientExternalLink, NodeClientIp and ClientGlobalTraffic
  models plus their symptom-index rows
- correct the cron table (check_cpu_usage is 1m not 10s; add
  check_memory_usage and free_os_memory), the middleware chain
  (MaxBodyBytes, ConfigEnvelope, CSRF) and the controller route prefixes
- refresh the sub/ and service/ file listings, frontend pages (hosts/,
  index/), CI workflow list, and replace stale exact line counts with
  rounded sizes
2026-07-02 14:21:21 +02:00
MHSanaei 92303094fd feat(settings): let users clear stored secrets from the UI
Redacted secrets (SMTP password, Telegram bot token, LDAP password) are
always served blank to the browser, so the update path treats a blank
submission as "unchanged" and silently restores the stored value. That
made a once-set secret impossible to remove without editing the database
— e.g. switching to a passwordless localhost SMTP relay kept sending the
old credentials forever.

Blank stays "unchanged"; clearing is now its own signal. The update
request carries explicit clear flags (request-scoped fields on the
controller form, so they are never persisted as settings rows), and
preserveRedactedSecrets skips the restore for a flagged secret. Each
secret field gets a Clear/Undo button that arms the flag; typing a new
value disarms it. The 2FA token keeps its existing behavior: it is
already clearable by disabling 2FA.

Closes #5724
2026-07-02 13:57:34 +02:00
MHSanaei fb3a1559b2 fix(sub): default https:// for scheme-less support and profile URLs
A support URL saved without a scheme (e.g. "t.me/handle") is served
verbatim in the subscription Support-Url header and page data, and client
apps resolve it relative to the subscription domain — clicking it lands
on "https://panel.example/t.me/handle". Same hazard for the profile URL.

Default the scheme to https:// when none is present, both when saving the
settings and when reading already-stored values, so existing databases are
covered without a migration. Deliberate non-http schemes (tg://, mailto:,
tel:) pass through untouched, which is why these two fields don't go
through SanitizeHTTPURL's http(s)-only validation.

Closes #5738
2026-07-02 13:47:10 +02:00
MHSanaei a335456cd3 fix(settings): repair legacy path settings that block every settings save
A subJsonPath (or subPath/subClashPath/webBasePath) stored without its
leading/trailing slash — written before the slash rules existed, or
restored from an old backup — fails the frontend's whole-form validation,
so every save on the Settings page is rejected client-side. The backend's
CheckValid would normalize the value, but a save request never reaches it,
leaving the panel wedged until someone edits the database by hand.

Normalize the stored path rows at startup, mirroring CheckValid's slash
rules. The pass is idempotent and not seeder-gated, since a restored
backup can reintroduce bad values at any time.

Also add the missing pages.settings.validation.pathLeadingSlash key to
all 13 locales — the validation error used to render as its raw key.

Closes #5726
2026-07-02 13:42:03 +02:00
MHSanaei 9a3a12b260 fix(node): stop Postgres deadlocks and deleted-client resurrection in node sync
Two defects in the node traffic sync, both hit hard on busy
master+multi-node Postgres deployments:

Client-IP merges deadlocked. Each node syncs on its own goroutine and
shared clients appear in several nodes' reports, but MergeInboundClientIps
and upsertNodeClientIps locked rows in whatever order each node's report
arrived. Two concurrent merges taking the same rows in opposite order is
exactly what Postgres aborts with SQLSTATE 40P01 ("merge client ips from
<node> failed: deadlock detected"). Both merges now process emails in
sorted order so every transaction acquires row locks in one global order.

Deleted clients resurrected with zeroed traffic. A snapshot fetched just
before a deletion still names the deleted email; applying it after the
delete committed re-added the client. The delete tombstone existed for
precisely this race but only zeroed the seed counters: the sync still
recreated the client_traffics row, and worse, adopted the node's stale
settings JSON wholesale, putting the client back in the central inbound
as if it were brand new with 0 traffic. Snapshot application now skips
row creation for tombstoned emails on known inbounds and strips
tombstoned clients from adopted settings; fresh node-adoption semantics
(rows seeded at zero) are unchanged.

The mass-disconnect part of the report is the forced node restart on
auto-disable, removed separately in 4d6f2ddd.

Closes #5739
2026-07-02 13:37:06 +02:00
MHSanaei 4d6f2ddd97 fix(node): stop force-restarting a node's Xray when its clients auto-disable
When a depleted or expired client lived on a node, the master pushed the
updated inbound (client flipped off) to the node and then also told the
node to fully restart Xray. The push alone already applies the disable:
the node updates that one inbound on its running core. The extra restart
dropped every live connection on the node each time any of its clients
crossed a quota or expiry, and a restart that failed to come back left
the node forwarding nothing until someone restarted Xray by hand.

This mirrors e5b56c94, which removed the same forced restart from the
local auto-disable path; remote nodes now get the same graceful
reconcile-by-push treatment.

Closes #5740
2026-07-02 13:27:36 +02:00
MHSanaei 62f303905e fix(scripts): pass --force to acme.sh --installcert so it survives sudo
acme.sh guards every non-install command behind _checkSudo: when a
non-root user runs the panel scripts via sudo, it prints the sudo wiki
warning and exits before doing anything, unless FORCE is set. All our
--issue calls already pass --force and were unaffected, but none of the
--installcert calls did, so issuance succeeded and installation then
aborted silently, ending in "Certificate files not found after
installation". FORCE has no other effect on the installcert path, so
mirror the --issue calls and pass --force everywhere we install certs.

Closes #5741
2026-07-02 13:20:31 +02:00
MHSanaei c8ef1b1f68 feat(reality): derive a stable per-client spiderX for shared links
The inbound's spiderX now acts as a per-client seed: exports emit
sha256(seed|subKey) truncated to a 15-hex "/path", so a client's spx no
longer changes on every subscription fetch (#5718) while different
clients stop sharing one fingerprintable value. The form gains a
regenerate button that rotates every client's path at once.

The frontend link builders derive through the same function
(lib/xray/spider-x.ts, @noble/hashes) keyed on subId-then-email like
the Go subKey, so panel QR/copy links and subscription output agree —
cross-language vector tests lock both sides byte-for-byte. streamData
now tolerates malformed stored stream settings (unparseable JSON, null
tls/reality settings) instead of panicking the subscription request.
2026-07-02 12:53:08 +02:00
MHSanaei 64c306037f feat(wireguard): make client allowedIPs editable with validation
The WireGuard peer address was allocated server-side and shown read-only
in the client editor, so changing it required hand-editing the inbound's
raw settings JSON (#5715). The backend add/update paths already honored a
submitted allowedIPs; only the form withheld it.

Make the field editable (comma-separated, empty still auto-assigns) and
validate submissions server-side: entries must parse as an IP or CIDR,
bare addresses normalize to single-host prefixes, and an address already
used by another peer on the inbound is rejected.

Closes #5715
2026-07-02 09:45:54 +02:00
MHSanaei 8dd3b31ee8 fix(node): show the activated first-use deadline on the Clients page
With "start after first use" on a node inbound, the node activates the
absolute deadline and the master adopts it into client_traffics via the
sync CASE merge — but the client record (what the Clients page reads) was
only refreshed by SyncInbound from the snapshot's settings JSON. A node
whose JSON still carried the negative duration (stale conversion, older
node build, or a mixed local+node attachment) kept rewriting the record
back to "not started" even though the DB held the real deadline (#5714).

Lift the activated deadline from client_traffics onto still-negative
client records at the end of every node sync, after SyncInbound has run.
Intentional resets back to delayed start are unaffected: editing a client
also resets client_traffics to the negative duration, so the lift's
expiry_time > 0 guard never matches.

Closes #5714
2026-07-02 09:36:07 +02:00
MHSanaei e5b56c9444 fix(xray): reconcile client auto-disable through the API instead of a forced restart
When a client expired or hit its traffic limit, XrayTrafficJob called
RestartXray(true), stopping the whole process and dropping every live
connection on every inbound (#5712 reported this as XHTTP on 443 dying) —
even though disableInvalidClients had already removed the user from the
running core over gRPC. The force restart existed only to re-sync the
process's config snapshot.

Switch the job to a non-forced restart and teach ComputeHotDiff to express
a client-only inbound change as per-user AlterInbound operations for
vless/vmess/trojan, so the reconcile is a no-op RemoveUser plus a snapshot
update rather than a handler swap that would still blip that inbound's
listener. Anything beyond the clients list still falls back to handler
replacement or a full restart as before.

Closes #5712
2026-07-02 09:26:53 +02:00
MHSanaei 1153d5db8c fix(groups): keep group traffic totals stable across client resets and deletes
ListGroups displays live_sum(client_traffics) minus the group's stored
reset baseline, but only ResetGroupTraffic ever moved the baseline. Any
client-level operation that zeroed or deleted traffic rows (single/bulk
reset, client delete, removing a client's last inbound) shrank the live
sum and silently subtracted that client's history from the group total.

Shift the baseline down by the removed counters inside the same
transaction, so group totals only change through group reset. Derived
groups without a stored row get one with a negative baseline, which the
existing clamp handles.

Closes #5675
2026-07-02 09:17:47 +02:00
MHSanaei 539bcc897c fix(inbounds): apply the legacy xhttp session-key migration when editing
rawInboundToFormValues injected the stored xhttpSettings blob into the form
store without running it through XHttpStreamSettingsSchema, so the
sessionPlacement/sessionKey -> sessionIDPlacement/sessionIDKey rename from
xray-core v26.6.22 (and the v3.4.0 field defaults) never applied on the
edit path. Inbounds saved before the rename opened with blank session
fields, and the stale keys could ride back on save even though the core no
longer reads them. Parse the sub-object through the schema on load, and
lift any stale legacy keys in normalizeXhttpForWire as a backstop.

Closes #5621
2026-07-01 23:11:58 +02:00
MHSanaei 273f88721e fix(database): stop noisy per-startup errors in the Postgres server log
Two statements failed server-side on every panel start after a SQLite to
Postgres migration, flooding the postgres log even though the Go side
suppressed them:

- resyncPostgresSequences issued SELECT MAX(id) against client_inbounds,
  whose composite primary key has no id column; Postgres validates the
  SELECT list at parse time, so the WHERE pg_get_serial_sequence(...) guard
  never got a chance to no-op it. Skip models whose GORM schema maps no id
  column before issuing the statement.

- AutoMigrate detects existing columns via information_schema filtered by
  table_catalog = CURRENT_DATABASE(), which misdetects on some setups and
  re-issues ALTER TABLE ... ADD for columns that already exist. HasColumn/
  HasIndex query without that filter and are reliable (the existing
  duplicate-column suppressor depends on exactly that), so skip AutoMigrate
  outright when the table, every column, and every index already exist.

Closes #5665
2026-07-01 23:07:05 +02:00
MHSanaei 1f2e3e1447 fix(sub): use configured spiderX instead of always randomizing
applyShareRealityParams and SubJsonService.realityData generated a fresh
random spx on every export, so share links, "export all links", and JSON
subscriptions never matched a spiderX configured on the inbound and two
exports of the same client disagreed with each other. Read the value from
realitySettings.settings like pbk/fp/pqv and keep the random value only as
a fallback when none is configured.

Closes #5718
2026-07-01 23:07:05 +02:00
MHSanaei 49773c18de fix(xray): force full restart for inbounds with a VLESS reverse client
Hot-applying an inbound change swaps it via DelInbound+AddInbound on
the running core. That unregisters any client's reverse.tag handler
on the xray-core side without closing the bridge's already-established
connection, so the reverse tunnel is silently orphaned until someone
manually restarts xray. diffInbounds now bails out of the hot-apply
path whenever the old or new inbound carries a reverse-tagged client,
falling back to a full restart, which actually drops the socket and
lets the bridge redial on its own.

Also scope the .claude ignore rule to its contents (.claude/*) instead
of the whole directory, so individual files under .claude/ can be
tracked selectively.
2026-07-01 14:02:13 +02:00
MHSanaei 427613b308 chore(ci): upgrade claude-bot to Sonnet 5 and set explicit effort levels
Sonnet 5 reaches near-Opus quality on coding/agentic work at lower cost;
pin effort explicitly (xhigh/max) instead of relying on model defaults.
2026-07-01 00:43:27 +02:00
MHSanaei f3a57d4c57 3.4.2 v3.4.2 2026-06-29 20:28:08 +02:00
MHSanaei 86813758cc fix(node): stop the offline-sync toast firing on saves to online nodes
IsNodePending fed the user-facing "saved locally, node offline, will
sync on reconnect" toast off three conditions, one of which was the
node's config_dirty flag. But every node-backed client/inbound edit
marks the node dirty unconditionally inside its write transaction — it
is the reconcile self-heal marker, set even for edits pushed live to a
healthy online node. The controller reads that freshly-set flag right
after the save, so the warning fired on every save to a node-backed
inbound regardless of the node actually being online.

Drop the dirty term so the predicate reflects only what the message
claims: the node being unreachable (offline or disabled). Offline and
disabled nodes still mark dirty and still surface the toast.

Add regression tests: online+dirty must not be pending; offline and
disabled must be.
2026-06-29 18:35:38 +02:00
MHSanaei 8332ba67ae chore(deps): bump antd to 6.5 and migrate deprecated component props
Upgrade frontend deps (antd 6.4.5 -> 6.5.0, Ant Design icons, TanStack
Query, i18next, eslint) and fasthttp 1.71 -> 1.72.

AntD 6.5 deprecated several Input/Card/Space props, so adapt the panel UI:
- Input/InputNumber addonBefore/addonAfter -> prefix/suffix
- Card bordered -> variant="outlined"
- Space direction -> orientation
- swap the hand-rolled Telegram SVG for the new TelegramFilled icon
- guard SettingListItem against cloning aria-labelledby onto a Fragment,
  which only accepts key/children
2026-06-29 16:57:55 +02:00
MHSanaei d8221a8153 fix(sub): bake Host VLESS Route into subscription UUIDs
The Host VLESS Route field was stored and shown in the panel but never applied to any generated subscription (raw, JSON, Clash), so the UUID was emitted unmodified (#5655).

Xray reads the route from the UUID's 3rd group (bytes 6-7, net.PortFromBytes) and masks those bytes to zero before authenticating, so a value can be baked into the share/JSON/Clash UUIDs without breaking the user match. A shared applyVlessRoute helper encodes a single 0-65535 value as the 3rd group; empty/invalid/non-UUID input is left unchanged, so legacy data never yields a broken link and no DB migration is needed.

The field was wrongly validated as a multi-segment port spec (that form belongs to the separate server-side routing rule). It is now a single value 0-65535, with frontend validation, link-preview parity (genVlessLink/hostToExternalProxyEntry), hint + error translations across all 13 locales, and tests on every path.

Closes #5655
2026-06-29 14:32:23 +02:00
MHSanaei 789e92cddc fix(clients): re-enable depleted clients on API renewal (#5619)
Renewing a subscription via POST /panel/api/clients/bulkAdjust extended a client's expiry/quota but left it disabled. The enforcement loop disables a depleted client across client_traffics, client_records and the inbound settings JSON (and pushes that to the node), while BulkAdjust only updated expiry/total and never cleared enable. On a node its UpdateUser push was built from the stale ClientRecord (Enable=false), which the next traffic poll merged back onto the master, so the client never recovered.

BulkAdjust now re-enables a client only when it was disabled because it was depleted and the adjustment lifts it back within limits, computed as a set-difference of the production depletedCond predicate and applied through the canonical BulkSetEnable (run after the per-inbound loop, since lockInbound is non-reentrant). Manually-disabled or still-depleted clients stay disabled.

Update now writes the clients.enable column explicitly so re-enabling sticks for inbound-less clients and stops feeding a stale record into node pushes.
2026-06-29 13:39:03 +02:00
nima1024m 7a5d6da28c fix(xray): clean stale routing references when a balancer or outbound is deleted (#5648)
* feat(xray): reference-cleanup helpers for entity deletion

When an outbound or balancer is deleted on the Xray page, routing rules and
balancers that reference it must be repaired in the same edit, or the saved
config breaks the core: a dangling balancerTag stops Router.Init (whole core
down), a dangling outboundTag black-holes matched traffic at the dispatcher.

Add pure plan*/apply* helpers that compute and apply the cleanup. A rule is
kept when a destination (outboundTag or balancerTag) remains and dropped when
none does. Deleting an outbound cascades: emptying a balancer selector removes
that balancer too, then repairs its rules in one pass against the full removed
set; fallbackTag and dialerProxy references are cleared and observatories
re-synced.

* fix(balancers): clean routing rules referencing a deleted balancer

Deleting a balancer left routing rules pointing at its balancerTag. xray-core's
Router.Init then fails ("balancer <tag> not found"), the core won't restart and
every inbound drops — the saved config passes CheckXrayConfig (JSON shape only),
so it breaks only on the next restart.

The delete confirm now lists the affected rules (modified vs removed) next to
the existing observatory warning and applies planBalancerDeletion's cleanup: a
rule keeps its outboundTag when present, otherwise the whole rule is dropped.
Adds the shared DeletionImpactList and refCleanup strings across all 13 locales.

* fix(outbounds): clean rules, balancer selectors and dialerProxy on outbound delete

Deleting an outbound left routing rules pointing at its outboundTag (matched
traffic black-holed at the dispatcher), plus stale references in balancer
selectors / fallbackTag and other outbounds' dialerProxy.

The delete confirm now shows planOutboundDeletion's impact and applies the
cascade: rules keep a remaining balancerTag (else are dropped), the tag is
pulled from balancer selectors and fallbacks, dialerProxy references are
cleared, and a balancer whose selector is emptied is removed along with its
own now-targetless rules.

* refactor(xray): share one rule classifier across preview and apply

Code review flagged that the keep/drop predicate was transcribed twice — in
ruleImpacts (the delete-modal preview) and in applyCleanup (the mutation) — kept
in sync only by a parity test. Extract a single classifyRule() that both call,
so the preview can never disagree with what apply actually does.

Also harden balancersEmptiedBy to skip tagless balancers: an empty/missing tag
would otherwise enter the removed set as "" and silently drop every other
tagless balancer (only reachable via a hand-edited config, but a silent data
loss). And remove observersRemovedByDeletingBalancer, orphaned once BalancersTab
switched to planBalancerDeletion.

* fix(xray): null-guard reference cleanup against unvalidated configs

The PR review noted that classifyRule and applyCleanup dereferenced rule /
balancer entries directly, while the sibling propagateOutboundTagRename uses
optional chaining — because fetchXrayConfig falls back to the unvalidated parsed
object when Zod validation fails, a stray null in rules / balancers can survive
into the editor and would throw during the delete preview/apply.

Match that defensive style: classifyRule and balancersEmptiedBy read through
optional chaining, the balancer loop skips nullish entries, and the dialerProxy
walk guards the outbound. A delete on a hand-edited config with null entries now
degrades gracefully instead of throwing.
2026-06-29 12:52:18 +02:00
nima1024m 71aca2018a feat(a11y): screen-reader & keyboard accessibility across the panel (#5486) (#5652)
* feat(a11y): label list, toolbar & dashboard actions for screen readers

Phase 1 of #5486 (Android TalkBack support). Icon-only controls across
the management surfaces previously announced only their untranslated
icon name (e.g. "edit", "ellipsis") or nothing at all.

- Add aria-label to icon-only row-action and toolbar buttons across
  inbounds, clients, groups, hosts, nodes and xray
  (outbounds/routing/dns/balancers) lists, plus the dashboard cards.
- Make clickable bare icons and AntD Card actions keyboard-operable via
  role/tabIndex + Enter/Space (new activateOnKey helper); convert mobile
  dropdown triggers to buttons so they open from the keyboard.
- Fix the sidebar hamburger's mislabeled aria-label (was the dashboard
  label) and translate previously-hardcoded outbound menu labels.

New i18n keys in all 13 locales: sort, menu.openMenu,
pages.xray.outbound.moveToTop.

* feat(a11y): label modal, QR and copy/download controls for screen readers

Phase 2 of #5486. Modal and overlay controls relied on tooltips (not a
reliable accessible name) or were bare clickable icons with no keyboard
or screen-reader support.

- Add aria-label to copy/QR/download/info icon buttons in the inbound and
  client info modals, sub-links modal, QR panel, backup/log modals, and
  to the bare search/select inputs of the attach/detach client modals.
- Make click-to-copy QR codes and the IP-log refresh/clear, geofile
  reload and log refresh icons keyboard-operable (role/tabIndex +
  Enter/Space) with translated labels.
- Label the 2FA code input; drop the QrPanel download-image string
  fallback now that the key exists.

New i18n key in all 13 locales: downloadImage.

* feat(a11y): label form fields and shared form components for screen readers

Phase 3 of #5486. Form controls and shared form widgets were largely
unlabelled, and several remove controls were not keyboard-operable.

- SettingListItem now ties its title to the control via aria-labelledby,
  giving accessible names to the ~90 settings-tab inputs at once.
- InputAddon gains button semantics (role/tabIndex/Enter+Space) and an
  ariaLabel prop when used as an interactive remove control.
- Sparkline charts expose a role="img" summary of their latest values.
- Add aria-label to add/remove/regenerate icon buttons and bare
  inputs/selects across inbound, client and xray (dns/routing/balancer/
  outbound) forms; make clickable remove icons keyboard-operable; mark
  decorative help/target icons aria-hidden; label the JSON editor,
  date-time clear button, header-map remove, notification select-all and
  remark token chips.

New i18n keys in all 13 locales: regenerate, jsonEditor,
pages.xray.balancer.{costMatch,costValue,costRegexp}.

* chore(a11y): add eslint-plugin-jsx-a11y harness and fix flagged interactions

Phase 4 of #5486. Adds eslint-plugin-jsx-a11y (recommended ruleset,
scoped to .tsx) so screen-reader/keyboard regressions fail lint.

- Make the mobile node-card header a proper keyboard disclosure
  (role=button, aria-expanded, Enter/Space activation that ignores
  clicks on the nested action buttons) and drop the now-redundant
  stop-propagation click handlers the linter flagged on card-action
  wrappers in the node, client and inbound mobile cards.
- Disable jsx-a11y/no-autofocus: the autofocus on the login field and
  modal primary inputs is intentional focus management that helps
  screen-reader and keyboard users land on the right control.

make lint passes with the a11y ruleset enforced.

* feat(a11y): cover remaining deferred spots (settings tabs, sockopt, API docs)

Completes the panel sweep for #5486 by labelling the spots previously
left out of phases 1-4:

- NotifyTimeField (Telegram notifications): the mode, interval, unit and
  custom-cron inputs now carry aria-labels.
- The Sockopt toggle in transport options.
- Settings category tabs in icons-only (mobile) mode now expose the tab
  name as the icon's aria-label instead of the raw icon name.
- The Swagger API-docs view is wrapped in a labelled region landmark.

New i18n keys in all 13 locales: pages.settings.notifyTime.{interval,unit}.

* feat(a11y): label shared xray form components and remark field

Code review surfaced frontend/src/lib/xray/forms/ — shared form components
used by the host and inbound JSON forms — which the initial audit missed.

- FinalMaskForm (TCP/UDP final-mask editor): label the icon-only add and
  regenerate buttons and make all six remove icons keyboard-operable
  (role/tabIndex/Enter+Space); adds useTranslation to its sub-components.
- CustomSockoptList: the remove icon is now keyboard-operable.
- SniffingFields: aria-label on the otherwise label-less destOverride select.
- RemarkTemplateField: aria-label on the remark-variable picker button.

New i18n key in all 13 locales: pages.inbounds.sniffingDestOverride.

* feat(a11y): label client info modal and WireGuard config block

After rebasing onto the WireGuard client-config feature, re-apply the
ClientInfoModal copy/QR/IP-log aria-labels (the modal was restructured
upstream, so the original labels did not carry over) and label the new
ConfigBlock component's copy/download/QR actions. ConfigBlock's action
wrapper keeps its stop-propagation handler (a non-interactive guard for
the Collapse header) under a scoped jsx-a11y exception.

* fix(frontend): let npm install jsx-a11y under ESLint 10

eslint-plugin-jsx-a11y@6.10.2 declares a peer range that stops at ESLint 9,
but the panel is on ESLint 10, so `npm ci` aborts with ERESOLVE even though
the plugin runs fine on ESLint 10 with flat config. Add an npm override so
jsx-a11y accepts the project's ESLint version. This keeps normal peer
resolution (recharts' react-is peer still auto-installs) — no global
legacy-peer-deps and no manual react-is pin needed.

* fix(a11y): size mobile row triggers and move node expand role to chevron

Address automated review on #5652:
- add size="small" to the inbound/client/node mobile-card "more" dropdown
  triggers so they match the adjacent small Switch and the established
  desktop RowActions pattern.
- move the node card-head disclosure semantics (role/tabIndex/aria-expanded/
  keyboard) onto the chevron affordance so the expand control is no longer a
  role="button" wrapping the Switch, info button and dropdown. Mouse
  click-anywhere-to-expand is preserved on the header div.
2026-06-29 12:51:29 +02:00
MHSanaei 6c71b725da fix(clients): hide WireGuard config after detaching the WG inbound
The client info and QR modals rendered a WireGuard config whenever the
client still carried leftover WG key material (privateKey / publicKey /
allowedIPs / preSharedKey / keepAlive), regardless of whether a WireGuard
inbound was actually attached. After detaching the WG inbound the config
kept showing, built with an empty endpoint port and public key.

Gate wgConfigText on an attached WireGuard inbound (wgInbound) being
present, not just isWireguardClient(client), in both ClientInfoModal and
ClientQrModal.

Also rename the i18n key pages.clients.conf -> config and add the missing
pages.clients keys (wireguardConfig, config, bulkFlow, bulkFlowNoChange,
bulkFlowDisable) to all 12 non-English locales so each one matches en-US.
2026-06-29 01:15:37 +02:00
MHSanaei a329882e0e feat(wireguard): client config UX, collapsible config card, configurable DNS
Land the WireGuard client-config UX work on main (the upstream PR #5642
branch could not be pushed to).

- Reusable collapsible ConfigBlock (copy/download/QR, actions aligned right)
  for the client .conf, used by client info and the public sub page.
- Correct .conf: canonical PresharedKey casing and DNS sourced from the inbound
  (configurable per-inbound, default 1.1.1.1, 1.0.0.1).
- Configurable per-inbound DNS for WireGuard (schema + form + backend hint via
  InboundOption.WgDns); inert at the Xray layer.
- Public sub page now shows the WireGuard config, rebuilt from the share link;
  the Go wireguard:// link carries dns/presharedkey/keepalive for completeness.
- QR enabled for the wireguard:// link; link rows are compact like other protocols.
- Client information order is subscription, copy URL, WireGuard config; the
  redundant config tab is removed from the add/edit client modal.
- Drop the Inbound Information and QR Code row actions for WireGuard inbounds.
2026-06-29 00:50:34 +02:00
Nikan Zeyaei 60c54827aa feat: ldap skip tls verify (#5637)
* feat(ldap): add InsecureSkipVerify field and tlsConfig helper

Extract the inline TLS config at both LDAPS dial sites (FetchVlessFlags,
AuthenticateUser) into a tlsConfig(cfg) helper, and add a new
Config.InsecureSkipVerify bool that flows through to
tls.Config.InsecureSkipVerify. This unblocks enterprise environments
(e.g. Microsoft AD CS with internal CAs) where the server certificate
chain cannot be imported into the system trust store.

Behavior is identical when InsecureSkipVerify is false (the default) -
pure refactor + plumbing. The helper is unit-testable without a live
server, which is why it is extracted.

Closes https://github.com/MHSanaei/3x-ui/issues/5538

* feat(settings): add LdapInsecureSkipVerify setting

Plumb the new LDAP skip-TLS-verify toggle through the settings stack:
- AllSetting struct field (json/form tag: ldapInsecureSkipVerify)
- defaultValueMap default ("false")
- GetLdapInsecureSkipVerify() getter
- ldap_sync_job wiring into ldaputil.Config (FetchVlessFlags path)
- panel/user.go wiring into ldaputil.Config (AuthenticateUser path;
  the original issue's file list missed this)

Persistence is handled by UpdateAllSetting's reflect loop, matching
the existing pattern used by ldapUseTLS (no explicit setter).

Closes https://github.com/MHSanaei/3x-ui/issues/5538

* feat(ui): add Skip TLS verification switch in LDAP settings

Wire the new ldapInsecureSkipVerify setting into the hand-written
frontend model and Zod schema, and render it as a new Switch in
GeneralTab right under "Use TLS (LDAPS)". The switch is disabled
when TLS is off (the setting is meaningless without LDAPS) and shows
an insecure-warning description to make the security implication
visible to operators.

Also adds a Vitest round-trip test pinning schema acceptance and
model default-to-false behavior.

Closes https://github.com/MHSanaei/3x-ui/issues/5538

* chore(i18n): add Skip TLS verification strings to all locales

Add pages.settings.ldap.skipTlsVerify and skipTlsVerifyDesc to all 13
backend-served translation files, matching the existing repo
convention of keeping LDAP keys present in every locale (en-US, fa-IR,
ru-RU, zh-CN, zh-TW, pt-BR, ar-EG, uk-UA, id-ID, tr-TR, vi-VN, ja-JP,
es-ES). No translation-parity test exists in CI, but every other
LDAP key is replicated across all files, so this keeps the
invariant intact.

Closes https://github.com/MHSanaei/3x-ui/issues/5538

* chore(codegen): regenerate frontend artifacts

Regenerate frontend/src/generated/{zod,types,schemas,examples}.ts
and frontend/public/openapi.json via `npm run gen` to reflect the
new ldapInsecureSkipVerify field. The codegen CI job runs
`git diff --exit-code` on these files; failing to commit them would
break the build.

Closes https://github.com/MHSanaei/3x-ui/issues/5538
2026-06-28 18:10:38 +02:00
n0ctal aef35ee0de fix(sync): mark node dirty inside the mutation transaction (atomic ConfigDirty) (#5611)
* fix(sync): mark node dirty inside the mutation transaction

ConfigDirty is currently set by MarkNodeDirty AFTER the mutation, on a
separate DB handle outside the mutation's transaction. A crash or error
between the committed change and the mark leaves a committed config
change that never reconciles to the node (silent drift). Add
MarkNodeDirtyTx(tx, id) and call it inside each mutation's transaction so
the dirty mark commits atomically with the change.

* fix(test): initialize DB in TestResolveInboundAddress and group gorm import

Two CI failures on this branch:

- race (-shuffle=on): TestResolveInboundAddress reaches resolveInboundAddress -> configuredPublicHost -> GetSubDomain, which reads the global DB. The test never initialized one, relying on another sub-package test to do so first; under shuffle it ran first and nil-dereferenced gorm. Call initSubDB(t) so it is self-sufficient (empty DB yields an empty subDomain, so the subscriber-host fallback still holds).

- golangci goimports: gorm.io/gorm was grouped with the github.com/mhsanaei/3x-ui local imports in node_dirty_test.go. Move it into the third-party group.
2026-06-28 15:18:28 +02:00
n0ctal 2b10808fbd fix(settings): require re-2FA confirmation for sensitive setting changes (#5610)
* fix(settings): require server-side 2fa for sensitive changes

* fix(lint): group third-party imports separately from local (goimports)

golangci-lint goimports flagged setting.go and setting_security_test.go because xlzd/gotp and gorm.io/gorm were mixed into the github.com/mhsanaei/3x-ui local-prefix group. Move them into the third-party group so the local imports stand alone.
2026-06-28 15:17:15 +02:00
nima1024m 25a86b9ee2 feat(balancers): tabbed Observatory/Burst Observatory form (#5627)
* feat(balancers): tabbed Observatory/Burst form replacing raw JSON

Replace the raw JSON editor for the Observatory / Burst Observatory sections
with a proper Ant Design form, and split the Balancers page into two sub-tabs:
"Balancer Settings" (the existing table) and "Observatory".

Observers stay fully auto-managed by balancer strategy through the existing
syncObservatories logic: users edit only the tunable probe fields, the
subjectSelector is shown read-only since it is derived from the balancers, and
deleting the last balancer that needs an observer now warns in the confirm
dialog that the observer will be removed too. Overlapping selectors keep an
observer alive while any balancer still references it.

Also add the previously missing pingConfig.httpMethod field (HEAD/GET) and
translations for the new strings across all 13 locales.

* refactor(balancers): tighten httpMethod typing and align connectivity default

Address automated review feedback on the Observatory form:
- Use the ObservatoryHttpMethodSchema enum for pingConfig.httpMethod instead of
  a free-form z.string(), and drive the HTTP method Select from its options.
  Removes the previously dead enum export and the duplicate local list, and
  types the field as 'HEAD' | 'GET'.
- Align the schema's connectivity default with DEFAULT_BURST_OBSERVATORY (the
  hicloud URL) so it matches what burst observers are actually created with.

No behavior change.
2026-06-28 15:02:18 +02:00
nima1024m 51ffba5961 fix(balancers): defer validation errors until touched or save (#5626)
The Add Balancer modal parsed its empty initial state through
BalancerFormSchema on mount and bound Form.Item validateStatus/help
directly to the result, so "Tag is required" and "Pick at least one
outbound" rendered the moment the modal opened, before any user input.

Gate the inline errors behind per-field touched tracking plus a
submit-attempted flag, and drop the disabled Create button so a save
attempt surfaces the errors (matching RuleFormModal). The existing
key-based remount in BalancersTab resets the flags on each open.

Add a regression test asserting no errors on open and errors only
after a save attempt.
2026-06-28 15:01:53 +02:00
n0ctal 5713c09980 fix(runtime): refresh cached node remotes on identity change (#5614) 2026-06-28 15:01:18 +02:00
n0ctal 7f8cbf4c4b fix(web): tighten database restore body-cap exemption (#5609) 2026-06-28 15:00:55 +02:00
MHSanaei bbfbd7eba6 Bump minimum eligible Xray version
Update Xray release filtering to only include versions at or above v26.6.27 (previously v26.4.25). Also mark `google.golang.org/protobuf` as a direct dependency in `go.mod` by removing the `// indirect` annotation.
2026-06-28 14:57:43 +02:00
MHSanaei 79069d2b64 fix(wireguard): allocate client IPs in the existing peer subnet
defaultWireguardClients always allocated new tunnel addresses from the
hardcoded 10.0.0.0/24 base, so a legacy or migrated inbound whose peers
live in a different subnet (e.g. 172.16.0.0/24) got new clients in an
unrelated, unroutable range. Derive the allocation base from the existing
peers' /24 and fall back to 10.0.0.0/24 only when there are none.
2026-06-28 14:41:24 +02:00