Commit Graph

10 Commits

Author SHA1 Message Date
MHSanaei 6a032bcb2a perf(scale): speed up traffic, auto-renew, and node bulk ops at 50k-100k clients
Local hot paths:
- autoRenewClients: replace the O(clients x expired) inner scan with an
  email->traffic map lookup (quadratic at scale).
- node traffic sync: scope the client_traffics email-membership query to the
  snapshot's emails instead of plucking the whole table every poll.
- add a (expiry_time, reset) index for the per-tick auto-renew filter.
- SQLite: add cache_size/mmap_size/temp_store pragmas (env-tunable); keep the
  single-file DELETE journal and synchronous=FULL defaults.
- scale benchmarks now run on SQLite too via XUI_SCALE_TEST=1 (shared
  setupScaleDB/resetScaleTables helpers), not just Postgres.

Node paths:
- bulk add/delete/adjust on a node-attached inbound folded one HTTP RPC per
  client; above nodeBulkPushThreshold (32) mark the node dirty and let one
  ReconcileNode push converge it instead of O(M) sequential round-trips.
  Small ops keep the live per-client path. Also hoist nodePushPlan out of the
  per-email delete loop.
- ReconcileNode skips inbounds whose wire payload is unchanged (per-tag
  fingerprint on Remote), guarded by node-side tag presence so a restarted
  node is still re-seeded.

Tests: auto-renew multi-inbound correctness, node-path dispatch (large ops
fold to dirty, small ops push live) via a manager runtime override seam, and
reconcile delta-skip.
2026-06-20 10:35:46 +02:00
Younes fb03b0e9f1 fix(traffic): prevent phantom quota consumption from stale node data (#5412)
Three related bugs caused inflated traffic counters and spurious quota
hits on multi-node setups, most visibly when a client email was renamed
while a node was offline or its PostgreSQL deadlocked.

**Fix 1 — phantom quota (root cause)** `setRemoteTrafficLocked`
new-row path: when master had no `client_traffics` row for an email
that a node reported, it seeded the row with `Up: cs.Up` — importing
the node's full accumulated counter as if it were fresh quota usage.
If the node retained stale data from a previously-deleted account (e.g.
a failed deletion during an outage), the ghost 50 GB appeared on the
new client immediately and triggered `disableInvalidClients` the same
tick. Fixed by seeding at `Up: 0`; the current node value still becomes
the baseline so only future increments count.

**Fix 2 — PostgreSQL deadlock** `addClientTraffic` did a
read-modify-write via `tx.Save(slice)`, issuing UPDATEs in slice order.
Two concurrent goroutines locking the same rows in opposite order
deadlock on PostgreSQL (SQLite avoids this with file-level
serialisation). Replaced with atomic per-email
`UPDATE SET up=up+?, down=down+?` statements. Also preserves the
delayed-start ExpiryTime conversion that `adjustTraffics` computes
in-memory but the old Save path persisted to the DB.

**Fix 3 & 4 — stale `inbound_id` filters** `autoRenewClients` used
`WHERE inbound_id NOT IN (node inbounds)` to skip node clients, but
`client_traffics.inbound_id` is set once on INSERT and never refreshed.
Replaced with an email-based subquery through `client_inbounds` (the
authoritative source). Also added a safe type assertion for
`settings["clients"].([]any)` that previously panicked on nil.

**Fix 5 — stale `inbound_id` in reset** `resetAllClientTrafficsLocked`
used `WHERE inbound_id = ?` to find which emails to reset; same staleness
problem. Replaced with the `client_inbounds` join for email lookup;
the `inbounds.last_traffic_reset_time` update still correctly uses the
inbound ID directly on the `inbounds` table.

Tests updated to reflect the new seeding-at-zero semantics and a new
`TestGhostData_NoPhantomTraffic` test reproduces the exact 50 GB
phantom scenario.
2026-06-20 00:36:35 +02:00
MHSanaei 7fe082a7f1 fix(nodes): stop multi-attached client traffic inflating across node inbounds
Xray counts client traffic globally per email, so a client attached to
several of a node's inbounds has its single shared counter copied onto
every inbound by the node's enriched inbound list. When those copies
diverge (legacy per-inbound rows surviving a v3.2.x->v3.3.x upgrade, or
any drift) the per-inbound delta loop read the lower sibling as a
node-counter reset and re-added its full value, inflating the client far
past real usage (#5274).

Fold each email to its per-field node-wide max before the delta loop so
every occurrence is equal: the per-email baseline dedup then holds and
the reset clamp never misfires.
2026-06-15 19:31:57 +02:00
Rouzbeh† 628406117e fix(nodes): sync "start after first connect" expiry so un-activated nodes do not reset it (#5319)
* fix(nodes): stop un-activated nodes from resetting "start after first connect" expiry

In a multi-node setup a client is attached to inbounds on several nodes, but
its `client_traffics` row is shared per-email (the column is `gorm:"unique"`).
With "start expiry after first connect", the expiry is stored as a negative
duration and each node converts it to an absolute deadline (now+duration) the
first time the client connects *there*.

The master's per-node traffic merge wrote `expiry_time = ?` unconditionally for
every node sync. So a node where the client never connected keeps reporting the
un-activated negative duration and clobbers the absolute deadline that the node
where the client *did* connect had already activated — last writer wins. The
shared row flip-flops and usually lands back on the negative value, so the main
panel shows the timer "not started" while the active node counts down, and the
subscription (which reads this row and recomputes negative as now+duration on
every fetch) reports a perpetually-resetting, wrong expiry and usage.

Guard the merge so an un-activated (<= 0) value reported by a node can never
reset an already-activated absolute deadline. A positive node value is still
adopted, so a node that legitimately moves the deadline forward (traffic reset /
auto-renew) still propagates. The rule lives in both the SQL CASE used by the
merge and a small `mergeActivationExpiry` helper (kept in lockstep) that the
structural-change check reuses so the guard does not trigger spurious config
re-pushes.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* fix(nodes): cast expiry merge params to BIGINT for Postgres

The "start after first connect" merge guard introduced the comparison
`? <= 0` in the client_traffics expiry_time CASE. There Postgres infers
the parameter type as int4 from the literal 0, so binding a real expiry
value — a negative start-after-connect duration or a positive absolute
deadline (~1.7e12 ms) — overflows int4 and the whole setRemoteTrafficLocked
transaction fails, breaking node traffic and expiry sync on Postgres.
SQLite (dynamic typing) was unaffected.

Wrap both params in CAST(? AS BIGINT) (portable across SQLite and
Postgres) so the parameter is typed bigint, matching the explicit casts
the sibling GreatestExpr/ClientTrafficEnableMergeExpr helpers already use.

Verified against Postgres 16: TestNodeFirstConnectExpiry_NotClobbered
failed before this change and passes after; SQLite suite unchanged.

---------

Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Co-authored-by: Sanaei <ho3ein.sanaei@gmail.com>
2026-06-15 15:46:19 +02:00
MHSanaei f1a4286e2f feat(sub): per-inbound sort order for subscription links
Add a subSortIndex field to inbounds that controls the order of links
in subscription output only: the raw sub body, the HTML sub page, and
the JSON/Clash formats (all served from the same query). Lower values
come first; ties keep id order. The panel inbound list is unaffected.

The value is editable in the inbound form next to the share-address
fields, propagates to nodes via wireInbound, and follows the usual
node-sync rules (copied on import, mirrored while not dirty, never a
structural change).

Rescoped from #5214 by @Ponywka.
2026-06-12 12:03:22 +02:00
MHSanaei 21143a6d72 fix(node-sync): keep node baseline while a sibling inbound still reports the email (#5202)
The orphan sweeps in setRemoteTrafficLocked deleted the (node, email)
baseline row unconditionally whenever an email was missing from one
inbound's snapshot stats — even though baselines are keyed per node, not
per inbound. For a client attached to two inbounds of the same node whose
stats the node reports under only one of them, the sweep for the other
inbound deleted the baseline at the end of every sync cycle. Depending on
inbound order, the baseline written earlier in the same transaction was
wiped each time, so the next cycle computed delta against a missing
baseline (zero) and the client's traffic froze permanently.

Scope both sweeps to the union of emails across the whole snapshot: a
baseline is only dropped when the email left the node entirely.
2026-06-11 21:20:38 +02:00
animesha3 554d85c2f7 feat: allow selecting inbounds synchronized from nodes (#5178)
* feat: select node inbounds for synchronization

Allow node owners to import either all remote inbounds or an explicit tag-based selection. Add remote inbound discovery, persistence, snapshot filtering, API documentation, tests, and localized UI labels.

* fix

* fix: scope node reconcile and orphan sweep to selected inbound tags

In 'selected' sync mode unselected inbounds never enter the panel DB, so
ReconcileNode treated them as undesired and deleted them from the node the
first time it went config-dirty. Reconcile now only sweeps remote tags that
are part of the selection; everything else on the node is unmanaged.

Panel-created or renamed inbounds on a selected-mode node also vanished:
their tag was outside the selection, so the next traffic pull filtered them
out of the snapshot and the orphan sweep silently dropped the central row.
AddInbound/UpdateInbound now allow the tag on the node before committing.

---------

Co-authored-by: Sanaei <ho3ein.sanaei@gmail.com>
2026-06-11 20:48:26 +02:00
iYuan 2a7342baa9 feat: add inbound share address strategy (#5162)
* feat: add inbound share address strategy

Allow node-managed inbounds to choose whether exported share links use the node address, routable listen address, or a custom endpoint. Preserve locally configured share address fields during remote node traffic sync.

Refs #5161

Refs #4891

* fix: preserve inbound share address settings

Forward share address fields to remote nodes, keep existing values when older update payloads omit them, align localhost handling between frontend and subscriptions, and preserve share address settings when cloning inbounds.

* fix: keep share address strategy out of subscriptions

Limit the new share address strategy to direct exported share links and QR codes. Restore subscription address resolution to the existing panel-owned behavior and update the UI help text accordingly.

* fix: address share address review feedback

* fix: validate custom share address

* fix

---------

Co-authored-by: Sanaei <ho3ein.sanaei@gmail.com>
2026-06-11 20:24:15 +02:00
MHSanaei 8258a26fbf fix(node-sync): keep shared client traffic row when email still lives on other inbounds
client_traffics is the per-email accumulator shared across every inbound
and node the client is attached to. setRemoteTrafficLocked deleted it
unguarded in two sweeps — when a node inbound vanished from the snapshot
(node reinstall, tag change, another master's reconcile on a shared
node) and when an email left one inbound's stats — even though the
email was still attached elsewhere. The next sync then re-seeded the
row with that node's counter alone, so the panel showed the last
changed panel's number instead of the summed total.

Guard both sweeps with emailUsedByOtherInbounds, matching what the
manual-edit path (updateClientTraffics) already does. Truly removed
clients are still cleaned up by the zero-attachment sweep.
2026-06-11 14:28:09 +02:00
Sanaei 41645255f1 refactor: focused service files, leaf subpackages, and an internal/ layout (#5167)
* refactor(service): split client.go into focused files

client.go had grown to 4455 lines mixing ~10 responsibilities. Split it
verbatim into cohesive same-package files (no behavior change):

  client.go            foundation: ClientService, ClientWithAttachments,
                       ClientCreatePayload, ErrClientNotInInbound, sqlInChunk
  client_locks.go      inbound mutation locks, delete tombstones, compactOrphans
  client_lookup.go     read-only lookups (GetByID, List, EffectiveFlow, ...)
  client_link.go       inbound association sync (SyncInbound, DetachInbound, ...)
  client_crud.go       single-client CRUD + validation + protocol defaults
  client_inbound_apply.go  low-level inbound-settings mutators + by-email setters
  client_bulk.go       bulk attach/detach/adjust/delete/create + DelDepleted
  client_traffic.go    traffic-reset paths
  client_groups.go     client group management
  client_paging.go     paged listing, filtering, sorting, summary

Every declaration moved unchanged (verified: identical func/type/const/var
signature set before vs after). Imports redistributed per file via goimports.
go build ./..., go vet, and go test ./web/service/... all pass.

* refactor(service): split inbound.go into focused files

inbound.go was 4100 lines. Split it verbatim into cohesive same-package
files (no behavior change):

  inbound.go             core inbound CRUD + InboundService (keeps pkg doc)
  inbound_protocol.go    protocol / stream capability helpers
  inbound_node.go        node/runtime/remote coordination + online tracking
  inbound_traffic.go     traffic accounting, reset, client stats
  inbound_client_ips.go  per-client IP tracking
  inbound_clients.go     client lookups within inbounds + copy-clients
  inbound_disable.go     auto-disable invalid inbounds/clients
  inbound_migration.go   DB migrations
  inbound_sublink.go     subscription link providers
  inbound_util.go        generic slice/string helpers

Identical func/type/const/var signature set before vs after; package doc
comment preserved on inbound.go. Imports redistributed via goimports.
Build, vet, and go test ./web/service/... all pass.

* refactor(service): split tgbot.go into focused files

tgbot.go was 3738 lines dominated by a 1246-line answerCallback. Split it
verbatim into cohesive same-package files (no behavior change):

  tgbot.go           lifecycle, bot setup, caches, small utils
  tgbot_router.go    incoming update / command / callback dispatch
  tgbot_send.go      outbound messaging primitives
  tgbot_client.go    client views, actions, subscription links
  tgbot_inbound.go   inbound listing / pickers
  tgbot_report.go    server usage, exhausted, online, backups, notifications

Identical func/type/const/var signature set before vs after. Imports
redistributed via goimports. Build, vet, and go test ./web/service/... pass.

* refactor(client): dedupe single-field by-email setters

ResetClientIpLimitByEmail, ResetClientExpiryTimeByEmail, and
ResetClientTrafficLimitByEmail shared an identical ~50-line body that
resolves the inbound by email, confirms the client exists, rewrites a
single-client settings payload, and delegates to UpdateInboundClient.

Extract that into applyClientFieldByEmail(inboundSvc, email, mutate) and
reduce each setter to a 3-line wrapper. Behavior is unchanged: same checks
and error strings, same single-client payload contract, same totalGB guard.

SetClientTelegramUserID (resolves by traffic id, different error text) and
ToggleClientEnableByEmail/SetClientEnableByEmail (different return shape and
a pre-read of the old state) intentionally keep their own bodies.

* refactor(service): extract panel/ subpackage

Move the panel-administration leaf services out of the flat service
package into web/service/panel/ (package panel):

  user.go         UserService (auth / 2FA / LDAP)
  panel.go        PanelService (restart / self-update) + version helpers
  panel_other.go  non-unix RestartPanel
  panel_unix.go   unix RestartPanel
  api_token.go    ApiTokenService
  websocket.go    WebSocketService
  panel_test.go   version/shellQuote unit tests

These are leaves: they depend on core (SettingService, Release) but no
core file references them, so the extraction creates no import cycle.
Core references are now qualified (service.SettingService, service.Release);
callers in main.go, web/web.go, and web/controller/* updated to panel.*.
Build, vet, and go test ./web/... pass.

* refactor(service): extract integration/ subpackage

Move the external-provider integration leaves into web/service/integration/
(package integration):

  warp.go        WarpService (Cloudflare WARP)
  nord.go        NordService (NordVPN)
  custom_geo.go  CustomGeoService (custom geo asset management)
  *_test.go      custom_geo / panel-proxy tests

These depend on core (SettingService, ServerService, XraySettingService) but
no core file references them. xray_setting.go stays in core because it calls
the unexported SettingService.saveSetting. The shared isBlockedIP SSRF helper
(used by core url_safety.go and by custom_geo) now has a small copy in each
package rather than being exported. Core references qualified; callers in
web/web.go, web/job/*, and web/controller/* updated to integration.*.
Build, vet, and go test ./web/... pass.

* refactor(service): extract tgbot/ subpackage

Move the Telegram bot (6 files + test) into web/service/tgbot/ (package
tgbot). It is a leaf: it embeds five core services (Inbound/Client/Setting/
Server/Xray) and the core never references it, so no import cycle.

To support the package boundary without changing behavior:
  - core exposes XrayProcess() *xray.Process so tgbot keeps calling the
    exact same running-process methods it used via the package-level `p`;
  - three core methods tgbot calls are exported: ClientService.checkIs-
    EnabledByEmail -> CheckIsEnabledByEmail, InboundService.getAllEmails ->
    GetAllEmails (callers updated in-package);
  - tgbot's embedded-field types and the few core type refs (Status,
    ClientCreatePayload, SanitizePublicHTTPURL) are now service-qualified.

Callers in main.go, web/web.go, web/job/*, and web/controller/* updated to
tgbot.*. Build, vet, and go test ./web/... pass.

* refactor(service): extract outbound/ subpackage

OutboundService (outbound.go) imports only neutral packages (config,
database, model, xray) and its production code is referenced by no core or
sibling service file — only by web/controller/xray_setting.go and
web/job/xray_traffic_job.go. Move it to web/service/outbound/ (package
outbound); no core qualification needed inside. Callers updated to outbound.*.

The one coupling was a tiny pure test helper, outboundsContainTag, used by
both outbound.go and the core outbound_subscription_test.go; it now has a
small copy in that test file rather than being shared across the boundary.
Build, vet, and go test ./web/... pass.

* refactor(util): move wireguard into its own subpackage

util/wireguard.go was the lone file of the root `util` package (24 lines,
one exported func GenerateWireguardKeypair), while every other util concern
lives in a focused subpackage (util/common, util/crypto, util/netsafe, ...).
Move it to util/wireguard/ (package wireguard) for consistency; its only
importer, web/service/integration/warp.go, is updated. The root `util`
package no longer exists.

* refactor(sub): drop redundant sub prefix from filenames

Inside package sub the subXxx.go prefix just repeats the package name
(like client_*.go did inside service). Rename for consistency; content and
type names are unchanged:

  subController.go    -> controller.go
  subService.go       -> service.go
  subClashService.go  -> clash_service.go
  subJsonService.go   -> json_service.go
  (+ matching _test.go files)

* refactor(controller): rename xui.go -> spa.go

XUIController serves the panel's single-page-app shell; spa.go names that
role plainly (the other controller files are domain-named). File rename only
— the type stays XUIController. api_docs_test.go keys route base paths by
filename, so its "xui.go" case is updated to "spa.go".

* refactor: move backend packages under internal/

Adopt the idiomatic Go application layout: the backend packages now live
under internal/ (a boundary the toolchain enforces), signalling private
implementation instead of a library-style flat root. No runtime behavior
changes — only import paths and a few build/config paths move.

Moved: config, database, logger, mtproto, sub, util, web, xray -> internal/.
main.go stays at the repo root and tools/openapigen stays under tools/ (both
still import internal/* because the internal rule keys off the module root).
The module path github.com/mhsanaei/3x-ui/v3 is unchanged; 149 .go files had
their import prefix rewritten to .../internal/<pkg>.

Couplings the Go compiler can't see, updated to the new layout:
  - frontend i18n imports of web/translation (react.ts, setup.components.ts)
  - vite outDir + eslint/tsconfig ignore globs -> internal/web/dist
  - Dockerfile COPY paths for web/dist and web/translation
  - locale.go os.DirFS("web") disk fallback -> "internal/web"
  - .gitignore and ci.yml go:embed stub for internal/web/dist
  - api_docs_test.go repo-root relative walk (one level deeper)
  - tools/openapigen filesystem package paths; ApiTokenView repointed to the
    web/service/panel subpackage and codegen regenerated (clears a stale
    type the ci.yml codegen check was failing on)

Verified: go build/vet/test (all packages), and frontend typecheck, lint,
vitest (478 tests), and production build into internal/web/dist.

* fix(config): keep test runs from writing logs into the source tree

GetLogFolder() returns a CWD-relative "./log" on Windows. Under `go test`
the working directory is each package's own folder, so InitLogger (called by
tests in web/job, web/service, xray, web/websocket) created stray log/
directories scattered through the source tree (e.g. internal/web/job/log/).

Redirect to a shared temp folder when testing.Testing() reports a test run.
Production behavior is unchanged: Windows still uses ./log next to the binary
and Linux /var/log/x-ui. The log files were always gitignored (*.log) and
never committed; this just stops the noise at the source.

* docs: move subscription-template guide out of root into docs/

sub_templates/ was a top-level folder holding only a README and no actual
templates (3x-ui ships none by design), referenced nowhere and unlinked from
any doc — it read like an empty placeholder cluttering the repo root.

Move the guide to docs/custom-subscription-templates.md (a proper docs home),
reword its intro to read as documentation rather than a folder note, link it
from the Features list in README.md, and drop the empty sub_templates/ folder.

* fix: update stale web/ path references after the internal/ move

The internal/ migration rewrote Go import paths but left some references to
the old top-level layout in docs, comments, and a few runtime disk paths.

Functional (dev-mode only): the disk-serving fallbacks that read the Vite
build from disk when running from source still pointed at web/dist/, which
moved to internal/web/dist/ — so `os.DirFS`/`os.Stat`/`os.ReadFile` in
internal/web/web.go and internal/sub/{sub,controller}.go are corrected.
Production was unaffected (it serves the embedded FS; verified by the Docker
build), but `go run` with a live frontend build silently fell back to embed.

Docs/comments: frontend/README.md, CONTRIBUTING.md, the claude-issue-bot and
release workflows, the openapigen -root help text, and assorted Go comments
now reference internal/web, internal/database, internal/sub, internal/xray,
etc. Package-name mentions (the "web" package), root paths (main.go,
frontend/, install scripts, /etc/x-ui), routes (/panel/api/xray), and the
historical "web/assets no longer exists" note were intentionally left as-is.

* refactor(web): remove the legacy /xui -> /panel redirect middleware

RedirectMiddleware existed only for backward compatibility with the old
`/xui` URL scheme (301-redirecting /xui and /xui/API to /panel and
/panel/api). That cutover was long ago, so drop the middleware, its
registration in initRouter, and the now-inaccurate "URL redirection"
mention in the middleware package doc. Old /xui URLs now 404 like any other
unknown path. HTTPS auto-redirect and auth redirects are unrelated and stay.

* build: fix .dockerignore for internal/ layout and exclude runtime dir

- web/dist -> internal/web/dist: the embedded frontend moved under internal/,
  so the stale exclude no longer matched and the locally-built dist could be
  sent to the build context (the frontend stage rebuilds it fresh anyway).
- exclude x-ui/: the local runtime directory (SQLite db, geo .dat files, xray
  binaries, certs — ~150MB) was being shipped into the build context for no
  reason. Verified the pattern excludes only the directory and still keeps
  x-ui.sh, which the Dockerfile copies to /usr/bin/x-ui.
2026-06-10 15:19:22 +02:00