Commit Graph

15 Commits

Author SHA1 Message Date
MHSanaei fa1a19c03c style: adopt golangci-lint v2 and resolve all findings
Add .golangci.yml (v2): the standard linters plus bodyclose, errorlint, noctx, misspell, rowserrcheck, sqlclosecheck, unconvert, usestdlibvars, with gofumpt + goimports formatters. Enable the std-error-handling exclusion preset for idiomatic Close/Remove/Setenv ignores; scope-exclude SA1019 (parser.ParseDir in tools/openapigen) and ST1005 (intentional capitalized user-facing error copy that tests assert verbatim). No inline nolint directives were introduced.

Resolve all 217 findings behavior-preserving: gofumpt/goimports formatting, explicit blank assignment on intentionally ignored errors, errors.Is/errors.As and %w wrapping, context-aware stdlib calls (CommandContext/QueryContext/NewRequestWithContext/Dialer), staticcheck simplifications, removed redundant conversions, http.StatusOK and http.MethodGet, inlined the go:fix intPtr helper, and deferred sql rows Close. Add a golangci CI job mirroring the existing Go jobs.
2026-06-27 15:42:22 +02:00
MHSanaei b32837e523 fix(node): import per-client traffic history on first sync of a node-hosted inbound
On the first sync of a node-hosted inbound, the central inbound adopted the
node's full lifetime counter but every client_traffics row was seeded at 0 (with
the delta baseline set to the node's current counter). So adding or migrating a
node that already had traffic kept the inbound total correct while every
per-client counter restarted from zero, and the master under-reported per-client
usage by the entire pre-attach history.

Seed a new client_traffics row from the node counter only when the inbound was
created during the same sync (a genuine node-add / inbound re-import); a client
reappearing under a pre-existing inbound still seeds 0, preserving the ghost
protection in TestGhostData_NoPhantomTraffic. The seed is additionally gated on
the delete tombstone so a just-deleted client cannot be resurrected if its
inbound is recreated. Baseline still equals the seeded value, so the next sync
delta is 0 and no traffic is double counted.

Adds TestNodeAdd_ImportsClientHistoryWithNewInbound and
TestNodeAdd_TombstonedClientNotResurrected.
2026-06-25 21:19:27 +02:00
Sanaei adc64bb804 fix(nodes): cloned-node attribution, node-hosted client display (online/speed/counts), and sync robustness (#5488)
* fix(nodes): keep cloned nodes (shared panelGuid) in separate attribution buckets

#4983 keys online/inbound attribution by panelGuid, assuming it is globally unique. Cloned node servers ship an identical panelGuid in their copied settings, so the master collapsed several physical nodes into one bucket: GetMergedNodeTrees merged their online sets under one key and every inbound on those nodes (same origin_node_guid) read that merged set, so the inbound page showed online cross-attributed and counts inflated.

Fall back to the node-unique synthNodeGuid(node.Id) whenever a node's panelGuid is shared by another of the master's direct nodes. Applied consistently at originGuidFor (origin_node_guid write), the online-tree key plus a self-key remap for nodes that report a GUID-keyed tree, effectiveNodeGuid, and recountByGuid's inbound bucketing. sharedNodeGuids computes the collision set. Online now works without node changes; making panelGuids unique restores real-GUID identity and also fixes GUID-keyed IP attribution.

* fix(nodes): extend duplicate-GUID hardening to master collisions, IP attribution, and a heartbeat warning

Builds on the node-vs-node fix: a node's GUID is now also treated as ambiguous when it equals the master's own panelGuid (a node cloned from the master), so the master's local clients and that node can't merge. Centralized as ambiguousNodeGuids(nodes, selfGuid) + effectiveNodeKey(node).

Applied the same node-unique fallback to the GUID-keyed IP attribution that #4983 added but the prior commit left collapsing: MergeClientIpsByGuid remaps a cloned node's own subtree to its node-unique key, nodeGuidNameMap resolves names by that key, and node deletion purges both keys. Added a throttled heartbeat warning so the operator is told to regenerate a duplicate panelGuid. Tests cover master-collision, effectiveNodeKey, and the IP remap.

* fix(node-sync): log the client-IP-attribution 404 once per node, not every cycle

Old-build nodes lack panel/api/clients/clientIpsByGuid and answer 404 on every IP-sync cycle (~10s), which floods the debug log now that the IP phase actually runs. Note the missing endpoint once per node (re-armed if the node later recovers or is upgraded) and keep logging genuine fetch errors.

* fix(nodes): remap a cloned node's own-panelGuid origin so the inbound page shows online

These nodes report their OWN inbounds with their own panelGuid as OriginNodeGuid, so originGuidFor returned the shared GUID verbatim and never remapped it. origin_node_guid stayed the shared GUID while online was keyed under the node-unique key, so the inbound page (which reads the stored origin_node_guid) looked up an empty bucket and showed everyone offline — even though the Nodes page (which derives the key live) was correct. Treat an origin equal to the node's own panelGuid as the node's own inbound and resolve it through selfKey; keep only a genuinely different (descendant) origin across hops.

* fix(node-sync): don't delete a node's central inbounds when its snapshot is empty

The central-inbound sweep deletes any central inbound whose tag is absent from the node's snapshot, with no guard for an empty snapshot. A node mid-restart or with a transient DB error (e.g. Postgres 57P01) can return an empty inbound list with success=true, which wiped all of that node's central inbounds and their clients (and reset traffic history on re-create) — observed on the Germany node: 0 clients but still 44 online (online survives because it comes from the snapshot's online tree, not the central inbound). Skip the sweep entirely when the snapshot reports zero inbounds; a real per-inbound deletion still sweeps via a non-empty snapshot that omits one tag.

* fix(email): stay silent when SMTP notifications are disabled

The event subscriber is registered unconditionally and only checked the per-event list (smtpEnabledEvents, default login.attempt,cpu.high) — not the smtpEnable master toggle. Login events are always published, so a panel with smtpEnable=false still attempted a send on every login and logged 'email subscriber: send failed: smtp host not configured'. Gate HandleEvent on GetSmtpEnable() so a disabled-SMTP panel does nothing, matching the comment where the subscriber is registered.

* fix(nodes): count only expired/exhausted as 'ended', not disabled clients

The per-node depleted (ended) count folded disabled clients in with expired/exhausted (expired || exhausted || !Enable), so the Nodes page 'ended' chip was inflated and inconsistent with the inbound page, where disabled and depleted are separate buckets. Count only expired/exhausted in both GetAll and recountByGuid so 'ended' means the same thing on both pages.

* feat(nodes): show live speed for node-hosted inbounds

Inbound speed is computed on the dashboard from a 'traffics' delta feed, which only the local Xray poll produced — so node-hosted inbounds showed no speed. The node sync now diffs successive per-inbound cumulative totals (it polls @5s, same as the local poll) and broadcasts the byte deltas as a separate 'nodeTraffics' field, keyed by the central tag the dashboard already matches. The frontend applies 'traffics' to local inbounds and 'nodeTraffics' to node inbounds within their own scope, so the two 5s polls don't clobber each other and idle inbounds still clear. Deltas clamp to 0 on a reset; a node that fails to sync keeps a stale total so its delta is 0 (no phantom speed).

* fix(nodes): normalize node-inbound speed by elapsed time to avoid recovery spikes

Adversarial review found that a node's cumulative inbound counter keeps climbing while the master can't reach it, so the first delta after a gap (node outage, skipped poll, slow node) spans more than one 5s window but was still divided by the dashboard's fixed 5s — rendering an impossible one-tick speed spike on recovery (and a 2x over-report after a skipped poll). Now each delta is normalized to the fixed window using the real elapsed time since the inbound's counter last changed, so a backlog shows the true average rate over the gap. The change timestamp advances only on actual movement, so idle stretches average correctly when traffic resumes; resets rebaseline. Also moves the maybePushGlobals doc comment back onto its function.

* fix(inbounds): keep last speed across page navigation instead of blanking

Speed is delta-derived, so it can't be recomputed until the first poll after mount. The websocket subscription and speed state are page-scoped (useWebSocket lives in InboundsPage), so leaving to another page and returning blanked the Speed column for up to one 5s poll. Cache the last speed map across mounts (module scope, 15s recency guard) and seed the state from it, so returning shows the last throughput immediately and the next poll refreshes it. Applies to both local and node-hosted inbound speed.

* fix(inbounds): rebalance table column widths so it fills width without gaps

Inbound list columns had small fixed widths summing far below the table's
full width, so AntD spread the leftover space evenly into wide empty gaps.
Widen the content-heavy columns (protocol, clients, traffic, node) so the
slack lands there, keep the small ones (id, port, enable) tight, and make
scroll.x track the visible columns' total so the table never collapses
below content and adapts when conditional columns are hidden.

* feat(nodes): show active/disabled client counts on the nodes page like inbounds

The nodes page only showed total/online/ended, and (since ended now excludes disabled) disabled clients were invisible there. Compute per-node active and disabled counts — in both GetAll and recountByGuid, with the same depleted-wins-over-disabled precedence the inbound page uses so the buckets stay mutually exclusive — and render total/active/disabled/ended/online chips matching the inbound page (table column + mobile stats modal).

* fix(nodes): count active/disabled/ended by client email, not stale inbound_id

The per-node client breakdown filtered client_traffics by inbound_id, but that column goes stale after an inbound is delete+recreated (e.g. the Germany node), so almost every traffic row pointed at a dead inbound id and the counts collapsed — active showed ~5 instead of ~1100. Classify each node client via client_inbounds -> clients joined to client_traffics by EMAIL (the reliable key), deduped per node/guid, in both GetAll and recountByGuid. Now active/disabled/ended on the nodes page match the inbound page. Added a regression test that proves matching works with a deliberately stale inbound_id.

* style(nodes): widen Clients column so the count chips fit one tidy line

After adding the active/disabled chips, the 5 chips (total/active/disabled/ended/online) no longer fit the 160px Clients column and wrapped to two lines. Widen it to 220 and drop the Space wrap so they render on a single line like the inbound page, and zero the total tag's margin for even spacing. Same principle as 79ff283 (give the content column enough width).

* style(nodes): tighten Clients chip spacing to match the inbound page

AntD's default tag side-padding (~8px) put a wide gap between the count chips. Apply the inbound page's compact padding ('0 2px') + client-count-tag (tabular-nums) to each chip and narrow the column to 180 so the numbers sit close together like the inbound list instead of floating apart.
2026-06-22 20:20:55 +02:00
MHSanaei 7d23a2c15b perf: prevent cron job overlap, auto-set GOMEMLIMIT, fix tgbot userStates race
cron: SkipIfStillRunning stops a slow 5s/10s job from overlapping itself and racing the shared xrayAPI (grpc conn leak) and the StatsLastValues map (fatal concurrent map write). memlimit: auto-detect a Go soft memory limit from XUI_MEMORY_LIMIT, the cgroup limit, or system RAM (about 90 percent); opt-in pprof via XUI_PPROF. tgbot: userStates now goes through a mutex-guarded store with TTL pruning (was raced by worker-pool and delayed-delete goroutines). check_client_ip: prefilter inbounds by settings LIKE limitIp instead of loading and JSON-parsing all of them every scan. minor: prune StatsLastValues, RateLimiter.lastSent, reportedRemoteTagConflict. docker-compose: document the memory knobs.
2026-06-22 02:48:58 +02:00
Sanaei 679d2e1cca fix: resolve a batch of open bug-tagged issues (traffic accounting, share strategy, sub address, CPU) (#5477)
* fix(node): never re-add a node's full counter on reset/restart (#5456, #5476, #5390)

When a node's per-client counter dips below the master's stored baseline
(node reboot, xray restart, or a reset propagated to the node), the delta
accounting clamped delta to the node's whole current counter and re-added it
to the master total — double-counting a client's lifetime usage in a single
sync and often pushing them over quota. Treat a backward-moving counter as a
reset: add 0 and rebaseline to the reported value, so only genuine post-reset
usage accrues.

Resets also now clear the per-node NodeClientTraffic baseline (ResetClient
TrafficByEmail, resetClientTrafficLocked, BulkResetTraffic, resetAllClient
TrafficsLocked), mirroring the delete paths. Without this the node's pre-reset
cumulative — including traffic it had counted but not yet synced — leaks back
onto the master after a reset, which is the 'reset reverts after a while'
report. The next sync then takes the clean delta=0 + rebaseline path regardless
of node state.

Updates TestNodeCounterReset (was _Clamped, now _NoReAdd) to assert rebaseline
instead of re-add, and adds TestCentralResetClearsNodeBaseline_NoLeak.

* fix(inbound): keep persisted node share strategy on edit (#5375)

Opening the edit modal silently reverted shareAddrStrategy from 'node' to
'listen'. The downgrade effect fires before the form settles: availableNodes
is an empty placeholder until /nodes/list resolves, and Form.useWatch('protocol')
is briefly empty on the first edit render — both transiently make the node
option look unavailable, so the effect clobbered the saved value.

Gate the downgrade on availableNodesFetched (threaded from useNodesQuery through
InboundsPage) and on the protocol watch being settled, so a persisted strategy
is only downgraded when the node option is genuinely unavailable. Adds a
rerender-based regression test covering the nodes-loading race.

* <3

* perf(traffic): skip cross-panel quota subquery when no globals exist (#5392, #5389)

disableInvalidClients ran a correlated EXISTS against client_global_traffics
on the full client_traffics table every 5s. On a panel no master pushes to,
that table is empty so the subquery can never match — yet it forced a full
scan that pegged Postgres at 100% CPU on large client counts. Probe the table
first and drop the EXISTS branch when it's empty (the common case), and add an
idx_client_global_email index so the subquery is an index lookup when globals
are present. Cross-panel enforcement is unchanged (TestGlobalUsage_DisablesClient).

This also relieves #5389 ('traffic writer queue full' / panel freeze): the
heavy query runs inside the serialized traffic write, so a slow DB backs the
shared writer queue up until request handlers block.

* fix(sub): don't advertise a leaked client IP for local wildcard inbounds (#5425)

For a local inbound with no node, no custom share address, and a wildcard/blank
listen, resolveInboundAddress fell straight through to the subscriber's request
host. Behind NAT/proxy/CDN that Host can be the requesting client's own IP, so
the subscription wrote the client's address into the inbound instead of the
server's — while the panel's own share link (which doesn't use the request host)
stayed correct.

Prefer the admin's configured public host (Sub/Web domain) over the raw request
host for this last-resort fallback. With no configured host the request host
still stands, so existing single-domain setups are unaffected.
2026-06-22 00:22:28 +02:00
MHSanaei 6a032bcb2a perf(scale): speed up traffic, auto-renew, and node bulk ops at 50k-100k clients
Local hot paths:
- autoRenewClients: replace the O(clients x expired) inner scan with an
  email->traffic map lookup (quadratic at scale).
- node traffic sync: scope the client_traffics email-membership query to the
  snapshot's emails instead of plucking the whole table every poll.
- add a (expiry_time, reset) index for the per-tick auto-renew filter.
- SQLite: add cache_size/mmap_size/temp_store pragmas (env-tunable); keep the
  single-file DELETE journal and synchronous=FULL defaults.
- scale benchmarks now run on SQLite too via XUI_SCALE_TEST=1 (shared
  setupScaleDB/resetScaleTables helpers), not just Postgres.

Node paths:
- bulk add/delete/adjust on a node-attached inbound folded one HTTP RPC per
  client; above nodeBulkPushThreshold (32) mark the node dirty and let one
  ReconcileNode push converge it instead of O(M) sequential round-trips.
  Small ops keep the live per-client path. Also hoist nodePushPlan out of the
  per-email delete loop.
- ReconcileNode skips inbounds whose wire payload is unchanged (per-tag
  fingerprint on Remote), guarded by node-side tag presence so a restarted
  node is still re-seeded.

Tests: auto-renew multi-inbound correctness, node-path dispatch (large ops
fold to dirty, small ops push live) via a manager runtime override seam, and
reconcile delta-skip.
2026-06-20 10:35:46 +02:00
Younes fb03b0e9f1 fix(traffic): prevent phantom quota consumption from stale node data (#5412)
Three related bugs caused inflated traffic counters and spurious quota
hits on multi-node setups, most visibly when a client email was renamed
while a node was offline or its PostgreSQL deadlocked.

**Fix 1 — phantom quota (root cause)** `setRemoteTrafficLocked`
new-row path: when master had no `client_traffics` row for an email
that a node reported, it seeded the row with `Up: cs.Up` — importing
the node's full accumulated counter as if it were fresh quota usage.
If the node retained stale data from a previously-deleted account (e.g.
a failed deletion during an outage), the ghost 50 GB appeared on the
new client immediately and triggered `disableInvalidClients` the same
tick. Fixed by seeding at `Up: 0`; the current node value still becomes
the baseline so only future increments count.

**Fix 2 — PostgreSQL deadlock** `addClientTraffic` did a
read-modify-write via `tx.Save(slice)`, issuing UPDATEs in slice order.
Two concurrent goroutines locking the same rows in opposite order
deadlock on PostgreSQL (SQLite avoids this with file-level
serialisation). Replaced with atomic per-email
`UPDATE SET up=up+?, down=down+?` statements. Also preserves the
delayed-start ExpiryTime conversion that `adjustTraffics` computes
in-memory but the old Save path persisted to the DB.

**Fix 3 & 4 — stale `inbound_id` filters** `autoRenewClients` used
`WHERE inbound_id NOT IN (node inbounds)` to skip node clients, but
`client_traffics.inbound_id` is set once on INSERT and never refreshed.
Replaced with an email-based subquery through `client_inbounds` (the
authoritative source). Also added a safe type assertion for
`settings["clients"].([]any)` that previously panicked on nil.

**Fix 5 — stale `inbound_id` in reset** `resetAllClientTrafficsLocked`
used `WHERE inbound_id = ?` to find which emails to reset; same staleness
problem. Replaced with the `client_inbounds` join for email lookup;
the `inbounds.last_traffic_reset_time` update still correctly uses the
inbound ID directly on the `inbounds` table.

Tests updated to reflect the new seeding-at-zero semantics and a new
`TestGhostData_NoPhantomTraffic` test reproduces the exact 50 GB
phantom scenario.
2026-06-20 00:36:35 +02:00
MHSanaei 7fe082a7f1 fix(nodes): stop multi-attached client traffic inflating across node inbounds
Xray counts client traffic globally per email, so a client attached to
several of a node's inbounds has its single shared counter copied onto
every inbound by the node's enriched inbound list. When those copies
diverge (legacy per-inbound rows surviving a v3.2.x->v3.3.x upgrade, or
any drift) the per-inbound delta loop read the lower sibling as a
node-counter reset and re-added its full value, inflating the client far
past real usage (#5274).

Fold each email to its per-field node-wide max before the delta loop so
every occurrence is equal: the per-email baseline dedup then holds and
the reset clamp never misfires.
2026-06-15 19:31:57 +02:00
Rouzbeh† 628406117e fix(nodes): sync "start after first connect" expiry so un-activated nodes do not reset it (#5319)
* fix(nodes): stop un-activated nodes from resetting "start after first connect" expiry

In a multi-node setup a client is attached to inbounds on several nodes, but
its `client_traffics` row is shared per-email (the column is `gorm:"unique"`).
With "start expiry after first connect", the expiry is stored as a negative
duration and each node converts it to an absolute deadline (now+duration) the
first time the client connects *there*.

The master's per-node traffic merge wrote `expiry_time = ?` unconditionally for
every node sync. So a node where the client never connected keeps reporting the
un-activated negative duration and clobbers the absolute deadline that the node
where the client *did* connect had already activated — last writer wins. The
shared row flip-flops and usually lands back on the negative value, so the main
panel shows the timer "not started" while the active node counts down, and the
subscription (which reads this row and recomputes negative as now+duration on
every fetch) reports a perpetually-resetting, wrong expiry and usage.

Guard the merge so an un-activated (<= 0) value reported by a node can never
reset an already-activated absolute deadline. A positive node value is still
adopted, so a node that legitimately moves the deadline forward (traffic reset /
auto-renew) still propagates. The rule lives in both the SQL CASE used by the
merge and a small `mergeActivationExpiry` helper (kept in lockstep) that the
structural-change check reuses so the guard does not trigger spurious config
re-pushes.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* fix(nodes): cast expiry merge params to BIGINT for Postgres

The "start after first connect" merge guard introduced the comparison
`? <= 0` in the client_traffics expiry_time CASE. There Postgres infers
the parameter type as int4 from the literal 0, so binding a real expiry
value — a negative start-after-connect duration or a positive absolute
deadline (~1.7e12 ms) — overflows int4 and the whole setRemoteTrafficLocked
transaction fails, breaking node traffic and expiry sync on Postgres.
SQLite (dynamic typing) was unaffected.

Wrap both params in CAST(? AS BIGINT) (portable across SQLite and
Postgres) so the parameter is typed bigint, matching the explicit casts
the sibling GreatestExpr/ClientTrafficEnableMergeExpr helpers already use.

Verified against Postgres 16: TestNodeFirstConnectExpiry_NotClobbered
failed before this change and passes after; SQLite suite unchanged.

---------

Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Co-authored-by: Sanaei <ho3ein.sanaei@gmail.com>
2026-06-15 15:46:19 +02:00
MHSanaei f1a4286e2f feat(sub): per-inbound sort order for subscription links
Add a subSortIndex field to inbounds that controls the order of links
in subscription output only: the raw sub body, the HTML sub page, and
the JSON/Clash formats (all served from the same query). Lower values
come first; ties keep id order. The panel inbound list is unaffected.

The value is editable in the inbound form next to the share-address
fields, propagates to nodes via wireInbound, and follows the usual
node-sync rules (copied on import, mirrored while not dirty, never a
structural change).

Rescoped from #5214 by @Ponywka.
2026-06-12 12:03:22 +02:00
MHSanaei 21143a6d72 fix(node-sync): keep node baseline while a sibling inbound still reports the email (#5202)
The orphan sweeps in setRemoteTrafficLocked deleted the (node, email)
baseline row unconditionally whenever an email was missing from one
inbound's snapshot stats — even though baselines are keyed per node, not
per inbound. For a client attached to two inbounds of the same node whose
stats the node reports under only one of them, the sweep for the other
inbound deleted the baseline at the end of every sync cycle. Depending on
inbound order, the baseline written earlier in the same transaction was
wiped each time, so the next cycle computed delta against a missing
baseline (zero) and the client's traffic froze permanently.

Scope both sweeps to the union of emails across the whole snapshot: a
baseline is only dropped when the email left the node entirely.
2026-06-11 21:20:38 +02:00
animesha3 554d85c2f7 feat: allow selecting inbounds synchronized from nodes (#5178)
* feat: select node inbounds for synchronization

Allow node owners to import either all remote inbounds or an explicit tag-based selection. Add remote inbound discovery, persistence, snapshot filtering, API documentation, tests, and localized UI labels.

* fix

* fix: scope node reconcile and orphan sweep to selected inbound tags

In 'selected' sync mode unselected inbounds never enter the panel DB, so
ReconcileNode treated them as undesired and deleted them from the node the
first time it went config-dirty. Reconcile now only sweeps remote tags that
are part of the selection; everything else on the node is unmanaged.

Panel-created or renamed inbounds on a selected-mode node also vanished:
their tag was outside the selection, so the next traffic pull filtered them
out of the snapshot and the orphan sweep silently dropped the central row.
AddInbound/UpdateInbound now allow the tag on the node before committing.

---------

Co-authored-by: Sanaei <ho3ein.sanaei@gmail.com>
2026-06-11 20:48:26 +02:00
iYuan 2a7342baa9 feat: add inbound share address strategy (#5162)
* feat: add inbound share address strategy

Allow node-managed inbounds to choose whether exported share links use the node address, routable listen address, or a custom endpoint. Preserve locally configured share address fields during remote node traffic sync.

Refs #5161

Refs #4891

* fix: preserve inbound share address settings

Forward share address fields to remote nodes, keep existing values when older update payloads omit them, align localhost handling between frontend and subscriptions, and preserve share address settings when cloning inbounds.

* fix: keep share address strategy out of subscriptions

Limit the new share address strategy to direct exported share links and QR codes. Restore subscription address resolution to the existing panel-owned behavior and update the UI help text accordingly.

* fix: address share address review feedback

* fix: validate custom share address

* fix

---------

Co-authored-by: Sanaei <ho3ein.sanaei@gmail.com>
2026-06-11 20:24:15 +02:00
MHSanaei 8258a26fbf fix(node-sync): keep shared client traffic row when email still lives on other inbounds
client_traffics is the per-email accumulator shared across every inbound
and node the client is attached to. setRemoteTrafficLocked deleted it
unguarded in two sweeps — when a node inbound vanished from the snapshot
(node reinstall, tag change, another master's reconcile on a shared
node) and when an email left one inbound's stats — even though the
email was still attached elsewhere. The next sync then re-seeded the
row with that node's counter alone, so the panel showed the last
changed panel's number instead of the summed total.

Guard both sweeps with emailUsedByOtherInbounds, matching what the
manual-edit path (updateClientTraffics) already does. Truly removed
clients are still cleaned up by the zero-attachment sweep.
2026-06-11 14:28:09 +02:00
Sanaei 41645255f1 refactor: focused service files, leaf subpackages, and an internal/ layout (#5167)
* refactor(service): split client.go into focused files

client.go had grown to 4455 lines mixing ~10 responsibilities. Split it
verbatim into cohesive same-package files (no behavior change):

  client.go            foundation: ClientService, ClientWithAttachments,
                       ClientCreatePayload, ErrClientNotInInbound, sqlInChunk
  client_locks.go      inbound mutation locks, delete tombstones, compactOrphans
  client_lookup.go     read-only lookups (GetByID, List, EffectiveFlow, ...)
  client_link.go       inbound association sync (SyncInbound, DetachInbound, ...)
  client_crud.go       single-client CRUD + validation + protocol defaults
  client_inbound_apply.go  low-level inbound-settings mutators + by-email setters
  client_bulk.go       bulk attach/detach/adjust/delete/create + DelDepleted
  client_traffic.go    traffic-reset paths
  client_groups.go     client group management
  client_paging.go     paged listing, filtering, sorting, summary

Every declaration moved unchanged (verified: identical func/type/const/var
signature set before vs after). Imports redistributed per file via goimports.
go build ./..., go vet, and go test ./web/service/... all pass.

* refactor(service): split inbound.go into focused files

inbound.go was 4100 lines. Split it verbatim into cohesive same-package
files (no behavior change):

  inbound.go             core inbound CRUD + InboundService (keeps pkg doc)
  inbound_protocol.go    protocol / stream capability helpers
  inbound_node.go        node/runtime/remote coordination + online tracking
  inbound_traffic.go     traffic accounting, reset, client stats
  inbound_client_ips.go  per-client IP tracking
  inbound_clients.go     client lookups within inbounds + copy-clients
  inbound_disable.go     auto-disable invalid inbounds/clients
  inbound_migration.go   DB migrations
  inbound_sublink.go     subscription link providers
  inbound_util.go        generic slice/string helpers

Identical func/type/const/var signature set before vs after; package doc
comment preserved on inbound.go. Imports redistributed via goimports.
Build, vet, and go test ./web/service/... all pass.

* refactor(service): split tgbot.go into focused files

tgbot.go was 3738 lines dominated by a 1246-line answerCallback. Split it
verbatim into cohesive same-package files (no behavior change):

  tgbot.go           lifecycle, bot setup, caches, small utils
  tgbot_router.go    incoming update / command / callback dispatch
  tgbot_send.go      outbound messaging primitives
  tgbot_client.go    client views, actions, subscription links
  tgbot_inbound.go   inbound listing / pickers
  tgbot_report.go    server usage, exhausted, online, backups, notifications

Identical func/type/const/var signature set before vs after. Imports
redistributed via goimports. Build, vet, and go test ./web/service/... pass.

* refactor(client): dedupe single-field by-email setters

ResetClientIpLimitByEmail, ResetClientExpiryTimeByEmail, and
ResetClientTrafficLimitByEmail shared an identical ~50-line body that
resolves the inbound by email, confirms the client exists, rewrites a
single-client settings payload, and delegates to UpdateInboundClient.

Extract that into applyClientFieldByEmail(inboundSvc, email, mutate) and
reduce each setter to a 3-line wrapper. Behavior is unchanged: same checks
and error strings, same single-client payload contract, same totalGB guard.

SetClientTelegramUserID (resolves by traffic id, different error text) and
ToggleClientEnableByEmail/SetClientEnableByEmail (different return shape and
a pre-read of the old state) intentionally keep their own bodies.

* refactor(service): extract panel/ subpackage

Move the panel-administration leaf services out of the flat service
package into web/service/panel/ (package panel):

  user.go         UserService (auth / 2FA / LDAP)
  panel.go        PanelService (restart / self-update) + version helpers
  panel_other.go  non-unix RestartPanel
  panel_unix.go   unix RestartPanel
  api_token.go    ApiTokenService
  websocket.go    WebSocketService
  panel_test.go   version/shellQuote unit tests

These are leaves: they depend on core (SettingService, Release) but no
core file references them, so the extraction creates no import cycle.
Core references are now qualified (service.SettingService, service.Release);
callers in main.go, web/web.go, and web/controller/* updated to panel.*.
Build, vet, and go test ./web/... pass.

* refactor(service): extract integration/ subpackage

Move the external-provider integration leaves into web/service/integration/
(package integration):

  warp.go        WarpService (Cloudflare WARP)
  nord.go        NordService (NordVPN)
  custom_geo.go  CustomGeoService (custom geo asset management)
  *_test.go      custom_geo / panel-proxy tests

These depend on core (SettingService, ServerService, XraySettingService) but
no core file references them. xray_setting.go stays in core because it calls
the unexported SettingService.saveSetting. The shared isBlockedIP SSRF helper
(used by core url_safety.go and by custom_geo) now has a small copy in each
package rather than being exported. Core references qualified; callers in
web/web.go, web/job/*, and web/controller/* updated to integration.*.
Build, vet, and go test ./web/... pass.

* refactor(service): extract tgbot/ subpackage

Move the Telegram bot (6 files + test) into web/service/tgbot/ (package
tgbot). It is a leaf: it embeds five core services (Inbound/Client/Setting/
Server/Xray) and the core never references it, so no import cycle.

To support the package boundary without changing behavior:
  - core exposes XrayProcess() *xray.Process so tgbot keeps calling the
    exact same running-process methods it used via the package-level `p`;
  - three core methods tgbot calls are exported: ClientService.checkIs-
    EnabledByEmail -> CheckIsEnabledByEmail, InboundService.getAllEmails ->
    GetAllEmails (callers updated in-package);
  - tgbot's embedded-field types and the few core type refs (Status,
    ClientCreatePayload, SanitizePublicHTTPURL) are now service-qualified.

Callers in main.go, web/web.go, web/job/*, and web/controller/* updated to
tgbot.*. Build, vet, and go test ./web/... pass.

* refactor(service): extract outbound/ subpackage

OutboundService (outbound.go) imports only neutral packages (config,
database, model, xray) and its production code is referenced by no core or
sibling service file — only by web/controller/xray_setting.go and
web/job/xray_traffic_job.go. Move it to web/service/outbound/ (package
outbound); no core qualification needed inside. Callers updated to outbound.*.

The one coupling was a tiny pure test helper, outboundsContainTag, used by
both outbound.go and the core outbound_subscription_test.go; it now has a
small copy in that test file rather than being shared across the boundary.
Build, vet, and go test ./web/... pass.

* refactor(util): move wireguard into its own subpackage

util/wireguard.go was the lone file of the root `util` package (24 lines,
one exported func GenerateWireguardKeypair), while every other util concern
lives in a focused subpackage (util/common, util/crypto, util/netsafe, ...).
Move it to util/wireguard/ (package wireguard) for consistency; its only
importer, web/service/integration/warp.go, is updated. The root `util`
package no longer exists.

* refactor(sub): drop redundant sub prefix from filenames

Inside package sub the subXxx.go prefix just repeats the package name
(like client_*.go did inside service). Rename for consistency; content and
type names are unchanged:

  subController.go    -> controller.go
  subService.go       -> service.go
  subClashService.go  -> clash_service.go
  subJsonService.go   -> json_service.go
  (+ matching _test.go files)

* refactor(controller): rename xui.go -> spa.go

XUIController serves the panel's single-page-app shell; spa.go names that
role plainly (the other controller files are domain-named). File rename only
— the type stays XUIController. api_docs_test.go keys route base paths by
filename, so its "xui.go" case is updated to "spa.go".

* refactor: move backend packages under internal/

Adopt the idiomatic Go application layout: the backend packages now live
under internal/ (a boundary the toolchain enforces), signalling private
implementation instead of a library-style flat root. No runtime behavior
changes — only import paths and a few build/config paths move.

Moved: config, database, logger, mtproto, sub, util, web, xray -> internal/.
main.go stays at the repo root and tools/openapigen stays under tools/ (both
still import internal/* because the internal rule keys off the module root).
The module path github.com/mhsanaei/3x-ui/v3 is unchanged; 149 .go files had
their import prefix rewritten to .../internal/<pkg>.

Couplings the Go compiler can't see, updated to the new layout:
  - frontend i18n imports of web/translation (react.ts, setup.components.ts)
  - vite outDir + eslint/tsconfig ignore globs -> internal/web/dist
  - Dockerfile COPY paths for web/dist and web/translation
  - locale.go os.DirFS("web") disk fallback -> "internal/web"
  - .gitignore and ci.yml go:embed stub for internal/web/dist
  - api_docs_test.go repo-root relative walk (one level deeper)
  - tools/openapigen filesystem package paths; ApiTokenView repointed to the
    web/service/panel subpackage and codegen regenerated (clears a stale
    type the ci.yml codegen check was failing on)

Verified: go build/vet/test (all packages), and frontend typecheck, lint,
vitest (478 tests), and production build into internal/web/dist.

* fix(config): keep test runs from writing logs into the source tree

GetLogFolder() returns a CWD-relative "./log" on Windows. Under `go test`
the working directory is each package's own folder, so InitLogger (called by
tests in web/job, web/service, xray, web/websocket) created stray log/
directories scattered through the source tree (e.g. internal/web/job/log/).

Redirect to a shared temp folder when testing.Testing() reports a test run.
Production behavior is unchanged: Windows still uses ./log next to the binary
and Linux /var/log/x-ui. The log files were always gitignored (*.log) and
never committed; this just stops the noise at the source.

* docs: move subscription-template guide out of root into docs/

sub_templates/ was a top-level folder holding only a README and no actual
templates (3x-ui ships none by design), referenced nowhere and unlinked from
any doc — it read like an empty placeholder cluttering the repo root.

Move the guide to docs/custom-subscription-templates.md (a proper docs home),
reword its intro to read as documentation rather than a folder note, link it
from the Features list in README.md, and drop the empty sub_templates/ folder.

* fix: update stale web/ path references after the internal/ move

The internal/ migration rewrote Go import paths but left some references to
the old top-level layout in docs, comments, and a few runtime disk paths.

Functional (dev-mode only): the disk-serving fallbacks that read the Vite
build from disk when running from source still pointed at web/dist/, which
moved to internal/web/dist/ — so `os.DirFS`/`os.Stat`/`os.ReadFile` in
internal/web/web.go and internal/sub/{sub,controller}.go are corrected.
Production was unaffected (it serves the embedded FS; verified by the Docker
build), but `go run` with a live frontend build silently fell back to embed.

Docs/comments: frontend/README.md, CONTRIBUTING.md, the claude-issue-bot and
release workflows, the openapigen -root help text, and assorted Go comments
now reference internal/web, internal/database, internal/sub, internal/xray,
etc. Package-name mentions (the "web" package), root paths (main.go,
frontend/, install scripts, /etc/x-ui), routes (/panel/api/xray), and the
historical "web/assets no longer exists" note were intentionally left as-is.

* refactor(web): remove the legacy /xui -> /panel redirect middleware

RedirectMiddleware existed only for backward compatibility with the old
`/xui` URL scheme (301-redirecting /xui and /xui/API to /panel and
/panel/api). That cutover was long ago, so drop the middleware, its
registration in initRouter, and the now-inaccurate "URL redirection"
mention in the middleware package doc. Old /xui URLs now 404 like any other
unknown path. HTTPS auto-redirect and auth redirects are unrelated and stay.

* build: fix .dockerignore for internal/ layout and exclude runtime dir

- web/dist -> internal/web/dist: the embedded frontend moved under internal/,
  so the stale exclude no longer matched and the locally-built dist could be
  sent to the build context (the frontend stage rebuilds it fresh anyway).
- exclude x-ui/: the local runtime directory (SQLite db, geo .dat files, xray
  binaries, certs — ~150MB) was being shipped into the build context for no
  reason. Verified the pattern excludes only the directory and still keeps
  x-ui.sh, which the Dockerfile copies to /usr/bin/x-ui.
2026-06-10 15:19:22 +02:00