Jun 14 2026

Development Update — June 14

A day of grounding the dmsg-only default in reality: as the network leans harder on the overlay, the failure modes that used to be masked by a plain-HTTP fallback now have to be fixed properly. Two of them were boot-and-reconnect wedges — a dmsg-only visor that spun forever trying to reach the transport-discovery, and a SUDPH client that retried a closed socket until it gave up. Alongside those, a new inject_pk mode makes forwarded websites HTTP-aware (the backend finally learns who is connecting), the router’s setup-failure diagnostics are made panic-safe and then richer, and a couple of CLI/UI papercuts get smoothed.

Skywire: Forwarded Websites Learn Who Is Connecting

3104 feat(serve): inject authenticated caller PK into HTTP-mode forwarded ports turns a forwarded website from a blind TCP splice into something a backend can reason about. The visor already holds the noise-authenticated caller PK — it uses it for the per-port whitelist — but the backend never saw it. inject_pk makes the forward HTTP-aware: the visor reverse-proxies to the backend and stamps X-Skywire-Remote-PK (the caller’s 66-hex key) and X-Skywire-Transport (dmsg or skynet), stripping any client-supplied copy first so they can’t be spoofed over this path. A forwarded site can then do per-PK auth or personalization in any language by reading one header — the moral equivalent of an mTLS-terminating proxy exposing the verified client cert. The trust model is documented carefully: the header is only authoritative if the backend is reachable only through the visor (bind loopback), and sites also on clearnet should separate the skywire ingress by listener — a loopback-only Caddy listener the visor forwards to — not by source IP, which fails open under tunnels and containers. It ships with serve --inject-pk (and --preserve-host), a guide, and a runnable pk-aware-website example.

Skywire: The dmsg-only TPD Boot Wedge

3108 fix(visor): dmsg-only TPD boot wedge — drop the plain-HTTP fallback in connectToTpDisc fixes a wedge that only the dmsg-only default could trigger. connectToTpDisc tried DMSG-HTTP first and, on any miss, fell through to a plain-HTTP retrier — but in a dmsg-only config (Transport.Discovery empty, only Transport.DiscoveryDmsg set) that fallback built its client against the empty Discovery URL, which resolves to http://localhost. The request still went out over the dmsg-HTTP RoundTripper, which rejects host localhost as a dmsg address, and with tries=0 the retrier spun on that error forever — so transport_manager.Serve never returned and the whole visor init wedged, RPC never binding. It only triggered when the DMSG-first attempt happened to miss (e.g. dmsg sessions to the TPD’s server not yet established at boot), which made it intermittent and maddening. The fix collapses the two paths into one: resolve the TPD URL once (prefer the dmsg address), build the matching client by scheme, and retry that single client — no silent transport fallback, and a clear error if neither is configured.

Skywire: SUDPH Reconnects on a Closed Socket

3106 fix(ar): rebuild SUDPH packet-filter conn on reconnect (it’s kcp-closed) fixes a permanent wedge in the SUDPH address-resolver link. The client reused c.sudphConn across reconnects — guarded by if c.sudphConn == nil — to keep the local UDP port stable. But a kcp client session closes its underlying conn on Close, so when an arConn died and was closed, kcp closed the shared sudphConn too. The nil-guard still saw it as non-nil, so every reconnect reused the dead socket and failed forever with “use of closed network connection,” leaving the visor’s SUDPH/AR registration permanently wedged (and producing a steady stream of handshake-timeout noise from peers that couldn’t reach it — observed live retrying the closed socket every minute for 13+ minutes without recovering). The fix builds a fresh filter conn on every connectSUDPH; the local UDP port comes from the shared listener, so it stays stable across rebuilds — which is the only thing the reuse was ever protecting. The reuse never actually bought port stability; it only hid, then re-broke, on the closed conn.

Skywire: Panic-Safe, Then Richer, Route-Setup Diagnostics

Route-group setup has been intermittently failing — the initiator times out (“Remote handshake not received”) while the responder’s noise wrap reads “no data captured,” the leading suspect being a reverse-leg route-ID mismatch where the responder stamps reverse packets with an ID the initiator has no rule for. Because the failure is hard to reproduce on demand, the approach was to instrument rather than guess.

3107 diag(router): log route-group rule linkage on setup-failure (both ends) enriches both setup-failure logs with the local rule linkage — the IDs each end stamps and listens for, plus the transports — so the next real occurrence is fully diagnosable by pairing the two ends: a mismatch is immediate if the responder’s reverse-next doesn’t equal the initiator’s reverse-key. 3109 fix(router): panic-safe route-group setup-failure diagnostics (NextRouteID on Consume rule) is the same-day correction: that new logging called type-specific rule accessors (NextRouteID(), NextTransportID()) directly, and on the reverse leg the rule is a Consume rule, for which those accessors panic. Both diagnostics fire on failure paths — so any visor that hit a route-group wrap-failure or handshake timeout now panicked and crashed instead of merely logging the failure. The fix renders each rule through the panic-safe Rule.String(), which prints the relevant IDs per rule type with no type assumptions — strictly more diagnostic detail than the hand-picked fields, and no crash.

Skywire: CLI & UI Papercuts

3105 fix(cli): reconnect visor log --follow instead of silently stalling — the follow loop polled one net/rpc client, and net/rpc does not auto-reconnect, so once the underlying conn dropped (an idle-socket timeout, observed after ~8 min) every subsequent call errored forever while the loop retried the same dead client. The stream silently stalled — process alive, no output — while the visor kept running. It now re-dials on error so the stream resumes on the next tick, and resets the cursor when the entry counter goes backwards so a visor restart mid-follow re-bootstraps from the fresh buffer.

3101 fix(visor): stable app-list order (sort AppStates by name) — AppStates() ranged the apps map directly, so Go’s randomized map iteration returned a different order on every call. The hypervisor UI polls this, and its stable JS sort preserves input order for tied keys — so tied apps reshuffled each poll, rows “moved around,” and with pagination they crossed page boundaries (apps appearing and disappearing). Sorting the names before building the states gives every consumer a deterministic order.