add impairment profile chapter
This commit is contained in:
+566
-20
@@ -552,9 +552,9 @@ than per-connection latency.
|
||||
|
||||
Several rankings invert relative to raw throughput. ZeroTier
|
||||
finishes faster than WireGuard (9.22\,s vs.\ 9.45\,s) despite
|
||||
30\,\% fewer raw Mbps and 1\,000$\times$ more retransmits. Yggdrasil
|
||||
6\,\% fewer raw Mbps and 1\,000$\times$ more retransmits. Yggdrasil
|
||||
is the clearest example: it has the
|
||||
third-highest throughput at 795\,Mbps, yet lands at 24\,\% overhead
|
||||
fourth-highest VPN throughput at 795\,Mbps, yet lands at 24\,\% overhead
|
||||
because its
|
||||
2.2\,ms latency adds up over the many small sequential HTTP requests
|
||||
that constitute a Nix cache download.
|
||||
@@ -815,57 +815,603 @@ what would otherwise be idle gaps in any individual flow, squeezing
|
||||
out throughput that no single stream could reach alone.
|
||||
|
||||
\section{Impact of Network Impairment}
|
||||
\label{sec:impairment}
|
||||
|
||||
This section examines how each VPN responds to the Low, Medium, and
|
||||
High impairment profiles defined in Chapter~\ref{Methodology}.
|
||||
The impairment profiles from Table~\ref{tab:impairment_profiles} are
|
||||
applied to the full benchmark suite. Baseline results from
|
||||
Section~\ref{sec:baseline} serve as the reference.
|
||||
|
||||
\subsection{Ping}
|
||||
|
||||
% RTT and packet loss across impairment profiles.
|
||||
Table~\ref{tab:ping_impairment} lists average round-trip times across
|
||||
all four profiles. Most VPNs track the expected increase closely:
|
||||
tc~netem adds roughly 4\,ms, 8\,ms, and 15\,ms of round-trip delay
|
||||
at Low, Medium, and High respectively, and Internal's measured values
|
||||
(4.82, 9.38, 15.49\,ms) confirm this.
|
||||
|
||||
\begin{table}[H]
|
||||
\centering
|
||||
\caption{Average ping RTT (ms) across impairment profiles, sorted
|
||||
by High-profile RTT}
|
||||
\label{tab:ping_impairment}
|
||||
\begin{tabular}{lrrrr}
|
||||
\hline
|
||||
\textbf{VPN} & \textbf{Baseline} & \textbf{Low} &
|
||||
\textbf{Medium} & \textbf{High} \\
|
||||
\hline
|
||||
Internal & 0.60 & 4.82 & 9.38 & 15.49 \\
|
||||
Tinc & 1.19 & 5.32 & 9.85 & 15.92 \\
|
||||
Nebula & 1.25 & 5.38 & 9.99 & 15.96 \\
|
||||
WireGuard & 1.20 & 5.36 & 9.88 & 15.99 \\
|
||||
Headscale & 1.64 & 5.82 & 10.39 & 16.07 \\
|
||||
VpnCloud & 1.13 & 5.41 & 10.35 & 16.21 \\
|
||||
ZeroTier & 1.28 & 5.34 & 10.02 & 16.54 \\
|
||||
Yggdrasil & 2.20 & 6.73 & 11.99 & 20.20 \\
|
||||
Hyprspace & 1.79 & 6.15 & 10.76 & 24.49 \\
|
||||
EasyTier & 1.33 & 6.27 & 14.13 & 26.60 \\
|
||||
Mycelium & 34.90 & 23.42 & 43.88 & 33.05 \\
|
||||
\hline
|
||||
\end{tabular}
|
||||
\end{table}
|
||||
|
||||
% PLOT: line chart
|
||||
% File: Figures/impairment/Ping Average RTT Heatmap.png
|
||||
% Data: Average ping RTT for all 11 VPNs at baseline, low, medium, high
|
||||
% Show: Most VPNs in a tight parallel band; Mycelium's non-monotonic curve;
|
||||
% EasyTier and Hyprspace diverging upward at high impairment
|
||||
|
||||
Mycelium defies the pattern. Its RTT \emph{drops} from 34.9\,ms at
|
||||
baseline to 23.4\,ms at Low impairment, a 33\% improvement where
|
||||
every other VPN gets slower. It then rises to 43.9\,ms at Medium
|
||||
before falling again to 33.0\,ms at High. The baseline analysis
|
||||
(Section~\ref{sec:mycelium_routing}) showed that Mycelium's latency
|
||||
comes from a bimodal routing distribution: one path runs at 1.63\,ms
|
||||
while two others route through the global overlay at
|
||||
${\sim}$51\,ms. The impairment appears to push Mycelium's path
|
||||
discovery toward shorter routes, so a larger share of traffic takes
|
||||
the direct path. The non-monotonic pattern is consistent with a path
|
||||
selection algorithm that responds to measured link quality, but not
|
||||
linearly with degradation severity.
|
||||
|
||||
Mycelium also achieves 0\% ping packet loss at Low and Medium
|
||||
impairment, while most VPNs show 0.1--3.2\% loss at those profiles.
|
||||
At High impairment, Mycelium's loss jumps to 11.1\%.
|
||||
|
||||
EasyTier accumulates 11\,ms of excess latency at High impairment
|
||||
beyond what tc~netem accounts for. Its average RTT of 26.6\,ms and
|
||||
maximum of 290\,ms (vs.\ ${\sim}$40\,ms for WireGuard) point to a
|
||||
userspace scheduling or retry mechanism that introduces escalating
|
||||
variance. EasyTier's RTT standard deviation reaches 44.6\,ms at
|
||||
High, the worst jitter of any VPN.
|
||||
|
||||
Hyprspace shows 11.1\% ping packet loss at every impairment level ---
|
||||
Low, Medium, and High alike. With 9~measurement runs (3~machine
|
||||
pairs $\times$ 3~runs of 100~packets), 11.1\% equals exactly 1/9:
|
||||
one run per profile fails completely while the other eight report zero
|
||||
loss. This binary pass/fail behavior is consistent with the buffer
|
||||
bloat diagnosis from Section~\ref{sec:hyprspace_bloat}. When buffers
|
||||
fill, an entire path stalls rather than degrading gradually.
|
||||
|
||||
\subsection{TCP Throughput}
|
||||
|
||||
% TCP iperf3: throughput, retransmits, congestion window.
|
||||
Table~\ref{tab:tcp_impairment} presents single-stream TCP throughput
|
||||
across all four profiles. The baseline performance tiers from
|
||||
Section~\ref{sec:baseline} dissolve almost immediately under
|
||||
impairment.
|
||||
|
||||
\begin{table}[H]
|
||||
\centering
|
||||
\caption{Single-stream TCP throughput (Mbps) across impairment
|
||||
profiles, sorted by baseline. Retention is the
|
||||
Low-to-baseline ratio.}
|
||||
\label{tab:tcp_impairment}
|
||||
\begin{tabular}{lrrrrr}
|
||||
\hline
|
||||
\textbf{VPN} & \textbf{Baseline} & \textbf{Low} &
|
||||
\textbf{Medium} & \textbf{High} & \textbf{Retention} \\
|
||||
\hline
|
||||
Internal & 934 & 333 & 29.6 & 4.25 & 35.7\% \\
|
||||
WireGuard & 864 & 54.7 & 8.77 & 2.63 & 6.3\% \\
|
||||
ZeroTier & 814 & 63.7 & 12.0 & 4.01 & 7.8\% \\
|
||||
Headscale & 800 & 274 & 41.5 & 4.21 & 34.3\% \\
|
||||
Yggdrasil & 795 & 13.2 & 6.08 & 3.40 & 1.7\% \\
|
||||
\hline
|
||||
Nebula & 706 & 49.8 & 7.82 & 2.60 & 7.1\% \\
|
||||
EasyTier & 636 & 156 & 17.4 & 3.59 & 24.6\% \\
|
||||
VpnCloud & 539 & 58.2 & 8.33 & 1.86 & 10.8\% \\
|
||||
\hline
|
||||
Hyprspace & 368 & 4.42 & 2.05 & 1.39 & 1.2\% \\
|
||||
Tinc & 336 & 54.4 & 5.53 & 2.77 & 16.2\% \\
|
||||
Mycelium & 259 & 16.2 & 3.87 & 2.73 & 6.3\% \\
|
||||
\hline
|
||||
\end{tabular}
|
||||
\end{table}
|
||||
|
||||
% PLOT: line chart
|
||||
% File: Figures/impairment/TCP Throughput Heatmap.png
|
||||
% Data: Single-stream TCP throughput for all 11 VPNs at baseline,
|
||||
% low, medium, high
|
||||
% Show: Headscale crossing above Internal at medium impairment;
|
||||
% Yggdrasil's cliff from baseline to low; convergence of all
|
||||
% VPNs at high impairment
|
||||
|
||||
Yggdrasil crashes from 795\,Mbps to 13.2\,Mbps at Low impairment, a
|
||||
98.3\% throughput loss from adding just 2\,ms latency, 2\,ms jitter,
|
||||
0.25\% packet loss, and 0.5\% reordering per machine. Even Mycelium,
|
||||
the slowest VPN at baseline (259\,Mbps), retains more throughput at
|
||||
Low than Yggdrasil does. The jumbo overlay MTU of 32\,731~bytes,
|
||||
which inflated baseline metrics
|
||||
(Section~\ref{sec:baseline}), becomes a liability under impairment:
|
||||
each lost or reordered outer packet triggers retransmission of
|
||||
${\sim}$24$\times$ more inner-layer data than a standard
|
||||
1\,400-byte MTU VPN would lose.
|
||||
|
||||
Headscale retains 34.3\% of its baseline throughput at Low, nearly
|
||||
matching Internal's 35.7\%. At Medium impairment, Headscale
|
||||
(41.5\,Mbps) overtakes Internal (29.6\,Mbps) --- a VPN outperforming
|
||||
the bare-metal baseline.
|
||||
Section~\ref{sec:tailscale_degraded} investigates this anomaly in
|
||||
detail.
|
||||
|
||||
At High impairment, the throughput range compresses from 675\,Mbps at
|
||||
baseline to just 2.9\,Mbps. Internal leads at 4.25\,Mbps; Hyprspace
|
||||
trails at 1.39\,Mbps. The impairment profile itself becomes the
|
||||
bottleneck. With 2.5\% packet loss and 5\% reordering per machine,
|
||||
every implementation is TCP-loss-limited, and architectural
|
||||
differences that matter at gigabit speeds become irrelevant.
|
||||
|
||||
\subsection{UDP Throughput}
|
||||
|
||||
% UDP iperf3: throughput, jitter, packet loss.
|
||||
The UDP stress test (\texttt{-b~0}) suffers from widespread failures
|
||||
under impairment. Hyprspace and Mycelium, which already failed at
|
||||
baseline, continue to fail at all profiles. Tinc and ZeroTier fail
|
||||
at most non-baseline profiles. The sparse dataset limits
|
||||
conclusions, but one pattern stands out.
|
||||
|
||||
Kernel-level implementations maintain throughput regardless of
|
||||
impairment. Internal holds ${\sim}$950\,Mbps across all profiles
|
||||
where data exists. Headscale sustains 700--876\,Mbps and WireGuard
|
||||
850--908\,Mbps; % TODO: verify WireGuard UDP range -- analysis doc says 850-898, possible digit transposition
|
||||
both rely on WireGuard's in-kernel UDP handling with
|
||||
proper backpressure. Userspace VPNs collapse: EasyTier drops from
|
||||
865 to 435 to 38.5 to 6.1\,Mbps across successive profiles.
|
||||
Yggdrasil, already pathological at baseline (98.7\% loss), crashes to
|
||||
12.3\,Mbps at Low and fails entirely at Medium and High.
|
||||
|
||||
% PLOT: heatmap
|
||||
% File: Figures/impairment/UDP Receiver Throughput Heatmap.png
|
||||
% Data: UDP receiver throughput for all 11 VPNs at baseline, low,
|
||||
% medium, high (grey/hatched cells for failures)
|
||||
% Show: Kernel-level VPNs (Internal, WireGuard, Headscale) maintaining
|
||||
% high throughput across all profiles; userspace VPNs failing or
|
||||
% collapsing; the large number of empty cells
|
||||
|
||||
|
||||
The failure rate of this benchmark under impairment makes it more
|
||||
useful as a robustness indicator than a throughput measurement. A VPN
|
||||
that cannot complete a 30-second UDP flood under 0.25\% packet loss
|
||||
has fundamental flow-control problems that will surface under real
|
||||
workloads too, even if the symptoms are milder.
|
||||
|
||||
\subsection{Parallel TCP}
|
||||
|
||||
% Parallel iperf3: throughput under contention (A->B, B->C, C->A).
|
||||
Table~\ref{tab:parallel_impairment} shows aggregate throughput across
|
||||
three concurrent bidirectional links (six unidirectional flows). The
|
||||
Headscale anomaly from the single-stream results is amplified here.
|
||||
|
||||
\begin{table}[H]
|
||||
\centering
|
||||
\caption{Parallel TCP throughput (Mbps) across impairment profiles.
|
||||
Three concurrent bidirectional links produce six unidirectional
|
||||
flows.}
|
||||
\label{tab:parallel_impairment}
|
||||
\begin{tabular}{lrrrr}
|
||||
\hline
|
||||
\textbf{VPN} & \textbf{Baseline} & \textbf{Low} &
|
||||
\textbf{Medium} & \textbf{High} \\
|
||||
\hline
|
||||
Internal & 1398 & 277 & 82.6 & 10.4 \\
|
||||
Headscale & 1228 & 718 & 113 & 20.0 \\
|
||||
WireGuard & 1281 & 173 & 24.5 & 8.39 \\
|
||||
Yggdrasil & 1265 & 38.7 & 16.7 & 8.95 \\
|
||||
ZeroTier & 1206 & 176 & 35.4 & 7.97 \\
|
||||
EasyTier & 927 & 473 & 57.4 & 10.7 \\
|
||||
Hyprspace & 803 & 2.87 & 6.94 & 3.62 \\
|
||||
VpnCloud & 763 & 174 & 23.7 & 8.25 \\
|
||||
Nebula & 648 & 103 & 15.3 & 4.93 \\
|
||||
Mycelium & 569 & 72.7 & 7.51 & 3.69 \\
|
||||
Tinc & 563 & 168 & 23.7 & 8.25 \\
|
||||
\hline
|
||||
\end{tabular}
|
||||
\end{table}
|
||||
|
||||
% PLOT: heatmap
|
||||
% File: Figures/impairment/Parallel TCP Throughput Heatmap.png
|
||||
% Data: Parallel TCP throughput for all 11 VPNs at baseline, low,
|
||||
% medium, high
|
||||
% Show: Headscale dominating at low impairment (718 Mbps vs Internal's
|
||||
% 277); EasyTier as runner-up (473 Mbps); Hyprspace's collapse
|
||||
% to 2.87 Mbps
|
||||
|
||||
|
||||
Headscale at Low impairment: 718\,Mbps --- 2.6$\times$ Internal
|
||||
(277\,Mbps) and 4.1$\times$ WireGuard (173\,Mbps). At Medium,
|
||||
Headscale (113\,Mbps) still leads Internal (82.6\,Mbps) by 37\%.
|
||||
The single-stream anomaly from
|
||||
Section~\ref{sec:tailscale_degraded} compounds when multiple flows
|
||||
each independently benefit from Headscale's congestion control
|
||||
tuning.
|
||||
|
||||
EasyTier is the second-most resilient VPN under parallel load, at
|
||||
473\,Mbps at Low (51\% of baseline). Both EasyTier and Headscale
|
||||
retain more than half their baseline parallel throughput at Low
|
||||
impairment; no other VPN exceeds 30\%.
|
||||
|
||||
Hyprspace collapses from 803\,Mbps to 2.87\,Mbps at Low, a 99.6\%
|
||||
loss. The buffer bloat that plagues single-stream transfers
|
||||
(Section~\ref{sec:hyprspace_bloat}) becomes catastrophic when six
|
||||
concurrent flows compete for the same bloated buffers.
|
||||
|
||||
The High-profile convergence effect is even more pronounced here than
|
||||
in single-stream mode. Tinc and VpnCloud land at identical
|
||||
8.25\,Mbps despite differing by 200\,Mbps at baseline.
|
||||
|
||||
\subsection{QUIC Performance}
|
||||
|
||||
% qperf: bandwidth, TTFB, connection establishment time.
|
||||
Headscale and Nebula failed the qperf QUIC benchmark at baseline
|
||||
(Section~\ref{sec:baseline}) and continue to fail across all
|
||||
impairment profiles.
|
||||
|
||||
Yggdrasil's QUIC bandwidth drops from 745\,Mbps at baseline to
|
||||
7.67\,Mbps at Low, 3.45\,Mbps at Medium, and 2.17\,Mbps at High ---
|
||||
the same cliff observed in its TCP results, driven by the same
|
||||
jumbo-MTU amplification of outer-layer packet loss.
|
||||
|
||||
At High impairment, WireGuard (23.2\,Mbps), VpnCloud (23.4\,Mbps),
|
||||
ZeroTier (23.0\,Mbps), and Tinc (23.4\,Mbps) converge to within
|
||||
0.4\,Mbps of each other. At baseline these four span a 500\,Mbps
|
||||
range. QUIC's own congestion control, operating atop the
|
||||
already-degraded outer link, becomes the sole limiter.
|
||||
|
||||
% PLOT: heatmap
|
||||
% File: Figures/impairment/QUIC Bandwidth Heatmap.png
|
||||
% Data: QPerf QUIC bandwidth for VPNs with data at all four profiles
|
||||
% (WireGuard, VpnCloud, ZeroTier, Tinc, Yggdrasil, Internal)
|
||||
% Show: Yggdrasil's cliff from baseline to low; convergence of
|
||||
% WireGuard, VpnCloud, ZeroTier, Tinc at high (~23 Mbps)
|
||||
|
||||
|
||||
\subsection{Video Streaming}
|
||||
|
||||
% RIST: bitrate, dropped frames, packets recovered, quality score.
|
||||
Table~\ref{tab:rist_impairment} presents RIST video quality scores
|
||||
across profiles. The actual encoding bitrate of ${\sim}$3.3\,Mbps
|
||||
sits well within every VPN's throughput budget even at High
|
||||
impairment, so quality differences reflect packet delivery reliability
|
||||
rather than bandwidth limits.
|
||||
|
||||
\begin{table}[H]
|
||||
\centering
|
||||
\caption{RIST video streaming quality (\%) across impairment
|
||||
profiles, sorted by High-profile quality}
|
||||
\label{tab:rist_impairment}
|
||||
\begin{tabular}{lrrrr}
|
||||
\hline
|
||||
\textbf{VPN} & \textbf{Baseline} & \textbf{Low} &
|
||||
\textbf{Medium} & \textbf{High} \\
|
||||
\hline
|
||||
Mycelium & 100.0 & 100.0 & 100.0 & 99.9 \\
|
||||
EasyTier & 100.0 & 100.0 & 96.2 & 85.5 \\
|
||||
Internal & 100.0 & 99.2 & 89.3 & 80.2 \\
|
||||
ZeroTier & 100.0 & 99.3 & 89.9 & 80.2 \\
|
||||
VpnCloud & 100.0 & 99.2 & 89.7 & 80.1 \\
|
||||
WireGuard & 100.0 & 99.3 & 90.0 & 80.0 \\
|
||||
Hyprspace & 100.0 & 92.9 & 87.9 & 78.1 \\
|
||||
Tinc & 100.0 & 99.3 & 90.0 & 77.8 \\
|
||||
Nebula & 99.8 & 98.8 & 85.6 & 72.1 \\
|
||||
Yggdrasil & 100.0 & 94.7 & 71.4 & 43.3 \\
|
||||
Headscale & 13.1 & 13.0 & 13.0 & 13.0 \\
|
||||
\hline
|
||||
\end{tabular}
|
||||
\end{table}
|
||||
|
||||
% PLOT: heatmap
|
||||
% File: Figures/impairment/Video Streaming Quality Heatmap.png
|
||||
% Data: RIST quality for all 11 VPNs at baseline, low, medium, high
|
||||
% Show: Headscale stuck at 13% (red row); Mycelium stuck near 100%
|
||||
% (green row); gradual degradation for the bulk; Yggdrasil's
|
||||
% steep decline to 43%
|
||||
|
||||
|
||||
Headscale stays at ${\sim}$13\% across all four profiles: 13.1\%,
|
||||
13.0\%, 13.0\%, 13.0\%. The profile-independence confirms the
|
||||
baseline diagnosis from Section~\ref{sec:baseline}. The failure is
|
||||
structural --- likely MTU fragmentation in the DERP relay layer ---
|
||||
and cannot worsen because it is already saturated. Adding latency or
|
||||
loss on top of an 87\% packet drop floor changes nothing.
|
||||
|
||||
Mycelium delivers 99.9\% quality even at High impairment, better than
|
||||
Internal (80.2\%) and every other VPN. At 3.3\,Mbps, even
|
||||
Mycelium's degraded overlay paths can sustain the stream. Its
|
||||
overlay retransmission mechanism, which cripples bulk TCP transfers,
|
||||
works well for steady low-bandwidth UDP flows. RIST's own forward
|
||||
error correction handles whatever Mycelium's retransmissions miss.
|
||||
|
||||
Yggdrasil degrades the most steeply: 100\% at baseline, 94.7\% at
|
||||
Low, 71.4\% at Medium, 43.3\% at High. The jumbo MTU that hurt TCP
|
||||
throughput also hurts here --- large overlay packets carrying RIST
|
||||
data are more likely to be lost or reordered at the outer layer, and
|
||||
RIST's FEC cannot recover from the resulting burst losses.
|
||||
|
||||
\subsection{Application-Level Download}
|
||||
|
||||
% Nix cache: download duration for Firefox package.
|
||||
Table~\ref{tab:nix_impairment} shows Nix binary cache download times
|
||||
across profiles. This HTTP-heavy workload, dominated by many
|
||||
short-lived TCP connections, is more sensitive to per-connection
|
||||
latency than to raw bandwidth.
|
||||
|
||||
\begin{table}[H]
|
||||
\centering
|
||||
\caption{Nix binary cache download time (seconds) across impairment
|
||||
profiles, sorted by Low-profile time. ``--'' marks a failed
|
||||
run.}
|
||||
\label{tab:nix_impairment}
|
||||
\begin{tabular}{lrrrr}
|
||||
\hline
|
||||
\textbf{VPN} & \textbf{Baseline} & \textbf{Low} &
|
||||
\textbf{Medium} & \textbf{High} \\
|
||||
\hline
|
||||
Internal & 8.53 & 11.9 & 58.6 & -- \\
|
||||
Headscale & 9.79 & 13.5 & 48.8 & 219 \\
|
||||
EasyTier & 9.39 & 22.1 & 141 & -- \\
|
||||
VpnCloud & 9.39 & 27.9 & 163 & -- \\
|
||||
WireGuard & 9.45 & 28.8 & 161 & -- \\
|
||||
Nebula & 9.15 & 30.8 & 180 & 547 \\
|
||||
Tinc & 10.0 & 30.9 & 166 & 496 \\
|
||||
ZeroTier & 9.22 & 36.2 & 141 & -- \\
|
||||
Mycelium & 10.1 & 79.5 & -- & -- \\
|
||||
Yggdrasil & 10.6 & 230 & -- & -- \\
|
||||
Hyprspace & 11.9 & -- & 170 & -- \\
|
||||
\hline
|
||||
\end{tabular}
|
||||
\end{table}
|
||||
|
||||
% PLOT: heatmap
|
||||
% File: Figures/impairment/Nix Cache Download Time Heatmap.png
|
||||
% Data: Nix cache download time for all VPNs at baseline, low, medium,
|
||||
% high (hatched/absent bars for failures)
|
||||
% Show: Headscale as the only VPN completing all four profiles;
|
||||
% Headscale beating Internal at medium (48.8 vs 58.6 s);
|
||||
% Yggdrasil's 22x slowdown at low impairment
|
||||
|
||||
|
||||
Headscale is the only VPN to complete all four profiles. At Medium
|
||||
impairment, it finishes in 48.8~seconds --- faster than Internal's
|
||||
58.6~seconds. Internal itself fails at High impairment while
|
||||
Headscale completes in 219~seconds. Only Nebula (547\,s) and Tinc
|
||||
(496\,s) also survive High impairment.
|
||||
|
||||
Yggdrasil's download time explodes from 10.6\,s to 230\,s at Low
|
||||
impairment, a 22$\times$ slowdown. Every HTTP request incurs the
|
||||
latency penalty from Yggdrasil's impairment-amplified
|
||||
retransmissions. Mycelium also degrades severely (10.1\,s to
|
||||
79.5\,s, an 8$\times$ increase), consistent with its overlay routing
|
||||
overhead, which compounds over hundreds of sequential HTTP
|
||||
connections.
|
||||
|
||||
The failure map reveals a clean gradient: more demanding profiles
|
||||
knock out more VPNs. At Low, 10 of 11 complete (Hyprspace fails).
|
||||
At Medium, 9 complete. At High, only 3 survive (Headscale, Nebula,
|
||||
Tinc). Internal's failure at High is the most surprising --- the
|
||||
bare-metal baseline cannot sustain a multi-connection HTTP workload
|
||||
under severe degradation, but Headscale, shielded by its userspace
|
||||
TCP stack, can. Section~\ref{sec:tailscale_degraded} explains why.
|
||||
|
||||
\section{Tailscale Under Degraded Conditions}
|
||||
|
||||
% The central finding: Tailscale outperforming the raw Linux
|
||||
% networking stack under impairment.
|
||||
\label{sec:tailscale_degraded}
|
||||
|
||||
\subsection{Observed Anomaly}
|
||||
|
||||
% Present the data showing Tailscale exceeding internal baseline
|
||||
% throughput under Medium/High impairment.
|
||||
At Medium impairment, Headscale delivers 41.5\,Mbps single-stream TCP
|
||||
throughput --- 40\% more than Internal's 29.6\,Mbps. A VPN built
|
||||
atop WireGuard outperforms the bare-metal connection it tunnels
|
||||
through. The anomaly is consistent across benchmarks:
|
||||
Table~\ref{tab:headscale_anomaly} summarizes the comparison.
|
||||
|
||||
\begin{table}[H]
|
||||
\centering
|
||||
\caption{Headscale vs.\ Internal vs.\ WireGuard under impairment
|
||||
(18.12.2025 run). For TCP benchmarks, higher is better. For
|
||||
Nix cache, lower is better; ``--'' marks a failed run.}
|
||||
\label{tab:headscale_anomaly}
|
||||
\begin{tabular}{llrrr}
|
||||
\hline
|
||||
\textbf{Benchmark} & \textbf{Profile} & \textbf{Internal} &
|
||||
\textbf{Headscale} & \textbf{WireGuard} \\
|
||||
\hline
|
||||
Single TCP (Mbps) & Low & 333 & 274 & 54.7 \\
|
||||
Single TCP (Mbps) & Medium & 29.6 & 41.5 & 8.77 \\
|
||||
Single TCP (Mbps) & High & 4.25 & 4.21 & 2.63 \\
|
||||
Parallel TCP (Mbps) & Low & 277 & 718 & 173 \\
|
||||
Parallel TCP (Mbps) & Medium & 82.6 & 113 & 24.5 \\
|
||||
Nix cache (s) & Medium & 58.6 & 48.8 & 161 \\
|
||||
Nix cache (s) & High & -- & 219 & -- \\
|
||||
\hline
|
||||
\end{tabular}
|
||||
\end{table}
|
||||
|
||||
% TODO: Needs to be created, use the tools/ folder
|
||||
% PLOT: line chart
|
||||
% File: Figures/impairment/headscale-vs-internal-across-profiles.png
|
||||
% Data: Single-stream TCP throughput for Internal, Headscale, and
|
||||
% WireGuard across all four profiles
|
||||
% Show: Headscale crossing above Internal at medium impairment;
|
||||
% WireGuard far below both; convergence at high
|
||||
% Y-axis: log scale
|
||||
|
||||
In parallel TCP at Low impairment, Headscale reaches 718\,Mbps vs.\
|
||||
Internal's 277\,Mbps (2.6$\times$). The Nix cache download at
|
||||
Medium takes Headscale 48.8\,s vs.\ Internal's 58.6\,s (17\%
|
||||
faster). At High impairment, Internal fails the Nix cache entirely
|
||||
while Headscale completes in 219\,s.
|
||||
|
||||
WireGuard, which shares Headscale's cryptographic layer, shows no
|
||||
such advantage: 54.7\,Mbps at Low, 8.77\,Mbps at Medium. Whatever
|
||||
protects Headscale is not the encryption or the tunnel --- it is
|
||||
something in Tailscale's userspace networking stack.
|
||||
|
||||
The retransmit data provides the first clue. At Medium impairment,
|
||||
Headscale's retransmit percentage is approximately 2.4\%, matching
|
||||
Internal's ${\sim}$2.4\%. WireGuard's is 5.2\%. Headscale achieves
|
||||
Internal's retransmit efficiency while delivering higher throughput
|
||||
--- fewer spurious retransmissions leave more bandwidth for actual
|
||||
data.
|
||||
|
||||
\subsection{Congestion Control Analysis}
|
||||
|
||||
% Reno vs CUBIC, RACK disabled to avoid spurious retransmits
|
||||
% under reordering.
|
||||
Tailscale uses a userspace TCP/IP stack derived from Google's gVisor
|
||||
(netstack). This stack does not inherit the host kernel's TCP
|
||||
parameters. Three defaults differ from the Linux kernel in ways that
|
||||
matter under packet reordering:
|
||||
|
||||
\begin{itemize}
|
||||
\bitem{\texttt{tcp\_reordering}:} gVisor uses 10; the Linux kernel
|
||||
defaults to~3. This parameter controls how many out-of-order
|
||||
packets TCP tolerates before treating the event as a loss. With
|
||||
tc~netem injecting 0.5--2.5\% reordering per machine, bursts of
|
||||
3+ reordered packets are frequent. The kernel's threshold of~3
|
||||
causes spurious fast retransmits and congestion window reductions
|
||||
for packets that are merely reordered, not lost.
|
||||
\bitem{\texttt{tcp\_recovery} (RACK):} gVisor disables it; the
|
||||
Linux kernel enables it by default. RACK uses timing-based loss
|
||||
detection that is more aggressive than the pure sequence-based
|
||||
approach gVisor uses. Under reordering, RACK's timing heuristics
|
||||
can falsely classify delayed packets as lost.
|
||||
\bitem{\texttt{tcp\_early\_retrans} (TLP):} gVisor disables it; the
|
||||
kernel enables it. Tail Loss Probe sends speculative retransmits
|
||||
on idle connections, which can worsen congestion when the link is
|
||||
already impaired.
|
||||
\end{itemize}
|
||||
|
||||
The combined effect: under network conditions with packet reordering,
|
||||
the default Linux TCP stack fires retransmits and cuts the congestion
|
||||
window far more often than necessary. Each false positive shrinks the
|
||||
window and reduces throughput. Tailscale's gVisor stack tolerates
|
||||
more reordering before reacting, so its congestion window stays larger
|
||||
and throughput stays higher.
|
||||
|
||||
This explains why the anomaly grows with impairment severity. At
|
||||
baseline, there is no reordering, so the threshold difference is
|
||||
irrelevant and Internal's kernel-level processing advantage dominates.
|
||||
As reordering increases from 0.5\% (Low) to 2.5\% (Medium) per
|
||||
machine, the kernel's aggressive loss detection fires more often, and
|
||||
the throughput gap shifts in Headscale's favor.
|
||||
|
||||
\subsection{Tuned Kernel Parameters}
|
||||
|
||||
% Re-run results with tuned buffer sizes and congestion control
|
||||
% on the internal baseline, showing the gap closes.
|
||||
Two follow-up benchmark runs applied Tailscale's gVisor TCP
|
||||
parameters to the host kernel via sysctl:
|
||||
|
||||
\begin{itemize}
|
||||
\bitem{Full gVisor (27.02.2026):} All parameters ---
|
||||
\texttt{tcp\_reordering=10}, \texttt{tcp\_recovery=0},
|
||||
\texttt{tcp\_early\_retrans=0}, plus enlarged buffer sizes
|
||||
(\texttt{tcp\_rmem}, \texttt{tcp\_wmem}, \texttt{rmem\_max},
|
||||
\texttt{wmem\_max}). Tested on Internal, Headscale, WireGuard,
|
||||
Tinc, and ZeroTier.
|
||||
\bitem{Reorder-only (06.03.2026):} Only
|
||||
\texttt{tcp\_reordering=10}, \texttt{tcp\_recovery=0}, and
|
||||
\texttt{tcp\_early\_retrans=0}. Buffer sizes left at kernel
|
||||
defaults. Tested on Internal and Headscale only.
|
||||
\end{itemize}
|
||||
|
||||
Table~\ref{tab:kernel_tuning_internal} shows how Internal responds
|
||||
to the tuning. Both follow-up runs used the same impairment profiles
|
||||
and hardware as the original 18.12.2025 run.
|
||||
|
||||
\begin{table}[H]
|
||||
\centering
|
||||
\caption{Internal (no VPN) throughput across three kernel
|
||||
configurations. ``Default'' is the 18.12.2025 run with stock
|
||||
Linux TCP parameters.}
|
||||
\label{tab:kernel_tuning_internal}
|
||||
\begin{tabular}{llrrr}
|
||||
\hline
|
||||
\textbf{Metric} & \textbf{Profile} & \textbf{Default} &
|
||||
\textbf{Full gVisor} & \textbf{Reorder-only} \\
|
||||
\hline
|
||||
Single TCP (Mbps) & Baseline & 934 & 934 & 934 \\
|
||||
Single TCP (Mbps) & Low & 333 & 363 & 354 \\
|
||||
Single TCP (Mbps) & Medium & 29.6 & 64.2 & 72.7 \\
|
||||
Parallel TCP (Mbps) & Low & 277 & 893 & 902 \\
|
||||
Parallel TCP (Mbps) & Medium & 82.6 & 226 & 211 \\
|
||||
Retransmit \% & Medium & ${\sim}$2.4 & 1.21 & 1.11 \\
|
||||
Nix cache (s) & Medium & 58.6 & 29.7 & 29.1 \\
|
||||
\hline
|
||||
\end{tabular}
|
||||
\end{table}
|
||||
|
||||
% TODO: Needs to be created
|
||||
% PLOT: grouped bar chart
|
||||
% File: Figures/impairment/kernel-tuning-internal-throughput.png
|
||||
% Data: Internal single-stream TCP at baseline/low/medium across
|
||||
% original, full gVisor, and reorder-only configurations
|
||||
% Show: Dramatic jump at medium (29.6 -> 64.2 -> 72.7 Mbps);
|
||||
% baseline unchanged; modest improvement at low
|
||||
% Y-axis: linear scale
|
||||
|
||||
Internal's Medium-impairment throughput jumps from 29.6 to
|
||||
72.7\,Mbps --- a 146\% increase from a three-line sysctl change. The
|
||||
retransmit percentage drops from ${\sim}$2.4\% to 1.11\%; most of the
|
||||
original retransmissions were spurious. The Nix cache download at
|
||||
Medium halves from 58.6\,s to 29.1\,s.
|
||||
|
||||
Parallel TCP sees an even larger gain. Internal at Low impairment
|
||||
climbs from 277 to 902\,Mbps, a 226\% increase that now exceeds
|
||||
Headscale's original 718\,Mbps. With six concurrent flows each
|
||||
independently benefiting from the higher reordering threshold, the
|
||||
aggregate improvement compounds.
|
||||
|
||||
The anomaly reverses. At every impairment level and benchmark, tuned
|
||||
Internal now meets or exceeds Headscale. At Medium impairment:
|
||||
Internal 72.7\,Mbps vs.\ Headscale 50.1\,Mbps (Internal 45\% ahead),
|
||||
where the original result had Headscale 40\% ahead. The Nix cache
|
||||
flips too: Internal completes in 29.1\,s vs.\ Headscale's 36.3\,s,
|
||||
where the original had Headscale 17\% faster.
|
||||
|
||||
% TODO: Needs to be created
|
||||
% PLOT: before/after comparison
|
||||
% File: Figures/impairment/headscale-gap-reversal.png
|
||||
% Data: Internal vs Headscale throughput ratio at each impairment
|
||||
% level, original vs tuned (reorder-only)
|
||||
% Show: The crossover from "Headscale wins" (ratio < 1) to "Internal
|
||||
% wins" (ratio > 1) at medium impairment after tuning
|
||||
% Y-axis: ratio (Internal / Headscale), 1.0 as break-even
|
||||
|
||||
The reorder-only configuration (06.03) matches or exceeds the full
|
||||
gVisor configuration (27.02) at most metrics; the two exceptions are
|
||||
single-stream TCP at Low (354 vs.\ 363\,Mbps) and parallel TCP at
|
||||
Medium (211 vs.\ 226\,Mbps), both within 7\%. Internal
|
||||
reaches 72.7\,Mbps at Medium with reorder-only vs.\ 64.2\,Mbps with
|
||||
full gVisor. The enlarged buffer sizes are unnecessary and may
|
||||
introduce mild buffer bloat that partially offsets the reordering
|
||||
benefit. The entire Headscale advantage is explained by three kernel
|
||||
parameters: \texttt{tcp\_reordering}, \texttt{tcp\_recovery}, and
|
||||
\texttt{tcp\_early\_retrans}.
|
||||
|
||||
Other VPNs benefit less from the kernel tuning. WireGuard's Medium
|
||||
throughput rises from 8.77 to 12.2\,Mbps (+39\%) and Tinc's from
|
||||
5.53 to 11.5\,Mbps (+108\%). ZeroTier shows no change (12.0 to
|
||||
11.5\,Mbps). The tuning helps the kernel TCP stack, but VPNs that
|
||||
add their own encapsulation overhead and userspace processing have
|
||||
independent bottlenecks that the sysctl parameters cannot remove.
|
||||
|
||||
Headscale itself gets modestly faster with kernel tuning (+21\% at
|
||||
Medium) but slightly slower at Low impairment ($-$5\%). Its
|
||||
userspace gVisor stack already optimizes for reordering tolerance.
|
||||
When the kernel stack also increases its tolerance, the two layers of
|
||||
tuning may interact suboptimally --- both independently delay
|
||||
retransmits, which can cause compound delays on the
|
||||
kernel-to-Headscale socket path.
|
||||
|
||||
\section{Source Code Analysis}
|
||||
|
||||
|
||||
Reference in New Issue
Block a user