add impairment profile chapter

This commit is contained in:
2026-03-19 17:34:36 +01:00
parent 8a6d676e93
commit 01e27502bd
+566 -20
View File
@@ -552,9 +552,9 @@ than per-connection latency.
Several rankings invert relative to raw throughput. ZeroTier
finishes faster than WireGuard (9.22\,s vs.\ 9.45\,s) despite
30\,\% fewer raw Mbps and 1\,000$\times$ more retransmits. Yggdrasil
6\,\% fewer raw Mbps and 1\,000$\times$ more retransmits. Yggdrasil
is the clearest example: it has the
third-highest throughput at 795\,Mbps, yet lands at 24\,\% overhead
fourth-highest VPN throughput at 795\,Mbps, yet lands at 24\,\% overhead
because its
2.2\,ms latency adds up over the many small sequential HTTP requests
that constitute a Nix cache download.
@@ -815,57 +815,603 @@ what would otherwise be idle gaps in any individual flow, squeezing
out throughput that no single stream could reach alone.
\section{Impact of Network Impairment}
\label{sec:impairment}
This section examines how each VPN responds to the Low, Medium, and
High impairment profiles defined in Chapter~\ref{Methodology}.
The impairment profiles from Table~\ref{tab:impairment_profiles} are
applied to the full benchmark suite. Baseline results from
Section~\ref{sec:baseline} serve as the reference.
\subsection{Ping}
% RTT and packet loss across impairment profiles.
Table~\ref{tab:ping_impairment} lists average round-trip times across
all four profiles. Most VPNs track the expected increase closely:
tc~netem adds roughly 4\,ms, 8\,ms, and 15\,ms of round-trip delay
at Low, Medium, and High respectively, and Internal's measured values
(4.82, 9.38, 15.49\,ms) confirm this.
\begin{table}[H]
\centering
\caption{Average ping RTT (ms) across impairment profiles, sorted
by High-profile RTT}
\label{tab:ping_impairment}
\begin{tabular}{lrrrr}
\hline
\textbf{VPN} & \textbf{Baseline} & \textbf{Low} &
\textbf{Medium} & \textbf{High} \\
\hline
Internal & 0.60 & 4.82 & 9.38 & 15.49 \\
Tinc & 1.19 & 5.32 & 9.85 & 15.92 \\
Nebula & 1.25 & 5.38 & 9.99 & 15.96 \\
WireGuard & 1.20 & 5.36 & 9.88 & 15.99 \\
Headscale & 1.64 & 5.82 & 10.39 & 16.07 \\
VpnCloud & 1.13 & 5.41 & 10.35 & 16.21 \\
ZeroTier & 1.28 & 5.34 & 10.02 & 16.54 \\
Yggdrasil & 2.20 & 6.73 & 11.99 & 20.20 \\
Hyprspace & 1.79 & 6.15 & 10.76 & 24.49 \\
EasyTier & 1.33 & 6.27 & 14.13 & 26.60 \\
Mycelium & 34.90 & 23.42 & 43.88 & 33.05 \\
\hline
\end{tabular}
\end{table}
% PLOT: line chart
% File: Figures/impairment/Ping Average RTT Heatmap.png
% Data: Average ping RTT for all 11 VPNs at baseline, low, medium, high
% Show: Most VPNs in a tight parallel band; Mycelium's non-monotonic curve;
% EasyTier and Hyprspace diverging upward at high impairment
Mycelium defies the pattern. Its RTT \emph{drops} from 34.9\,ms at
baseline to 23.4\,ms at Low impairment, a 33\% improvement where
every other VPN gets slower. It then rises to 43.9\,ms at Medium
before falling again to 33.0\,ms at High. The baseline analysis
(Section~\ref{sec:mycelium_routing}) showed that Mycelium's latency
comes from a bimodal routing distribution: one path runs at 1.63\,ms
while two others route through the global overlay at
${\sim}$51\,ms. The impairment appears to push Mycelium's path
discovery toward shorter routes, so a larger share of traffic takes
the direct path. The non-monotonic pattern is consistent with a path
selection algorithm that responds to measured link quality, but not
linearly with degradation severity.
Mycelium also achieves 0\% ping packet loss at Low and Medium
impairment, while most VPNs show 0.1--3.2\% loss at those profiles.
At High impairment, Mycelium's loss jumps to 11.1\%.
EasyTier accumulates 11\,ms of excess latency at High impairment
beyond what tc~netem accounts for. Its average RTT of 26.6\,ms and
maximum of 290\,ms (vs.\ ${\sim}$40\,ms for WireGuard) point to a
userspace scheduling or retry mechanism that introduces escalating
variance. EasyTier's RTT standard deviation reaches 44.6\,ms at
High, the worst jitter of any VPN.
Hyprspace shows 11.1\% ping packet loss at every impairment level ---
Low, Medium, and High alike. With 9~measurement runs (3~machine
pairs $\times$ 3~runs of 100~packets), 11.1\% equals exactly 1/9:
one run per profile fails completely while the other eight report zero
loss. This binary pass/fail behavior is consistent with the buffer
bloat diagnosis from Section~\ref{sec:hyprspace_bloat}. When buffers
fill, an entire path stalls rather than degrading gradually.
\subsection{TCP Throughput}
% TCP iperf3: throughput, retransmits, congestion window.
Table~\ref{tab:tcp_impairment} presents single-stream TCP throughput
across all four profiles. The baseline performance tiers from
Section~\ref{sec:baseline} dissolve almost immediately under
impairment.
\begin{table}[H]
\centering
\caption{Single-stream TCP throughput (Mbps) across impairment
profiles, sorted by baseline. Retention is the
Low-to-baseline ratio.}
\label{tab:tcp_impairment}
\begin{tabular}{lrrrrr}
\hline
\textbf{VPN} & \textbf{Baseline} & \textbf{Low} &
\textbf{Medium} & \textbf{High} & \textbf{Retention} \\
\hline
Internal & 934 & 333 & 29.6 & 4.25 & 35.7\% \\
WireGuard & 864 & 54.7 & 8.77 & 2.63 & 6.3\% \\
ZeroTier & 814 & 63.7 & 12.0 & 4.01 & 7.8\% \\
Headscale & 800 & 274 & 41.5 & 4.21 & 34.3\% \\
Yggdrasil & 795 & 13.2 & 6.08 & 3.40 & 1.7\% \\
\hline
Nebula & 706 & 49.8 & 7.82 & 2.60 & 7.1\% \\
EasyTier & 636 & 156 & 17.4 & 3.59 & 24.6\% \\
VpnCloud & 539 & 58.2 & 8.33 & 1.86 & 10.8\% \\
\hline
Hyprspace & 368 & 4.42 & 2.05 & 1.39 & 1.2\% \\
Tinc & 336 & 54.4 & 5.53 & 2.77 & 16.2\% \\
Mycelium & 259 & 16.2 & 3.87 & 2.73 & 6.3\% \\
\hline
\end{tabular}
\end{table}
% PLOT: line chart
% File: Figures/impairment/TCP Throughput Heatmap.png
% Data: Single-stream TCP throughput for all 11 VPNs at baseline,
% low, medium, high
% Show: Headscale crossing above Internal at medium impairment;
% Yggdrasil's cliff from baseline to low; convergence of all
% VPNs at high impairment
Yggdrasil crashes from 795\,Mbps to 13.2\,Mbps at Low impairment, a
98.3\% throughput loss from adding just 2\,ms latency, 2\,ms jitter,
0.25\% packet loss, and 0.5\% reordering per machine. Even Mycelium,
the slowest VPN at baseline (259\,Mbps), retains more throughput at
Low than Yggdrasil does. The jumbo overlay MTU of 32\,731~bytes,
which inflated baseline metrics
(Section~\ref{sec:baseline}), becomes a liability under impairment:
each lost or reordered outer packet triggers retransmission of
${\sim}$24$\times$ more inner-layer data than a standard
1\,400-byte MTU VPN would lose.
Headscale retains 34.3\% of its baseline throughput at Low, nearly
matching Internal's 35.7\%. At Medium impairment, Headscale
(41.5\,Mbps) overtakes Internal (29.6\,Mbps) --- a VPN outperforming
the bare-metal baseline.
Section~\ref{sec:tailscale_degraded} investigates this anomaly in
detail.
At High impairment, the throughput range compresses from 675\,Mbps at
baseline to just 2.9\,Mbps. Internal leads at 4.25\,Mbps; Hyprspace
trails at 1.39\,Mbps. The impairment profile itself becomes the
bottleneck. With 2.5\% packet loss and 5\% reordering per machine,
every implementation is TCP-loss-limited, and architectural
differences that matter at gigabit speeds become irrelevant.
\subsection{UDP Throughput}
% UDP iperf3: throughput, jitter, packet loss.
The UDP stress test (\texttt{-b~0}) suffers from widespread failures
under impairment. Hyprspace and Mycelium, which already failed at
baseline, continue to fail at all profiles. Tinc and ZeroTier fail
at most non-baseline profiles. The sparse dataset limits
conclusions, but one pattern stands out.
Kernel-level implementations maintain throughput regardless of
impairment. Internal holds ${\sim}$950\,Mbps across all profiles
where data exists. Headscale sustains 700--876\,Mbps and WireGuard
850--908\,Mbps; % TODO: verify WireGuard UDP range -- analysis doc says 850-898, possible digit transposition
both rely on WireGuard's in-kernel UDP handling with
proper backpressure. Userspace VPNs collapse: EasyTier drops from
865 to 435 to 38.5 to 6.1\,Mbps across successive profiles.
Yggdrasil, already pathological at baseline (98.7\% loss), crashes to
12.3\,Mbps at Low and fails entirely at Medium and High.
% PLOT: heatmap
% File: Figures/impairment/UDP Receiver Throughput Heatmap.png
% Data: UDP receiver throughput for all 11 VPNs at baseline, low,
% medium, high (grey/hatched cells for failures)
% Show: Kernel-level VPNs (Internal, WireGuard, Headscale) maintaining
% high throughput across all profiles; userspace VPNs failing or
% collapsing; the large number of empty cells
The failure rate of this benchmark under impairment makes it more
useful as a robustness indicator than a throughput measurement. A VPN
that cannot complete a 30-second UDP flood under 0.25\% packet loss
has fundamental flow-control problems that will surface under real
workloads too, even if the symptoms are milder.
\subsection{Parallel TCP}
% Parallel iperf3: throughput under contention (A->B, B->C, C->A).
Table~\ref{tab:parallel_impairment} shows aggregate throughput across
three concurrent bidirectional links (six unidirectional flows). The
Headscale anomaly from the single-stream results is amplified here.
\begin{table}[H]
\centering
\caption{Parallel TCP throughput (Mbps) across impairment profiles.
Three concurrent bidirectional links produce six unidirectional
flows.}
\label{tab:parallel_impairment}
\begin{tabular}{lrrrr}
\hline
\textbf{VPN} & \textbf{Baseline} & \textbf{Low} &
\textbf{Medium} & \textbf{High} \\
\hline
Internal & 1398 & 277 & 82.6 & 10.4 \\
Headscale & 1228 & 718 & 113 & 20.0 \\
WireGuard & 1281 & 173 & 24.5 & 8.39 \\
Yggdrasil & 1265 & 38.7 & 16.7 & 8.95 \\
ZeroTier & 1206 & 176 & 35.4 & 7.97 \\
EasyTier & 927 & 473 & 57.4 & 10.7 \\
Hyprspace & 803 & 2.87 & 6.94 & 3.62 \\
VpnCloud & 763 & 174 & 23.7 & 8.25 \\
Nebula & 648 & 103 & 15.3 & 4.93 \\
Mycelium & 569 & 72.7 & 7.51 & 3.69 \\
Tinc & 563 & 168 & 23.7 & 8.25 \\
\hline
\end{tabular}
\end{table}
% PLOT: heatmap
% File: Figures/impairment/Parallel TCP Throughput Heatmap.png
% Data: Parallel TCP throughput for all 11 VPNs at baseline, low,
% medium, high
% Show: Headscale dominating at low impairment (718 Mbps vs Internal's
% 277); EasyTier as runner-up (473 Mbps); Hyprspace's collapse
% to 2.87 Mbps
Headscale at Low impairment: 718\,Mbps --- 2.6$\times$ Internal
(277\,Mbps) and 4.1$\times$ WireGuard (173\,Mbps). At Medium,
Headscale (113\,Mbps) still leads Internal (82.6\,Mbps) by 37\%.
The single-stream anomaly from
Section~\ref{sec:tailscale_degraded} compounds when multiple flows
each independently benefit from Headscale's congestion control
tuning.
EasyTier is the second-most resilient VPN under parallel load, at
473\,Mbps at Low (51\% of baseline). Both EasyTier and Headscale
retain more than half their baseline parallel throughput at Low
impairment; no other VPN exceeds 30\%.
Hyprspace collapses from 803\,Mbps to 2.87\,Mbps at Low, a 99.6\%
loss. The buffer bloat that plagues single-stream transfers
(Section~\ref{sec:hyprspace_bloat}) becomes catastrophic when six
concurrent flows compete for the same bloated buffers.
The High-profile convergence effect is even more pronounced here than
in single-stream mode. Tinc and VpnCloud land at identical
8.25\,Mbps despite differing by 200\,Mbps at baseline.
\subsection{QUIC Performance}
% qperf: bandwidth, TTFB, connection establishment time.
Headscale and Nebula failed the qperf QUIC benchmark at baseline
(Section~\ref{sec:baseline}) and continue to fail across all
impairment profiles.
Yggdrasil's QUIC bandwidth drops from 745\,Mbps at baseline to
7.67\,Mbps at Low, 3.45\,Mbps at Medium, and 2.17\,Mbps at High ---
the same cliff observed in its TCP results, driven by the same
jumbo-MTU amplification of outer-layer packet loss.
At High impairment, WireGuard (23.2\,Mbps), VpnCloud (23.4\,Mbps),
ZeroTier (23.0\,Mbps), and Tinc (23.4\,Mbps) converge to within
0.4\,Mbps of each other. At baseline these four span a 500\,Mbps
range. QUIC's own congestion control, operating atop the
already-degraded outer link, becomes the sole limiter.
% PLOT: heatmap
% File: Figures/impairment/QUIC Bandwidth Heatmap.png
% Data: QPerf QUIC bandwidth for VPNs with data at all four profiles
% (WireGuard, VpnCloud, ZeroTier, Tinc, Yggdrasil, Internal)
% Show: Yggdrasil's cliff from baseline to low; convergence of
% WireGuard, VpnCloud, ZeroTier, Tinc at high (~23 Mbps)
\subsection{Video Streaming}
% RIST: bitrate, dropped frames, packets recovered, quality score.
Table~\ref{tab:rist_impairment} presents RIST video quality scores
across profiles. The actual encoding bitrate of ${\sim}$3.3\,Mbps
sits well within every VPN's throughput budget even at High
impairment, so quality differences reflect packet delivery reliability
rather than bandwidth limits.
\begin{table}[H]
\centering
\caption{RIST video streaming quality (\%) across impairment
profiles, sorted by High-profile quality}
\label{tab:rist_impairment}
\begin{tabular}{lrrrr}
\hline
\textbf{VPN} & \textbf{Baseline} & \textbf{Low} &
\textbf{Medium} & \textbf{High} \\
\hline
Mycelium & 100.0 & 100.0 & 100.0 & 99.9 \\
EasyTier & 100.0 & 100.0 & 96.2 & 85.5 \\
Internal & 100.0 & 99.2 & 89.3 & 80.2 \\
ZeroTier & 100.0 & 99.3 & 89.9 & 80.2 \\
VpnCloud & 100.0 & 99.2 & 89.7 & 80.1 \\
WireGuard & 100.0 & 99.3 & 90.0 & 80.0 \\
Hyprspace & 100.0 & 92.9 & 87.9 & 78.1 \\
Tinc & 100.0 & 99.3 & 90.0 & 77.8 \\
Nebula & 99.8 & 98.8 & 85.6 & 72.1 \\
Yggdrasil & 100.0 & 94.7 & 71.4 & 43.3 \\
Headscale & 13.1 & 13.0 & 13.0 & 13.0 \\
\hline
\end{tabular}
\end{table}
% PLOT: heatmap
% File: Figures/impairment/Video Streaming Quality Heatmap.png
% Data: RIST quality for all 11 VPNs at baseline, low, medium, high
% Show: Headscale stuck at 13% (red row); Mycelium stuck near 100%
% (green row); gradual degradation for the bulk; Yggdrasil's
% steep decline to 43%
Headscale stays at ${\sim}$13\% across all four profiles: 13.1\%,
13.0\%, 13.0\%, 13.0\%. The profile-independence confirms the
baseline diagnosis from Section~\ref{sec:baseline}. The failure is
structural --- likely MTU fragmentation in the DERP relay layer ---
and cannot worsen because it is already saturated. Adding latency or
loss on top of an 87\% packet drop floor changes nothing.
Mycelium delivers 99.9\% quality even at High impairment, better than
Internal (80.2\%) and every other VPN. At 3.3\,Mbps, even
Mycelium's degraded overlay paths can sustain the stream. Its
overlay retransmission mechanism, which cripples bulk TCP transfers,
works well for steady low-bandwidth UDP flows. RIST's own forward
error correction handles whatever Mycelium's retransmissions miss.
Yggdrasil degrades the most steeply: 100\% at baseline, 94.7\% at
Low, 71.4\% at Medium, 43.3\% at High. The jumbo MTU that hurt TCP
throughput also hurts here --- large overlay packets carrying RIST
data are more likely to be lost or reordered at the outer layer, and
RIST's FEC cannot recover from the resulting burst losses.
\subsection{Application-Level Download}
% Nix cache: download duration for Firefox package.
Table~\ref{tab:nix_impairment} shows Nix binary cache download times
across profiles. This HTTP-heavy workload, dominated by many
short-lived TCP connections, is more sensitive to per-connection
latency than to raw bandwidth.
\begin{table}[H]
\centering
\caption{Nix binary cache download time (seconds) across impairment
profiles, sorted by Low-profile time. ``--'' marks a failed
run.}
\label{tab:nix_impairment}
\begin{tabular}{lrrrr}
\hline
\textbf{VPN} & \textbf{Baseline} & \textbf{Low} &
\textbf{Medium} & \textbf{High} \\
\hline
Internal & 8.53 & 11.9 & 58.6 & -- \\
Headscale & 9.79 & 13.5 & 48.8 & 219 \\
EasyTier & 9.39 & 22.1 & 141 & -- \\
VpnCloud & 9.39 & 27.9 & 163 & -- \\
WireGuard & 9.45 & 28.8 & 161 & -- \\
Nebula & 9.15 & 30.8 & 180 & 547 \\
Tinc & 10.0 & 30.9 & 166 & 496 \\
ZeroTier & 9.22 & 36.2 & 141 & -- \\
Mycelium & 10.1 & 79.5 & -- & -- \\
Yggdrasil & 10.6 & 230 & -- & -- \\
Hyprspace & 11.9 & -- & 170 & -- \\
\hline
\end{tabular}
\end{table}
% PLOT: heatmap
% File: Figures/impairment/Nix Cache Download Time Heatmap.png
% Data: Nix cache download time for all VPNs at baseline, low, medium,
% high (hatched/absent bars for failures)
% Show: Headscale as the only VPN completing all four profiles;
% Headscale beating Internal at medium (48.8 vs 58.6 s);
% Yggdrasil's 22x slowdown at low impairment
Headscale is the only VPN to complete all four profiles. At Medium
impairment, it finishes in 48.8~seconds --- faster than Internal's
58.6~seconds. Internal itself fails at High impairment while
Headscale completes in 219~seconds. Only Nebula (547\,s) and Tinc
(496\,s) also survive High impairment.
Yggdrasil's download time explodes from 10.6\,s to 230\,s at Low
impairment, a 22$\times$ slowdown. Every HTTP request incurs the
latency penalty from Yggdrasil's impairment-amplified
retransmissions. Mycelium also degrades severely (10.1\,s to
79.5\,s, an 8$\times$ increase), consistent with its overlay routing
overhead, which compounds over hundreds of sequential HTTP
connections.
The failure map reveals a clean gradient: more demanding profiles
knock out more VPNs. At Low, 10 of 11 complete (Hyprspace fails).
At Medium, 9 complete. At High, only 3 survive (Headscale, Nebula,
Tinc). Internal's failure at High is the most surprising --- the
bare-metal baseline cannot sustain a multi-connection HTTP workload
under severe degradation, but Headscale, shielded by its userspace
TCP stack, can. Section~\ref{sec:tailscale_degraded} explains why.
\section{Tailscale Under Degraded Conditions}
% The central finding: Tailscale outperforming the raw Linux
% networking stack under impairment.
\label{sec:tailscale_degraded}
\subsection{Observed Anomaly}
% Present the data showing Tailscale exceeding internal baseline
% throughput under Medium/High impairment.
At Medium impairment, Headscale delivers 41.5\,Mbps single-stream TCP
throughput --- 40\% more than Internal's 29.6\,Mbps. A VPN built
atop WireGuard outperforms the bare-metal connection it tunnels
through. The anomaly is consistent across benchmarks:
Table~\ref{tab:headscale_anomaly} summarizes the comparison.
\begin{table}[H]
\centering
\caption{Headscale vs.\ Internal vs.\ WireGuard under impairment
(18.12.2025 run). For TCP benchmarks, higher is better. For
Nix cache, lower is better; ``--'' marks a failed run.}
\label{tab:headscale_anomaly}
\begin{tabular}{llrrr}
\hline
\textbf{Benchmark} & \textbf{Profile} & \textbf{Internal} &
\textbf{Headscale} & \textbf{WireGuard} \\
\hline
Single TCP (Mbps) & Low & 333 & 274 & 54.7 \\
Single TCP (Mbps) & Medium & 29.6 & 41.5 & 8.77 \\
Single TCP (Mbps) & High & 4.25 & 4.21 & 2.63 \\
Parallel TCP (Mbps) & Low & 277 & 718 & 173 \\
Parallel TCP (Mbps) & Medium & 82.6 & 113 & 24.5 \\
Nix cache (s) & Medium & 58.6 & 48.8 & 161 \\
Nix cache (s) & High & -- & 219 & -- \\
\hline
\end{tabular}
\end{table}
% TODO: Needs to be created, use the tools/ folder
% PLOT: line chart
% File: Figures/impairment/headscale-vs-internal-across-profiles.png
% Data: Single-stream TCP throughput for Internal, Headscale, and
% WireGuard across all four profiles
% Show: Headscale crossing above Internal at medium impairment;
% WireGuard far below both; convergence at high
% Y-axis: log scale
In parallel TCP at Low impairment, Headscale reaches 718\,Mbps vs.\
Internal's 277\,Mbps (2.6$\times$). The Nix cache download at
Medium takes Headscale 48.8\,s vs.\ Internal's 58.6\,s (17\%
faster). At High impairment, Internal fails the Nix cache entirely
while Headscale completes in 219\,s.
WireGuard, which shares Headscale's cryptographic layer, shows no
such advantage: 54.7\,Mbps at Low, 8.77\,Mbps at Medium. Whatever
protects Headscale is not the encryption or the tunnel --- it is
something in Tailscale's userspace networking stack.
The retransmit data provides the first clue. At Medium impairment,
Headscale's retransmit percentage is approximately 2.4\%, matching
Internal's ${\sim}$2.4\%. WireGuard's is 5.2\%. Headscale achieves
Internal's retransmit efficiency while delivering higher throughput
--- fewer spurious retransmissions leave more bandwidth for actual
data.
\subsection{Congestion Control Analysis}
% Reno vs CUBIC, RACK disabled to avoid spurious retransmits
% under reordering.
Tailscale uses a userspace TCP/IP stack derived from Google's gVisor
(netstack). This stack does not inherit the host kernel's TCP
parameters. Three defaults differ from the Linux kernel in ways that
matter under packet reordering:
\begin{itemize}
\bitem{\texttt{tcp\_reordering}:} gVisor uses 10; the Linux kernel
defaults to~3. This parameter controls how many out-of-order
packets TCP tolerates before treating the event as a loss. With
tc~netem injecting 0.5--2.5\% reordering per machine, bursts of
3+ reordered packets are frequent. The kernel's threshold of~3
causes spurious fast retransmits and congestion window reductions
for packets that are merely reordered, not lost.
\bitem{\texttt{tcp\_recovery} (RACK):} gVisor disables it; the
Linux kernel enables it by default. RACK uses timing-based loss
detection that is more aggressive than the pure sequence-based
approach gVisor uses. Under reordering, RACK's timing heuristics
can falsely classify delayed packets as lost.
\bitem{\texttt{tcp\_early\_retrans} (TLP):} gVisor disables it; the
kernel enables it. Tail Loss Probe sends speculative retransmits
on idle connections, which can worsen congestion when the link is
already impaired.
\end{itemize}
The combined effect: under network conditions with packet reordering,
the default Linux TCP stack fires retransmits and cuts the congestion
window far more often than necessary. Each false positive shrinks the
window and reduces throughput. Tailscale's gVisor stack tolerates
more reordering before reacting, so its congestion window stays larger
and throughput stays higher.
This explains why the anomaly grows with impairment severity. At
baseline, there is no reordering, so the threshold difference is
irrelevant and Internal's kernel-level processing advantage dominates.
As reordering increases from 0.5\% (Low) to 2.5\% (Medium) per
machine, the kernel's aggressive loss detection fires more often, and
the throughput gap shifts in Headscale's favor.
\subsection{Tuned Kernel Parameters}
% Re-run results with tuned buffer sizes and congestion control
% on the internal baseline, showing the gap closes.
Two follow-up benchmark runs applied Tailscale's gVisor TCP
parameters to the host kernel via sysctl:
\begin{itemize}
\bitem{Full gVisor (27.02.2026):} All parameters ---
\texttt{tcp\_reordering=10}, \texttt{tcp\_recovery=0},
\texttt{tcp\_early\_retrans=0}, plus enlarged buffer sizes
(\texttt{tcp\_rmem}, \texttt{tcp\_wmem}, \texttt{rmem\_max},
\texttt{wmem\_max}). Tested on Internal, Headscale, WireGuard,
Tinc, and ZeroTier.
\bitem{Reorder-only (06.03.2026):} Only
\texttt{tcp\_reordering=10}, \texttt{tcp\_recovery=0}, and
\texttt{tcp\_early\_retrans=0}. Buffer sizes left at kernel
defaults. Tested on Internal and Headscale only.
\end{itemize}
Table~\ref{tab:kernel_tuning_internal} shows how Internal responds
to the tuning. Both follow-up runs used the same impairment profiles
and hardware as the original 18.12.2025 run.
\begin{table}[H]
\centering
\caption{Internal (no VPN) throughput across three kernel
configurations. ``Default'' is the 18.12.2025 run with stock
Linux TCP parameters.}
\label{tab:kernel_tuning_internal}
\begin{tabular}{llrrr}
\hline
\textbf{Metric} & \textbf{Profile} & \textbf{Default} &
\textbf{Full gVisor} & \textbf{Reorder-only} \\
\hline
Single TCP (Mbps) & Baseline & 934 & 934 & 934 \\
Single TCP (Mbps) & Low & 333 & 363 & 354 \\
Single TCP (Mbps) & Medium & 29.6 & 64.2 & 72.7 \\
Parallel TCP (Mbps) & Low & 277 & 893 & 902 \\
Parallel TCP (Mbps) & Medium & 82.6 & 226 & 211 \\
Retransmit \% & Medium & ${\sim}$2.4 & 1.21 & 1.11 \\
Nix cache (s) & Medium & 58.6 & 29.7 & 29.1 \\
\hline
\end{tabular}
\end{table}
% TODO: Needs to be created
% PLOT: grouped bar chart
% File: Figures/impairment/kernel-tuning-internal-throughput.png
% Data: Internal single-stream TCP at baseline/low/medium across
% original, full gVisor, and reorder-only configurations
% Show: Dramatic jump at medium (29.6 -> 64.2 -> 72.7 Mbps);
% baseline unchanged; modest improvement at low
% Y-axis: linear scale
Internal's Medium-impairment throughput jumps from 29.6 to
72.7\,Mbps --- a 146\% increase from a three-line sysctl change. The
retransmit percentage drops from ${\sim}$2.4\% to 1.11\%; most of the
original retransmissions were spurious. The Nix cache download at
Medium halves from 58.6\,s to 29.1\,s.
Parallel TCP sees an even larger gain. Internal at Low impairment
climbs from 277 to 902\,Mbps, a 226\% increase that now exceeds
Headscale's original 718\,Mbps. With six concurrent flows each
independently benefiting from the higher reordering threshold, the
aggregate improvement compounds.
The anomaly reverses. At every impairment level and benchmark, tuned
Internal now meets or exceeds Headscale. At Medium impairment:
Internal 72.7\,Mbps vs.\ Headscale 50.1\,Mbps (Internal 45\% ahead),
where the original result had Headscale 40\% ahead. The Nix cache
flips too: Internal completes in 29.1\,s vs.\ Headscale's 36.3\,s,
where the original had Headscale 17\% faster.
% TODO: Needs to be created
% PLOT: before/after comparison
% File: Figures/impairment/headscale-gap-reversal.png
% Data: Internal vs Headscale throughput ratio at each impairment
% level, original vs tuned (reorder-only)
% Show: The crossover from "Headscale wins" (ratio < 1) to "Internal
% wins" (ratio > 1) at medium impairment after tuning
% Y-axis: ratio (Internal / Headscale), 1.0 as break-even
The reorder-only configuration (06.03) matches or exceeds the full
gVisor configuration (27.02) at most metrics; the two exceptions are
single-stream TCP at Low (354 vs.\ 363\,Mbps) and parallel TCP at
Medium (211 vs.\ 226\,Mbps), both within 7\%. Internal
reaches 72.7\,Mbps at Medium with reorder-only vs.\ 64.2\,Mbps with
full gVisor. The enlarged buffer sizes are unnecessary and may
introduce mild buffer bloat that partially offsets the reordering
benefit. The entire Headscale advantage is explained by three kernel
parameters: \texttt{tcp\_reordering}, \texttt{tcp\_recovery}, and
\texttt{tcp\_early\_retrans}.
Other VPNs benefit less from the kernel tuning. WireGuard's Medium
throughput rises from 8.77 to 12.2\,Mbps (+39\%) and Tinc's from
5.53 to 11.5\,Mbps (+108\%). ZeroTier shows no change (12.0 to
11.5\,Mbps). The tuning helps the kernel TCP stack, but VPNs that
add their own encapsulation overhead and userspace processing have
independent bottlenecks that the sysctl parameters cannot remove.
Headscale itself gets modestly faster with kernel tuning (+21\% at
Medium) but slightly slower at Low impairment ($-$5\%). Its
userspace gVisor stack already optimizes for reordering tolerance.
When the kernel stack also increases its tolerance, the two layers of
tuning may interact suboptimally --- both independently delay
retransmits, which can cause compound delays on the
kernel-to-Headscale socket path.
\section{Source Code Analysis}