create charts for methodology section
This commit is contained in:
@@ -21,11 +21,11 @@ All experiments were conducted on three bare-metal servers with
|
||||
identical specifications:
|
||||
|
||||
\begin{itemize}
|
||||
\bitem{CPU:} Intel Model 94, 4 cores / 8 threads
|
||||
\bitem{Memory:} 64 GB RAM
|
||||
\bitem{Network:} 1 Gbps Ethernet (e1000e driver; one machine
|
||||
\bitem{CPU:} Intel Model 94, 4 cores / 8 threads
|
||||
\bitem{Memory:} 64 GB RAM
|
||||
\bitem{Network:} 1 Gbps Ethernet (e1000e driver; one machine
|
||||
uses r8169)
|
||||
\bitem{Cryptographic acceleration:} AES-NI, AVX, AVX2, PCLMULQDQ,
|
||||
\bitem{Cryptographic acceleration:} AES-NI, AVX, AVX2, PCLMULQDQ,
|
||||
RDRAND, SSE4.2
|
||||
\end{itemize}
|
||||
|
||||
@@ -36,9 +36,35 @@ may differ on systems without these features.
|
||||
\subsection{Network Topology}
|
||||
|
||||
The three machines are connected via a direct 1 Gbps LAN on the same
|
||||
network segment. This baseline topology provides a controlled environment
|
||||
with minimal latency and no packet loss, allowing the overhead introduced
|
||||
by each VPN implementation to be measured in isolation.
|
||||
network segment. Each machine has a publicly reachable IPv4 address,
|
||||
which is used to deploy configuration changes via Clan. This baseline
|
||||
topology provides a controlled environment with minimal latency and no
|
||||
packet loss, allowing the overhead introduced by each VPN implementation
|
||||
to be measured in isolation. Figure~\ref{fig:mesh_topology} illustrates
|
||||
the full-mesh connectivity between the three machines.
|
||||
|
||||
\begin{figure}[H]
|
||||
\centering
|
||||
\begin{tikzpicture}[
|
||||
node/.style={
|
||||
draw, rounded corners, minimum width=2.2cm, minimum height=1cm,
|
||||
font=\ttfamily\bfseries, align=center
|
||||
},
|
||||
link/.style={thick, <->}
|
||||
]
|
||||
% Nodes in an equilateral triangle
|
||||
\node[node] (luna) at (0, 3.5) {luna};
|
||||
\node[node] (yuki) at (-3, 0) {yuki};
|
||||
\node[node] (lom) at (3, 0) {lom};
|
||||
|
||||
% Mesh links
|
||||
\draw[link] (luna) -- node[left, font=\small] {1 Gbps} (yuki);
|
||||
\draw[link] (luna) -- node[right, font=\small] {1 Gbps} (lom);
|
||||
\draw[link] (yuki) -- node[below, font=\small] {1 Gbps} (lom);
|
||||
\end{tikzpicture}
|
||||
\caption{Full-mesh network topology of the three benchmark machines}
|
||||
\label{fig:mesh_topology}
|
||||
\end{figure}
|
||||
|
||||
To simulate real-world network conditions, Linux traffic control
|
||||
(\texttt{tc netem}) is used to inject latency, jitter, packet loss,
|
||||
@@ -85,20 +111,20 @@ for understanding the cost of mesh coordination and NAT traversal logic.
|
||||
|
||||
VPNs were selected based on:
|
||||
\begin{itemize}
|
||||
\bitem{NAT traversal capability:} All selected VPNs can establish
|
||||
\bitem{NAT traversal capability:} All selected VPNs can establish
|
||||
connections between peers behind NAT without manual port forwarding.
|
||||
\bitem{Decentralization:} Preference for solutions without mandatory
|
||||
\bitem{Decentralization:} Preference for solutions without mandatory
|
||||
central servers, though coordinated-mesh VPNs were included for comparison.
|
||||
\bitem{Active development:} Only VPNs with recent commits and
|
||||
\bitem{Active development:} Only VPNs with recent commits and
|
||||
maintained releases were considered.
|
||||
\bitem{Linux support:} All VPNs must run on Linux.
|
||||
\bitem{Linux support:} All VPNs must run on Linux.
|
||||
\end{itemize}
|
||||
|
||||
\subsection{Configuration Methodology}
|
||||
|
||||
Each VPN is built from source within the Nix flake, ensuring that all
|
||||
dependencies are pinned to exact versions. VPNs not packaged in nixpkgs
|
||||
(Hyprspace, EasyTier, VpnCloud, qperf) have dedicated build expressions
|
||||
(Hyprspace, EasyTier, VpnCloud) have dedicated build expressions
|
||||
under \texttt{pkgs/} in the flake.
|
||||
|
||||
Cryptographic material (WireGuard keys, Nebula certificates, ZeroTier
|
||||
@@ -122,43 +148,63 @@ work that relied exclusively on iperf3.
|
||||
|
||||
\subsection{Ping}
|
||||
|
||||
Measures round-trip latency and packet delivery reliability.
|
||||
Measures ICMP round-trip latency and packet delivery reliability.
|
||||
|
||||
\begin{itemize}
|
||||
\bitem{Method:} 100 ICMP echo requests at 200 ms intervals,
|
||||
\bitem{Method:} 100 ICMP echo requests at 200 ms intervals,
|
||||
1-second per-packet timeout, repeated for 3 runs.
|
||||
\bitem{Metrics:} RTT (min, avg, max, mdev), packet loss percentage,
|
||||
\bitem{Metrics:} RTT (min, avg, max, mdev), packet loss percentage,
|
||||
per-packet RTTs.
|
||||
\end{itemize}
|
||||
|
||||
\subsection{iPerf3}
|
||||
\subsection{TCP iPerf3}
|
||||
|
||||
Measures bulk data transfer throughput.
|
||||
|
||||
\textbf{TCP variant:} 30-second bidirectional test with RSA authentication
|
||||
and zero-copy mode (\texttt{-Z}) to minimize CPU overhead.
|
||||
|
||||
\textbf{UDP variant:} Same configuration with unlimited target bandwidth
|
||||
(\texttt{-b 0}) and 64-bit counters.
|
||||
|
||||
\textbf{Parallel TCP variant:} Tests concurrent mesh traffic by running
|
||||
TCP streams on all machines simultaneously in a circular pattern
|
||||
(A$\rightarrow$B, B$\rightarrow$C, C$\rightarrow$A) for 60 seconds.
|
||||
This simulates contention across the mesh.
|
||||
Measures bulk TCP throughput with iperf3
|
||||
a common tool used in research to measure network performance.
|
||||
|
||||
\begin{itemize}
|
||||
\bitem{Metrics:} Throughput (bits/s), retransmits, congestion window,
|
||||
jitter (UDP), packet loss (UDP).
|
||||
\bitem{Method:} 30-second bidirectional test zero-copy mode
|
||||
(\texttt{-Z}) to minimize CPU
|
||||
overhead.
|
||||
\bitem{Metrics:} Throughput (bits/s), retransmits, congestion
|
||||
window and CPU utilization.
|
||||
\end{itemize}
|
||||
|
||||
\subsection{qPerf}
|
||||
\subsection{UDP iPerf3}
|
||||
|
||||
Measures connection-level performance rather than bulk throughput.
|
||||
Measures bulk UDP throughput with the same flags as the TCP Iperf3 benchmark.
|
||||
|
||||
\begin{itemize}
|
||||
\bitem{Method:} One qperf instance per CPU core in parallel, each
|
||||
\bitem{Method:} plus unlimited target bandwidth (\texttt{-b 0}) and
|
||||
64-bit counters flags.
|
||||
\bitem{Metrics:} Throughput (bits/s), jitter, packet loss and CPU
|
||||
utilization.
|
||||
\end{itemize}
|
||||
|
||||
\subsection{Parallel iPerf3}
|
||||
|
||||
Tests concurrent overlay network traffic by running TCP streams on all machines
|
||||
simultaneously in a circular pattern (A$\rightarrow$B,
|
||||
B$\rightarrow$C, C$\rightarrow$A) for 60 seconds. This simulates
|
||||
contention across the overlay network.
|
||||
|
||||
\begin{itemize}
|
||||
\bitem{Method:} 60-second bidirectional test zero-copy mode
|
||||
(\texttt{-Z}) to minimize CPU
|
||||
overhead.
|
||||
\bitem{Metrics:} Throughput (bits/s), retransmits, congestion
|
||||
window and CPU utilization.
|
||||
\end{itemize}
|
||||
|
||||
\subsection{QPerf}
|
||||
|
||||
Measures connection-level QUIC performance rather
|
||||
than bulk UDP or TCP throughput.
|
||||
|
||||
\begin{itemize}
|
||||
\bitem{Method:} One qperf process per CPU core in parallel, each
|
||||
running for 30 seconds. Bandwidth from all cores is summed per second.
|
||||
\bitem{Metrics:} Total bandwidth (Mbps), CPU usage, time to first
|
||||
\bitem{Metrics:} Total bandwidth (Mbps), CPU usage, time to first
|
||||
byte (TTFB), connection establishment time.
|
||||
\end{itemize}
|
||||
|
||||
@@ -167,12 +213,12 @@ Measures connection-level performance rather than bulk throughput.
|
||||
Measures real-time multimedia streaming performance.
|
||||
|
||||
\begin{itemize}
|
||||
\bitem{Method:} The sender generates a 4K (3840$\times$2160) test
|
||||
\bitem{Method:} The sender generates a 4K ($3840\times2160$) test
|
||||
pattern at 30 fps using ffmpeg with H.264 encoding (ultrafast preset,
|
||||
zerolatency tuning) at 25 Mbps target bitrate. The stream is transmitted
|
||||
over the RIST protocol to a receiver on the target machine for 30 seconds.
|
||||
\bitem{Encoding metrics:} Actual bitrate, frame rate, dropped frames.
|
||||
\bitem{Network metrics:} Packets dropped, packets recovered via
|
||||
\bitem{Encoding metrics:} Actual bitrate, frame rate, dropped frames.
|
||||
\bitem{Network metrics:} Packets dropped, packets recovered via
|
||||
RIST retransmission, RTT, quality score (0--100), received bitrate.
|
||||
\end{itemize}
|
||||
|
||||
@@ -182,22 +228,19 @@ realistic test of VPN behavior under multimedia workloads.
|
||||
|
||||
\subsection{Nix Cache Download}
|
||||
|
||||
Measures sustained download performance using a real-world workload.
|
||||
Measures sustained HTTP download performance of many small files
|
||||
using a real-world workload.
|
||||
|
||||
\begin{itemize}
|
||||
\bitem{Method:} A Harmonia Nix binary cache server on the target
|
||||
\bitem{Method:} A Harmonia Nix binary cache server on the target
|
||||
machine serves the Firefox package. The client downloads it via
|
||||
\texttt{nix copy} through the VPN. Benchmarked with hyperfine:
|
||||
1 warmup run followed by 2 timed runs. The local cache and Nix's
|
||||
SQLite metadata are cleared between runs.
|
||||
\bitem{Metrics:} Mean duration (seconds), standard deviation,
|
||||
\bitem{Metrics:} Mean duration (seconds), standard deviation,
|
||||
min/max duration.
|
||||
\end{itemize}
|
||||
|
||||
This benchmark tests realistic HTTP traffic patterns and sustained
|
||||
sequential download performance, complementing the synthetic throughput
|
||||
tests.
|
||||
|
||||
\section{Network Impairment Profiles}
|
||||
|
||||
Four impairment profiles simulate a range of network conditions, from
|
||||
@@ -215,14 +258,21 @@ effective round-trip impairment is approximately doubled.
|
||||
\textbf{Profile} & \textbf{Latency} & \textbf{Jitter} &
|
||||
\textbf{Loss} & \textbf{Reorder} & \textbf{Correlation} \\
|
||||
\hline
|
||||
Baseline & ; & ; & ; & ; & ; \\
|
||||
Baseline & - & - & - & - & - \\
|
||||
Low & 2 ms & 2 ms & 0.25\% & 0.5\% & 25\% \\
|
||||
Medium & 4 ms & 7 ms & 1.0\% & 2.5\% & 50\% \\
|
||||
High & 12 ms & 30 ms & 5.0\% & 10\% & 50\% \\
|
||||
High & 6 ms & 15 ms & 2.5\% & 5\% & 50\% \\
|
||||
\hline
|
||||
\end{tabular}
|
||||
\end{table}
|
||||
|
||||
The correlation column controls how strongly each packet's impairment
|
||||
depends on the preceding packet. At 0\% correlation, loss and
|
||||
reordering events are independent; at higher values they occur in
|
||||
bursts, because a packet that was lost or reordered increases the
|
||||
probability that the next packet suffers the same fate. This produces
|
||||
realistic bursty degradation rather than uniformly distributed drops.
|
||||
|
||||
The ``Low'' profile approximates a well-provisioned continental
|
||||
connection, ``Medium'' represents intercontinental links or congested
|
||||
networks, and ``High'' simulates severely degraded conditions such as
|
||||
@@ -249,12 +299,70 @@ The benchmark suite is fully automated via a Python orchestrator
|
||||
\begin{enumerate}
|
||||
\item Applies TC rules via context manager (guarantees cleanup)
|
||||
\item Waits 30 seconds for stabilization
|
||||
\item Executes all benchmarks
|
||||
\item Executes each benchmark three times sequentially,
|
||||
once per machine pair: $A\to B$, then
|
||||
$B\to C$, lastly $C\to A$
|
||||
\item Clears TC rules
|
||||
\end{enumerate}
|
||||
\item Collects results and metadata
|
||||
\end{enumerate}
|
||||
|
||||
Figure~\ref{fig:orchestrator_flow} illustrates this procedure as a
|
||||
flowchart.
|
||||
|
||||
\begin{figure}[H]
|
||||
\centering
|
||||
\begin{tikzpicture}[
|
||||
box/.style={
|
||||
draw, rounded corners, minimum width=4.8cm, minimum height=0.9cm,
|
||||
font=\small, align=center, fill=white
|
||||
},
|
||||
decision/.style={
|
||||
draw, diamond, aspect=2.5, minimum width=3cm,
|
||||
font=\small, align=center, fill=white, inner sep=1pt
|
||||
},
|
||||
arr/.style={->, thick},
|
||||
every node/.style={font=\small}
|
||||
]
|
||||
% Main flow
|
||||
\node[box] (clean) at (0, 0) {Clean state directories};
|
||||
\node[box] (deploy) at (0, -1.5) {Deploy VPN via Clan};
|
||||
\node[box] (restart) at (0, -3) {Restart VPN services\\(up to 3 attempts)};
|
||||
\node[box] (verify) at (0, -4.5) {Verify connectivity\\(120\,s timeout)};
|
||||
|
||||
% Inner loop
|
||||
\node[decision] (profile) at (0, -6.3) {Next impairment\\profile?};
|
||||
\node[box] (tc) at (0, -8.3) {Apply TC rules};
|
||||
\node[box] (wait) at (0, -9.8) {Wait 30\,s};
|
||||
\node[box] (bench) at (0, -11.3) {Run benchmarks\\$A{\to}B,\;
|
||||
B{\to}C,\; C{\to}A$};
|
||||
\node[box] (clear) at (0, -12.8) {Clear TC rules};
|
||||
|
||||
% After loop
|
||||
\node[box] (collect) at (0, -14.8) {Collect results};
|
||||
|
||||
% Arrows -- main spine
|
||||
\draw[arr] (clean) -- (deploy);
|
||||
\draw[arr] (deploy) -- (restart);
|
||||
\draw[arr] (restart) -- (verify);
|
||||
\draw[arr] (verify) -- (profile);
|
||||
\draw[arr] (profile) -- node[right] {yes} (tc);
|
||||
\draw[arr] (tc) -- (wait);
|
||||
\draw[arr] (wait) -- (bench);
|
||||
\draw[arr] (bench) -- (clear);
|
||||
|
||||
% Loop back
|
||||
\draw[arr] (clear) -- ++(3.8, 0) |- (profile);
|
||||
|
||||
% Exit loop
|
||||
\draw[arr] (profile) -- ++(-3.2, 0) node[above, pos=0.3] {no}
|
||||
|- (collect);
|
||||
\end{tikzpicture}
|
||||
\caption{Flowchart of the benchmark orchestrator procedure for a
|
||||
single VPN}
|
||||
\label{fig:orchestrator_flow}
|
||||
\end{figure}
|
||||
|
||||
\subsection{Retry Logic}
|
||||
|
||||
Tests use a retry wrapper with up to 2 retries (3 total attempts),
|
||||
@@ -267,18 +375,35 @@ be identified during analysis.
|
||||
Each metric is summarized as a statistics dictionary containing:
|
||||
|
||||
\begin{itemize}
|
||||
\bitem{min / max:} Extreme values observed
|
||||
\bitem{average:} Arithmetic mean across samples
|
||||
\bitem{p25 / p50 / p75:} Quartiles via \texttt{statistics.quantiles()}
|
||||
\bitem{min / max:} Extreme values observed
|
||||
\bitem{average:} Arithmetic mean across samples
|
||||
\bitem{p25 / p50 / p75:} Quartiles via pythons
|
||||
\texttt{statistics.quantiles()} method
|
||||
\end{itemize}
|
||||
|
||||
Multi-run tests (ping, nix-cache) aggregate across runs. Per-second
|
||||
tests (qperf, RIST) aggregate across all per-second samples.
|
||||
Aggregation differs by benchmark type. Benchmarks that execute
|
||||
multiple discrete runs, ping (3 runs of 100 packets each) and
|
||||
nix-cache (2 timed runs via hyperfine), first compute statistics
|
||||
within each run, then average the resulting statistics across runs.
|
||||
Concretely, if ping produces three runs with mean RTTs of
|
||||
5.1, 5.3, and 5.0\,ms, the reported average is the mean of
|
||||
those three values (5.13\,ms). The reported minimum is the
|
||||
single lowest RTT observed across all three runs.
|
||||
|
||||
The approach uses empirical percentiles rather than parametric
|
||||
confidence intervals, which is appropriate for benchmark data that
|
||||
may not follow a normal distribution. The nix-cache test (via hyperfine)
|
||||
additionally reports standard deviation.
|
||||
Benchmarks that produce continuous per-second samples, qperf and
|
||||
RIST streaming for example, pool all per-second measurements from a single
|
||||
execution into one series before computing statistics. For qperf,
|
||||
bandwidth is first summed across CPU cores for each second, and
|
||||
statistics are then computed over the resulting time series.
|
||||
|
||||
The analysis reports empirical percentiles (p25, p50, p75) alongside
|
||||
min/max bounds rather than parametric confidence intervals. This
|
||||
choice is deliberate: benchmark latency and throughput distributions
|
||||
are often skewed or multimodal, making assumptions of normality
|
||||
unreliable. The interquartile range (p25--p75) conveys the spread of
|
||||
typical observations, while min and max capture outlier behavior.
|
||||
The nix-cache benchmark additionally reports standard deviation via
|
||||
hyperfine's built-in statistical output.
|
||||
|
||||
\section{Source Code Analysis}
|
||||
|
||||
@@ -345,17 +470,17 @@ cryptographic hashes (\texttt{narHash}) and commit SHAs for each input.
|
||||
Key pinned inputs include:
|
||||
|
||||
\begin{itemize}
|
||||
\bitem{nixpkgs:} Follows \texttt{clan-core/nixpkgs}, ensuring a
|
||||
\bitem{nixpkgs:} Follows \texttt{clan-core/nixpkgs}, ensuring a
|
||||
single version across the dependency graph
|
||||
\bitem{clan-core:} The Clan framework, pinned to a specific commit
|
||||
\bitem{VPN sources:} Hyprspace, EasyTier, Nebula locked to
|
||||
\bitem{clan-core:} The Clan framework, pinned to a specific commit
|
||||
\bitem{VPN sources:} Hyprspace, EasyTier, Nebula locked to
|
||||
exact commits
|
||||
\bitem{Build infrastructure:} flake-parts, treefmt-nix, disko,
|
||||
\bitem{Build infrastructure:} flake-parts, treefmt-nix, disko,
|
||||
nixos-facter-modules
|
||||
\end{itemize}
|
||||
|
||||
Custom packages not in nixpkgs (qperf, VpnCloud, iperf with auth patches,
|
||||
phantun, EasyTier, Hyprspace) are built from source within the flake.
|
||||
EasyTier, Hyprspace) are built from source within the flake.
|
||||
|
||||
\subsection{Declarative System Configuration}
|
||||
|
||||
|
||||
Reference in New Issue
Block a user