create charts for methodology section

This commit is contained in:
2026-02-25 17:50:40 +01:00
parent c1c94fdf78
commit 841973f26f
2 changed files with 193 additions and 63 deletions

View File

@@ -21,11 +21,11 @@ All experiments were conducted on three bare-metal servers with
identical specifications: identical specifications:
\begin{itemize} \begin{itemize}
\bitem{CPU:} Intel Model 94, 4 cores / 8 threads \bitem{CPU:} Intel Model 94, 4 cores / 8 threads
\bitem{Memory:} 64 GB RAM \bitem{Memory:} 64 GB RAM
\bitem{Network:} 1 Gbps Ethernet (e1000e driver; one machine \bitem{Network:} 1 Gbps Ethernet (e1000e driver; one machine
uses r8169) uses r8169)
\bitem{Cryptographic acceleration:} AES-NI, AVX, AVX2, PCLMULQDQ, \bitem{Cryptographic acceleration:} AES-NI, AVX, AVX2, PCLMULQDQ,
RDRAND, SSE4.2 RDRAND, SSE4.2
\end{itemize} \end{itemize}
@@ -36,9 +36,35 @@ may differ on systems without these features.
\subsection{Network Topology} \subsection{Network Topology}
The three machines are connected via a direct 1 Gbps LAN on the same The three machines are connected via a direct 1 Gbps LAN on the same
network segment. This baseline topology provides a controlled environment network segment. Each machine has a publicly reachable IPv4 address,
with minimal latency and no packet loss, allowing the overhead introduced which is used to deploy configuration changes via Clan. This baseline
by each VPN implementation to be measured in isolation. topology provides a controlled environment with minimal latency and no
packet loss, allowing the overhead introduced by each VPN implementation
to be measured in isolation. Figure~\ref{fig:mesh_topology} illustrates
the full-mesh connectivity between the three machines.
\begin{figure}[H]
\centering
\begin{tikzpicture}[
node/.style={
draw, rounded corners, minimum width=2.2cm, minimum height=1cm,
font=\ttfamily\bfseries, align=center
},
link/.style={thick, <->}
]
% Nodes in an equilateral triangle
\node[node] (luna) at (0, 3.5) {luna};
\node[node] (yuki) at (-3, 0) {yuki};
\node[node] (lom) at (3, 0) {lom};
% Mesh links
\draw[link] (luna) -- node[left, font=\small] {1 Gbps} (yuki);
\draw[link] (luna) -- node[right, font=\small] {1 Gbps} (lom);
\draw[link] (yuki) -- node[below, font=\small] {1 Gbps} (lom);
\end{tikzpicture}
\caption{Full-mesh network topology of the three benchmark machines}
\label{fig:mesh_topology}
\end{figure}
To simulate real-world network conditions, Linux traffic control To simulate real-world network conditions, Linux traffic control
(\texttt{tc netem}) is used to inject latency, jitter, packet loss, (\texttt{tc netem}) is used to inject latency, jitter, packet loss,
@@ -85,20 +111,20 @@ for understanding the cost of mesh coordination and NAT traversal logic.
VPNs were selected based on: VPNs were selected based on:
\begin{itemize} \begin{itemize}
\bitem{NAT traversal capability:} All selected VPNs can establish \bitem{NAT traversal capability:} All selected VPNs can establish
connections between peers behind NAT without manual port forwarding. connections between peers behind NAT without manual port forwarding.
\bitem{Decentralization:} Preference for solutions without mandatory \bitem{Decentralization:} Preference for solutions without mandatory
central servers, though coordinated-mesh VPNs were included for comparison. central servers, though coordinated-mesh VPNs were included for comparison.
\bitem{Active development:} Only VPNs with recent commits and \bitem{Active development:} Only VPNs with recent commits and
maintained releases were considered. maintained releases were considered.
\bitem{Linux support:} All VPNs must run on Linux. \bitem{Linux support:} All VPNs must run on Linux.
\end{itemize} \end{itemize}
\subsection{Configuration Methodology} \subsection{Configuration Methodology}
Each VPN is built from source within the Nix flake, ensuring that all Each VPN is built from source within the Nix flake, ensuring that all
dependencies are pinned to exact versions. VPNs not packaged in nixpkgs dependencies are pinned to exact versions. VPNs not packaged in nixpkgs
(Hyprspace, EasyTier, VpnCloud, qperf) have dedicated build expressions (Hyprspace, EasyTier, VpnCloud) have dedicated build expressions
under \texttt{pkgs/} in the flake. under \texttt{pkgs/} in the flake.
Cryptographic material (WireGuard keys, Nebula certificates, ZeroTier Cryptographic material (WireGuard keys, Nebula certificates, ZeroTier
@@ -122,43 +148,63 @@ work that relied exclusively on iperf3.
\subsection{Ping} \subsection{Ping}
Measures round-trip latency and packet delivery reliability. Measures ICMP round-trip latency and packet delivery reliability.
\begin{itemize} \begin{itemize}
\bitem{Method:} 100 ICMP echo requests at 200 ms intervals, \bitem{Method:} 100 ICMP echo requests at 200 ms intervals,
1-second per-packet timeout, repeated for 3 runs. 1-second per-packet timeout, repeated for 3 runs.
\bitem{Metrics:} RTT (min, avg, max, mdev), packet loss percentage, \bitem{Metrics:} RTT (min, avg, max, mdev), packet loss percentage,
per-packet RTTs. per-packet RTTs.
\end{itemize} \end{itemize}
\subsection{iPerf3} \subsection{TCP iPerf3}
Measures bulk data transfer throughput. Measures bulk TCP throughput with iperf3
a common tool used in research to measure network performance.
\textbf{TCP variant:} 30-second bidirectional test with RSA authentication
and zero-copy mode (\texttt{-Z}) to minimize CPU overhead.
\textbf{UDP variant:} Same configuration with unlimited target bandwidth
(\texttt{-b 0}) and 64-bit counters.
\textbf{Parallel TCP variant:} Tests concurrent mesh traffic by running
TCP streams on all machines simultaneously in a circular pattern
(A$\rightarrow$B, B$\rightarrow$C, C$\rightarrow$A) for 60 seconds.
This simulates contention across the mesh.
\begin{itemize} \begin{itemize}
\bitem{Metrics:} Throughput (bits/s), retransmits, congestion window, \bitem{Method:} 30-second bidirectional test zero-copy mode
jitter (UDP), packet loss (UDP). (\texttt{-Z}) to minimize CPU
overhead.
\bitem{Metrics:} Throughput (bits/s), retransmits, congestion
window and CPU utilization.
\end{itemize} \end{itemize}
\subsection{qPerf} \subsection{UDP iPerf3}
Measures connection-level performance rather than bulk throughput. Measures bulk UDP throughput with the same flags as the TCP Iperf3 benchmark.
\begin{itemize} \begin{itemize}
\bitem{Method:} One qperf instance per CPU core in parallel, each \bitem{Method:} plus unlimited target bandwidth (\texttt{-b 0}) and
64-bit counters flags.
\bitem{Metrics:} Throughput (bits/s), jitter, packet loss and CPU
utilization.
\end{itemize}
\subsection{Parallel iPerf3}
Tests concurrent overlay network traffic by running TCP streams on all machines
simultaneously in a circular pattern (A$\rightarrow$B,
B$\rightarrow$C, C$\rightarrow$A) for 60 seconds. This simulates
contention across the overlay network.
\begin{itemize}
\bitem{Method:} 60-second bidirectional test zero-copy mode
(\texttt{-Z}) to minimize CPU
overhead.
\bitem{Metrics:} Throughput (bits/s), retransmits, congestion
window and CPU utilization.
\end{itemize}
\subsection{QPerf}
Measures connection-level QUIC performance rather
than bulk UDP or TCP throughput.
\begin{itemize}
\bitem{Method:} One qperf process per CPU core in parallel, each
running for 30 seconds. Bandwidth from all cores is summed per second. running for 30 seconds. Bandwidth from all cores is summed per second.
\bitem{Metrics:} Total bandwidth (Mbps), CPU usage, time to first \bitem{Metrics:} Total bandwidth (Mbps), CPU usage, time to first
byte (TTFB), connection establishment time. byte (TTFB), connection establishment time.
\end{itemize} \end{itemize}
@@ -167,12 +213,12 @@ Measures connection-level performance rather than bulk throughput.
Measures real-time multimedia streaming performance. Measures real-time multimedia streaming performance.
\begin{itemize} \begin{itemize}
\bitem{Method:} The sender generates a 4K (3840$\times$2160) test \bitem{Method:} The sender generates a 4K ($3840\times2160$) test
pattern at 30 fps using ffmpeg with H.264 encoding (ultrafast preset, pattern at 30 fps using ffmpeg with H.264 encoding (ultrafast preset,
zerolatency tuning) at 25 Mbps target bitrate. The stream is transmitted zerolatency tuning) at 25 Mbps target bitrate. The stream is transmitted
over the RIST protocol to a receiver on the target machine for 30 seconds. over the RIST protocol to a receiver on the target machine for 30 seconds.
\bitem{Encoding metrics:} Actual bitrate, frame rate, dropped frames. \bitem{Encoding metrics:} Actual bitrate, frame rate, dropped frames.
\bitem{Network metrics:} Packets dropped, packets recovered via \bitem{Network metrics:} Packets dropped, packets recovered via
RIST retransmission, RTT, quality score (0--100), received bitrate. RIST retransmission, RTT, quality score (0--100), received bitrate.
\end{itemize} \end{itemize}
@@ -182,22 +228,19 @@ realistic test of VPN behavior under multimedia workloads.
\subsection{Nix Cache Download} \subsection{Nix Cache Download}
Measures sustained download performance using a real-world workload. Measures sustained HTTP download performance of many small files
using a real-world workload.
\begin{itemize} \begin{itemize}
\bitem{Method:} A Harmonia Nix binary cache server on the target \bitem{Method:} A Harmonia Nix binary cache server on the target
machine serves the Firefox package. The client downloads it via machine serves the Firefox package. The client downloads it via
\texttt{nix copy} through the VPN. Benchmarked with hyperfine: \texttt{nix copy} through the VPN. Benchmarked with hyperfine:
1 warmup run followed by 2 timed runs. The local cache and Nix's 1 warmup run followed by 2 timed runs. The local cache and Nix's
SQLite metadata are cleared between runs. SQLite metadata are cleared between runs.
\bitem{Metrics:} Mean duration (seconds), standard deviation, \bitem{Metrics:} Mean duration (seconds), standard deviation,
min/max duration. min/max duration.
\end{itemize} \end{itemize}
This benchmark tests realistic HTTP traffic patterns and sustained
sequential download performance, complementing the synthetic throughput
tests.
\section{Network Impairment Profiles} \section{Network Impairment Profiles}
Four impairment profiles simulate a range of network conditions, from Four impairment profiles simulate a range of network conditions, from
@@ -215,14 +258,21 @@ effective round-trip impairment is approximately doubled.
\textbf{Profile} & \textbf{Latency} & \textbf{Jitter} & \textbf{Profile} & \textbf{Latency} & \textbf{Jitter} &
\textbf{Loss} & \textbf{Reorder} & \textbf{Correlation} \\ \textbf{Loss} & \textbf{Reorder} & \textbf{Correlation} \\
\hline \hline
Baseline & ; & ; & ; & ; & ; \\ Baseline & - & - & - & - & - \\
Low & 2 ms & 2 ms & 0.25\% & 0.5\% & 25\% \\ Low & 2 ms & 2 ms & 0.25\% & 0.5\% & 25\% \\
Medium & 4 ms & 7 ms & 1.0\% & 2.5\% & 50\% \\ Medium & 4 ms & 7 ms & 1.0\% & 2.5\% & 50\% \\
High & 12 ms & 30 ms & 5.0\% & 10\% & 50\% \\ High & 6 ms & 15 ms & 2.5\% & 5\% & 50\% \\
\hline \hline
\end{tabular} \end{tabular}
\end{table} \end{table}
The correlation column controls how strongly each packet's impairment
depends on the preceding packet. At 0\% correlation, loss and
reordering events are independent; at higher values they occur in
bursts, because a packet that was lost or reordered increases the
probability that the next packet suffers the same fate. This produces
realistic bursty degradation rather than uniformly distributed drops.
The ``Low'' profile approximates a well-provisioned continental The ``Low'' profile approximates a well-provisioned continental
connection, ``Medium'' represents intercontinental links or congested connection, ``Medium'' represents intercontinental links or congested
networks, and ``High'' simulates severely degraded conditions such as networks, and ``High'' simulates severely degraded conditions such as
@@ -249,12 +299,70 @@ The benchmark suite is fully automated via a Python orchestrator
\begin{enumerate} \begin{enumerate}
\item Applies TC rules via context manager (guarantees cleanup) \item Applies TC rules via context manager (guarantees cleanup)
\item Waits 30 seconds for stabilization \item Waits 30 seconds for stabilization
\item Executes all benchmarks \item Executes each benchmark three times sequentially,
once per machine pair: $A\to B$, then
$B\to C$, lastly $C\to A$
\item Clears TC rules \item Clears TC rules
\end{enumerate} \end{enumerate}
\item Collects results and metadata \item Collects results and metadata
\end{enumerate} \end{enumerate}
Figure~\ref{fig:orchestrator_flow} illustrates this procedure as a
flowchart.
\begin{figure}[H]
\centering
\begin{tikzpicture}[
box/.style={
draw, rounded corners, minimum width=4.8cm, minimum height=0.9cm,
font=\small, align=center, fill=white
},
decision/.style={
draw, diamond, aspect=2.5, minimum width=3cm,
font=\small, align=center, fill=white, inner sep=1pt
},
arr/.style={->, thick},
every node/.style={font=\small}
]
% Main flow
\node[box] (clean) at (0, 0) {Clean state directories};
\node[box] (deploy) at (0, -1.5) {Deploy VPN via Clan};
\node[box] (restart) at (0, -3) {Restart VPN services\\(up to 3 attempts)};
\node[box] (verify) at (0, -4.5) {Verify connectivity\\(120\,s timeout)};
% Inner loop
\node[decision] (profile) at (0, -6.3) {Next impairment\\profile?};
\node[box] (tc) at (0, -8.3) {Apply TC rules};
\node[box] (wait) at (0, -9.8) {Wait 30\,s};
\node[box] (bench) at (0, -11.3) {Run benchmarks\\$A{\to}B,\;
B{\to}C,\; C{\to}A$};
\node[box] (clear) at (0, -12.8) {Clear TC rules};
% After loop
\node[box] (collect) at (0, -14.8) {Collect results};
% Arrows -- main spine
\draw[arr] (clean) -- (deploy);
\draw[arr] (deploy) -- (restart);
\draw[arr] (restart) -- (verify);
\draw[arr] (verify) -- (profile);
\draw[arr] (profile) -- node[right] {yes} (tc);
\draw[arr] (tc) -- (wait);
\draw[arr] (wait) -- (bench);
\draw[arr] (bench) -- (clear);
% Loop back
\draw[arr] (clear) -- ++(3.8, 0) |- (profile);
% Exit loop
\draw[arr] (profile) -- ++(-3.2, 0) node[above, pos=0.3] {no}
|- (collect);
\end{tikzpicture}
\caption{Flowchart of the benchmark orchestrator procedure for a
single VPN}
\label{fig:orchestrator_flow}
\end{figure}
\subsection{Retry Logic} \subsection{Retry Logic}
Tests use a retry wrapper with up to 2 retries (3 total attempts), Tests use a retry wrapper with up to 2 retries (3 total attempts),
@@ -267,18 +375,35 @@ be identified during analysis.
Each metric is summarized as a statistics dictionary containing: Each metric is summarized as a statistics dictionary containing:
\begin{itemize} \begin{itemize}
\bitem{min / max:} Extreme values observed \bitem{min / max:} Extreme values observed
\bitem{average:} Arithmetic mean across samples \bitem{average:} Arithmetic mean across samples
\bitem{p25 / p50 / p75:} Quartiles via \texttt{statistics.quantiles()} \bitem{p25 / p50 / p75:} Quartiles via pythons
\texttt{statistics.quantiles()} method
\end{itemize} \end{itemize}
Multi-run tests (ping, nix-cache) aggregate across runs. Per-second Aggregation differs by benchmark type. Benchmarks that execute
tests (qperf, RIST) aggregate across all per-second samples. multiple discrete runs, ping (3 runs of 100 packets each) and
nix-cache (2 timed runs via hyperfine), first compute statistics
within each run, then average the resulting statistics across runs.
Concretely, if ping produces three runs with mean RTTs of
5.1, 5.3, and 5.0\,ms, the reported average is the mean of
those three values (5.13\,ms). The reported minimum is the
single lowest RTT observed across all three runs.
The approach uses empirical percentiles rather than parametric Benchmarks that produce continuous per-second samples, qperf and
confidence intervals, which is appropriate for benchmark data that RIST streaming for example, pool all per-second measurements from a single
may not follow a normal distribution. The nix-cache test (via hyperfine) execution into one series before computing statistics. For qperf,
additionally reports standard deviation. bandwidth is first summed across CPU cores for each second, and
statistics are then computed over the resulting time series.
The analysis reports empirical percentiles (p25, p50, p75) alongside
min/max bounds rather than parametric confidence intervals. This
choice is deliberate: benchmark latency and throughput distributions
are often skewed or multimodal, making assumptions of normality
unreliable. The interquartile range (p25--p75) conveys the spread of
typical observations, while min and max capture outlier behavior.
The nix-cache benchmark additionally reports standard deviation via
hyperfine's built-in statistical output.
\section{Source Code Analysis} \section{Source Code Analysis}
@@ -345,17 +470,17 @@ cryptographic hashes (\texttt{narHash}) and commit SHAs for each input.
Key pinned inputs include: Key pinned inputs include:
\begin{itemize} \begin{itemize}
\bitem{nixpkgs:} Follows \texttt{clan-core/nixpkgs}, ensuring a \bitem{nixpkgs:} Follows \texttt{clan-core/nixpkgs}, ensuring a
single version across the dependency graph single version across the dependency graph
\bitem{clan-core:} The Clan framework, pinned to a specific commit \bitem{clan-core:} The Clan framework, pinned to a specific commit
\bitem{VPN sources:} Hyprspace, EasyTier, Nebula locked to \bitem{VPN sources:} Hyprspace, EasyTier, Nebula locked to
exact commits exact commits
\bitem{Build infrastructure:} flake-parts, treefmt-nix, disko, \bitem{Build infrastructure:} flake-parts, treefmt-nix, disko,
nixos-facter-modules nixos-facter-modules
\end{itemize} \end{itemize}
Custom packages not in nixpkgs (qperf, VpnCloud, iperf with auth patches, Custom packages not in nixpkgs (qperf, VpnCloud, iperf with auth patches,
phantun, EasyTier, Hyprspace) are built from source within the flake. EasyTier, Hyprspace) are built from source within the flake.
\subsection{Declarative System Configuration} \subsection{Declarative System Configuration}

View File

@@ -59,6 +59,8 @@
\usepackage{svg} \usepackage{svg}
\usepackage{acronym} \usepackage{acronym}
\usepackage{subcaption} % For subfigures \usepackage{subcaption} % For subfigures
\usepackage{tikz}
\usetikzlibrary{shapes.geometric}
\usepackage[backend=bibtex,style=numeric,natbib=true]{biblatex} % \usepackage[backend=bibtex,style=numeric,natbib=true]{biblatex} %
% Use the bibtex backend with the authoryear citation style (which % Use the bibtex backend with the authoryear citation style (which
@@ -70,7 +72,10 @@
\usepackage[autostyle=true]{csquotes} % Required to generate \usepackage[autostyle=true]{csquotes} % Required to generate
% language-dependent quotes in the bibliography % language-dependent quotes in the bibliography
\newcommand{\bitem}[1]{\item \textbf{#1}} \newcommand{\bitem}[1]{
\item \textbf{#1}}
\setcounter{secnumdepth}{1} % Only number chapters and sections, not subsections
%---------------------------------------------------------------------------------------- %----------------------------------------------------------------------------------------
% MARGIN SETTINGS % MARGIN SETTINGS
@@ -333,8 +338,8 @@ and Management}} % Your department's name and URL, this is used in
% Include the chapters of the thesis as separate files from the Chapters folder % Include the chapters of the thesis as separate files from the Chapters folder
% Uncomment the lines as you write the chapters % Uncomment the lines as you write the chapters
\include{Chapters/Introduction} \include{Chapters/Introduction}
\include{Chapters/Methodology}
\include{Chapters/Preliminaries} \include{Chapters/Preliminaries}
\include{Chapters/Methodology}
%\include{Chapters/Chapter1} %\include{Chapters/Chapter1}
%\include{Chapters/Chapter2} %\include{Chapters/Chapter2}