create charts for methodology section

2026-02-25 17:50:40 +01:00
parent c1c94fdf78
commit 841973f26f
2 changed files with 193 additions and 63 deletions
@@ -21,11 +21,11 @@ All experiments were conducted on three bare-metal servers with
 identical specifications:
 \begin{itemize}
-  \bitem{CPU:} Intel Model 94, 4 cores / 8 threads
+    \bitem{CPU:} Intel Model 94, 4 cores / 8 threads
-  \bitem{Memory:} 64 GB RAM
+    \bitem{Memory:} 64 GB RAM
-  \bitem{Network:} 1 Gbps Ethernet (e1000e driver; one machine
+    \bitem{Network:} 1 Gbps Ethernet (e1000e driver; one machine
    uses r8169)
-  \bitem{Cryptographic acceleration:} AES-NI, AVX, AVX2, PCLMULQDQ,
+    \bitem{Cryptographic acceleration:} AES-NI, AVX, AVX2, PCLMULQDQ,
    RDRAND, SSE4.2
 \end{itemize}
@@ -36,9 +36,35 @@ may differ on systems without these features.
 \subsection{Network Topology}
 The three machines are connected via a direct 1 Gbps LAN on the same
-network segment. This baseline topology provides a controlled environment
+network segment. Each machine has a publicly reachable IPv4 address,
-with minimal latency and no packet loss, allowing the overhead introduced
+which is used to deploy configuration changes via Clan. This baseline
-by each VPN implementation to be measured in isolation.
+topology provides a controlled environment with minimal latency and no
 packet loss, allowing the overhead introduced by each VPN implementation
 to be measured in isolation. Figure~\ref{fig:mesh_topology} illustrates
 the full-mesh connectivity between the three machines.
 \begin{figure}[H]
  \centering
  \begin{tikzpicture}[
      node/.style={
        draw, rounded corners, minimum width=2.2cm, minimum height=1cm,
        font=\ttfamily\bfseries, align=center
      },
      link/.style={thick, <->}
    ]
    % Nodes in an equilateral triangle
    \node[node] (luna) at (0, 3.5) {luna};
    \node[node] (yuki) at (-3, 0) {yuki};
    \node[node] (lom)  at (3, 0)  {lom};
    % Mesh links
    \draw[link] (luna) -- node[left,  font=\small] {1 Gbps} (yuki);
    \draw[link] (luna) -- node[right, font=\small] {1 Gbps} (lom);
    \draw[link] (yuki) -- node[below, font=\small] {1 Gbps} (lom);
  \end{tikzpicture}
  \caption{Full-mesh network topology of the three benchmark machines}
  \label{fig:mesh_topology}
 \end{figure}
 To simulate real-world network conditions, Linux traffic control
 (\texttt{tc netem}) is used to inject latency, jitter, packet loss,
@@ -85,20 +111,20 @@ for understanding the cost of mesh coordination and NAT traversal logic.
 VPNs were selected based on:
 \begin{itemize}
-  \bitem{NAT traversal capability:} All selected VPNs can establish
+    \bitem{NAT traversal capability:} All selected VPNs can establish
    connections between peers behind NAT without manual port forwarding.
-  \bitem{Decentralization:} Preference for solutions without mandatory
+    \bitem{Decentralization:} Preference for solutions without mandatory
    central servers, though coordinated-mesh VPNs were included for comparison.
-  \bitem{Active development:} Only VPNs with recent commits and
+    \bitem{Active development:} Only VPNs with recent commits and
    maintained releases were considered.
-  \bitem{Linux support:} All VPNs must run on Linux.
+    \bitem{Linux support:} All VPNs must run on Linux.
 \end{itemize}
 \subsection{Configuration Methodology}
 Each VPN is built from source within the Nix flake, ensuring that all
 dependencies are pinned to exact versions. VPNs not packaged in nixpkgs
-(Hyprspace, EasyTier, VpnCloud, qperf) have dedicated build expressions
+(Hyprspace, EasyTier, VpnCloud) have dedicated build expressions
 under \texttt{pkgs/} in the flake.
 Cryptographic material (WireGuard keys, Nebula certificates, ZeroTier
@@ -122,43 +148,63 @@ work that relied exclusively on iperf3.
 \subsection{Ping}
-Measures round-trip latency and packet delivery reliability.
+Measures ICMP round-trip latency and packet delivery reliability.
 \begin{itemize}
-  \bitem{Method:} 100 ICMP echo requests at 200 ms intervals,
+    \bitem{Method:} 100 ICMP echo requests at 200 ms intervals,
    1-second per-packet timeout, repeated for 3 runs.
-  \bitem{Metrics:} RTT (min, avg, max, mdev), packet loss percentage,
+    \bitem{Metrics:} RTT (min, avg, max, mdev), packet loss percentage,
    per-packet RTTs.
 \end{itemize}
-\subsection{iPerf3}
+\subsection{TCP iPerf3}
-Measures bulk data transfer throughput.
+Measures bulk TCP throughput with iperf3
-
+a common tool used in research to measure network performance.
 \textbf{TCP variant:} 30-second bidirectional test with RSA authentication
 and zero-copy mode (\texttt{-Z}) to minimize CPU overhead.
 \textbf{UDP variant:} Same configuration with unlimited target bandwidth
 (\texttt{-b 0}) and 64-bit counters.
 \textbf{Parallel TCP variant:} Tests concurrent mesh traffic by running
 TCP streams on all machines simultaneously in a circular pattern
 (A$\rightarrow$B, B$\rightarrow$C, C$\rightarrow$A) for 60 seconds.
 This simulates contention across the mesh.
 \begin{itemize}
-  \bitem{Metrics:} Throughput (bits/s), retransmits, congestion window,
+    \bitem{Method:}  30-second bidirectional test zero-copy mode
-    jitter (UDP), packet loss (UDP).
+    (\texttt{-Z}) to minimize CPU
    overhead.
    \bitem{Metrics:} Throughput (bits/s), retransmits, congestion
    window and CPU utilization.
 \end{itemize}
-\subsection{qPerf}
+\subsection{UDP iPerf3}
-Measures connection-level performance rather than bulk throughput.
+Measures bulk UDP throughput with the same flags as the TCP Iperf3 benchmark.
 \begin{itemize}
-  \bitem{Method:} One qperf instance per CPU core in parallel, each
+    \bitem{Method:} plus unlimited target bandwidth (\texttt{-b 0}) and
    64-bit counters flags.
    \bitem{Metrics:} Throughput (bits/s), jitter, packet loss and CPU
    utilization.
 \end{itemize}
 \subsection{Parallel iPerf3}
 Tests concurrent overlay network traffic by running TCP streams on all machines
 simultaneously in a circular pattern (A$\rightarrow$B,
 B$\rightarrow$C, C$\rightarrow$A) for 60 seconds. This simulates
 contention across the overlay network.
 \begin{itemize}
    \bitem{Method:}  60-second bidirectional test zero-copy mode
    (\texttt{-Z}) to minimize CPU
    overhead.
    \bitem{Metrics:} Throughput (bits/s), retransmits, congestion
    window and CPU utilization.
 \end{itemize}
 \subsection{QPerf}
 Measures connection-level QUIC performance rather
 than bulk UDP or TCP throughput.
 \begin{itemize}
    \bitem{Method:} One qperf process per CPU core in parallel, each
    running for 30 seconds. Bandwidth from all cores is summed per second.
-  \bitem{Metrics:} Total bandwidth (Mbps), CPU usage, time to first
+    \bitem{Metrics:} Total bandwidth (Mbps), CPU usage, time to first
    byte (TTFB), connection establishment time.
 \end{itemize}
@@ -167,12 +213,12 @@ Measures connection-level performance rather than bulk throughput.
 Measures real-time multimedia streaming performance.
 \begin{itemize}
-  \bitem{Method:} The sender generates a 4K (3840$\times$2160) test
+    \bitem{Method:} The sender generates a 4K ($3840\times2160$) test
    pattern at 30 fps using ffmpeg with H.264 encoding (ultrafast preset,
    zerolatency tuning) at 25 Mbps target bitrate. The stream is transmitted
    over the RIST protocol to a receiver on the target machine for 30 seconds.
-  \bitem{Encoding metrics:} Actual bitrate, frame rate, dropped frames.
+    \bitem{Encoding metrics:} Actual bitrate, frame rate, dropped frames.
-  \bitem{Network metrics:} Packets dropped, packets recovered via
+    \bitem{Network metrics:} Packets dropped, packets recovered via
    RIST retransmission, RTT, quality score (0--100), received bitrate.
 \end{itemize}
@@ -182,22 +228,19 @@ realistic test of VPN behavior under multimedia workloads.
 \subsection{Nix Cache Download}
-Measures sustained download performance using a real-world workload.
+Measures sustained HTTP download performance of many small files
 using a real-world workload.
 \begin{itemize}
-  \bitem{Method:} A Harmonia Nix binary cache server on the target
+    \bitem{Method:} A Harmonia Nix binary cache server on the target
    machine serves the Firefox package. The client downloads it via
    \texttt{nix copy} through the VPN. Benchmarked with hyperfine:
    1 warmup run followed by 2 timed runs. The local cache and Nix's
    SQLite metadata are cleared between runs.
-  \bitem{Metrics:} Mean duration (seconds), standard deviation,
+    \bitem{Metrics:} Mean duration (seconds), standard deviation,
    min/max duration.
 \end{itemize}
 This benchmark tests realistic HTTP traffic patterns and sustained
 sequential download performance, complementing the synthetic throughput
 tests.
 \section{Network Impairment Profiles}
 Four impairment profiles simulate a range of network conditions, from
@@ -215,14 +258,21 @@ effective round-trip impairment is approximately doubled.
    \textbf{Profile} & \textbf{Latency} & \textbf{Jitter} &
    \textbf{Loss} & \textbf{Reorder} & \textbf{Correlation} \\
    \hline
-    Baseline & ;  & ;  & ;  & ;  & ;  \\
+    Baseline & -  & -  & -  & -  & -  \\
    Low & 2 ms & 2 ms & 0.25\% & 0.5\% & 25\% \\
    Medium & 4 ms & 7 ms & 1.0\% & 2.5\% & 50\% \\
-    High & 12 ms & 30 ms & 5.0\% & 10\% & 50\% \\
+    High & 6 ms & 15 ms & 2.5\% & 5\% & 50\% \\
    \hline
  \end{tabular}
 \end{table}
 The correlation column controls how strongly each packet's impairment
 depends on the preceding packet. At 0\% correlation, loss and
 reordering events are independent; at higher values they occur in
 bursts, because a packet that was lost or reordered increases the
 probability that the next packet suffers the same fate. This produces
 realistic bursty degradation rather than uniformly distributed drops.
 The ``Low'' profile approximates a well-provisioned continental
 connection, ``Medium'' represents intercontinental links or congested
 networks, and ``High'' simulates severely degraded conditions such as
@@ -249,12 +299,70 @@ The benchmark suite is fully automated via a Python orchestrator
    \begin{enumerate}
      \item Applies TC rules via context manager (guarantees cleanup)
      \item Waits 30 seconds for stabilization
-      \item Executes all benchmarks
+      \item Executes each benchmark three times sequentially,
        once per machine pair: $A\to B$, then
        $B\to C$, lastly $C\to A$
      \item Clears TC rules
    \end{enumerate}
  \item Collects results and metadata
 \end{enumerate}
 Figure~\ref{fig:orchestrator_flow} illustrates this procedure as a
 flowchart.
 \begin{figure}[H]
  \centering
  \begin{tikzpicture}[
      box/.style={
        draw, rounded corners, minimum width=4.8cm, minimum height=0.9cm,
        font=\small, align=center, fill=white
      },
      decision/.style={
        draw, diamond, aspect=2.5, minimum width=3cm,
        font=\small, align=center, fill=white, inner sep=1pt
      },
      arr/.style={->, thick},
      every node/.style={font=\small}
    ]
    % Main flow
    \node[box] (clean) at (0, 0) {Clean state directories};
    \node[box] (deploy) at (0, -1.5) {Deploy VPN via Clan};
    \node[box] (restart) at (0, -3) {Restart VPN services\\(up to 3 attempts)};
    \node[box] (verify) at (0, -4.5) {Verify connectivity\\(120\,s timeout)};
    % Inner loop
    \node[decision] (profile) at (0, -6.3) {Next impairment\\profile?};
    \node[box] (tc) at (0, -8.3) {Apply TC rules};
    \node[box] (wait) at (0, -9.8) {Wait 30\,s};
    \node[box] (bench) at (0, -11.3) {Run benchmarks\\$A{\to}B,\;
    B{\to}C,\; C{\to}A$};
    \node[box] (clear) at (0, -12.8) {Clear TC rules};
    % After loop
    \node[box] (collect) at (0, -14.8) {Collect results};
    % Arrows -- main spine
    \draw[arr] (clean) -- (deploy);
    \draw[arr] (deploy) -- (restart);
    \draw[arr] (restart) -- (verify);
    \draw[arr] (verify) -- (profile);
    \draw[arr] (profile) -- node[right] {yes} (tc);
    \draw[arr] (tc) -- (wait);
    \draw[arr] (wait) -- (bench);
    \draw[arr] (bench) -- (clear);
    % Loop back
    \draw[arr] (clear) -- ++(3.8, 0) |- (profile);
    % Exit loop
    \draw[arr] (profile) -- ++(-3.2, 0) node[above, pos=0.3] {no}
    |- (collect);
  \end{tikzpicture}
  \caption{Flowchart of the benchmark orchestrator procedure for a
  single VPN}
  \label{fig:orchestrator_flow}
 \end{figure}
 \subsection{Retry Logic}
 Tests use a retry wrapper with up to 2 retries (3 total attempts),
@@ -267,18 +375,35 @@ be identified during analysis.
 Each metric is summarized as a statistics dictionary containing:
 \begin{itemize}
-  \bitem{min / max:} Extreme values observed
+    \bitem{min / max:} Extreme values observed
-  \bitem{average:} Arithmetic mean across samples
+    \bitem{average:} Arithmetic mean across samples
-  \bitem{p25 / p50 / p75:} Quartiles via \texttt{statistics.quantiles()}
+    \bitem{p25 / p50 / p75:} Quartiles via pythons
    \texttt{statistics.quantiles()} method
 \end{itemize}
-Multi-run tests (ping, nix-cache) aggregate across runs. Per-second
+Aggregation differs by benchmark type. Benchmarks that execute
-tests (qperf, RIST) aggregate across all per-second samples.
+multiple discrete runs, ping (3 runs of 100 packets each) and
 nix-cache (2 timed runs via hyperfine), first compute statistics
 within each run, then average the resulting statistics across runs.
 Concretely, if ping produces three runs with mean RTTs of
 5.1, 5.3, and 5.0\,ms, the reported average is the mean of
 those three values (5.13\,ms). The reported minimum is the
 single lowest RTT observed across all three runs.
-The approach uses empirical percentiles rather than parametric
+Benchmarks that produce continuous per-second samples, qperf and
-confidence intervals, which is appropriate for benchmark data that
+RIST streaming for example, pool all per-second measurements from a single
-may not follow a normal distribution. The nix-cache test (via hyperfine)
+execution into one series before computing statistics. For qperf,
-additionally reports standard deviation.
+bandwidth is first summed across CPU cores for each second, and
 statistics are then computed over the resulting time series.
 The analysis reports empirical percentiles (p25, p50, p75) alongside
 min/max bounds rather than parametric confidence intervals. This
 choice is deliberate: benchmark latency and throughput distributions
 are often skewed or multimodal, making assumptions of normality
 unreliable. The interquartile range (p25--p75) conveys the spread of
 typical observations, while min and max capture outlier behavior.
 The nix-cache benchmark additionally reports standard deviation via
 hyperfine's built-in statistical output.
 \section{Source Code Analysis}
@@ -345,17 +470,17 @@ cryptographic hashes (\texttt{narHash}) and commit SHAs for each input.
 Key pinned inputs include:
 \begin{itemize}
-  \bitem{nixpkgs:} Follows \texttt{clan-core/nixpkgs}, ensuring a
+    \bitem{nixpkgs:} Follows \texttt{clan-core/nixpkgs}, ensuring a
    single version across the dependency graph
-  \bitem{clan-core:} The Clan framework, pinned to a specific commit
+    \bitem{clan-core:} The Clan framework, pinned to a specific commit
-  \bitem{VPN sources:} Hyprspace, EasyTier, Nebula locked to
+    \bitem{VPN sources:} Hyprspace, EasyTier, Nebula locked to
    exact commits
-  \bitem{Build infrastructure:} flake-parts, treefmt-nix, disko,
+    \bitem{Build infrastructure:} flake-parts, treefmt-nix, disko,
    nixos-facter-modules
 \end{itemize}
 Custom packages not in nixpkgs (qperf, VpnCloud, iperf with auth patches,
-phantun, EasyTier, Hyprspace) are built from source within the flake.
+EasyTier, Hyprspace) are built from source within the flake.
 \subsection{Declarative System Configuration}
@@ -59,6 +59,8 @@
 \usepackage{svg}
 \usepackage{acronym}
 \usepackage{subcaption} % For subfigures
 \usepackage{tikz}
 \usetikzlibrary{shapes.geometric}
 \usepackage[backend=bibtex,style=numeric,natbib=true]{biblatex} %
 % Use the bibtex backend with the authoryear citation style (which
@@ -70,7 +72,10 @@
 \usepackage[autostyle=true]{csquotes} % Required to generate
 % language-dependent quotes in the bibliography
-\newcommand{\bitem}[1]{\item \textbf{#1}}
+\newcommand{\bitem}[1]{
 \item \textbf{#1}}
 \setcounter{secnumdepth}{1} % Only number chapters and sections, not subsections
 %----------------------------------------------------------------------------------------
 %  MARGIN SETTINGS
@@ -333,8 +338,8 @@ and Management}} % Your department's name and URL, this is used in
 % Include the chapters of the thesis as separate files from the Chapters folder
 % Uncomment the lines as you write the chapters
 \include{Chapters/Introduction}
 \include{Chapters/Methodology}
 \include{Chapters/Preliminaries}
 \include{Chapters/Methodology}
 %\include{Chapters/Chapter1}
 %\include{Chapters/Chapter2}