Compare commits

...

3 Commits

Author SHA1 Message Date
910a7b2a81 several fixups discussed on tuesday 2026-03-06 17:56:36 +01:00
f29d810240 several fixups discussed on tuesday 2026-03-06 17:56:30 +01:00
0a0ca0800a several fixups discussed on tuesday 2026-03-06 17:53:26 +01:00
7 changed files with 127 additions and 131 deletions

View File

@@ -1,4 +1,4 @@
\chapter{Conclusion} % Main chapter title \chapter{Conclusion} % Main chapter title
\label{Conclusion} \label{Conclusion}

View File

@@ -1,4 +1,4 @@
\chapter{Discussion} % Main chapter title \chapter{Discussion} % Main chapter title
\label{Discussion} \label{Discussion}

View File

@@ -72,7 +72,6 @@ and reordering. These impairments are applied symmetrically on all
machines, meaning effective round-trip impairment is approximately machines, meaning effective round-trip impairment is approximately
double the per-machine values. double the per-machine values.
\subsection{Configuration Methodology} \subsection{Configuration Methodology}
Each VPN is built from source within the Nix flake, ensuring that all Each VPN is built from source within the Nix flake, ensuring that all
@@ -82,7 +81,7 @@ under \texttt{pkgs/} in the flake.
Cryptographic material (WireGuard keys, Nebula certificates, ZeroTier Cryptographic material (WireGuard keys, Nebula certificates, ZeroTier
identities) is generated deterministically via Clan's vars generator identities) is generated deterministically via Clan's vars generator
system. system.
Generated keys are stored in version control under Generated keys are stored in version control under
\texttt{vars/per-machine/\{name\}/} and read at NixOS evaluation time, \texttt{vars/per-machine/\{name\}/} and read at NixOS evaluation time,
@@ -93,109 +92,92 @@ making key material part of the reproducible configuration.
The benchmark suite includes both synthetic throughput tests and The benchmark suite includes both synthetic throughput tests and
real-world workloads. This combination addresses a limitation of prior real-world workloads. This combination addresses a limitation of prior
work that relied exclusively on iperf3. work that relied exclusively on iperf3.
Table~\ref{tab:benchmark_suite} summarises each benchmark.
\begin{table}[H]
\centering
\caption{Benchmark suite overview}
\label{tab:benchmark_suite}
\begin{tabular}{llll}
\hline
\textbf{Benchmark} & \textbf{Protocol} & \textbf{Duration} &
\textbf{Key Metrics} \\
\hline
Ping & ICMP & 3 runs $\times$ 100 pkts & RTT, packet loss \\
TCP iPerf3 & TCP & 30 s & Throughput, retransmits, CPU \\
UDP iPerf3 & UDP & 30 s & Throughput, jitter, packet loss \\
Parallel iPerf3 & TCP & 60 s & Throughput under contention \\
QPerf & QUIC & 30 s & Bandwidth, TTFB, conn. time \\
RIST Streaming & RIST & 30 s & Bitrate, dropped frames, RTT \\
Nix Cache Download & HTTP & 2 runs & Download duration \\
\hline
\end{tabular}
\end{table}
The first four benchmarks use well-known network testing tools.
The remaining three target workloads that are closer to real-world
usage. The subsections below describe the configuration details
that the table does not capture.
\subsection{Ping} \subsection{Ping}
Measures ICMP round-trip latency and packet delivery reliability. Sends 100 ICMP echo requests at 200\,ms intervals with a 1-second
per-packet timeout, repeated for 3 runs.
\begin{itemize} \subsection{TCP and UDP iPerf3}
\bitem{Method:} 100 ICMP echo requests at 200 ms intervals,
1-second per-packet timeout, repeated for 3 runs.
\bitem{Metrics:} RTT (min, avg, max, mdev), packet loss percentage,
per-packet RTTs.
\end{itemize}
\subsection{TCP iPerf3} Both tests run for 30 seconds in bidirectional mode with zero-copy
(\texttt{-Z}) to minimize CPU overhead. The UDP variant additionally
Measures bulk TCP throughput with iperf3 sets unlimited target bandwidth (\texttt{-b 0}) and enables 64-bit
a common tool used in research to measure network performance. counters.
\begin{itemize}
\bitem{Method:} 30-second bidirectional test zero-copy mode
(\texttt{-Z}) to minimize CPU
overhead.
\bitem{Metrics:} Throughput (bits/s), retransmits, congestion
window and CPU utilization.
\end{itemize}
\subsection{UDP iPerf3}
Measures bulk UDP throughput with the same flags as the TCP Iperf3 benchmark.
\begin{itemize}
\bitem{Method:} plus unlimited target bandwidth (\texttt{-b 0}) and
64-bit counters flags.
\bitem{Metrics:} Throughput (bits/s), jitter, packet loss and CPU
utilization.
\end{itemize}
\subsection{Parallel iPerf3} \subsection{Parallel iPerf3}
Tests concurrent overlay network traffic by running TCP streams on all machines Runs TCP streams on all three machines simultaneously in a circular
simultaneously in a circular pattern (A$\rightarrow$B, pattern (A$\rightarrow$B, B$\rightarrow$C, C$\rightarrow$A) for
B$\rightarrow$C, C$\rightarrow$A) for 60 seconds. This simulates 60 seconds with zero-copy (\texttt{-Z}). This creates contention
contention across the overlay network. across the overlay network, stressing shared resources that
single-stream tests leave idle.
\begin{itemize}
\bitem{Method:} 60-second bidirectional test zero-copy mode
(\texttt{-Z}) to minimize CPU
overhead.
\bitem{Metrics:} Throughput (bits/s), retransmits, congestion
window and CPU utilization.
\end{itemize}
\subsection{QPerf} \subsection{QPerf}
Measures connection-level QUIC performance rather Spawns one qperf process per CPU core, each running for 30 seconds.
than bulk UDP or TCP throughput. Per-core bandwidth is summed per second. Unlike the iPerf3 tests,
QPerf targets QUIC connection-level performance, capturing time to
\begin{itemize} first byte and connection establishment time alongside throughput.
\bitem{Method:} One qperf process per CPU core in parallel, each
running for 30 seconds. Bandwidth from all cores is summed per second.
\bitem{Metrics:} Total bandwidth (Mbps), CPU usage, time to first
byte (TTFB), connection establishment time.
\end{itemize}
\subsection{RIST Video Streaming} \subsection{RIST Video Streaming}
Measures real-time multimedia streaming performance. Generates a 4K ($3840\times2160$) H.264 test pattern at 30\,fps
(ultrafast preset, zerolatency tuning, 25\,Mbps target bitrate) with
\begin{itemize} ffmpeg and transmits it over the RIST protocol for 30 seconds. RIST
\bitem{Method:} The sender generates a 4K ($3840\times2160$) test (Reliable Internet Stream Transport) is designed for low-latency
pattern at 30 fps using ffmpeg with H.264 encoding (ultrafast preset, video contribution over unreliable networks, making it a realistic
zerolatency tuning) at 25 Mbps target bitrate. The stream is transmitted test of VPN behavior under multimedia workloads. In addition to
over the RIST protocol to a receiver on the target machine for 30 seconds. standard network metrics, the benchmark records encoding-side
\bitem{Encoding metrics:} Actual bitrate, frame rate, dropped frames. statistics (actual bitrate, frame rate, dropped frames) and
\bitem{Network metrics:} Packets dropped, packets recovered via RIST-specific counters (packets recovered via retransmission, quality
RIST retransmission, RTT, quality score (0--100), received bitrate. score).
\end{itemize}
RIST (Reliable Internet Stream Transport) is a protocol designed for
low-latency video contribution over unreliable networks, making it a
realistic test of VPN behavior under multimedia workloads.
\subsection{Nix Cache Download} \subsection{Nix Cache Download}
Measures sustained HTTP download performance of many small files A Harmonia Nix binary cache server on the target machine serves the
using a real-world workload. Firefox package. The client downloads it via \texttt{nix copy}
through the VPN, exercising many small HTTP requests rather than a
\begin{itemize} single bulk transfer. Benchmarked with hyperfine (1 warmup run,
\bitem{Method:} A Harmonia Nix binary cache server on the target 2 timed runs); the local Nix store and SQLite metadata are cleared
machine serves the Firefox package. The client downloads it via between runs.
\texttt{nix copy} through the VPN. Benchmarked with hyperfine:
1 warmup run followed by 2 timed runs. The local cache and Nix's
SQLite metadata are cleared between runs.
\bitem{Metrics:} Mean duration (seconds), standard deviation,
min/max duration.
\end{itemize}
\section{Network Impairment Profiles} \section{Network Impairment Profiles}
Four impairment profiles simulate a range of network conditions, from To evaluate VPN performance under different network conditions, four
ideal to severely degraded. Impairments are applied via Linux traffic impairment profiles are defined, ranging from an unmodified baseline
control (\texttt{tc netem}) on every machine's primary interface. to a severely degraded link. All impairments are injected with Linux
Table~\ref{tab:impairment_profiles} shows the per-machine values; traffic control (\texttt{tc netem}) on the egress side of every
effective round-trip impairment is approximately doubled. machine's primary interface.
Table~\ref{tab:impairment_profiles} lists the per-machine values.
Because impairments are applied on both ends of a connection, the
effective round-trip impact is roughly double the listed values.
\begin{table}[H] \begin{table}[H]
\centering \centering
@@ -214,12 +196,30 @@ effective round-trip impairment is approximately doubled.
\end{tabular} \end{tabular}
\end{table} \end{table}
The correlation column controls how strongly each packet's impairment Each column in Table~\ref{tab:impairment_profiles} controls one
depends on the preceding packet. At 0\% correlation, loss and aspect of the simulated degradation:
reordering events are independent; at higher values they occur in
bursts, because a packet that was lost or reordered increases the \begin{itemize}
probability that the next packet suffers the same fate. This produces \item \textbf{Latency} is a constant delay added to every outgoing
realistic bursty degradation rather than uniformly distributed drops. packet. For example, 2\,ms on each machine adds roughly 4\,ms to
the round trip.
\item \textbf{Jitter} introduces random variation on top of the
fixed latency. A packet on the Low profile may see anywhere
between 0 and 4\,ms of total added delay instead of exactly
2\,ms.
\item \textbf{Loss} is the fraction of packets that are silently
dropped. At 0.25\,\% (Low profile), roughly 1 in 400 packets is
discarded.
\item \textbf{Reorder} is the fraction of packets that arrive out
of sequence. \texttt{tc netem} achieves this by giving selected
packets a shorter delay than their predecessors, so they overtake
earlier packets.
\item \textbf{Correlation} determines whether impairment events are
independent or bursty. At 0\,\%, each packet's fate is decided
independently. At higher values, a packet that was lost or
reordered raises the probability that the next packet suffers the
same fate, producing the burst patterns typical of real networks.
\end{itemize}
A 30-second stabilization period follows TC application before A 30-second stabilization period follows TC application before
measurements begin, allowing queuing disciplines to settle. measurements begin, allowing queuing disciplines to settle.
@@ -348,8 +348,6 @@ typical observations, while min and max capture outlier behavior.
The nix-cache benchmark additionally reports standard deviation via The nix-cache benchmark additionally reports standard deviation via
hyperfine's built-in statistical output. hyperfine's built-in statistical output.
\section{Source Code Analysis} \section{Source Code Analysis}
To complement the performance benchmarks with architectural To complement the performance benchmarks with architectural
@@ -517,8 +515,6 @@ wall-clock duration, number of attempts, VPN restart count and
duration, connectivity wait time, source and target machine names, duration, connectivity wait time, source and target machine names,
and on failure, the relevant service logs. and on failure, the relevant service logs.
\section{VPNs Under Test} \section{VPNs Under Test}
VPNs were selected based on: VPNs were selected based on:
@@ -532,7 +528,6 @@ VPNs were selected based on:
\bitem{Linux support:} All VPNs must run on Linux. \bitem{Linux support:} All VPNs must run on Linux.
\end{itemize} \end{itemize}
Ten VPN implementations were selected for evaluation, spanning a range Ten VPN implementations were selected for evaluation, spanning a range
of architectures from centralized coordination to fully decentralized of architectures from centralized coordination to fully decentralized
mesh topologies. Table~\ref{tab:vpn_selection} summarizes the selection. mesh topologies. Table~\ref{tab:vpn_selection} summarizes the selection.

View File

@@ -12,8 +12,6 @@ The chapter concludes with findings from the source code analysis.
\section{Baseline Performance} \section{Baseline Performance}
% Under the baseline impairment profile (no added latency, loss, or % Under the baseline impairment profile (no added latency, loss, or
% reordering), the overhead introduced by each VPN relative to the % reordering), the overhead introduced by each VPN relative to the
% internal (no VPN) baseline and WireGuard can be measured in isolation. % internal (no VPN) baseline and WireGuard can be measured in isolation.
@@ -97,4 +95,4 @@ High impairment profiles defined in Chapter~\ref{Methodology}.
\section{Summary of Findings} \section{Summary of Findings}
% Brief summary table or ranking of VPNs by key metrics. % Brief summary table or ranking of VPNs by key metrics.
% Save deeper interpretation for a Discussion chapter. % Save deeper interpretation for a Discussion chapter.

View File

@@ -0,0 +1,29 @@
%----------------------------------------------------------------------------------------
% GERMAN ABSTRACT PAGE
%----------------------------------------------------------------------------------------
\begingroup
\renewcommand{\abstractname}{Zusammenfassung}
\begin{abstract}
\addchaptertocentry{Zusammenfassung}
Diese Arbeit untersucht Peer-to-Peer-Mesh-VPNs mithilfe eines
reproduzierbaren, Nix-basierten Frameworks, das auf einem
Deployment-System namens Clan aufbaut. Wir evaluieren zehn
VPN-Implementierungen, darunter Tailscale (über Headscale),
Hyprspace, Nebula, Tinc und ZeroTier, under vier
Netzwerkbeeinträchtigungsprofilen mit variierendem Paketverlust,
Paketumsortierung, Latenz und Jitter, was über 300 einzelne
Messungen in sieben Benchmarks ergibt.
Unsere Analyse zeigt, dass Tailscale under beeinträchtigten
Bedingungen den Standard-Netzwerkstack des Linux-Kernels
übertrifft, was auf seinen Userspace-IP-Stack mit optimierten
Parametern zurückzuführen ist. Wir bestätigen dies, indem wir die
Benchmarks mit entsprechend angepassten Kernel-Parametern erneut
durchführen und vergleichbare Durchsatzgewinne beobachten. Die
Untersuchung deckte zudem eine kritische Sicherheitslücke in einem
der evaluierten VPNs auf.
\end{abstract}
\endgroup

View File

@@ -6,6 +6,7 @@ extend-exclude = [
"**/facter-report.nix", "**/facter-report.nix",
"**/key.json", "**/key.json",
"pkgs/clan-cli/clan_lib/machines/test_suggestions.py", "pkgs/clan-cli/clan_lib/machines/test_suggestions.py",
"Chapters/Zusammenfassung.tex",
] ]
[default.extend-words] [default.extend-words]

View File

@@ -95,7 +95,8 @@
% THESIS INFORMATION % THESIS INFORMATION
%---------------------------------------------------------------------------------------- %----------------------------------------------------------------------------------------
\thesistitle{An Analysis of P2P VPN Implementation} % Your thesis title, this is used in the title \thesistitle{An Analysis of P2P VPN Implementation} % Your thesis
% title, this is used in the title
% and abstract, print it elsewhere with \ttitle % and abstract, print it elsewhere with \ttitle
%\supervisor{\textsc{Ber Lorke}} % Your supervisor's name, this is %\supervisor{\textsc{Ber Lorke}} % Your supervisor's name, this is
% used in the title page, print it elsewhere with \supname % used in the title page, print it elsewhere with \supname
@@ -248,35 +249,7 @@ and Management}} % Your department's name and URL, this is used in
\end{abstract} \end{abstract}
%---------------------------------------------------------------------------------------- \input{Chapters/Zusammenfassung}
% GERMAN ABSTRACT PAGE
%----------------------------------------------------------------------------------------
\begingroup
\renewcommand{\abstractname}{Zusammenfassung}
\begin{abstract}
\addchaptertocentry{Zusammenfassung}
Diese Arbeit untersucht Peer-to-Peer-Mesh-VPNs mithilfe eines
reproduzierbaren, Nix-basierten Frameworks, das auf einem
Deployment-System namens Clan aufbaut. Wir evaluieren zehn
VPN-Implementierungen, darunter Tailscale (über Headscale),
Hyprspace, Nebula, Tinc und ZeroTier, unter vier
Netzwerkbeeinträchtigungsprofilen mit variierendem Paketverlust,
Paketumsortierung, Latenz und Jitter, was über 300 einzelne
Messungen in sieben Benchmarks ergibt.
Unsere Analyse zeigt, dass Tailscale unter beeinträchtigten
Bedingungen den Standard-Netzwerkstack des Linux-Kernels
übertrifft, was auf seinen Userspace-IP-Stack mit optimierten
Parametern zurückzuführen ist. Wir bestätigen dies, indem wir die
Benchmarks mit entsprechend angepassten Kernel-Parametern erneut
durchführen und vergleichbare Durchsatzgewinne beobachten. Die
Untersuchung deckte zudem eine kritische Sicherheitslücke in einem
der evaluierten VPNs auf.
\end{abstract}
\endgroup
%---------------------------------------------------------------------------------------- %----------------------------------------------------------------------------------------
% ACKNOWLEDGEMENTS % ACKNOWLEDGEMENTS