diff --git a/Chapters/Methodology.tex b/Chapters/Methodology.tex index 3f9235f..e93c7dd 100755 --- a/Chapters/Methodology.tex +++ b/Chapters/Methodology.tex @@ -2,243 +2,386 @@ \chapter{Methodology} % Main chapter title -\label{Methodology} % Change X to a consecutive number; for -% referencing this chapter elsewhere, use \ref{ChapterX} +\label{Methodology} -%---------------------------------------------------------------------------------------- -% SECTION 1 -%---------------------------------------------------------------------------------------- +This chapter describes the methodology used to benchmark peer-to-peer +overlay VPN implementations. The experimental design prioritizes +reproducibility at every layer---from dependency management to network +conditions---enabling independent verification of results and +facilitating future comparative studies. -This chapter describes the methodology used to evaluate and analyze -the Clan framework. A summary of the logical flow of this research is -depicted in Figure \ref{fig:clan_thesis_argumentation_tree}. +\section{Experimental Setup} -\begin{figure}[H] - \centering - \includesvg[width=1\textwidth, - keepaspectratio]{Figures/clan_thesis_argumentation_tree.drawio.svg} - \caption{Argumentation Tree for the Clan Thesis} - \label{fig:clan_thesis_argumentation_tree} -\end{figure} +\subsection{Hardware Configuration} -The structure of this study adopts a multi-faceted approach, -addressing several interrelated challenges in enhancing the -reliability and manageability of \ac{P2P} networks. -The primary objective is to assess how the Clan framework effectively -addresses these challenges. +All experiments were conducted on three bare-metal servers with +identical specifications: -The research methodology consists of two main components: -\begin{enumerate} - \item \textbf{Development of a Theoretical Model} \\ - A theoretical model of the Clan framework will be constructed. - This includes a formal specification of the system's foundational - axioms, outlining the principles and properties that guide its - design. From these axioms, key theorems will be derived, along - with their boundary conditions. The aim is to understand the - mechanisms underpinning the framework and establish a basis for - its evaluation. - - \item \textbf{Empirical Validation of the Theoretical Model} \\ - Practical experiments will be conducted to validate the - predictions of the theoretical model. These experiments will - evaluate how well the model aligns with observed performance in - real-world settings. This step is crucial to identifying the - model’s strengths and limitations. -\end{enumerate} - -The methodology will particularly examine three core components of -the Clan framework: \begin{itemize} - \item \textbf{Clan Deployment System} \\ - The deployment system is the core of the Clan framework, enabling - the configuration and management of distributed software - components. It simplifies complex configurations through Python - code, which abstracts the intricacies of the Nix language. - Central to this system is the "inventory," a mergeable data - structure designed for ensuring consistent service configurations - across nodes without conflicts. This component will be analyzed - for its design, functionality, efficiency, scalability, and fault - resilience. - - \item \textbf{Overlay Networks / Mesh VPNs} \\ - Overlay networks, also known as "Mesh VPNs," are critical for - secure communication in Clan’s \ac{P2P} deployment. The study - will evaluate their performance in terms of security, - scalability, and resilience to network disruptions. Specifically, - the assessment will include how well these networks handle - traffic in environments where no device has a public IP address, - as well as the impact of node failures on overall - connectivity. The analysis will focus on: - \begin{itemize} - \item \textbf{ZeroTier}: A globally distributed "Ethernet Switch". - \item \textbf{Mycelium}: An end-to-end encrypted IPv6 overlay network. - \item \textbf{Hyprspace}: A lightweight VPN leveraging IPFS and libp2p. - \end{itemize} - - Other Mesh VPN solutions may be considered as comparison: - \begin{itemize} - \item \textbf{Tailscale}: A secure network for teams. - \item \textbf{Nebula Lightouse}: A scalable overlay networking - tool with a focus on performance - \end{itemize} - \item \textbf{Data Mesher} \\ - The Data Mesher is responsible for data synchronization across - nodes, ensuring eventual consistency in Clan’s decentralized network. This - component will be evaluated for synchronization speed, fault - tolerance, and conflict resolution mechanisms. Additionally, it - will be analyzed for its resilience in scenarios involving - malicious nodes, measuring how effectively it prevents and - mitigates manipulation or integrity violations during data - replication and distribution. + \item \textbf{CPU:} Intel Model 94, 4 cores / 8 threads + \item \textbf{Memory:} 64 GB RAM + \item \textbf{Network:} 1 Gbps Ethernet (e1000e driver; one machine uses r8169) + \item \textbf{Cryptographic acceleration:} AES-NI, AVX, AVX2, PCLMULQDQ, + RDRAND, SSE4.2 \end{itemize} -\section{Related Work} +The presence of hardware cryptographic acceleration is relevant because +many VPN implementations leverage AES-NI for encryption, and the results +may differ on systems without these features. -The Clan framework operates within the realm of software deployment -and peer-to-peer networking, -necessitating a deep understanding of existing methodologies in these -areas to tackle contemporary challenges. -This section will discuss related works encompassing system -deployment, peer data management, -and low maintenance structured peer-to-peer overlays, which inform -the development and positioning of the Clan framework. +\subsection{Network Topology} + +The three machines are connected via a direct 1 Gbps LAN on the same +network segment. This baseline topology provides a controlled environment +with minimal latency and no packet loss, allowing the overhead introduced +by each VPN implementation to be measured in isolation. + +To simulate real-world network conditions, Linux traffic control +(\texttt{tc netem}) is used to inject latency, jitter, packet loss, +and reordering. These impairments are applied symmetrically on all +machines, meaning effective round-trip impairment is approximately +double the per-machine values. + +\section{VPNs Under Test} + +Ten VPN implementations were selected for evaluation, spanning a range +of architectures from centralized coordination to fully decentralized +mesh topologies. Table~\ref{tab:vpn_selection} summarizes the selection. + +\begin{table}[H] + \centering + \caption{VPN implementations included in the benchmark} + \label{tab:vpn_selection} + \begin{tabular}{lll} + \hline + \textbf{VPN} & \textbf{Architecture} & \textbf{Notes} \\ + \hline + Tailscale (Headscale) & Coordinated mesh & Open-source coordination server \\ + ZeroTier & Coordinated mesh & Global virtual Ethernet \\ + Nebula & Lighthouse-based mesh & Slack's overlay network \\ + Tinc & Decentralized mesh & Established since 1998 \\ + Yggdrasil & Fully decentralized & Spanning-tree routing \\ + Mycelium & Fully decentralized & End-to-end encrypted IPv6 overlay \\ + Hyprspace & Fully decentralized & libp2p-based, IPFS-compatible \\ + EasyTier & Decentralized mesh & Rust-based, multi-protocol \\ + VpnCloud & Decentralized mesh & Lightweight, kernel bypass option \\ + WireGuard & Point-to-point & Reference baseline (not a mesh VPN) \\ + \hline + Internal (no VPN) & N/A & Baseline for raw network performance \\ + \hline + \end{tabular} +\end{table} + +WireGuard is included as a reference point despite not being a mesh VPN. +Its minimal overhead and widespread adoption make it a useful comparison +for understanding the cost of mesh coordination and NAT traversal logic. + +\subsection{Selection Criteria} + +VPNs were selected based on: +\begin{itemize} + \item \textbf{NAT traversal capability:} All selected VPNs can establish + connections between peers behind NAT without manual port forwarding. + \item \textbf{Decentralization:} Preference for solutions without mandatory + central servers, though coordinated-mesh VPNs were included for comparison. + \item \textbf{Active development:} Only VPNs with recent commits and + maintained releases were considered. + \item \textbf{Linux support:} All VPNs must run on Linux. +\end{itemize} + +\subsection{Configuration Methodology} + +Each VPN is built from source within the Nix flake, ensuring that all +dependencies are pinned to exact versions. VPNs not packaged in nixpkgs +(Hyprspace, EasyTier, VpnCloud, qperf) have dedicated build expressions +under \texttt{pkgs/} in the flake. + +Cryptographic material (WireGuard keys, Nebula certificates, ZeroTier +identities) is generated deterministically via Clan's vars generator +system. For example, WireGuard keys are generated as: + +\begin{verbatim} +wg genkey > "$out/private-key" +wg pubkey < "$out/private-key" > "$out/public-key" +\end{verbatim} + +Generated keys are stored in version control under +\texttt{vars/per-machine/\{name\}/} and read at NixOS evaluation time, +making key material part of the reproducible configuration. + +\section{Benchmark Suite} + +The benchmark suite includes both synthetic throughput tests and +real-world workloads. This combination addresses a limitation of prior +work that relied exclusively on iperf3. + +\subsection{Ping} + +Measures round-trip latency and packet delivery reliability. + +\begin{itemize} + \item \textbf{Method:} 100 ICMP echo requests at 200 ms intervals, + 1-second per-packet timeout, repeated for 3 runs. + \item \textbf{Metrics:} RTT (min, avg, max, mdev), packet loss percentage, + per-packet RTTs. +\end{itemize} + +\subsection{iPerf3} + +Measures bulk data transfer throughput. + +\textbf{TCP variant:} 30-second bidirectional test with RSA authentication +and zero-copy mode (\texttt{-Z}) to minimize CPU overhead. + +\textbf{UDP variant:} Same configuration with unlimited target bandwidth +(\texttt{-b 0}) and 64-bit counters. + +\textbf{Parallel TCP variant:} Tests concurrent mesh traffic by running +TCP streams on all machines simultaneously in a circular pattern +(A$\rightarrow$B, B$\rightarrow$C, C$\rightarrow$A) for 60 seconds. +This simulates contention across the mesh. + +\begin{itemize} + \item \textbf{Metrics:} Throughput (bits/s), retransmits, congestion window, + jitter (UDP), packet loss (UDP). +\end{itemize} + +\subsection{qPerf} + +Measures connection-level performance rather than bulk throughput. + +\begin{itemize} + \item \textbf{Method:} One qperf instance per CPU core in parallel, each + running for 30 seconds. Bandwidth from all cores is summed per second. + \item \textbf{Metrics:} Total bandwidth (Mbps), CPU usage, time to first + byte (TTFB), connection establishment time. +\end{itemize} + +\subsection{RIST Video Streaming} + +Measures real-time multimedia streaming performance. + +\begin{itemize} + \item \textbf{Method:} The sender generates a 4K (3840$\times$2160) test + pattern at 30 fps using ffmpeg with H.264 encoding (ultrafast preset, + zerolatency tuning) at 25 Mbps target bitrate. The stream is transmitted + over the RIST protocol to a receiver on the target machine for 30 seconds. + \item \textbf{Encoding metrics:} Actual bitrate, frame rate, dropped frames. + \item \textbf{Network metrics:} Packets dropped, packets recovered via + RIST retransmission, RTT, quality score (0--100), received bitrate. +\end{itemize} + +RIST (Reliable Internet Stream Transport) is a protocol designed for +low-latency video contribution over unreliable networks, making it a +realistic test of VPN behavior under multimedia workloads. + +\subsection{Nix Cache Download} + +Measures sustained download performance using a real-world workload. + +\begin{itemize} + \item \textbf{Method:} A Harmonia Nix binary cache server on the target + machine serves the Firefox package. The client downloads it via + \texttt{nix copy} through the VPN. Benchmarked with hyperfine: + 1 warmup run followed by 2 timed runs. The local cache and Nix's + SQLite metadata are cleared between runs. + \item \textbf{Metrics:} Mean duration (seconds), standard deviation, + min/max duration. +\end{itemize} + +This benchmark tests realistic HTTP traffic patterns and sustained +sequential download performance, complementing the synthetic throughput +tests. + +\section{Network Impairment Profiles} + +Four impairment profiles simulate a range of network conditions, from +ideal to severely degraded. Impairments are applied via Linux traffic +control (\texttt{tc netem}) on every machine's primary interface. +Table~\ref{tab:impairment_profiles} shows the per-machine values; +effective round-trip impairment is approximately doubled. + +\begin{table}[H] + \centering + \caption{Network impairment profiles (per-machine egress values)} + \label{tab:impairment_profiles} + \begin{tabular}{lccccc} + \hline + \textbf{Profile} & \textbf{Latency} & \textbf{Jitter} & + \textbf{Loss} & \textbf{Reorder} & \textbf{Correlation} \\ + \hline + Baseline & --- & --- & --- & --- & --- \\ + Low & 2 ms & 2 ms & 0.25\% & 0.5\% & 25\% \\ + Medium & 4 ms & 7 ms & 1.0\% & 2.5\% & 50\% \\ + High & 12 ms & 30 ms & 5.0\% & 10\% & 50\% \\ + \hline + \end{tabular} +\end{table} + +The ``Low'' profile approximates a well-provisioned continental +connection, ``Medium'' represents intercontinental links or congested +networks, and ``High'' simulates severely degraded conditions such as +satellite links or highly congested mobile networks. + +A 30-second stabilization period follows TC application before +measurements begin, allowing queuing disciplines to settle. + +\section{Experimental Procedure} + +\subsection{Automation} + +The benchmark suite is fully automated via a Python orchestrator +(\texttt{vpn\_bench/}). For each VPN under test, the orchestrator: + +\begin{enumerate} + \item Cleans all state directories from previous VPN runs + \item Deploys the VPN configuration to all machines via Clan + \item Restarts the VPN service on every machine (with retry: + up to 3 attempts, 2-second backoff) + \item Verifies VPN connectivity via a connection-check service + (120-second timeout) + \item For each impairment profile: + \begin{enumerate} + \item Applies TC rules via context manager (guarantees cleanup) + \item Waits 30 seconds for stabilization + \item Executes all benchmarks + \item Clears TC rules + \end{enumerate} + \item Collects results and metadata +\end{enumerate} + +\subsection{Retry Logic} + +Tests use a retry wrapper with up to 2 retries (3 total attempts), +5-second initial delay, and 700-second maximum total time. The number +of attempts is recorded in test metadata so that retried results can +be identified during analysis. + +\subsection{Statistical Analysis} + +Each metric is summarized as a statistics dictionary containing: + +\begin{itemize} + \item \textbf{min / max:} Extreme values observed + \item \textbf{average:} Arithmetic mean across samples + \item \textbf{p25 / p50 / p75:} Quartiles via \texttt{statistics.quantiles()} +\end{itemize} + +Multi-run tests (ping, nix-cache) aggregate across runs. Per-second +tests (qperf, RIST) aggregate across all per-second samples. + +The approach uses empirical percentiles rather than parametric +confidence intervals, which is appropriate for benchmark data that +may not follow a normal distribution. The nix-cache test (via hyperfine) +additionally reports standard deviation. + +\section{Reproducibility} + +Reproducibility is ensured at every layer of the experimental stack. + +\subsection{Dependency Pinning} + +Every external dependency is pinned via \texttt{flake.lock}, which records +cryptographic hashes (\texttt{narHash}) and commit SHAs for each input. +Key pinned inputs include: + +\begin{itemize} + \item \textbf{nixpkgs:} Follows \texttt{clan-core/nixpkgs}, ensuring a + single version across the dependency graph + \item \textbf{clan-core:} The Clan framework, pinned to a specific commit + \item \textbf{VPN sources:} Hyprspace, EasyTier, Nebula locked to exact commits + \item \textbf{Build infrastructure:} flake-parts, treefmt-nix, disko, + nixos-facter-modules +\end{itemize} + +Custom packages not in nixpkgs (qperf, VpnCloud, iperf with auth patches, +phantun, EasyTier, Hyprspace) are built from source within the flake. + +\subsection{Declarative System Configuration} + +Each benchmark machine runs NixOS, where the entire operating system is +defined declaratively. There is no imperative package installation or +configuration drift. Given the same NixOS configuration, two machines +will have identical software, services, and kernel parameters. + +Machine deployment is atomic: the system either switches to the new +configuration entirely or rolls back. + +\subsection{Inventory-Driven Topology} + +Clan's inventory system maps machines to service roles declaratively. +For each VPN, the orchestrator writes an inventory entry assigning +machines to roles (e.g., Nebula lighthouse vs.\ peer). The Clan module +system translates this into NixOS configuration---systemd services, +firewall rules, peer lists, and key references. The same inventory +entry always produces the same NixOS configuration. + +\subsection{State Isolation} + +Before installing a new VPN, the orchestrator deletes all state +directories from previous runs, including VPN-specific directories +(\texttt{/var/lib/zerotier-one}, \texttt{/var/lib/nebula}, etc.) and +benchmark directories. This prevents cross-contamination between tests. + +\subsection{Data Provenance} + +Every test result includes metadata recording: + +\begin{itemize} + \item Wall-clock duration + \item Number of attempts (1 = first try succeeded) + \item VPN restart attempts and duration + \item Connectivity wait duration + \item Source and target machine names + \item Service logs (on failure) +\end{itemize} + +Results are organized hierarchically by VPN, TC profile, and machine +pair. Each profile directory contains a \texttt{tc\_settings.json} +snapshot of the exact impairment parameters applied. + +\section{Related Work} \subsection{Nix: A Safe and Policy-Free System for Software Deployment} Nix addresses significant issues in software deployment by utilizing -a technique that employs cryptographic -hashes to ensure unique paths for component instances \cite{dolstra_nix_2004}. -The system is distinguished by its features, such as concurrent -installation of multiple versions and variants, -atomic upgrades, and safe garbage collection. -These capabilities lead to a flexible deployment system that -harmonizes source and binary deployments. -Nix conceptualizes deployment without imposing rigid policies, -thereby offering adaptable strategies for component management. -This contrasts with many prevailing systems that are constrained by -policy-specific designs, -making Nix an easily extensible, safe and versatile deployment solution -for configuration files and software. - -As Clan makes extensive use of Nix for deployment, understanding the -foundations and principles of Nix is crucial for evaluating inner workings. +cryptographic hashes to ensure unique paths for component instances +\cite{dolstra_nix_2004}. Features such as concurrent installation of +multiple versions, atomic upgrades, and safe garbage collection make +Nix a flexible deployment system. This work uses Nix to ensure that +all VPN builds and system configurations are deterministic. \subsection{NixOS: A Purely Functional Linux Distribution} -NixOS is an extension of the principles established by Nix, -presenting a Linux distribution that manages system configurations -using purely functional methods \cite{dolstra_nixos_2008}. This model -ensures that system -configurations are reproducible and isolated -from stateful interactions typical in imperative models of package management. -Because NixOS configurations are built by pure functions, they can overcome the -challenges of easily rolling back changes, deploying multiple package versions -side-by-side, and achieving deterministic configuration reproduction . -The solution is particularly compelling in environments necessitating rigorous -reproducibility and minimal configuration drift—a valuable feature -for distributed networks . - -Clan also leverages NixOS for system configuration and deployment, -making it essential to understand how NixOS's functional model works. - -\subsection{Disnix: A Toolset for Distributed Deployment} - -Disnix extends the Nix philosophy to the challenge of distributed -deployment, offering a toolset that enables system administrators and -developers to perform automatic deployment of service-oriented -systems across a network of machines \cite{van_der_burg_disnix_2014}. -Disnix leverages the features of Nix to manage complex intra-dependencies. -Meaning dependencies that exist on a network level instead on a binary level. -The overlap with the Clan framework is evident in the focus on deployment, how -they differ will be explored in the evaluation of Clan's deployment system. - -\subsection{State of the Art in Software Defined Networking} - -The work by Bakhshi \cite{bakhshi_state_2017} surveys the -foundational principles and recent developments in Software Defined -Networking (SDN). It describes SDN as a paradigm that separates the -control plane from the data plane, enabling centralized, programmable -control over network behavior. The paper focuses on the architectural -components of SDN, including the three-layer abstraction model—the -application layer, control layer, and data layer—and highlights the -role of SDN controllers such as OpenDaylight, Floodlight, and Ryu. - -A key contribution of the paper is its identification of challenges -and open research questions in SDN. These include issues related to -scalability, fault tolerance, and the security risks introduced by -centralized control. - -This work is relevant to evaluating Clan’s role as a -Software Defined Network deployment tool and as a -comparison point against the state of the art. - -\subsection{Low Maintenance Peer-to-Peer Overlays} - -Structured Peer-to-Peer (P2P) overlay networks offer scalability and -efficiency but often require significant maintenance to handle -challenges such as peer churn and mismatched logical and physical -topologies. Shukla et al. propose a novel approach to designing -Distributed Hash Table (DHT)-based P2P overlays by integrating -Software Defined Networks (SDNs) to dynamically adjust -application-specific network policies and rules -\cite{shukla_towards_2021}. This method reduces maintenance overhead -by aligning overlay topology with the underlying physical network, -thus improving performance and reducing communication costs. - -The relevance of this work to Clan lies in its addressing of -operational complexity in managing P2P networks. +NixOS extends Nix principles to Linux system configuration +\cite{dolstra_nixos_2008}. System configurations are reproducible and +isolated from stateful interactions typical in imperative package +management. This property is essential for ensuring identical test +environments across benchmark runs. \subsection{Full-Mesh VPN Performance Evaluation} -The work by Kjorveziroski et al. \cite{kjorveziroski_full-mesh_2024} -provides a comprehensive evaluation of full-mesh VPN solutions, -specifically focusing on their use as underlay networks for -distributed systems, such as Kubernetes clusters. Their benchmarks -analyze the performance of VPNs with built-in NAT traversal -capabilities, including ZeroTier, emphasizing throughput, reliability -under packet loss, and behavior when relay mechanisms are used. For -the Clan framework, these insights are particularly relevant in -assessing the performance and scalability of its Overlay Networks -component. By benchmarking ZeroTier alongside its peers, the paper -offers an established reference point for evaluating how Mesh VPN -solutions like ZeroTier perform under conditions similar to the -intricacies of peer-to-peer systems managed by Clan. +Kjorveziroski et al.\ \cite{kjorveziroski_full-mesh_2024} provide a +comprehensive evaluation of full-mesh VPN solutions for distributed +systems. Their benchmarks analyze throughput, reliability under packet +loss, and relay behavior for VPNs including ZeroTier. -\subsection{AMC: Towards Trustworthy and Explorable CRDT Applications} +This thesis extends their work in several ways: +\begin{itemize} + \item Broader VPN selection with emphasis on fully decentralized + architectures + \item Real-world workloads (video streaming, package downloads) + beyond synthetic iperf3 tests + \item Multiple impairment profiles to characterize behavior under + varying network conditions + \item Fully reproducible experimental framework via Nix/NixOS/Clan +\end{itemize} -Jeffery and Mortier \cite{jeffery_amc_2023} present the Automerge -Model Checker (AMC), a tool aimed at verifying and dynamically -exploring the correctness of applications built on Conflict-Free -Replicated Data Types (CRDTs). Their work addresses critical -challenges associated with implementing and optimizing -operation-based (op-based) CRDTs, particularly emphasizing how these -optimizations can inadvertently introduce subtle bugs in distributed -systems despite rigorous testing methods like fuzz testing. As part -of their contributions, they implemented the "Automerge" library in -Rust, an op-based CRDT framework that exposes a JSON-like API and -supports local-first and asynchronous collaborative operations. +\subsection{Low Maintenance Peer-to-Peer Overlays} -This paper is particularly relevant to the development and evaluation -of the Data Mesher component of the Clan framework, which utilizes -state-based (or value-based) CRDTs for synchronizing distributed data -across peer-to-peer nodes. While Automerge addresses issues pertinent -to op-based CRDTs, the discussion on verification techniques, edge -case handling, and model-checking methodologies provides -cross-cutting insights to the complexities of ops based CRDTs and is -a good argument for using simpler state based CRDTs. - -\subsection{Keep CALM and CRDT On} - -The work by Laddad et al. \cite{laddad_keep_2022} complements and -expands upon concepts presented in the AMC paper. By revisiting the -foundations of CRDTs, the authors address limitations related to -reliance on eventual consistency and propose techniques to -distinguish between safe and unsafe queries using monotonicity -results derived from the CALM Theorem. This inquiry is highly -relevant for the Data Mesher component of Clan, as it delves into -operational and observable consistency guarantees that can optimize -both efficiency and safety in distributed query execution. -Specifically, the insights on query models and coordination-free -approaches advance the understanding of how CRDT-based systems, like -the Data Mesher, manage distributed state effectively without -compromising safety guarantees. +Shukla et al.\ propose integrating Software Defined Networks with +DHT-based P2P overlays to reduce maintenance overhead +\cite{shukla_towards_2021}. Their work on aligning overlay topology +with physical networks is relevant to understanding the performance +characteristics of mesh VPNs that must discover and maintain peer +connectivity dynamically.