Methodology section
This commit is contained in:
@@ -2,243 +2,386 @@
|
||||
|
||||
\chapter{Methodology} % Main chapter title
|
||||
|
||||
\label{Methodology} % Change X to a consecutive number; for
|
||||
% referencing this chapter elsewhere, use \ref{ChapterX}
|
||||
\label{Methodology}
|
||||
|
||||
%----------------------------------------------------------------------------------------
|
||||
% SECTION 1
|
||||
%----------------------------------------------------------------------------------------
|
||||
This chapter describes the methodology used to benchmark peer-to-peer
|
||||
overlay VPN implementations. The experimental design prioritizes
|
||||
reproducibility at every layer---from dependency management to network
|
||||
conditions---enabling independent verification of results and
|
||||
facilitating future comparative studies.
|
||||
|
||||
This chapter describes the methodology used to evaluate and analyze
|
||||
the Clan framework. A summary of the logical flow of this research is
|
||||
depicted in Figure \ref{fig:clan_thesis_argumentation_tree}.
|
||||
\section{Experimental Setup}
|
||||
|
||||
\begin{figure}[H]
|
||||
\centering
|
||||
\includesvg[width=1\textwidth,
|
||||
keepaspectratio]{Figures/clan_thesis_argumentation_tree.drawio.svg}
|
||||
\caption{Argumentation Tree for the Clan Thesis}
|
||||
\label{fig:clan_thesis_argumentation_tree}
|
||||
\end{figure}
|
||||
\subsection{Hardware Configuration}
|
||||
|
||||
The structure of this study adopts a multi-faceted approach,
|
||||
addressing several interrelated challenges in enhancing the
|
||||
reliability and manageability of \ac{P2P} networks.
|
||||
The primary objective is to assess how the Clan framework effectively
|
||||
addresses these challenges.
|
||||
All experiments were conducted on three bare-metal servers with
|
||||
identical specifications:
|
||||
|
||||
The research methodology consists of two main components:
|
||||
\begin{enumerate}
|
||||
\item \textbf{Development of a Theoretical Model} \\
|
||||
A theoretical model of the Clan framework will be constructed.
|
||||
This includes a formal specification of the system's foundational
|
||||
axioms, outlining the principles and properties that guide its
|
||||
design. From these axioms, key theorems will be derived, along
|
||||
with their boundary conditions. The aim is to understand the
|
||||
mechanisms underpinning the framework and establish a basis for
|
||||
its evaluation.
|
||||
|
||||
\item \textbf{Empirical Validation of the Theoretical Model} \\
|
||||
Practical experiments will be conducted to validate the
|
||||
predictions of the theoretical model. These experiments will
|
||||
evaluate how well the model aligns with observed performance in
|
||||
real-world settings. This step is crucial to identifying the
|
||||
model’s strengths and limitations.
|
||||
\end{enumerate}
|
||||
|
||||
The methodology will particularly examine three core components of
|
||||
the Clan framework:
|
||||
\begin{itemize}
|
||||
\item \textbf{Clan Deployment System} \\
|
||||
The deployment system is the core of the Clan framework, enabling
|
||||
the configuration and management of distributed software
|
||||
components. It simplifies complex configurations through Python
|
||||
code, which abstracts the intricacies of the Nix language.
|
||||
Central to this system is the "inventory," a mergeable data
|
||||
structure designed for ensuring consistent service configurations
|
||||
across nodes without conflicts. This component will be analyzed
|
||||
for its design, functionality, efficiency, scalability, and fault
|
||||
resilience.
|
||||
|
||||
\item \textbf{Overlay Networks / Mesh VPNs} \\
|
||||
Overlay networks, also known as "Mesh VPNs," are critical for
|
||||
secure communication in Clan’s \ac{P2P} deployment. The study
|
||||
will evaluate their performance in terms of security,
|
||||
scalability, and resilience to network disruptions. Specifically,
|
||||
the assessment will include how well these networks handle
|
||||
traffic in environments where no device has a public IP address,
|
||||
as well as the impact of node failures on overall
|
||||
connectivity. The analysis will focus on:
|
||||
\begin{itemize}
|
||||
\item \textbf{ZeroTier}: A globally distributed "Ethernet Switch".
|
||||
\item \textbf{Mycelium}: An end-to-end encrypted IPv6 overlay network.
|
||||
\item \textbf{Hyprspace}: A lightweight VPN leveraging IPFS and libp2p.
|
||||
\end{itemize}
|
||||
|
||||
Other Mesh VPN solutions may be considered as comparison:
|
||||
\begin{itemize}
|
||||
\item \textbf{Tailscale}: A secure network for teams.
|
||||
\item \textbf{Nebula Lightouse}: A scalable overlay networking
|
||||
tool with a focus on performance
|
||||
\end{itemize}
|
||||
\item \textbf{Data Mesher} \\
|
||||
The Data Mesher is responsible for data synchronization across
|
||||
nodes, ensuring eventual consistency in Clan’s decentralized network. This
|
||||
component will be evaluated for synchronization speed, fault
|
||||
tolerance, and conflict resolution mechanisms. Additionally, it
|
||||
will be analyzed for its resilience in scenarios involving
|
||||
malicious nodes, measuring how effectively it prevents and
|
||||
mitigates manipulation or integrity violations during data
|
||||
replication and distribution.
|
||||
\item \textbf{CPU:} Intel Model 94, 4 cores / 8 threads
|
||||
\item \textbf{Memory:} 64 GB RAM
|
||||
\item \textbf{Network:} 1 Gbps Ethernet (e1000e driver; one machine uses r8169)
|
||||
\item \textbf{Cryptographic acceleration:} AES-NI, AVX, AVX2, PCLMULQDQ,
|
||||
RDRAND, SSE4.2
|
||||
\end{itemize}
|
||||
|
||||
\section{Related Work}
|
||||
The presence of hardware cryptographic acceleration is relevant because
|
||||
many VPN implementations leverage AES-NI for encryption, and the results
|
||||
may differ on systems without these features.
|
||||
|
||||
The Clan framework operates within the realm of software deployment
|
||||
and peer-to-peer networking,
|
||||
necessitating a deep understanding of existing methodologies in these
|
||||
areas to tackle contemporary challenges.
|
||||
This section will discuss related works encompassing system
|
||||
deployment, peer data management,
|
||||
and low maintenance structured peer-to-peer overlays, which inform
|
||||
the development and positioning of the Clan framework.
|
||||
\subsection{Network Topology}
|
||||
|
||||
The three machines are connected via a direct 1 Gbps LAN on the same
|
||||
network segment. This baseline topology provides a controlled environment
|
||||
with minimal latency and no packet loss, allowing the overhead introduced
|
||||
by each VPN implementation to be measured in isolation.
|
||||
|
||||
To simulate real-world network conditions, Linux traffic control
|
||||
(\texttt{tc netem}) is used to inject latency, jitter, packet loss,
|
||||
and reordering. These impairments are applied symmetrically on all
|
||||
machines, meaning effective round-trip impairment is approximately
|
||||
double the per-machine values.
|
||||
|
||||
\section{VPNs Under Test}
|
||||
|
||||
Ten VPN implementations were selected for evaluation, spanning a range
|
||||
of architectures from centralized coordination to fully decentralized
|
||||
mesh topologies. Table~\ref{tab:vpn_selection} summarizes the selection.
|
||||
|
||||
\begin{table}[H]
|
||||
\centering
|
||||
\caption{VPN implementations included in the benchmark}
|
||||
\label{tab:vpn_selection}
|
||||
\begin{tabular}{lll}
|
||||
\hline
|
||||
\textbf{VPN} & \textbf{Architecture} & \textbf{Notes} \\
|
||||
\hline
|
||||
Tailscale (Headscale) & Coordinated mesh & Open-source coordination server \\
|
||||
ZeroTier & Coordinated mesh & Global virtual Ethernet \\
|
||||
Nebula & Lighthouse-based mesh & Slack's overlay network \\
|
||||
Tinc & Decentralized mesh & Established since 1998 \\
|
||||
Yggdrasil & Fully decentralized & Spanning-tree routing \\
|
||||
Mycelium & Fully decentralized & End-to-end encrypted IPv6 overlay \\
|
||||
Hyprspace & Fully decentralized & libp2p-based, IPFS-compatible \\
|
||||
EasyTier & Decentralized mesh & Rust-based, multi-protocol \\
|
||||
VpnCloud & Decentralized mesh & Lightweight, kernel bypass option \\
|
||||
WireGuard & Point-to-point & Reference baseline (not a mesh VPN) \\
|
||||
\hline
|
||||
Internal (no VPN) & N/A & Baseline for raw network performance \\
|
||||
\hline
|
||||
\end{tabular}
|
||||
\end{table}
|
||||
|
||||
WireGuard is included as a reference point despite not being a mesh VPN.
|
||||
Its minimal overhead and widespread adoption make it a useful comparison
|
||||
for understanding the cost of mesh coordination and NAT traversal logic.
|
||||
|
||||
\subsection{Selection Criteria}
|
||||
|
||||
VPNs were selected based on:
|
||||
\begin{itemize}
|
||||
\item \textbf{NAT traversal capability:} All selected VPNs can establish
|
||||
connections between peers behind NAT without manual port forwarding.
|
||||
\item \textbf{Decentralization:} Preference for solutions without mandatory
|
||||
central servers, though coordinated-mesh VPNs were included for comparison.
|
||||
\item \textbf{Active development:} Only VPNs with recent commits and
|
||||
maintained releases were considered.
|
||||
\item \textbf{Linux support:} All VPNs must run on Linux.
|
||||
\end{itemize}
|
||||
|
||||
\subsection{Configuration Methodology}
|
||||
|
||||
Each VPN is built from source within the Nix flake, ensuring that all
|
||||
dependencies are pinned to exact versions. VPNs not packaged in nixpkgs
|
||||
(Hyprspace, EasyTier, VpnCloud, qperf) have dedicated build expressions
|
||||
under \texttt{pkgs/} in the flake.
|
||||
|
||||
Cryptographic material (WireGuard keys, Nebula certificates, ZeroTier
|
||||
identities) is generated deterministically via Clan's vars generator
|
||||
system. For example, WireGuard keys are generated as:
|
||||
|
||||
\begin{verbatim}
|
||||
wg genkey > "$out/private-key"
|
||||
wg pubkey < "$out/private-key" > "$out/public-key"
|
||||
\end{verbatim}
|
||||
|
||||
Generated keys are stored in version control under
|
||||
\texttt{vars/per-machine/\{name\}/} and read at NixOS evaluation time,
|
||||
making key material part of the reproducible configuration.
|
||||
|
||||
\section{Benchmark Suite}
|
||||
|
||||
The benchmark suite includes both synthetic throughput tests and
|
||||
real-world workloads. This combination addresses a limitation of prior
|
||||
work that relied exclusively on iperf3.
|
||||
|
||||
\subsection{Ping}
|
||||
|
||||
Measures round-trip latency and packet delivery reliability.
|
||||
|
||||
\begin{itemize}
|
||||
\item \textbf{Method:} 100 ICMP echo requests at 200 ms intervals,
|
||||
1-second per-packet timeout, repeated for 3 runs.
|
||||
\item \textbf{Metrics:} RTT (min, avg, max, mdev), packet loss percentage,
|
||||
per-packet RTTs.
|
||||
\end{itemize}
|
||||
|
||||
\subsection{iPerf3}
|
||||
|
||||
Measures bulk data transfer throughput.
|
||||
|
||||
\textbf{TCP variant:} 30-second bidirectional test with RSA authentication
|
||||
and zero-copy mode (\texttt{-Z}) to minimize CPU overhead.
|
||||
|
||||
\textbf{UDP variant:} Same configuration with unlimited target bandwidth
|
||||
(\texttt{-b 0}) and 64-bit counters.
|
||||
|
||||
\textbf{Parallel TCP variant:} Tests concurrent mesh traffic by running
|
||||
TCP streams on all machines simultaneously in a circular pattern
|
||||
(A$\rightarrow$B, B$\rightarrow$C, C$\rightarrow$A) for 60 seconds.
|
||||
This simulates contention across the mesh.
|
||||
|
||||
\begin{itemize}
|
||||
\item \textbf{Metrics:} Throughput (bits/s), retransmits, congestion window,
|
||||
jitter (UDP), packet loss (UDP).
|
||||
\end{itemize}
|
||||
|
||||
\subsection{qPerf}
|
||||
|
||||
Measures connection-level performance rather than bulk throughput.
|
||||
|
||||
\begin{itemize}
|
||||
\item \textbf{Method:} One qperf instance per CPU core in parallel, each
|
||||
running for 30 seconds. Bandwidth from all cores is summed per second.
|
||||
\item \textbf{Metrics:} Total bandwidth (Mbps), CPU usage, time to first
|
||||
byte (TTFB), connection establishment time.
|
||||
\end{itemize}
|
||||
|
||||
\subsection{RIST Video Streaming}
|
||||
|
||||
Measures real-time multimedia streaming performance.
|
||||
|
||||
\begin{itemize}
|
||||
\item \textbf{Method:} The sender generates a 4K (3840$\times$2160) test
|
||||
pattern at 30 fps using ffmpeg with H.264 encoding (ultrafast preset,
|
||||
zerolatency tuning) at 25 Mbps target bitrate. The stream is transmitted
|
||||
over the RIST protocol to a receiver on the target machine for 30 seconds.
|
||||
\item \textbf{Encoding metrics:} Actual bitrate, frame rate, dropped frames.
|
||||
\item \textbf{Network metrics:} Packets dropped, packets recovered via
|
||||
RIST retransmission, RTT, quality score (0--100), received bitrate.
|
||||
\end{itemize}
|
||||
|
||||
RIST (Reliable Internet Stream Transport) is a protocol designed for
|
||||
low-latency video contribution over unreliable networks, making it a
|
||||
realistic test of VPN behavior under multimedia workloads.
|
||||
|
||||
\subsection{Nix Cache Download}
|
||||
|
||||
Measures sustained download performance using a real-world workload.
|
||||
|
||||
\begin{itemize}
|
||||
\item \textbf{Method:} A Harmonia Nix binary cache server on the target
|
||||
machine serves the Firefox package. The client downloads it via
|
||||
\texttt{nix copy} through the VPN. Benchmarked with hyperfine:
|
||||
1 warmup run followed by 2 timed runs. The local cache and Nix's
|
||||
SQLite metadata are cleared between runs.
|
||||
\item \textbf{Metrics:} Mean duration (seconds), standard deviation,
|
||||
min/max duration.
|
||||
\end{itemize}
|
||||
|
||||
This benchmark tests realistic HTTP traffic patterns and sustained
|
||||
sequential download performance, complementing the synthetic throughput
|
||||
tests.
|
||||
|
||||
\section{Network Impairment Profiles}
|
||||
|
||||
Four impairment profiles simulate a range of network conditions, from
|
||||
ideal to severely degraded. Impairments are applied via Linux traffic
|
||||
control (\texttt{tc netem}) on every machine's primary interface.
|
||||
Table~\ref{tab:impairment_profiles} shows the per-machine values;
|
||||
effective round-trip impairment is approximately doubled.
|
||||
|
||||
\begin{table}[H]
|
||||
\centering
|
||||
\caption{Network impairment profiles (per-machine egress values)}
|
||||
\label{tab:impairment_profiles}
|
||||
\begin{tabular}{lccccc}
|
||||
\hline
|
||||
\textbf{Profile} & \textbf{Latency} & \textbf{Jitter} &
|
||||
\textbf{Loss} & \textbf{Reorder} & \textbf{Correlation} \\
|
||||
\hline
|
||||
Baseline & --- & --- & --- & --- & --- \\
|
||||
Low & 2 ms & 2 ms & 0.25\% & 0.5\% & 25\% \\
|
||||
Medium & 4 ms & 7 ms & 1.0\% & 2.5\% & 50\% \\
|
||||
High & 12 ms & 30 ms & 5.0\% & 10\% & 50\% \\
|
||||
\hline
|
||||
\end{tabular}
|
||||
\end{table}
|
||||
|
||||
The ``Low'' profile approximates a well-provisioned continental
|
||||
connection, ``Medium'' represents intercontinental links or congested
|
||||
networks, and ``High'' simulates severely degraded conditions such as
|
||||
satellite links or highly congested mobile networks.
|
||||
|
||||
A 30-second stabilization period follows TC application before
|
||||
measurements begin, allowing queuing disciplines to settle.
|
||||
|
||||
\section{Experimental Procedure}
|
||||
|
||||
\subsection{Automation}
|
||||
|
||||
The benchmark suite is fully automated via a Python orchestrator
|
||||
(\texttt{vpn\_bench/}). For each VPN under test, the orchestrator:
|
||||
|
||||
\begin{enumerate}
|
||||
\item Cleans all state directories from previous VPN runs
|
||||
\item Deploys the VPN configuration to all machines via Clan
|
||||
\item Restarts the VPN service on every machine (with retry:
|
||||
up to 3 attempts, 2-second backoff)
|
||||
\item Verifies VPN connectivity via a connection-check service
|
||||
(120-second timeout)
|
||||
\item For each impairment profile:
|
||||
\begin{enumerate}
|
||||
\item Applies TC rules via context manager (guarantees cleanup)
|
||||
\item Waits 30 seconds for stabilization
|
||||
\item Executes all benchmarks
|
||||
\item Clears TC rules
|
||||
\end{enumerate}
|
||||
\item Collects results and metadata
|
||||
\end{enumerate}
|
||||
|
||||
\subsection{Retry Logic}
|
||||
|
||||
Tests use a retry wrapper with up to 2 retries (3 total attempts),
|
||||
5-second initial delay, and 700-second maximum total time. The number
|
||||
of attempts is recorded in test metadata so that retried results can
|
||||
be identified during analysis.
|
||||
|
||||
\subsection{Statistical Analysis}
|
||||
|
||||
Each metric is summarized as a statistics dictionary containing:
|
||||
|
||||
\begin{itemize}
|
||||
\item \textbf{min / max:} Extreme values observed
|
||||
\item \textbf{average:} Arithmetic mean across samples
|
||||
\item \textbf{p25 / p50 / p75:} Quartiles via \texttt{statistics.quantiles()}
|
||||
\end{itemize}
|
||||
|
||||
Multi-run tests (ping, nix-cache) aggregate across runs. Per-second
|
||||
tests (qperf, RIST) aggregate across all per-second samples.
|
||||
|
||||
The approach uses empirical percentiles rather than parametric
|
||||
confidence intervals, which is appropriate for benchmark data that
|
||||
may not follow a normal distribution. The nix-cache test (via hyperfine)
|
||||
additionally reports standard deviation.
|
||||
|
||||
\section{Reproducibility}
|
||||
|
||||
Reproducibility is ensured at every layer of the experimental stack.
|
||||
|
||||
\subsection{Dependency Pinning}
|
||||
|
||||
Every external dependency is pinned via \texttt{flake.lock}, which records
|
||||
cryptographic hashes (\texttt{narHash}) and commit SHAs for each input.
|
||||
Key pinned inputs include:
|
||||
|
||||
\begin{itemize}
|
||||
\item \textbf{nixpkgs:} Follows \texttt{clan-core/nixpkgs}, ensuring a
|
||||
single version across the dependency graph
|
||||
\item \textbf{clan-core:} The Clan framework, pinned to a specific commit
|
||||
\item \textbf{VPN sources:} Hyprspace, EasyTier, Nebula locked to exact commits
|
||||
\item \textbf{Build infrastructure:} flake-parts, treefmt-nix, disko,
|
||||
nixos-facter-modules
|
||||
\end{itemize}
|
||||
|
||||
Custom packages not in nixpkgs (qperf, VpnCloud, iperf with auth patches,
|
||||
phantun, EasyTier, Hyprspace) are built from source within the flake.
|
||||
|
||||
\subsection{Declarative System Configuration}
|
||||
|
||||
Each benchmark machine runs NixOS, where the entire operating system is
|
||||
defined declaratively. There is no imperative package installation or
|
||||
configuration drift. Given the same NixOS configuration, two machines
|
||||
will have identical software, services, and kernel parameters.
|
||||
|
||||
Machine deployment is atomic: the system either switches to the new
|
||||
configuration entirely or rolls back.
|
||||
|
||||
\subsection{Inventory-Driven Topology}
|
||||
|
||||
Clan's inventory system maps machines to service roles declaratively.
|
||||
For each VPN, the orchestrator writes an inventory entry assigning
|
||||
machines to roles (e.g., Nebula lighthouse vs.\ peer). The Clan module
|
||||
system translates this into NixOS configuration---systemd services,
|
||||
firewall rules, peer lists, and key references. The same inventory
|
||||
entry always produces the same NixOS configuration.
|
||||
|
||||
\subsection{State Isolation}
|
||||
|
||||
Before installing a new VPN, the orchestrator deletes all state
|
||||
directories from previous runs, including VPN-specific directories
|
||||
(\texttt{/var/lib/zerotier-one}, \texttt{/var/lib/nebula}, etc.) and
|
||||
benchmark directories. This prevents cross-contamination between tests.
|
||||
|
||||
\subsection{Data Provenance}
|
||||
|
||||
Every test result includes metadata recording:
|
||||
|
||||
\begin{itemize}
|
||||
\item Wall-clock duration
|
||||
\item Number of attempts (1 = first try succeeded)
|
||||
\item VPN restart attempts and duration
|
||||
\item Connectivity wait duration
|
||||
\item Source and target machine names
|
||||
\item Service logs (on failure)
|
||||
\end{itemize}
|
||||
|
||||
Results are organized hierarchically by VPN, TC profile, and machine
|
||||
pair. Each profile directory contains a \texttt{tc\_settings.json}
|
||||
snapshot of the exact impairment parameters applied.
|
||||
|
||||
\section{Related Work}
|
||||
|
||||
\subsection{Nix: A Safe and Policy-Free System for Software Deployment}
|
||||
|
||||
Nix addresses significant issues in software deployment by utilizing
|
||||
a technique that employs cryptographic
|
||||
hashes to ensure unique paths for component instances \cite{dolstra_nix_2004}.
|
||||
The system is distinguished by its features, such as concurrent
|
||||
installation of multiple versions and variants,
|
||||
atomic upgrades, and safe garbage collection.
|
||||
These capabilities lead to a flexible deployment system that
|
||||
harmonizes source and binary deployments.
|
||||
Nix conceptualizes deployment without imposing rigid policies,
|
||||
thereby offering adaptable strategies for component management.
|
||||
This contrasts with many prevailing systems that are constrained by
|
||||
policy-specific designs,
|
||||
making Nix an easily extensible, safe and versatile deployment solution
|
||||
for configuration files and software.
|
||||
|
||||
As Clan makes extensive use of Nix for deployment, understanding the
|
||||
foundations and principles of Nix is crucial for evaluating inner workings.
|
||||
cryptographic hashes to ensure unique paths for component instances
|
||||
\cite{dolstra_nix_2004}. Features such as concurrent installation of
|
||||
multiple versions, atomic upgrades, and safe garbage collection make
|
||||
Nix a flexible deployment system. This work uses Nix to ensure that
|
||||
all VPN builds and system configurations are deterministic.
|
||||
|
||||
\subsection{NixOS: A Purely Functional Linux Distribution}
|
||||
|
||||
NixOS is an extension of the principles established by Nix,
|
||||
presenting a Linux distribution that manages system configurations
|
||||
using purely functional methods \cite{dolstra_nixos_2008}. This model
|
||||
ensures that system
|
||||
configurations are reproducible and isolated
|
||||
from stateful interactions typical in imperative models of package management.
|
||||
Because NixOS configurations are built by pure functions, they can overcome the
|
||||
challenges of easily rolling back changes, deploying multiple package versions
|
||||
side-by-side, and achieving deterministic configuration reproduction .
|
||||
The solution is particularly compelling in environments necessitating rigorous
|
||||
reproducibility and minimal configuration drift—a valuable feature
|
||||
for distributed networks .
|
||||
|
||||
Clan also leverages NixOS for system configuration and deployment,
|
||||
making it essential to understand how NixOS's functional model works.
|
||||
|
||||
\subsection{Disnix: A Toolset for Distributed Deployment}
|
||||
|
||||
Disnix extends the Nix philosophy to the challenge of distributed
|
||||
deployment, offering a toolset that enables system administrators and
|
||||
developers to perform automatic deployment of service-oriented
|
||||
systems across a network of machines \cite{van_der_burg_disnix_2014}.
|
||||
Disnix leverages the features of Nix to manage complex intra-dependencies.
|
||||
Meaning dependencies that exist on a network level instead on a binary level.
|
||||
The overlap with the Clan framework is evident in the focus on deployment, how
|
||||
they differ will be explored in the evaluation of Clan's deployment system.
|
||||
|
||||
\subsection{State of the Art in Software Defined Networking}
|
||||
|
||||
The work by Bakhshi \cite{bakhshi_state_2017} surveys the
|
||||
foundational principles and recent developments in Software Defined
|
||||
Networking (SDN). It describes SDN as a paradigm that separates the
|
||||
control plane from the data plane, enabling centralized, programmable
|
||||
control over network behavior. The paper focuses on the architectural
|
||||
components of SDN, including the three-layer abstraction model—the
|
||||
application layer, control layer, and data layer—and highlights the
|
||||
role of SDN controllers such as OpenDaylight, Floodlight, and Ryu.
|
||||
|
||||
A key contribution of the paper is its identification of challenges
|
||||
and open research questions in SDN. These include issues related to
|
||||
scalability, fault tolerance, and the security risks introduced by
|
||||
centralized control.
|
||||
|
||||
This work is relevant to evaluating Clan’s role as a
|
||||
Software Defined Network deployment tool and as a
|
||||
comparison point against the state of the art.
|
||||
|
||||
\subsection{Low Maintenance Peer-to-Peer Overlays}
|
||||
|
||||
Structured Peer-to-Peer (P2P) overlay networks offer scalability and
|
||||
efficiency but often require significant maintenance to handle
|
||||
challenges such as peer churn and mismatched logical and physical
|
||||
topologies. Shukla et al. propose a novel approach to designing
|
||||
Distributed Hash Table (DHT)-based P2P overlays by integrating
|
||||
Software Defined Networks (SDNs) to dynamically adjust
|
||||
application-specific network policies and rules
|
||||
\cite{shukla_towards_2021}. This method reduces maintenance overhead
|
||||
by aligning overlay topology with the underlying physical network,
|
||||
thus improving performance and reducing communication costs.
|
||||
|
||||
The relevance of this work to Clan lies in its addressing of
|
||||
operational complexity in managing P2P networks.
|
||||
NixOS extends Nix principles to Linux system configuration
|
||||
\cite{dolstra_nixos_2008}. System configurations are reproducible and
|
||||
isolated from stateful interactions typical in imperative package
|
||||
management. This property is essential for ensuring identical test
|
||||
environments across benchmark runs.
|
||||
|
||||
\subsection{Full-Mesh VPN Performance Evaluation}
|
||||
|
||||
The work by Kjorveziroski et al. \cite{kjorveziroski_full-mesh_2024}
|
||||
provides a comprehensive evaluation of full-mesh VPN solutions,
|
||||
specifically focusing on their use as underlay networks for
|
||||
distributed systems, such as Kubernetes clusters. Their benchmarks
|
||||
analyze the performance of VPNs with built-in NAT traversal
|
||||
capabilities, including ZeroTier, emphasizing throughput, reliability
|
||||
under packet loss, and behavior when relay mechanisms are used. For
|
||||
the Clan framework, these insights are particularly relevant in
|
||||
assessing the performance and scalability of its Overlay Networks
|
||||
component. By benchmarking ZeroTier alongside its peers, the paper
|
||||
offers an established reference point for evaluating how Mesh VPN
|
||||
solutions like ZeroTier perform under conditions similar to the
|
||||
intricacies of peer-to-peer systems managed by Clan.
|
||||
Kjorveziroski et al.\ \cite{kjorveziroski_full-mesh_2024} provide a
|
||||
comprehensive evaluation of full-mesh VPN solutions for distributed
|
||||
systems. Their benchmarks analyze throughput, reliability under packet
|
||||
loss, and relay behavior for VPNs including ZeroTier.
|
||||
|
||||
\subsection{AMC: Towards Trustworthy and Explorable CRDT Applications}
|
||||
This thesis extends their work in several ways:
|
||||
\begin{itemize}
|
||||
\item Broader VPN selection with emphasis on fully decentralized
|
||||
architectures
|
||||
\item Real-world workloads (video streaming, package downloads)
|
||||
beyond synthetic iperf3 tests
|
||||
\item Multiple impairment profiles to characterize behavior under
|
||||
varying network conditions
|
||||
\item Fully reproducible experimental framework via Nix/NixOS/Clan
|
||||
\end{itemize}
|
||||
|
||||
Jeffery and Mortier \cite{jeffery_amc_2023} present the Automerge
|
||||
Model Checker (AMC), a tool aimed at verifying and dynamically
|
||||
exploring the correctness of applications built on Conflict-Free
|
||||
Replicated Data Types (CRDTs). Their work addresses critical
|
||||
challenges associated with implementing and optimizing
|
||||
operation-based (op-based) CRDTs, particularly emphasizing how these
|
||||
optimizations can inadvertently introduce subtle bugs in distributed
|
||||
systems despite rigorous testing methods like fuzz testing. As part
|
||||
of their contributions, they implemented the "Automerge" library in
|
||||
Rust, an op-based CRDT framework that exposes a JSON-like API and
|
||||
supports local-first and asynchronous collaborative operations.
|
||||
\subsection{Low Maintenance Peer-to-Peer Overlays}
|
||||
|
||||
This paper is particularly relevant to the development and evaluation
|
||||
of the Data Mesher component of the Clan framework, which utilizes
|
||||
state-based (or value-based) CRDTs for synchronizing distributed data
|
||||
across peer-to-peer nodes. While Automerge addresses issues pertinent
|
||||
to op-based CRDTs, the discussion on verification techniques, edge
|
||||
case handling, and model-checking methodologies provides
|
||||
cross-cutting insights to the complexities of ops based CRDTs and is
|
||||
a good argument for using simpler state based CRDTs.
|
||||
|
||||
\subsection{Keep CALM and CRDT On}
|
||||
|
||||
The work by Laddad et al. \cite{laddad_keep_2022} complements and
|
||||
expands upon concepts presented in the AMC paper. By revisiting the
|
||||
foundations of CRDTs, the authors address limitations related to
|
||||
reliance on eventual consistency and propose techniques to
|
||||
distinguish between safe and unsafe queries using monotonicity
|
||||
results derived from the CALM Theorem. This inquiry is highly
|
||||
relevant for the Data Mesher component of Clan, as it delves into
|
||||
operational and observable consistency guarantees that can optimize
|
||||
both efficiency and safety in distributed query execution.
|
||||
Specifically, the insights on query models and coordination-free
|
||||
approaches advance the understanding of how CRDT-based systems, like
|
||||
the Data Mesher, manage distributed state effectively without
|
||||
compromising safety guarantees.
|
||||
Shukla et al.\ propose integrating Software Defined Networks with
|
||||
DHT-based P2P overlays to reduce maintenance overhead
|
||||
\cite{shukla_towards_2021}. Their work on aligning overlay topology
|
||||
with physical networks is relevant to understanding the performance
|
||||
characteristics of mesh VPNs that must discover and maintain peer
|
||||
connectivity dynamically.
|
||||
|
||||
Reference in New Issue
Block a user