several fixups discussed on tuesday

This commit is contained in:
2026-03-06 17:53:26 +01:00
parent b6ac4e20bf
commit 0a0ca0800a

View File

@@ -93,109 +93,91 @@ making key material part of the reproducible configuration.
The benchmark suite includes both synthetic throughput tests and The benchmark suite includes both synthetic throughput tests and
real-world workloads. This combination addresses a limitation of prior real-world workloads. This combination addresses a limitation of prior
work that relied exclusively on iperf3. work that relied exclusively on iperf3.
Table~\ref{tab:benchmark_suite} summarises each benchmark.
\begin{table}[H]
\centering
\caption{Benchmark suite overview}
\label{tab:benchmark_suite}
\begin{tabular}{llll}
\hline
\textbf{Benchmark} & \textbf{Protocol} & \textbf{Duration} & \textbf{Key Metrics} \\
\hline
Ping & ICMP & 3 runs $\times$ 100 pkts & RTT, packet loss \\
TCP iPerf3 & TCP & 30 s & Throughput, retransmits, CPU \\
UDP iPerf3 & UDP & 30 s & Throughput, jitter, packet loss \\
Parallel iPerf3 & TCP & 60 s & Throughput under contention \\
QPerf & QUIC & 30 s & Bandwidth, TTFB, conn. time \\
RIST Streaming & RIST & 30 s & Bitrate, dropped frames, RTT \\
Nix Cache Download & HTTP & 2 runs & Download duration \\
\hline
\end{tabular}
\end{table}
The first four benchmarks use well-known network testing tools.
The remaining three target workloads that are closer to real-world
usage. The subsections below describe the configuration details
that the table does not capture.
\subsection{Ping} \subsection{Ping}
Measures ICMP round-trip latency and packet delivery reliability. Sends 100 ICMP echo requests at 200\,ms intervals with a 1-second
per-packet timeout, repeated for 3 runs.
\begin{itemize} \subsection{TCP and UDP iPerf3}
\bitem{Method:} 100 ICMP echo requests at 200 ms intervals,
1-second per-packet timeout, repeated for 3 runs.
\bitem{Metrics:} RTT (min, avg, max, mdev), packet loss percentage,
per-packet RTTs.
\end{itemize}
\subsection{TCP iPerf3} Both tests run for 30 seconds in bidirectional mode with zero-copy
(\texttt{-Z}) to minimize CPU overhead. The UDP variant additionally
Measures bulk TCP throughput with iperf3 sets unlimited target bandwidth (\texttt{-b 0}) and enables 64-bit
a common tool used in research to measure network performance. counters.
\begin{itemize}
\bitem{Method:} 30-second bidirectional test zero-copy mode
(\texttt{-Z}) to minimize CPU
overhead.
\bitem{Metrics:} Throughput (bits/s), retransmits, congestion
window and CPU utilization.
\end{itemize}
\subsection{UDP iPerf3}
Measures bulk UDP throughput with the same flags as the TCP Iperf3 benchmark.
\begin{itemize}
\bitem{Method:} plus unlimited target bandwidth (\texttt{-b 0}) and
64-bit counters flags.
\bitem{Metrics:} Throughput (bits/s), jitter, packet loss and CPU
utilization.
\end{itemize}
\subsection{Parallel iPerf3} \subsection{Parallel iPerf3}
Tests concurrent overlay network traffic by running TCP streams on all machines Runs TCP streams on all three machines simultaneously in a circular
simultaneously in a circular pattern (A$\rightarrow$B, pattern (A$\rightarrow$B, B$\rightarrow$C, C$\rightarrow$A) for
B$\rightarrow$C, C$\rightarrow$A) for 60 seconds. This simulates 60 seconds with zero-copy (\texttt{-Z}). This creates contention
contention across the overlay network. across the overlay network, stressing shared resources that
single-stream tests leave idle.
\begin{itemize}
\bitem{Method:} 60-second bidirectional test zero-copy mode
(\texttt{-Z}) to minimize CPU
overhead.
\bitem{Metrics:} Throughput (bits/s), retransmits, congestion
window and CPU utilization.
\end{itemize}
\subsection{QPerf} \subsection{QPerf}
Measures connection-level QUIC performance rather Spawns one qperf process per CPU core, each running for 30 seconds.
than bulk UDP or TCP throughput. Per-core bandwidth is summed per second. Unlike the iPerf3 tests,
QPerf targets QUIC connection-level performance, capturing time to
\begin{itemize} first byte and connection establishment time alongside throughput.
\bitem{Method:} One qperf process per CPU core in parallel, each
running for 30 seconds. Bandwidth from all cores is summed per second.
\bitem{Metrics:} Total bandwidth (Mbps), CPU usage, time to first
byte (TTFB), connection establishment time.
\end{itemize}
\subsection{RIST Video Streaming} \subsection{RIST Video Streaming}
Measures real-time multimedia streaming performance. Generates a 4K ($3840\times2160$) H.264 test pattern at 30\,fps
(ultrafast preset, zerolatency tuning, 25\,Mbps target bitrate) with
\begin{itemize} ffmpeg and transmits it over the RIST protocol for 30 seconds. RIST
\bitem{Method:} The sender generates a 4K ($3840\times2160$) test (Reliable Internet Stream Transport) is designed for low-latency
pattern at 30 fps using ffmpeg with H.264 encoding (ultrafast preset, video contribution over unreliable networks, making it a realistic
zerolatency tuning) at 25 Mbps target bitrate. The stream is transmitted test of VPN behavior under multimedia workloads. In addition to
over the RIST protocol to a receiver on the target machine for 30 seconds. standard network metrics, the benchmark records encoding-side
\bitem{Encoding metrics:} Actual bitrate, frame rate, dropped frames. statistics (actual bitrate, frame rate, dropped frames) and
\bitem{Network metrics:} Packets dropped, packets recovered via RIST-specific counters (packets recovered via retransmission, quality
RIST retransmission, RTT, quality score (0--100), received bitrate. score).
\end{itemize}
RIST (Reliable Internet Stream Transport) is a protocol designed for
low-latency video contribution over unreliable networks, making it a
realistic test of VPN behavior under multimedia workloads.
\subsection{Nix Cache Download} \subsection{Nix Cache Download}
Measures sustained HTTP download performance of many small files A Harmonia Nix binary cache server on the target machine serves the
using a real-world workload. Firefox package. The client downloads it via \texttt{nix copy}
through the VPN, exercising many small HTTP requests rather than a
\begin{itemize} single bulk transfer. Benchmarked with hyperfine (1 warmup run,
\bitem{Method:} A Harmonia Nix binary cache server on the target 2 timed runs); the local Nix store and SQLite metadata are cleared
machine serves the Firefox package. The client downloads it via between runs.
\texttt{nix copy} through the VPN. Benchmarked with hyperfine:
1 warmup run followed by 2 timed runs. The local cache and Nix's
SQLite metadata are cleared between runs.
\bitem{Metrics:} Mean duration (seconds), standard deviation,
min/max duration.
\end{itemize}
\section{Network Impairment Profiles} \section{Network Impairment Profiles}
Four impairment profiles simulate a range of network conditions, from To evaluate VPN performance under different network conditions, four
ideal to severely degraded. Impairments are applied via Linux traffic impairment profiles are defined, ranging from an unmodified baseline
control (\texttt{tc netem}) on every machine's primary interface. to a severely degraded link. All impairments are injected with Linux
Table~\ref{tab:impairment_profiles} shows the per-machine values; traffic control (\texttt{tc netem}) on the egress side of every
effective round-trip impairment is approximately doubled. machine's primary interface.
Table~\ref{tab:impairment_profiles} lists the per-machine values.
Because impairments are applied on both ends of a connection, the
effective round-trip impact is roughly double the listed values.
\begin{table}[H] \begin{table}[H]
\centering \centering
@@ -214,12 +196,30 @@ effective round-trip impairment is approximately doubled.
\end{tabular} \end{tabular}
\end{table} \end{table}
The correlation column controls how strongly each packet's impairment Each column in Table~\ref{tab:impairment_profiles} controls one
depends on the preceding packet. At 0\% correlation, loss and aspect of the simulated degradation:
reordering events are independent; at higher values they occur in
bursts, because a packet that was lost or reordered increases the \begin{itemize}
probability that the next packet suffers the same fate. This produces \item \textbf{Latency} is a constant delay added to every outgoing
realistic bursty degradation rather than uniformly distributed drops. packet. For example, 2\,ms on each machine adds roughly 4\,ms to
the round trip.
\item \textbf{Jitter} introduces random variation on top of the
fixed latency. A packet on the Low profile may see anywhere
between 0 and 4\,ms of total added delay instead of exactly
2\,ms.
\item \textbf{Loss} is the fraction of packets that are silently
dropped. At 0.25\,\% (Low profile), roughly 1 in 400 packets is
discarded.
\item \textbf{Reorder} is the fraction of packets that arrive out
of sequence. \texttt{tc netem} achieves this by giving selected
packets a shorter delay than their predecessors, so they overtake
earlier packets.
\item \textbf{Correlation} determines whether impairment events are
independent or bursty. At 0\,\%, each packet's fate is decided
independently. At higher values, a packet that was lost or
reordered raises the probability that the next packet suffers the
same fate, producing the burst patterns typical of real networks.
\end{itemize}
A 30-second stabilization period follows TC application before A 30-second stabilization period follows TC application before
measurements begin, allowing queuing disciplines to settle. measurements begin, allowing queuing disciplines to settle.