\subsection{Measuring Congestion Performance}

\subsubsection{Introducing Netgauge}

To show the benefit of this work, a benchmark had to be developed to
measure the performance of the ESP protocol specifically in the
presence of network congestion. As a starting point for this
benchmark, ``Netgauge'' was chosen, a network performance analysis
tool developed here at the chair of computer architecture.

Netgauge had to be extended to be capable of provoking network
congestion in a cluster environment, because its behavior, when given
a set of peers to measure bandwidth between, was to split them in two
equal-sized groups at random, assigning each peer in one group a
partner in the other group and average out the network performance
between each of these pairs, thus measuring bisection bandwidth. This
scheme of operation does not generate any network congestion in a
cluster environment, as explained in section~\ref{network-congestion}.

\begin{figure}
  \centering
  \input{graphics/netgauge/netgauge-structure.pdf_t}
  \caption[The structure of Netgauge]{The structure of Netgauge. A
    component being on top of another means that it makes use of
    services supplied by the lower component. Among the modules and
    patterns only one may be chosen for a actual run of the
    benchmark.}
  \label{fig:netgauge-structure}
\end{figure}

Therefore Netgauge had to be reworked so it can use different
communication patterns. Netgauge now is organized as a framework
supporting the development of two kinds of components: transport
modules and communication patterns, where every communication pattern
can use any transport module for doing the actual data transmission.
The architecture is shown in figure \ref{fig:netgauge-structure}.

\subsubsection{Transport Modules}
\label{netgauge-modules}

Transport modules are responsible for transmitting the data over the
various the networks and protocols which Netgauge supports. This is
accomplished by having a interface common to all modules in terms of
function prototypes, which are to be implemented.

An important aspect of the transport modules is the abstraction from
the addressing scheme which the underlying protocol uses. The rest of
the program identifies the hosts involved in the current benchmark by
the means of integer numbers ranging within $[0, \enspace n-1]$, where
$n$ is the number of hosts involved in the current run of the
benchmark. For example, the transport modules for the TCP/IP and ESP
protocols internally use look-up tables to map the given integeres to
the file descriptors, which are used for the actual communication.

The communication module is responsible for establishing connections
between the peers involved in the benchmark in a way allowing every
peer to send data to every other peer. Because socket connections are
bidirectional by default, for the TCP/IP and the ESP protocols it is
sufficient if peer $i$ makes connections to peers $[i+1, \enspace
n-1]$, with $i \in \{1, 2, \ldots, n-2\}$.

The most important functions used for communication, which every
module has to implement are:

\begin{itemize}
\item \fname{int sendto(int dst, void *buffer, int size)} \\
  This function is used for sending data to another peer. It takes the
  peer to send data to, a pointer to a buffer containing the data to
  transmit and as the last parameter the amount of data to send to the
  other peer. The returned value gives the number of bytes actually
  sent out with the function call, or be an error code.
\item \fname{int recvfrom(int src, void *buffer, int size)} \\
  This is for receiving data from another peer. It's complementary to
  the \fname{sendto} function and takes the peer to receive data from,
  a pointer where the data should be stored at and the amount of data
  to receive. Similar to the \fname{sendto()} function, the number
  bytes actually received is returned.
\item \fname{int select(int count, int *clients, unsigned long timeout)} \\
  This mimics the POSIX \fname{select(...)} function, but with a
  simpler interface. The purpose of this function is for being able to
  multiplex the communication between a receiver and several senders.
  It is given an array of senders from which incoming messages are
  expected and returns one of them, from which some data has already
  arrived.
\end{itemize}

These functions are to be filled in a \fname{struct} as function
pointers, together with some auxillary information. The implementation
of the communication pattern which shall be benchmarked the makes use
of them to perform the desired data transmissions. The lifecycle of a
communication module is as follows:

\begin{enumerate}
\item The module gets chance to parse its supported command line
  options if it has any, via the module-supplied \fname{getopt()}
  function.
\item The \fname{init()} function is called.  Here the module may
  check its prerequisites (e.g. if a network protocol is available)
  and set it up.  If this function returns non-zero (indicating a
  problem) the modules' life ends here (especially the
  \fname{shutdown()} function will not be called).
\item Otherwise, a call to \fname{setup\_channels()} will instruct the
  module to set up channels between all peers. If this function
  returns non-zero (indicating an error) the next function called will
  be \fname{shutdown()}, being the last in the modules' lifecycle.
\item If setting up the channels succeeds the main program will do
  calls to the data transmission functions in an unpredictable manner,
  controlled by the communication pattern currently used.
\item When the communication pattern is finished with its data
  transmission phase, the main program will call \fname{shutdown()} to
  give the module a chance to free all occupied resources.
\end{enumerate}

The functions \fname{getopt()}, \fname{init()},
\fname{setup\_channels()} and \fname{shutdown()} are
optional.  This means that if such a function is not set by the module
(the function pointer is \NULL), no function call will be made and the
main program will go on as if the function call was made and
succeeded.

\subsubsection{Communication Patterns}
\label{netgauge-patterns}

Communication patterns (``patterns'') implement the actual workings of
a specific benchmark. They control when a peer sends data to another
peer, when to receive data and which times and throughputs to measure.
To achieve this, patterns can make use of the functionality the rest
of Netgauge provides.

In particular this means usage of the communication module which was
selected for data transmission, as well as supporting features for
gathering the statistical information for the benchmark provided by
Netgauge.

\subsubsection{A Communication Pattern to provoke Network Congestion}
\label{netgauge-congestion-pattern}

Having the new design of Netgauge as a foundation, it was an easy task
to implement a pattern which can be used for measurements on network
congestion, especially in a cluster environment. As already mentioned,
the effect of network congestion in a cluster is that the per-port
buffer(s) on the switch cannot hold the data coming in for the host on
the specific port and thus some packets have to be dropped. This
situation shall be provoked by the communication pattern developed
here.

To achieve this, the following scheme of operation was chosed for this
communication pattern: From the set of peers taking part in the
current run of the benchmark, one is chosen at random to act as the
server, and all other peers become clients to this server.
Communication happens only between the server and the clients, there
is no inter-client communication.

The algoritm the server executes is fairly simple. It basically
consumes all data it receives from the clients, and when the full
message from one client is received, a single byte message is sent
back to the specific client. The scheme of sending the full message
only towards the server, but only a single byte message back was
chosen because on the way back to the clients, the data transmission
is not impaired by congestion effects, which would obscure the true
magnitude of network congestion effects. The only complexity on the
server side arises for the need to multiplex the communication using
the \fname{select()} function the module provides. The pseudo code for
the communication pattern is given in
figure~\ref{fig:one-many-pseudo}.  It is worth to mention that the
sockets have to be set to non-blocking operation, as otherwise the
server would wait for all of the data from the client which gets the
first packet to the server in line 7, instead of serving the other
clients in between. Each of these algorithms is executed in a loop
several times, and the measurements are averaged out to compensate for
temporal fluctuations.


\begin{figure}
  \centering
  \lstset{language=C, frame=single, numbers=left, stepnumber=3}
\begin{lstlisting}[caption=Server side]
while (!<enough data received from all clients>)
{
  /* select() from the clients */
  client = select(<set of clients>);

  /* receive client's data */
  size = recvfrom(client, <buffer to recv to>,
     <data size of current run>);

  <record how much data was received from the client>

  if (<this client is done>) {
    /* ping back */
    sendto(client, <buffer>, 1);
  }
}
\end{lstlisting}

\begin{lstlisting}[caption=Client side]
<take before-starting time>

/* send data to server */
sendto(<server>, <buffer>, <data size of current run>);

/* wait for ping */
recvfrom(<server>, <buffer>, 1);

<take after-receiving time>
<compute and store throughput>
\end{lstlisting}
  \caption[Pseudo code of the one-many pattern]{Pseudo code of the
    one-many communication pattern as developed for Netgauge to
    generate a network congestion situation.}
  \label{fig:one-many-pseudo}
\end{figure}


\subsubsection{Results and Comparison}

\begin{figure}
  \centering
  \subfigure[Old implementation of Netgauge] {
    \includegraphics[width=0.85\textwidth]{graphics/netgauge/ng-old-tcpip.pdf}
  }

  \subfigure[Pattern ``one-one'' from the new implementation] {
    \includegraphics[width=0.85\textwidth]{graphics/netgauge/ng-new-tcpip.pdf}
  }
  \caption[Comparison of old and new Netgauge]{This plot shows the
    comparison results for the old and the new implementation of
    Netgauge, using the TCP/IP protocol on cluster 1.}
  \label{fig:cmp-ng-tcp}
\end{figure}

By having the possibility to implement custom communication patters,
Netgauge's fields of application multiplied. Because a communication
pattern is independent from the networking protocol and even the
hardware used, it's a great tool for comparing different
interconnection techniques. And reciprocally, the transport module
being independent from the communication pattern makes it easy to
measure different aspects of networking performance while having to
implement a transport module only once.

An important aspect when making changes to a benchmark tool is the
comparability of the metered results. To validate the results of the
new Netgauge, several test runs were performed, once with the old
version of Netgauge and once with the new one. For the new
implementation the communication pattern ``one-one'' was created,
which mimics the behavior of the old implementation. Some of the
results of these test runs are shown in figure \ref{fig:cmp-ng-tcp},
and the results for the one-many pattern are shown in
section~\ref{ca-stress-test}.

%%% Local Variables:
%%% mode: latex
%%% TeX-master: "main"
%%% IspellDict: "english"
%%% End: 

