Dipl.-Inf. Torsten Hoefler

Evaluation of publicly available Barrier-Algorithms and Improvement of the Barrier-Operation for large-­scale Cluster-Systems with special Attention on InfiniBand Networks

Dokumente und Dateien

Hinweis

Bitte nutzen Sie beim Zitieren immer folgende Url:

http://nbn-resolving.de/urn:nbn:de:swb:ch1-200500738

Kurzfassung in Englisch

The MPI_Barrier-collective operation, as a part of the MPI-1.1
standard, is extremely important for all parallel applications using it.
The latency of this operation increases the application run time and
can not be overlaid. Thus, the whole MPI performance can be decreased
by unsatisfactory barrier latency. The main goals of this work are to
lower the barrier latency for InfiniBand networks by analyzing well
known barrier algorithms with regards to their suitability within
InfiniBand networks, to enhance the barrier operation by utilizing
standard InfiniBand operations as much as possible, and to design a
constant time barrier for InfiniBand with special hardware support.
This partition into three main steps is retained throughout the whole
thesis. The first part evaluates publicly known models and proposes a
new more accurate model (LoP) for InfiniBand. All barrier algorithms are
evaluated within the well known LogP and this new model. Two new
algorithms which promise a better performance have been developed. A
constant time barrier integrated into InfiniBand as well as a cheap
separate barrier network is proposed in the hardware section. All
results have been implemented inside the Open MPI framework. This work
led to three new Open MPI collective modules. The first one implements
different barrier algorithms which are dynamically benchmarked and
selected during the startup phase to maximize the performance. The
second one offers a special barrier implementation for InfiniBand with RDMA
and performs up to 40% better than the best solution that has been
published so far. The third implementation offers a constant time
barrier in a separate network, leveraging commodity components, with a
latency of only 2.5 microseconds. All components have their specialty and can
be used to enhance the barrier performance significantly.

weitere Metadaten

Schlagwörter
Barrier
Schlagwörter
Collective Operations
Schlagwörter
Collectives
Schlagwörter
InfiniBand
Schlagwörter
Kollektive Operationen
Schlagwörter
LoP Modell
Schlagwörter
LogGP
Schlagwörter
LogGPC
Schlagwörter
LogP
Schlagwörter
MPI_Barrier
Schlagwörter
Open MPI
Schlagwörter
RDMA
SWD SchlagworteCluster Server
SWD SchlagworteMPI <Schnittstelle>
SWD SchlagworteNetzwerk <Graphentheorie>
DDC Klassifikation004
Institution(en) 
HochschuleTU Chemnitz
FakultätFakultät für Informatik
BetreuerDipl.-Inf. Torsten Mehlan
Dipl.-Inf. Frank Mietke
GutachterProf. Dr.-Ing. Wolfgang Rehm
DokumententypDiplomarbeit
SpracheEnglisch
Tag d. Einreichung (bei der Fakultät)01.04.2005
Veröffentlichungsdatum (online)28.06.2005
persistente URNurn:nbn:de:swb:ch1-200500738

Hinweis zum Urheberrecht

Diese Website ist eine Installation von Qucosa - Quality Content of Saxony!
Sächsische Landesbibliothek Staats- und Universitätsbibliothek Dresden