Vacancy No. IPE 01-20

IPE 01-20 Internship or Master Thesis: Optimizing network performance for future data acquisition systems

Job description

Progress in virtually all areas of physics research relies on recording and analyzing enormous amounts of data. This is equally true for the high energy physics at LHC, planned future lepton and neutrino detectors, as well as for experiments at high-intensity light sources such as the EU-XFEL or PETRA III. Recent improvements in detector instrumentation provide unprecedented detail to researchers. At the same time data rates far outpace the improvements in the performance of storage systems.  Online data reduction is crucial for the next generation of detectors.

We aim to establish a closer integration of the data acquisition workflows with cloud-enabled HPC centers. The goal of this work is to build infrastructure to push data from the detector directly into the local HPC data center and rely on the HPC resources for data processing and reduction. The rapid advances in the Ethernet technology allow sufficient readout bandwidth, but efficient data distribution methods relaying on RDMA technologies are required to utilize network capacity efficiently. One of the challenges is to design an efficient protocol to facilitate communication between DAQ hardware and data processing cluster and to simplify development of scalable data reduction modules. As a pilot project, we aim to enable deployment of extremely complex Machine Learning models which can be executed across multiple nodes and accelerated using FPGAs, GPUs, or/and custom neuro-computers. We aim to enable real-time data reduction and classification of data streams with rates in the 10 - 20 GB/s range per detector (multi-detector systems are envisaged).

The student is expected to perform a subset of the following tasks:

Benchmark high-speed communication protocols, e.g. UDP, STCP, QUIC. Research available high-throughput alternatives to the standard Linux network stack, e.g. DPDK or LibVMA.

Latest Mellanox adapters allow offloading part of packet processing into the hardware. Investigate the provided features and suggest if they can be used to further increase network throughput.

Evaluate available RDMA extensions to deliver data directly to the computation accelerators like FPGAs or GPUs, e.g. RoCE or iWARP.

Design an application layer protocol integrating Ethernet-connected detectors with data-processing clusters. The protocol should include control channel for setting and reading detector parameters (registers) and high-speed data streaming channel.

Evaluate different methods to scale data flow across multiple cluster nodes. Assess scalability potential, fault tolerance, costs and simplicity of implementation.

Personal qualification

Required Skills: Good understanding of networking and socket programming. Strong knowledge of C or C++ programming language. Deep understanding of Linux network stack and prior experience with DPDK or/and RDMA technologies is a plus.

Organizational unit

Institute for Data Processing and Electronics (IPE)

Starting date

on appointment

Contract Duration

according the study regulations

Contact person in line-management

Suren Chilingaryan    suren.chilingaryan@kit.edu, Phone: +49 721 / 608 26579

Andreas Kopmann     andreas.kopmann@kit.edu, Phone: +49 721 / 608 24910

Application

Please apply online using the button below for this vacancy number IPE 01-20.
Personnel support is provided by 

Ms Schaber
phone: +49 721 608-25184,

Hermann-von-Helmholtz-Platz 1, 76344 Eggenstein-Leopoldshafen, Germany

If qualified, severely disabled persons will be preferred.