Unsere Mission
Green, Open, Efficient

Blog.

Machine Learning Benchmark.

10.11.2022

Geschrieben von Jakob Buschhaus

Cloud&Heat operates a wide variety of different hardware setups and takes care of the provisioning of computing resources via the security hardened cloud operating system SecuStack. We also continuously optimize our infrastructure in regards to the execution of Machine Learning (ML) workloads. Depending on the use case, we provide a variety of hardware setups by configuring i.e. V100, A100, A10 or T4 graphics cards in conjunction with NVMe block storage, HDD block storage, or with a local SSD. In order to compare different hardware configurations, we use two separate MLCommons benchmarks. With these benchmarks, not only individual components, but rather the entire setup can be tested for its overall performance. The first benchmark is a training benchmark, which trains an ML model for image segmentation. The second is an inference benchmark, which determines the turnaround time of an ML process for image classification and recognition.

Training Benchmark

The benchmark was tested with Linux Ubuntu 20.4 and represents a 3D medical image segmentation task. The model used is a variant of the U-Net3D model, based on the paper “No New-Net” . The dataset (KiTS19) from the Kidney Tumor Segmentation Challenge 2019 is used to train the model.

Overview of the implementation of the benchmark:

First, the Nvidia driver, Nvidia Container Toolkit, Nvidia Docker2 and Docker must be installed.

The corresponding repository must be downloaded from GitHub

git clone https://github.com/mmarcinkiewicz/training.git

The image for the Docker container is created from an existing Dockerfile

docker build unet-3d .

Download the dataset KiTS19.

The data should then be structured as follows:

data
|__ case_00000
|        |__ imaging.nii.gz
|        |__ segmentation.nii.gz
|__ case_00001
|        |__ imaging.nii.gz
|        |__ segmentation.nii.gz
...
|__ case_00209
|        |__ imaging.nii.gz
|        |__ segmentation.nii.gz
|__ kits.json

Then

sudo docker run

is used to start an interactive session in the container.

Various directories from the Shell are mounted in the container so that there is access to the previously downloaded data.

The previously downloaded data is prepared in the container for the ML process and stored as a NumPy array.

Finally, the ML model will train in the container

bash run_and_time.sh <seed>

This command should be executed for seeds in the range {1..9}, the target accuracy should converge to 0.908.

Quality metric is mean (composite) DICE score for classes 1 (kidney) and 2 (kidney tumor).

Afterwards we measure training time for comparison of the different hardware setups.

Inference Benchmark

The benchmark was tested with Linux Ubuntu 20.4. The COCO (Common Objects in Context) dataset, based on the paper „Microsoft COCO: Common objects in context“, is used. The model is ssd-resnet34, which assigns images from the dataset to a category, e.g. „bottle“. The ssd-resnet-34-1200-onnx model is a multiscale SSD based on the ResNet-34 backbone network and is intended to perform object detection. The model has been trained from the COCO image dataset. This model is pre-trained in the PyTorch framework and converted to ONNX format.

Overview of the implementation of the benchmark:

This benchmark is not executed in a container like the previous one, so only CUDA must be installed as a basic requirement besides Torch and NumPy
The corresponding repository must be downloaded from GitHub
git clone https://github.com/mlperf/inference.git
ONNX Runtime ( https://github.com/microsoft/onnxruntime ) is used as ML accelerator
Then the benchmark is created by executing various Python scripts as defined by the benchmark
The ML model can be downloaded directly in the terminal
Annotation and Validation Data are downloaded
- Annotation data is the label data to check the accuracy of the model (.json format)
- Validation data is data with which the model actually processes for evaluation (.jpg format)
Resolution of the data is increased to 1200*1200, this is a requirement for the ssd-resnet34 model (new format of the validation data: .png)
Setting of environment variables MODEL_DIR and DATA_DIR, so that the executed bash script knows where the model and the preprocessed data set COCO-1200 are located
Run the benchmark in the shell with
sudo -E ./run_local.sh onnxruntime ssd-resnet34 gpu
- Other ML accelerators can also be used as ONNX Runtime
- Models other than ssd-resnet34 applicable
- Can also run on CPU
- Bash script accepts a variety of commands (e.g. –count; –time; –accuracy)

Functioning of the benchmark:

1. Benchmark knows the model, dataset, and preprocessing.

2. Benchmark hands dataset sample IDs to LoadGen.

3. LoadGen starts generating queries of sample IDs.

4. Benchmark creates requests to backend.

5. Result is post processed and forwarded to LoadGen.

6. LoadGen outputs logs for analysis.

GitHub Training Benchmark

GitHub Inferenz Benchmark

If you are interested in a benchmark for your used compute infrastructure, don’t hesitate to contact us.

Weitere Blogbeiträge

Fünf Jahre Yaook als Open-Source-Projekt: Einige Überlegungen zur Bedeutung der Community-getriebenen Softwareentwicklung

19.05.2026

Messdaten aus Cloud- und Rechenzentrumsinfrastrukturen visualisieren – Empfehlungen für die Nutzung von Grafana Canvas

10.03.2026

In diesem Tutorial zeigen wir, wie sich Messdaten aus Cloud- und Rechenzentrumsinfrastrukturen mit dem Canvas-Feature von Grafana übersichtlich darstellen lassen. Die Erstellung von diesen Panels hat einige Herausforderungen, die wir hier gezielt adressieren.

Intelligente Skalierung von Automatisierung mit Cloud&Heat: Wie die elevait-Suite die Zukunft KI-getriebener Unternehmen vorantreibt

23.02.2026

Wie man eine skalierbare und souveräne IT-Infrastruktur für das Hosting von KI-Anwendungen aufbaut – bereitgestellt von elevait

Machine Learning Benchmark.

Training Benchmark

Overview of the implementation of the benchmark:

Inference Benchmark

Overview of the implementation of the benchmark:

Functioning of the benchmark:

Weitere Blogbeiträge

Fünf Jahre Yaook als Open-Source-Projekt: Einige Überlegungen zur Bedeutung der Community-getriebenen Softwareentwicklung

Messdaten aus Cloud- und Rechenzentrumsinfrastrukturen visualisieren – Empfehlungen für die Nutzung von Grafana Canvas

Intelligente Skalierung von Automatisierung mit Cloud&Heat: Wie die elevait-Suite die Zukunft KI-getriebener Unternehmen vorantreibt

Green, Open, Efficient.

Kontakt

Verzeichnis

Rechtliches

Social Media

Social Media

Zertifizierungen