Gabriel Dax

Machine Learning Research Engineer — CUDA · TensorRT · Distributed Training

I build and optimize deep learning systems end-to-end: from custom CUDA kernels and multi-node distributed training (32 GPUs / DDP / NCCL) to production inference with TensorRT. Currently at Fraunhofer IIS (Munich), working on industrial defect detection, model compression, and vision-language models. PhD from TU Munich — Algorithmic Information Theory in Spatial ML.

CUDA C++20 PyTorch DDP TensorRT NCCL NVIDIA DALI Slurm HPC Model Compression CLIP / SAM
Download CV View Projects
Gabriel Dax