Enable job alerts via email!

GPU ML Engineer – HPC & Distributed Systems

Oriole Networks Ltd

City of London

On-site

GBP 60,000 - 80,000

Full time

28 days ago

Generate a tailored resume in minutes

Land an interview and earn more. Learn more

Job summary

A technology company in the UK is seeking a talented Machine Learning Engineer to enhance performance of their AI/ML software stack. The role involves designing custom GPU communication kernels and collaborating on large-scale deep learning models, with a strong emphasis on GPU programming and optimization. Applicants should have proficiency in C++ and Python, and expertise in deep learning frameworks, including CUDA programming.

Qualifications

Expertise in high-performance computing or machine learning projects.
Strong understanding of GPU memory hierarchies and kernel optimization.
Solid experience in deploying and optimizing distributed deep learning workloads.

Responsibilities

Design and optimize custom GPU communication kernels.
Develop distributed communication frameworks for deep learning models.
Collaborate with hardware teams for integration of optimized kernels.

Skills

C++

Python

CUDA programming

GPU debugging

Communication libraries knowledge

Distributed deep learning frameworks

Tools

Cuda-gdb

Cuda Memcheck

NSight Systems

Docker

Kubernetes

SLURM

Get your free, confidential resume review.

or drag and drop a PDF, DOC, DOCX, ODT, or PAGES file up to 5MB.

Top cities

Top companies

Popular jobs