Job Search and Career Advice Platform

Enable job alerts via email!

Senior HPC Performance Engineer, Multi-GPU Debugging

NVIDIA Corporation

Remote

USD 60,000 - 140,000

Full time

Today
Be an early applicant

Generate a tailored resume in minutes

Land an interview and earn more. Learn more

Job summary

A leading tech company is seeking an experienced engineer specializing in performance analysis to work on multi-GPU clusters. The role involves conducting performance characterization, evaluating solutions, and developing tools for data visualization. The ideal candidate will hold a master's or PhD, possess extensive knowledge in parallel programming, and have experience with CUDA and various communication runtimes. Join us to be part of a dynamic team pushing the boundaries of AI technology.

Benefits

Highly competitive salaries
Extensive benefits package
Promotes diversity and inclusion

Qualifications

  • 3+ years of experience with parallel programming and at least one communication runtime.
  • Good understanding of computer system architecture and HW-SW interactions.
  • Experience debugging performance issues across the HW/SW stack.

Responsibilities

  • Conduct in-depth performance characterization and analysis on large multi-GPU clusters.
  • Evaluate proof-of-concepts; conduct trade-off analysis when multiple solutions are available.
  • Collect performance data; build tools to visualize and analyze the information.

Skills

Parallel programming
Performance benchmarking
CUDA programming
Scripting (Python)
Cloud provisioning

Education

M.S. or PhD in Computer Science or related field

Tools

MPI
NCCL
UCX
Kubernetes
SLURM
Docker
Job description
A leading tech company is seeking an experienced engineer specializing in performance analysis to work on multi-GPU clusters. The role involves conducting performance characterization, evaluating solutions, and developing tools for data visualization. The ideal candidate will hold a master's or PhD, possess extensive knowledge in parallel programming, and have experience with CUDA and various communication runtimes. Join us to be part of a dynamic team pushing the boundaries of AI technology.
Get your free, confidential resume review.
or drag and drop a PDF, DOC, DOCX, ODT, or PAGES file up to 5MB.