Company Logo

Software Engineer

Netflix - 1d ago

Company Logo

Senior Software Engineer

Reddit - 4d ago

Senior HPC Performance Engineer

AI Summary ✨

Requirements

  • M.S. (or equivalent experience) or PHD in Computer Science, or related field with relevant performance engineering and HPC experience
  • 3+ yrs of experience with parallel programming and at least one communication runtime (MPI, NCCL, UCX, NVSHMEM)
  • Experience conducting performance benchmarking and triage on large scale HPC clusters
  • Good understanding of computer system architecture, HW-SW interactions and operating systems principles (aka systems software fundamentals)
  • Implement micro-benchmarks in C/C++, read and modify the code base when required
  • Ability to debug performance issues across the entire HW/SW stack. Proficient in a scripting language, preferably Python
  • Familiar with containers, cloud provisioning and scheduling tools (Kubernetes, SLURM, Ansible, Docker)
  • Adaptability and passion to learn new areas and tools. Flexibility to work and communicate effectively across different teams and timezones

Nice to Haves

  • Practical experience with Infiniband/Ethernet networks in areas like RDMA, topologies, congestion control
  • Experience debugging network issues in large scale deployments
  • Familiarity with CUDA programming and/or GPUs
  • Experience with Deep Learning Frameworks such PyTorch, TensorFlow

What you'll be doing

  • Conduct in-depth performance characterization and analysis on large multi-GPU and multi-node clusters
  • Study the interaction of our libraries with all HW (GPU, CPU, Networking) and SW components in the stack
  • Evaluate proof-of-concepts, conduct trade-off analysis when multiple solutions are available
  • Triage and root-cause performance issues reported by our customers
  • Collect a lot of performance data; build tools and infrastructure to visualize and analyze the information
  • Collaborate with a very dynamic team across multiple time zones

Perks and Benefits

  • Highly competitive salaries
  • Extensive benefits package
  • Work environment that promotes diversity, inclusion, and flexibility
Apply here
NVIDIA logo

NVIDIA

Remote - Germany (Remote)

Experience: Senior
Posted: May 23, 2025
Docker
Golang
Kubernetes
Nodejs
Python
dataengineering

Similar jobs

  • ebay logo

    Data Analyst

    Germany (Remote)

    17 hours ago
    New
    Remote
  • 17 days ago
    Remote
  • 19 days ago
    Remote
  • See all jobs in Germany