Company Logo

Software Engineer

Netflix - 1d ago

Company Logo

Senior Software Engineer

Reddit - 4d ago

Senior DevTech Engineer - Windows LLM and GenAI Open-Source Ecosystem

This job is offline
AI Summary ✨

Requirements:

  • 5+ years of professional experience in local GPU deployment, profiling, and optimization
  • BS or MS degree in Computer Science, Engineering, or related degree
  • Strong proficiency in C/C++, Python, software design, programming techniques
  • Familiarity with and development experience on the Windows operating system
  • Proven theoretical understanding of Transformer architectures - specifically LLMs and Generative AI - and convolutional neural networks
  • Experience working with open-source LLM and GenAI software, e.g., PyTorch or llama.cpp
  • Experience with CUDA and NVIDIA's Nsight GPU profiling and debugging suite
  • Strong verbal and written communication skills in English and organization skills, with a logical approach to problem-solving, time management, and task prioritization skills
  • Excellent interpersonal skills
  • Some travel is required for conferences and for on-site visits with external partners

What you'll be doing:

  • Improve Windows LLM & GenAI user experience on NVIDIA RTX by working on feature and performance enhancements of OSS software, including but not limited to projects like PyTorch, llama.cpp, ComfyUI
  • Engage with internal product teams and external OSS maintainers to align on and prioritize OSS enhancements
  • Work closely with internal engineering teams and external app developers on solving local end-to-end LLM & Generative AI GPU deployment challenges, using techniques like quantization or distillation
  • Apply powerful profiling and debugging tools for analyzing the most demanding GPU-accelerated end-to-end AI applications to detect insufficient GPU utilization resulting in suboptimal runtime performance
  • Conduct hands-on trainings, develop sample code and host presentations to give good guidance on efficient end-to-end AI deployment targeting optimal runtime performance
  • Guide developers of AI applications applying methodologies for efficient adoption of DL frameworks targeting maximal utilization of GPU Tensor Cores for the best possible inference performance
  • Collaborate with GPU driver and architecture teams as well as NVIDIA research to influence next-generation GPU features by providing real-world workflows and giving feedback on partner and customer needs

Nice to haves:

  • Experience with GPU-accelerated AI inference driven by NVIDIA APIs, specifically cuDNN, CUTLASS, TensorRT
  • Confirmed expert knowledge in Vulkan and / or DX12
  • Familiarity with WSL2, Docker
  • Detailed knowledge of the latest generation GPU architectures
  • Experience with AI deployment on NPUs and ARM architectures

Perks and Benefits:

  • Highly competitive salaries
  • Extensive benefits package
  • Work environment that promotes diversity, inclusion, and flexibility
  • Equal opportunity employer committed to fostering a supportive and empowering workplace for all
Apply here
NVIDIA logo

NVIDIA

Germany

Experience: Senior
Posted: November 28, 2024
Cpp
Docker
Python
machinelearning

Similar jobs

  • 14 days ago
  • adobe logo

    Staff Data Scientist

    Hamburg, Germany

    15 days ago
  • 16 days ago
  • 23 days ago
    Still looking
  • See all jobs in Germany