Company Logo

Software Engineer

Netflix - 1d ago

Company Logo

Senior Software Engineer

Reddit - 4d ago

DevTech Engineer - Windows LLM and GenAI Open-Source Ecosystem

This job is offline
AI Summary ✨

Requirements

  • 5+ years of professional experience in local GPU deployment, profiling, and optimization.
  • BS or MS degree in Computer Science, Engineering, or related degree.
  • Strong proficiency in C/C++, Python, software design, programming techniques.
  • Familiarity with and development experience on the Windows operating system.
  • Proven theoretical understanding of Transformer architectures, specifically LLMs and Generative AI, and convolutional neural networks.
  • Experience working with open-source LLM and GenAI software, e.g., PyTorch or llama.cpp.
  • Experience with CUDA and NVIDIA's Nsight GPU profiling and debugging suite.
  • Strong verbal and written communication skills in English and organization skills, with a logical approach to problem-solving, time management, and task prioritization skills.
  • Excellent interpersonal skills.
  • Some travel required for conferences and on-site visits with external partners.

Nice to Haves

  • Experience with GPU-accelerated AI inference driven by NVIDIA APIs, specifically cuDNN, CUTLASS, TensorRT.
  • Confirmed expert knowledge in Vulkan and / or DX12.
  • Familiarity with WSL2, Docker.
  • Detailed knowledge of the latest generation GPU architectures.
  • Experience with AI deployment on NPUs and ARM architectures.

What You'll Be Doing

  • Improve Windows LLM & GenAI user experience on NVIDIA RTX by working on feature and performance enhancements of OSS software, including but not limited to projects like PyTorch, llama.cpp, ComfyUI.
  • Engage with internal product teams and external OSS maintainers to align on and prioritize OSS enhancements.
  • Work closely with internal engineering teams and external app developers on solving local end-to-end LLM & Generative AI GPU deployment challenges, using techniques like quantization or distillation.
  • Apply powerful profiling and debugging tools for analyzing most demanding GPU-accelerated end-to-end AI applications to detect insufficient GPU utilization resulting in suboptimal runtime performance.
  • Conduct hands-on training, develop sample code and host presentations to provide guidance on efficient end-to-end AI deployment targeting optimal runtime performance.
  • Guide developers of AI applications applying methodologies for efficient adoption of DL frameworks targeting maximal utilization of GPU Tensor Cores for the best possible inference performance.
  • Collaborate with GPU driver and architecture teams as well as NVIDIA research to influence next generation GPU features by providing real-world workflows and giving feedback on partner and customer needs.

Perks and Benefits

NVIDIA is at the forefront of breakthroughs in Artificial Intelligence, High-Performance Computing, and Visualization. Our teams are composed of driven, innovative professionals dedicated to pushing the boundaries of technology. We offer highly competitive salaries, an extensive benefits package, and a work environment that promotes diversity, inclusion, and flexibility. As an equal opportunity employer, we are committed to fostering a supportive and empowering workplace for all.

Apply here
NVIDIA logo

NVIDIA

Germany

Experience: Senior
Posted: November 28, 2024
Cpp
Docker
Python
machinelearning

Similar jobs

  • 13 days ago
  • adobe logo

    Staff Data Scientist

    Hamburg, Germany

    13 days ago
  • 22 days ago
    Still looking
  • a month ago
    Still looking
  • See all jobs in Germany