BS, MS, or PhD in Computer Science, AI, Applied Math, or a related field, or equivalent experience.
10+ years of hands-on experience in AI for natural language processing (NLP) and large language models (LLMs).
Strong problem-solving, debugging, performance analysis, test design, and documentation skills.
Solid mathematical foundations and expertise in AI/DL algorithms.
Excellent written and verbal communication skills, with the ability to work both independently and collaboratively in a fast-paced environment.
Nice to Have:
Experience in accuracy evaluation of LLMs (OpenLLM Leaderboard or HELM).
Hands-on experience with inference and deployment environments like TensorRT, ONNX, or Triton.
Passion for DevOps/MLOps practices in deep learning product development.
Experience running large-scale workloads in high-performance computing (HPC) clusters.
Strong understanding of Linux environments and containerization technologies like Docker.
What You'll Be Doing:
Collaborate closely with our partners and the open-source community to deliver their flagship models as highly optimized NVIDIA Inference Microservices (NIM).
Research and develop innovative deep learning methodologies to accurately evaluate new model families across diverse domains.
Analyze, influence, and enhance AI/DL libraries, frameworks, and APIs, ensuring consistency with the best engineering practices.
Research, prototype, and build robust tools and infrastructure pipelines to support our ground-breaking AI initiatives.
Perks and Benefits:
Outstanding opportunity to craft the future of AI at a fast-growing company.
Work with world-class software engineers and partners.
Deliver the most advanced models with lightning-fast inference.
Access to the most powerful, enterprise-grade GPU clusters capable of hundreds of PetaFLOPS.
Gain early access to unreleased hardware impacting NVIDIA's roadmap and the broader AI landscape.