Requirements

10+ years of software development experience with demonstrated progression in technical leadership and impact.
Expert-level proficiency in C/C++ and low-level systems programming with proven track record of delivering order-of-magnitude (10× or greater) performance improvements in production systems.
Extensive experience with CUDA programming, GPU architecture, assembly-level optimization (e.g., Nvidia PTX), and kernel development across multiple hardware platforms.
Demonstrated ability to lead organization-level technical initiatives spanning multiple teams, building consensus on contentious technical decisions and driving architectural strategy.

Nice to Have

Master's degree (or higher) in Computer Science, Computer Engineering, or related technical field with 15+ years of performance engineering experience.
Experience optimizing ML inference and/or training workloads (LLMs, Transformers, CNNs) across diverse hardware: GPUs, AWS Neuron/Inferentia, and other accelerators.
Deep expertise across multiple hardware architectures and platforms (x86, ARM, multiple GPU generations, SoCs, custom accelerators) with ability to quickly master new hardware platforms.
Track record of developing portable, high-performance libraries, tools, or frameworks used across engineering organizations or open-source projects with significant adoption.
Experience leading large-scale optimization initiatives or coordinating performance engineering efforts across multiple teams and organizations.
Proven ability to establish deep understanding of complex systems and create performance measurement/analysis tools that provide critical insights for organization-wide use.
Entrepreneurial experience including startup founding, CTO role, or driving technical vision in product development environments.

Define and drive the technical strategy and architectural roadmap for ML inference and training optimization across multiple teams.
Lead the design and architecture of kernel-level optimizations spanning NVIDIA GPUs, AWS Inferentia/Trainium, and emerging AI accelerators.
Tackle the most difficult performance challenges and drive systems-level innovation.
Establish deep understanding of new hardware platforms and influence hardware selection decisions.
Guide the career growth of senior engineers and develop the next generation of performance engineering leaders.

Work/Life Balance: We value work-life harmony and strive for flexibility in our working culture.
Inclusive Team Culture: Embrace diversity and foster a culture of inclusion through employee-led affinity groups and learning experiences.
Mentorship and Career Growth: Access endless knowledge-sharing, mentorship, and resources to help you develop professionally.