Requirements:

Bachelor’s degree or equivalent practical experience.
2 years of experience with software development in one or more programming languages, or 1 year of experience with an advanced degree.
1 year of experience with ML infrastructure (e.g., model deployment, model evaluation, data processing, debugging).
Experience in C++.
Experience with performance, systems data analysis, visualization tools, or debugging.

Nice to haves:

Master's degree or PhD in Computer Science or related technical fields.
5 years of experience with data structures and algorithms.
Experience in Machine Learning and High Performance Computing (HPC).
Experience optimizing distributed programs at large-scale and experience with compilers and compiler construction.
Excellent debugging and programming concurrent/parallel computations, while working on accelerators including but not limited to VLIW and vector machines, GPUs, or DSPs.

Write product or system development code for the TPU compiler (in C++).
Participate in, or lead design reviews with peers and stakeholders to decide amongst available technologies.
Contribute to a compiler which scales-out machine learning models across accelerators like TPU/Graphics Processing Unit (GPU) at Google and Cloud.
Conduct static and runtime performance analysis of important large-scale production models.
Design and implement performance optimizations and critical features, which increase the velocity of important production teams.

Opportunities to switch teams and projects as you and the fast-paced business grow and evolve.
Versatile work environment.
Leadership qualities development.
Equal employment opportunity with a culture of belonging.
Global collaboration and communication.