Site Reliability Engineer, Platforms Infrastructure Engineering
AI Summary ✨
Requirements:
Bachelor’s degree in Computer Science, a related field, or equivalent practical experience.
2 years of experience with data structures/algorithms and software development in one or more programming languages.
Nice to haves:
Experience in one or more of the following: C, C++, Java, Python, Go, Perl or Ruby.
Experience with Unix based operating systems.
Experience analyzing and troubleshooting systems.
Experience with on-call production incident management.
What you'll be doing:
Ensure the safety, reliability, availability, and performance of new hardware platforms throughout the product's lifecycle, from concept to EOL.
Drive an understanding of production reliability into platform design and development, through consulting, model development, and automation.
Own the characterization and qualification of new platforms. Build increasingly better reliability through closer understanding of the platform's performance and capabilities.
Develop per-platform capability-focused SLOs, monitoring, and alerts to create coherency and consistency in spite of significantly increased platform heterogeneity.
Learn about the software and hardware that underpins Google’s production systems and interact with the development and SRE teams that make the magic happen.
Perks and benefits:
Opportunity to work on large-scale, fault-tolerant systems.
Chance to manage project priorities, deadlines, and deliverables.
Design, develop, test, deploy, maintain, and enhance software solutions.