Bachelor’s degree in Computer Science, a related field, or equivalent practical experience.
5 years of experience with software development in one or more programming languages.
3 years of experience in designing, analyzing, and troubleshooting large-scale distributed systems.
2 years of experience leading projects and providing technical leadership.
Nice to haves:
Experience in designing, analyzing, and troubleshooting large-scale distributed systems.
Excellent organizational, production, and project management skills with the ability to execute on multiple projects in an organized fashion.
Excellent systematic problem-solving approach, with effective verbal and written communication skills.
Excellent leadership, communication and collaboration skills.
What you'll be doing:
Engage in and improve the whole life-cycle of services—from inception and design, through deployment, operation and refinement.
Support services before launch by conducting system design consulting, developing software platforms and frameworks, capacity planning and launch reviews.
Maintain services post-launch by measuring and monitoring availability, latency and overall system health.
Scale systems sustainably through automation, and drive changes that improve reliability and velocity.
Practice sustainable incident response and conduct blameless postmortems.
Perks and benefits:
Site Reliability Engineering combining software and systems engineering.
Intellectual curiosity, problem-solving, and openness culture.
Opportunity to manage complex challenges unique to Google Cloud.
Promote self-direction, support, and mentorship for learning and growth.