Senior Software Engineer, Site Reliability Engineering, Colossus SRE
AI Summary ✨
Requirements
Bachelor’s degree in Computer Science, a related field, or equivalent practical experience.
5 years of experience with software development in one or more programming languages.
3 years of experience in designing, analyzing, and troubleshooting large-scale distributed systems.
2 years of experience leading projects and providing technical leadership.
Nice to haves
Experience working with large-scale storage systems, file systems, or critical infrastructure.
Experience in C++.
Experience in system architecture, performance analysis, capacity planning, and debugging production issues.
Understanding of system and service observability, monitoring, and data analysis.
Ability to lead technical projects, set technical direction, and drive a culture of technical excellence.
What you'll be doing
Perform Site Reliability Engineering work and projects across the service to improve performance and reliability of the service.
Work with partner development, Site Reliability Engineering (SRE) and SRE leadership to design and deliver different programs and projects in a scalable, reliable and secure manner.
Be a full member of the Colossus SRE on-call rotation. Support Colossus at global scale and ensure the very many dependent systems can always deliver.
Drive technical direction of Colossus SRE team.
Help shape team culture and goals of the team by bringing your own knowledge and perspective, and by advocating for SRE best practices.
Perks and Benefits
SRE culture of intellectual curiosity, problem solving, and openness.
An environment that encourages collaboration, thinking big, taking risks in a blame-free setting.
Support and mentorship for personal growth and learning.