Senior Systems Engineer, Site Reliability Engineering
This job is offline
AI Summary ✨
Requirements:
Bachelor’s degree in Computer Science, a related field, or equivalent practical experience
5 years of experience with programming in one or more programming languages
3 years of experience designing, analyzing, and troubleshooting distributed systems and working with Unix/Linux systems internals and administration or networking
2 years of experience leading projects
Nice to haves:
Experience working in computing, distributed systems, storage, or networking
Expertise in designing, analyzing, and troubleshooting large-scale distributed systems
Ability to debug, optimize code, and to automate routine tasks
Systematic problem-solving approach, effective verbal and written communication skills
What you'll be doing:
Improve the whole lifecycle of services from inception and design, through deployment, operation, and refinement
Provide guidance to other team members on managing availability and performance of mission critical services
Maintain services once they are live by measuring and monitoring availability, latency, and overall system health
Scale systems sustainably through mechanisms like automation and evolve systems by driving changes that improve reliability and velocity
Manage support services before they go live through activities such as system design consulting, developing software platforms and frameworks