Software Engineer

Netflix - 1d ago

Senior Software Engineer

Reddit - 4d ago

Next Level JobsEU

Site Reliability Engineer (L4/L5)

AI Summary ✨

Requirements

3+ years of experience as a Site Reliability Engineer or in a similar role
Strong scripting and programming skills (Python, Go, Java or JavaScript/Node.js)
Experience with complex sociotechnical systems and their successful operations at scale
Experience with incident management and response
Experience with Infrastructure as code like Terraform and container orchestration tools like Kubernetes, Docker
Experience with cloud platforms like AWS, microservices architecture, and enterprise software solutions like Slack & GSuite
Excellent communication & collaboration skills and a continuous improvement mindset
Proven ability to cultivate relationships through influence
Proven ability to troubleshoot complex issues and implement effective solutions
Familiarity with Human Factors Engineering
Ability to grow expertise, influence & educate others

What You'll Be Doing

Design, implement, and maintain scalable and reliable infrastructure to support our services.
Collaborate with engineering and product teams to integrate observability, reliability, and security considerations into the entire software development lifecycle.
Develop and implement automation tools for monitoring, deployment, and incident response to ensure efficient and reliable operations.
Conduct or participate in capacity planning, performance analysis, and system tuning to optimize system reliability.
Participate in on-call rotations and contribute to incident response, diagnosis, and resolution.
Implement and improve monitoring and alerting systems to proactively identify and address potential issues.
Implement and maintain robust disaster recovery and business continuity plans.
Continuously evaluate and recommend improvements to enhance system observability and reliability.
Proactively identify sources of instability in distributed systems and analyze how complex systems fail from a reliability and resilience perspective.
Engage with product teams to diagnose operational surprises and drive improvements.
Implement and maintain a robust incident response framework, including blame-aware incident reviews to learn from operational surprises.
Champion a growth mindset and continuous learning culture, encouraging proactive innovation and ongoing skill development.

Nice to Haves

N/A

Perks and Benefits

Inclusion is a core value at Netflix
Equal-opportunity employer promoting diversity and inclusion

Netflix

Warsaw, Poland

Netflix

Warsaw, Poland

Experience: Mid-level

Posted: May 7, 2025

Aws

Docker

Golang

Java

Javascript

Kubernetes

Nodejs

Python

Terraform

sitereliability

More jobs Salary Data

Similar jobs

Staff SRE Engineer (Platform)

Remote EMEA

18 days ago

Remote

June 25, 202518 days ago

Staff Software Engineer, Site Reliability Engineering, Google Cloud

Poland

a month ago

Still looking

June 13, 2025a month ago

Software Engineer, Site Reliability Engineering, Google Cloud

Poland

2 months ago

Still looking

May 20, 20252 months ago

Tech Lead, Senior Site Reliability Engineer

Poland

2 months ago

Still looking

May 15, 20252 months ago

See all jobs in Poland