Software Engineer

Netflix - 1d ago

Senior Software Engineer

Reddit - 4d ago

Next Level JobsEU

Site Reliability Engineer, ML Infrastructure, Large Models SRE

This job is offline

AI Summary ✨

Requirements:

Bachelor's degree in Computer Science or a related technical field or equivalent practical experience.
5 years of experience with software development in one or more programming languages.
3 years of experience in designing, analyzing, and troubleshooting distributed systems.
2 years of experience leading projects and providing technical leadership.

Nice to haves:

Experience in Large Language Models/Machine Learning tooling and infrastructure.
Experience in automation, monitoring, and incident response.
Experience in C++, Java, Python, or Go.
Understanding of Site Reliability Engineering (SRE) principles and best practices.
Excellent communication, project and stakeholder management skills.

What you'll be doing:

Design, build, and maintain scalable and reliable Large Model infrastructure.
Scale systems sustainably through mechanisms like automation, and evolve systems by pushing for changes that improve reliability and velocity.
Participate in an oncall incident response, be a part of the oncall rotation and practice blameless postmortems.
Practice sustainable incident response and blameless postmortems.
Implement best practices in SRE, including automation, monitoring, and incident response.

Perks and benefits:

Google is proud to be an equal opportunity and affirmative action employer.
Opportunity to work on challenging projects with unique scale at Google.
Collaborative and intellectually stimulating work environment.

Google

London, UK

Google

London, UK

Experience: Mid-level

Posted: August 1, 2025

Golang

Java

Nodejs

Python

sitereliability

More jobs Salary Data

Similar jobs

Site Reliability Engineer, Observability, London, T2

UK

4 days ago

August 27, 20254 days ago

Software Engineer III, Site Reliability Engineering

London, UK

4 days ago

August 26, 20254 days ago

Streaming Operations Engineer

London, UK

10 days ago

August 21, 202510 days ago

Software Engineer (DevOps) - Database Reliability

UK (Remote)

10 days ago

Remote

August 20, 202510 days ago

See all jobs in UK