Senior Site Reliability Engineer - Production Engineering (Remote - Ireland)
AI Summary ✨
Requirements:
Mastery of Linux (we use Ubuntu but any distro is fine), with the view of debugging ambiguous OS behaviours!!
Command of your favorite modern programming language to appreciate delivering safe and secure services: Python, Typescript, Ruby, Go, Rust, Java, C++, etc.
A solid understanding of Internet fundamental technologies in delivering services on the Internet (TCP/IP, HTTP, DNS, etc).
Experience with public cloud platforms (we use AWS and GCP, but others are also fine) and related tooling (Terraform, Puppet, Chef, Ansible etc.).
Experience with Linux containerisation and orchestration (e.g., Docker, Podman and Kubernetes).
Self-motivated to investigate, fix and improve Yelp in an ever changing environment.
Leading, Collaborating and Sharing technical activities with teams.
Own the total lifecycle of a system.
What you'll be doing:
Working with engineers across Yelp in supporting new features and services.
Integrating tools to monitor platform stability and performance.
Help scale our Kubernetes clusters and AWS-based infrastructure while maintaining our platform's SLOs.
Ensure the reliability of Yelp’s primary datastores (MySQL and Cassandra).
Troubleshoot site issues using industry-leading tools like Splunk, Grafana, and Prometheus.
Automate everything with Python, Puppet, Git, Jenkins, Terraform and more!
Develop custom tools, when off-the-shelf solutions don’t work at our scale and contribute upstream to open source projects.
Design and implement new systems, tests, and procedures.
Foster and build a fun, diverse, and inclusive culture that reflects Yelp’s values.
Bring your curiosity, tenacity and experience.
Participate in light on-call rotations - we have geographically distributed SRE teams for follow-the-sun support, which reduces the need to be on-call 24h a day!
Perks and benefits:
Full responsibility for projects from day one, a collaborative team, and a dynamic work environment.
Competitive salary, a pension scheme, and an optional employee stock purchase plan.
25 days paid holiday (rising to 29 with service), plus one floating holiday.
€150 monthly reimbursement to help cover remote working expenses.
€95 caregiver reimbursement to support dependent care for families.
Private health insurance, including dental and vision.
Flexible working hours and meeting-free Wednesdays.
Regular 3-day Hackathons, bi-weekly learning groups, and productivity spending to support and encourage your career growth.
Opportunities to participate in digital events and conferences.
€95 per month to use toward qualifying wellness expenses.