Company Logo
Software Engineer

Netflix - 1d ago

Company Logo
Senior Software Engineer

Reddit - 4d ago

Software Engineer II, Reliability

Requirements:

  • Have experience operating cloud-native production systems and services
  • Write production-quality code (e.g. Python, Go, or similar) to automate operations and improve reliability
  • Understand common failure modes in distributed systems, such as dependency failures, resource exhaustion, and partial outages
  • Have experience working with containerized workloads and platforms (e.g. Kubernetes) in production environments
  • Are comfortable participating in on-call rotations and diagnosing straightforward production issues
  • Have experience using observability tools and responding to alerts
  • Are familiar with SRE concepts such as SLIs, SLOs, and error budgets, and are learning how to apply them in practice
  • Have hands-on experience with infrastructure as code or declarative configuration (e.g. Terraform, Kubernetes manifests)
  • Can follow incident response processes and contribute meaningfully during outages
  • Are comfortable receiving feedback, learning from incidents, and improving your systems over time

Nice to Have:

  • Experience supporting security-sensitive systems or internal platforms
  • Familiarity with AWS or other cloud providers
  • Exposure to messaging or asynchronous systems (e.g. Kafka, RabbitMQ, Celery)
  • Interest in performance testing, capacity planning, or resilience work
  • Practical experience with algorithms and data structures

What You'll Be Doing:

  • Build, operate, and improve production systems with a focus on reliability, scalability, and performance
  • Apply software engineering principles to automate operational tasks and reduce manual toil
  • Contribute to the design and implementation of systems using established SRE best practices
  • Help define and measure SLIs and SLOs for services you support
  • Improve observability through metrics, dashboards, logging, and tracing
  • Participate in on-call rotations and respond to production incidents with guidance and support
  • Assist with incident investigation and contribute to post-incident reviews and follow-up actions
  • Perform basic analysis around system behavior, capacity usage, and scaling characteristics
  • Identify reliability issues or operational pain points and work with teammates to address them
  • Collaborate with product, platform, and security engineers to ship reliable systems
  • Write and maintain clear operational runbooks and system documentation

Perks and Benefits:

  • This role is based in Dublin, Ireland and follows a hybrid working model
  • Klaviyo supports work authorization and relocation for this position
  • Company’s total compensation package may include participation in the company’s annual cash bonus plan, equity, and comprehensive range of health, welfare, and wellbeing benefits based on eligibility
AI Summary ✨
Klaviyo logo

Klaviyo

Dublin, Ireland

Experience: Mid-level
Posted: March 31, 2026
Last seen: an hour ago
Aws
Django
Fastapi
Golang
Kubernetes
Mysql
Python
React
Redis
Terraform
backend

Why we track Klaviyo

Klaviyo is a marketing automation platform focused on e-commerce. They have engineering teams in London and Dublin with strong compensation. The engineering challenges involve large-scale data processing, ML-powered personalization, and real-time messaging infrastructure.

Similar jobs

  • microsoft logo

    Software Engineer

    Dublin, Ireland

    9 hours ago
    New
  • pinterest logo

    Sr. Software Engineer

    Dublin, Ireland

    a day ago
    New
  • kraken logo

    Senior Database Administrator

    UK, Spain, Czech Republic, Sweden, Cyprus, Ireland, Poland, Portugal, Hungary, Lithuania, Switzerland, Bulgaria, Romania

    a day ago
    New
    Remote
  • a day ago
    New
  • See all jobs in Ireland