Experience as a Site Reliability Engineer, DevOps Engineer, or Software Engineer focused on infrastructure in a large-scale distributed environment.
Strong software development skills in a language like Swift, Go, or Python, and a high degree of comfort with shell scripting (Bash).
Hands-on experience building and managing systems with container orchestration tools (Kubernetes, Docker).
Deep understanding of networking (TCP/IP, DNS, HTTP) and experience using observability tools (monitoring, logging, tracing) to diagnose complex issues.
Excellent problem-solving and communication skills, with a strong sense of ownership and drive.
Preferred Qualifications
Proven experience leading initiatives to reduce technical debt, refactor systems, or improve performance and latency.
Expertise in performance analysis and capacity planning for global, distributed systems.
Experience with large-scale distributed databases (e.g., Cassandra, FoundationDB) or messaging systems (e.g., Kafka).
Demonstrated ability to lead incident response for high-impact outages.
Familiarity with using Generative AI (GenAI) or Large Language Models (LLMs) to accelerate operational tasks, such as automating runbooks, generating scripts, or analyzing incident data.
What You'll Be Doing
Shape the future of how Apple delivers software to millions of customers.
Work with a team dedicated to engineering excellence, reusable design, and simplicity.
Build the next generation of release technologies that power Apple's development lifecycle.
Mentor team members and collaborate to build resilient, high-quality systems.
Perks and Benefits
Passionate about solving complex problems at scale.