In depth experience in a Site Reliability Engineering role
In depth experience running distributed services in a large-scale Linux/Unix environment
Understanding of SRE principles and goals, along with prior on-call experience
Strong programming skills in Python and extensive experience with supporting and debugging Java-based applications in Cloud and Kubernetes environments
Nice to Haves
Deep understanding and experience in one or more of the Big Data technologies like Hadoop, Spark or Flink
Fast learner with excellent analytical problem-solving and interpersonal skills
Experience working with geographically distributed teams and implementing high-level projects and migrations
Strong communication skills and ability to deliver results on time with high quality
What You'll Be Doing
Configure, tune, and fix multi-tiered systems for optimal application performance, stability, and availability
Manage jobs and applications on bare-metal and cloud computing platforms for data processing
Work with exabytes of data, petabytes of memory, and tens of thousands of jobs to enable data analytics for Apple's global products