Distributed Systems Engineer - Data Platform - Analytics and Alerts
AI Summary ✨
Requirements
3+ years of experience working in software development covering distributed systems and scalable APIs
Strong programming skills (Go is preferable), with a deep understanding of software development best practices for building performant, customer-facing services
Hands-on experience with modern observability stacks, including Prometheus, Grafana, and a strong understanding of handling high-cardinality metrics at scale
Strong knowledge of SQL, including extensive experience with complex query optimization
A solid foundation in computer science, including algorithms, data structures, distributed systems, and concurrency
Strong analytical and problem-solving skills, with a willingness to debug, troubleshoot, and learn about complex problems at high scale
Ability to work collaboratively in a team environment and communicate effectively with other teams across Cloudflare
Nice to Haves
Experience developing and scaling APIs, particularly GraphQL, is a strong plus
Experience with data streaming technologies (e.g., Kafka, Flink) for real-time processing is a plus
Experience with Infrastructure as Code tools like SALT or Terraform is a plus
Experience with Linux container technologies, such as Docker and Kubernetes, is a plus
What You'll Be Doing
Develop and enhance our customer-facing APIs focusing on performance, reliability, and an intuitive user experience
Design, build, and maintain our near real-time alerting platform, from data processing and anomaly detection to reliable notification delivery
Optimize the performance of complex analytical queries that power our APIs and dashboards, working closely with the database platform team
Create intuitive and powerful tools that allow customers to explore their data and configure meaningful alerts based on logs and metrics
Scale our API and alerting infrastructure to support a growing number of internal and external use cases
Collaborate with front-end engineers and product managers to define API contracts and deliver a seamless data experience for our users
Ensure the operational health of our APIs and alerting systems by developing comprehensive monitoring, and participating in an on-call rotation (with the flexibility to be on-call outside of standard working hours as needed)
Perks and Benefits
Passionate about building scalable and performant data platforms using cutting-edge technologies
Work with a world-class team of engineers
Contribute to helping build a better internet for everyone