Distributed Systems Engineer - Data Platform - Logs and Audit Logs
AI Summary ✨
Requirements
3+ years of experience working in software development covering distributed systems and data pipelines
Strong programming skills (Go is preferable), with a deep understanding of software development best practices for building resilient, high-throughput systems
Hands-on experience with modern observability stacks, including Prometheus, Grafana, and a strong understanding of handling high-cardinality metrics at scale
Strong knowledge of SQL, including experience with query optimization
A solid foundation in computer science, including algorithms, data structures, distributed systems, and concurrency
Strong analytical and problem-solving skills, with a willingness to debug, troubleshoot, and learn about complex problems at high scale
Ability to work collaboratively in a team environment and communicate effectively with other teams across Cloudflare
What You'll Be Doing
Design, build, and operate a robust logging platform, ensuring reliable logging, and secure data transfer to a wide array of customer destinations and third-party integrations
Develop and maintain high-performance data connectors and integrations for our log-shipping products, focusing on usability, scalability, and data integrity
Create and manage systems for handling comprehensive audit logs, ensuring they are delivered securely and adhere to strict compliance and performance standards
Scale and optimize the data delivery pipeline to handle massive data volumes with low latency, identifying and removing bottlenecks in data processing and routing
Work closely with Product and other engineering teams to define requirements for a new logging platform and integrations
Maintain the operational health of our log delivery platform through comprehensive monitoring and participation in an on-call rotation
Collaborate on the architectural evolution of our data egress platform, researching and implementing new technologies to improve efficiency and reliability
Nice to Haves
Experience with data streaming technologies (e.g., Kafka, Flink) is a strong plus
Experience with various logging platforms or SIEMs (e.g., Splunk, Datadog, Sumo Logic) and storage destinations (e.g., S3, R2, GCS) is a plus
Experience with Infrastructure as Code tools like SALT or Terraform is a plus
Experience with Linux container technologies, such as Docker and Kubernetes, is a plus
Perks and Benefits
If you're passionate about building scalable and performant data platforms using cutting-edge technologies and want to work with a world-class team of engineers, then we want to hear from you!