Experience with Apache Kafka, Apache Flink, or similar streaming technologies
Experience of designing and developing large-scale software applications using Object-Oriented Design and JVM languages (Java, Scala)
Basic knowledge of data semantics, discovery processes, and data governance
Proficiency in designing and implementing automated data pipelines
Proficiency in programming languages such as Java, Scala, or Python
Bachelor or MS degree in Computer Science or related field
What you'll be doing:
Data Architecture Support: Assist in designing and enhancing data architecture to support real-time and batch processing, focusing on scalability and fault tolerance using technologies like Apache Kafka and Apache Flink
Data Semantics and Discovery: Contribute to implementing systems for effective data semantics management, ensuring data is accurately categorized and easily discoverable
Pipeline Automation: Support the development and maintenance of automated data pipelines to ensure efficient data flow and processing
Data Consumption Optimization: Help to create strategies for generating materialized views and data subsumption to optimize data architecture performance
Cross-functional Collaboration: Collaborate with data scientists, business analysts, and other engineering teams to define and refine data solution requirements
Innovation and Research: Stay updated on industry developments in data engineering and suggest new technologies or methodologies to enhance data infrastructure