We are seeking a highly skilled Real-Time Data Streaming Engineer with strong expertise in designing and building low-latency, event-driven data pipelines. You will be responsible for architecting, implementing, and optimizing streaming platforms using Apache Kafka, Apache Flink, and Spark Streaming. The ideal candidate has hands-on experience with large-scale, high-throughput systems, ensuring data reliability, scalability, and performance for enterprise-grade solutions.
Details:
Location: Remote in EU
Employment Type: Full-Time, B2B Contract
Start Date: ASAP
Language Requirements: Fluent English
Key Responsibilities
- Design, develop, and maintain real-time data streaming pipelines using Kafka, Flink, and Spark Streaming.
- Architect event-driven and microservices-based solutions for real-time analytics and processing.
- Implement data ingestion, transformation, and enrichment workflows across distributed systems.
- Optimize performance and scalability of streaming jobs for high-throughput, low-latency environments.
- Ensure data quality, governance, and fault tolerance within the streaming infrastructure.
- Integrate streaming solutions with data warehouses, data lakes, and cloud platforms (AWS, Azure, GCP).
- Collaborate with data engineers, data scientists, and application teams to deliver business-critical real-time insights.
- Monitor, troubleshoot, and improve the reliability and resilience of streaming systems.
- Participate in system design, code reviews, and best practices development.
Requirements
- 5+ years of experience in data engineering with at least 3+ years focused on real-time streaming.
- Strong expertise in Apache Kafka (producers, consumers, Connect, Streams, schema registry).
- Hands-on experience with Apache Flink or Spark Streaming for real-time data processing.
- Solid understanding of event-driven architectures, pub/sub systems, and distributed computing.
- Strong programming skills in Java, Scala, or Python.
- Proficiency in SQL and experience with databases (relational and NoSQL: PostgreSQL, Cassandra, MongoDB, etc.).
- Familiarity with cloud-native streaming solutions (AWS Kinesis, Azure Event Hubs, GCP Pub/Sub).
- Knowledge of CI/CD, containerization (Docker, Kubernetes), and monitoring tools (Prometheus, Grafana, ELK).
- Strong problem-solving and debugging skills, with experience in large-scale production environments.
Nice to Have
- Knowledge of data lakehouse architectures (Delta Lake, Iceberg, Hudi).
- Experience with machine learning pipelines on streaming data.
- Familiarity with message brokers (RabbitMQ, Pulsar, ActiveMQ).
- Background in industries like fintech, telecom, IoT, or e-commerce where real-time data is critical.
- Contributions to open-source streaming projects.