Senior Data Engineer

Role Overview: We are seeking a skilled Data Engineer to join our team and play a crucial role in the development and maintenance of our recommendation system. As a Data Engineer, you will be responsible for designing and implementing the data infrastructure that powers the system, ensuring the seamless flow of data and the availability of high-quality information to drive personalized content recommendations.

Key Responsibilities:

Design and implement a scalable, fault-tolerant, and highly available data pipeline to capture, process, and store user engagement data in real-time.
Develop efficient data storage solutions, including the collisionless embedding table, to effectively represent and retrieve user data and content metadata.
Optimize data processing and transformation workflows to enable the continuous training and adaptation of the recommendation model.
Ensure the reliability, performance, and scalability of the data infrastructure to handle the growing volume and velocity of user interactions.
Collaborate with the machine learning engineering team to understand their data requirements and provide the necessary data products to support the development and deployment of the recommendation system.
Implement robust data monitoring, alerting, and troubleshooting mechanisms to maintain the overall health and reliability of the data ecosystem.
Continuously explore and evaluate new data technologies, tools, and techniques to enhance the efficiency and capabilities of the data infrastructure.
Document data pipelines, processes, and best practices to maintain system transparency and enable cross-team knowledge sharing.

Required Qualifications:

Bachelor's or Master's degree in Computer Science, Data Engineering, or a related technical field
3+ years of experience in designing and implementing large-scale data pipelines and data infrastructure
Proficient in Python and familiarity with data processing frameworks like Apache Spark, Apache Flink, or Apache Kafka
Strong understanding of data modeling, data warehousing, and distributed data storage solutions (e.g., Hadoop, Hive, Cassandra)
Experience with real-time data processing and stream processing architectures
Solid understanding of data engineering best practices, including data quality, data security, and data governance
Ability to work collaboratively in a cross-functional team environment and communicate technical concepts to stakeholders

Key Responsibilities:

Required Qualifications:

Desired Skills: