ResponsibilitiesTechnical Leadership: Provide technical direction and mentorship to a team of data engineers, ensuring best practices in coding, architecture, and data operations.End-to-End Ownership: Architect, implement, and optimize end-to-end data pipelines that process and transform large-scale datasets efficiently and reliably.Orchestration and Automation: Design scalable workflows using orchestration tools such as Apache Airflow, ensuring high availability and fault tolerance.Data Warehouse and Lake Optimization: Lead the implementation and optimization of Snowflake and data lake technologies like Apache Iceberg for storage, query performance, and scalability.Real-Time and Batch Processing: Build robust systems leveraging Kafka, SQS, or similar messaging technologies for real-time and batch data processing.Cross-Functional Collaboration: Work closely with Data Science, Product, and Engineering teams to define data requirements and deliver actionable insights.Data Governance and Security: Establish and enforce data governance frameworks, ensuring compliance with regulatory standards and maintaining data integrity.scalability and Performance: Develop strategies to optimize performance for systems processing terabytes of data daily while ensuring scalability.Team Building: Foster a collaborative team environment, driving skill development, career growth, and continuous learning within the teamInnovation and Continuous Improvement: Stay ahead of industry trends to evaluate and incorporate new tools, technologies, and methodologies into the organization.QualificationsReuired Skills:8+ years of experience in data engineering with a proven track record of leading data projects or teams.Strong programming skills in Python, with expertise in building and optimizing ETL pipelines.Extensive experience with Snowflake or equivalent data warehouses for designing schemas, optimizing queries, and managing large datasets.Expertise in orchestration tools like Apache Airflow, with experience in building and managing complex workflows.Deep understanding of messaging queues such as Kafka, AWS SQS, or similar technologies for real-time data ingestion and processing.Demonstrated ability to architect and implement scalable data solutions handling terabytes of data.Hands-on experience with Apache Iceberg for managing and optimizing data lakes.Proficiency in containerization and orchestration tools like Docker and Kubernetes for deploying and managing distributed systems.Strong understanding of CI/CD pipelines, including version control, deployment strategies, and automated testing.Proven experience working in an Agile development environment and managing cross-functional team interactions.Strong background in data modeling, data governance, and ensuring compliance with data security standards.Experience working with cloud platforms like AWS, Azure, or GCP.Preferred Skills:Proficiency in stream processing frameworks such as Apache Flink for real-time analytics.Familiarity with programming languages like Scala or Java for additional engineering tasks.Exposure to integrating data pipelines with machine learning workflows.Strong analytical skills to evaluate new technologies and tools for scalability and performance.