Location: Chennai Work from Office Experience Level: 8-10 yearsTier: T2We are seeking a highly skilled and experienced Senior Data Engineer to lead the design and development of scalable, secure, and high-performance data pipelines hosted on a cloud platform. The ideal candidate will have deep expertise in Databricks, Data Fabric, MDM, Informatica, and Unity Catalog, and a strong foundation in data modelling, software engineering, and DevOps practices.This role is critical to building a next-generation healthcare data platform that will power advanced analytics, operational efficiency, and business innovation.Key ResponsibilitiesData Pipeline Design & DevelopmentTranslate business requirements into actionable technical specifications, defining application components, enhancement needs, data models, and integration workflowsDesign, develop, and optimize end-to-end data pipelines using Databricks, and related cloud-native tools.Create and maintain detailed technical design documentation and provide accurate estimations for storage, compute resources, cost efficiency, and operational readinessImplement reusable and scalable ingestion, transformation, and orchestration patterns for structured and unstructured data sources.Ensure pipelines meet functional and non-functional requirements such as latency, throughput, fault tolerance, and scalability.Cloud Platform & ArchitectureBuild and deploy data solutions on Microsoft Azure, Azure Fabric leveraging Data Lake, Unity Catalog.Integrate pipelines with Data Fabric and Master Data Management (MDM) platforms for consistent and governed data delivery.Follow best practices in cloud security, encryption, access controls, and identity management.Data Modeling & Metadata ManagementDesign robust and extensible data models supporting analytics, AI/ML, and operational reporting.Ensure metadata is cataloged, documented, and accessible through Unity Catalog and MDM frameworks.Collaborate with data architects and analysts to ensure alignment with business requirements.DevOps & CI/CD AutomationAdopt DevOps best practices for data pipelines, including automated testing, deployment, monitoring, and rollback strategies.Work closely with platform engineers to manage infrastructure-as-code, containerization, and CI/CD pipelines.Ensure compliance with enterprise SDLC, security, and data governance policies.Collaboration & Continuous ImprovementPartner with data analysts, and product teams to understand data needs and translate them into technical solutions.Continuously evaluate and integrate new tools, frameworks, and patterns to improve pipeline performance and maintainability.Key Skills & Technologies:Required:Databricks (Delta Lake, Spark, Unity Catalog)Azure Data Platform (Data Factory, Data Lake, Azure Functions, Azure Fabric )Unity Catalog for metadata and data governanceStrong programming skills in Python, , SQLExperience with data modeling, data warehousing, and star/snowflake schema designProficiency in DevOps tools (Git, Azure DevOps, Jenkins, Terraform, Docker)Preferred:Experience with healthcare or regulated industry data environmentsFamiliarity with data security standards (e.g., HIPAA, GDPR)
10- 15 yrs experience in Databrick and exposure to Data/AI platforms. Expertise in Pyspark/Data factory Develop efficient Extract, Load and Transform (ELT/ETL) processes to facilitate seamless data integration, transformation, and loading from various sources into the data platform using Azure and Databricks This includes inbound and outbound data processes. Conduct and support unit and system testing/ SIT/ UAT Support platform deployment and post go-live support Expert in Pyspark and Data Factory