Airflow (MWAA) Orchestration Engineer
Job Description for Airflow MWAA Orchestration Engineer in Chennai, TN:
Role Summary
We are seeking an Airflow MWAA Orchestration Engineer to build and support orchestration for an Unstructured Data Cache platform The pipeline includes BA Insight for content ingestion PostgreSQL as the raw cache Fivetran for data movement BIGID for NER and classification enrichment and Snowflake as the enriched cache Observability spans Splunk and CloudWatch for platform metrics and ACCELDATA for data quality and data observability
Key Responsibilities:
Develop and maintain Airflow MWAA DAGs orchestrating BA Insight to PostgreSQL to Fivetran to BIGID to Snowflake workflows
Trigger monitor and automate BA Insight ingestion cycles and extraction health checks
Manage raw cache pipelines into PostgreSQL including metadata normalization and document lifecycle handling
Orchestrate and monitor Fivetran syncs implement retry failure handling and connector level alerting
Integrate Airflow with BIGID APIs to run NER and classification jobs and harvest enrichment outputs
Build Snowflake processing steps for staging to enriched to consumption zones using COPY MERGE patterns
Implement data reconciliation counts checksums schema drift detection and idempotent loads
Use Splunk and CloudWatch for infrastructure server level observability and MWAA health
Integrate ACCELDATA for freshness volume data quality and anomaly alerts automate remediation workflows
Apply IAM least privilege secrets management and secure data handling practices for PII PHI enrichment data
Produce runbooks documentation and support handover to downstream consumers
Required Skills:
2 to 5 plus years hands on Airflow 2x experience MWAA preferred
Strong Python for building DAGs operators and API integrations
Strong AWS basics IAM S3 KMS Secrets Manager VPC networking
Good to have:
Experience with PostgreSQL queries performance tuning metadata handling
Experience with Snowflake COPY INTO MERGE stages warehouse tuning
Practical experience with Fivetran connectors and APIs for monitoring and orchestration
Experience integrating with BIGID for NER classification workflows
Solid understanding of unstructured content workflows PDF DOCX emails HTML OCR
Working knowledge of Splunk CloudWatch and ACCELDATA for observability