Big data lead engineer
- Position: Big data engineer
- Primary skills: Big data, Pyspark, AWS
- Location: HYD
- Create Trigger based automation framework for Data Migration
Identify roles/access needed for data migration from federated bucket to managed bucket and Build APIs for the same
Integrate CDMS framework with Lake and Data bridge API
Data migration from S3 Managed to Hadoop On prem
Jobs for Daily and Bulk loads
Test support for AVRO to test lake features
Test support for Compression types like LZO, .ENC to test lake features
ABINITIO integration: Build feature to create operation trigger for ABI pipeline
Movement to new datacenter -SQL server migration
Carlstadt to Ashburn (DR switchover)
Develop and maintain data platforms using Python.
Work with AWS and Big Data, design and implement data pipelines, and ensure data quality and integrity.
Collaborate with cross functional teams to understand data requirements and design solutions that meet business needs .
Implement and manage agents for monitoring, logging, and automation within AWS environments.
Handling migration from PySpark to AWS.
(Secondary) Resource must have hands on development experience with various Ab Initio components such as Rollup Scan join Partition by key Partition by Round Robin Gather Merge Interleave Lookup etc.
Must have experience with SQL database programming SQL performance tuning relational model analysis.
Good knowledge in developing UNIX scripts Oracle SQLPLSQL.
Leverage internal tools and SDKs, utilize AWS services such as S3, Athena, and Glue, and integrate with our internal Archival Service Platform for efficient data purging.
Lead the integration efforts with the internal Archival Service Platform for seamless data purging and lifecycle management.
Collaborate with the data engineering team to continuously improve data integration pipelines, ensuring adaptability to evolving business needs.
- Education: Any degree or equivalent
- Experience: 8 years