🌎
This job posting isn't available in all website languages
📁
Lead Software Engineer
📅
CREQ213662 Requisition #
Thanks for your interest in the Databricks + PySpark position. Unfortunately this position has been closed but you can search our 989 open jobs by clicking here.

Detailed Job Description for Databricks + PySpark Developer:

·       Data Pipeline Development: Design, implement, and maintain scalable and efficient data pipelines using PySpark and Databricks for ETL processing of large volumes of data.

·       Cloud Integration: Develop solutions leveraging Databricks on cloud platforms (AWS/Azure/GCP) to process and analyze data in a distributed computing environment.

·       Data Modeling: Build robust data models, ensuring high-quality data integration and consistency across multiple data sources.

·       Optimization: Optimize PySpark jobs for performance, ensuring the efficient use of resources and cost-effective execution.

·       Collaborative Development: Work closely with data scientists, analysts, and other stakeholders to understand data requirements and deliver actionable insights.

·       Automation & Monitoring: Implement monitoring solutions for data pipeline health, performance, and failure detection.

·       Documentation & Best Practices: Maintain comprehensive documentation of architecture, design, and code. Ensure adherence to best practices for data engineering, version control, and CI/CD processes.

·       Mentorship: Provide guidance to junior data engineers and help with the design and implementation of new features and components.

 

Required Skills & Qualifications:

·       Experience: 6+ years of experience in data engineering or software engineering roles, with a strong focus on PySpark and Databricks.

Technical Skills:

·       Proficient in PySpark for distributed data processing and ETL pipelines.

·       Experience working with Databricks for running Apache Spark workloads in a cloud environment.

·       Solid knowledge of SQL, data wrangling, and data manipulation.

·       Experience with cloud platforms (AWS, Azure, or GCP) and their respective data storage services (S3, ADLS, BigQuery, etc.).

·       Familiarity with data lakes, data warehouses, and NoSQL databases (e.g., MongoDB, Cassandra, HBase).

·       Experience with orchestration tools like Apache Airflow, Azure Data Factory, or DBT.

·       Familiarity with containerization (Docker, Kubernetes) and DevOps practices.

·       Problem Solving: Strong ability to troubleshoot and debug issues related to distributed computing, performance bottlenecks, and data quality.

·       Version Control: Proficient in Git based workflows and version control.

·       Communication Skills: Excellent written and verbal communication skills, with the ability to explain complex technical concepts to both technical and non-technical stakeholders.

·       Education: Bachelor or Master’s degree in Computer Science, Engineering, or a related field (or equivalent practical experience).

Previous Job Searches

Similar Listings

Hyderabad, Andhra Pradesh, India

📁 Lead Software Engineer

Requisition #: CREQ251148

Hyderabad, Andhra Pradesh, India

📁 Lead Software Engineer

Requisition #: CREQ248663

Hyderabad, Andhra Pradesh, India

📁 Lead Software Engineer

Requisition #: CREQ251406