🌎
This job posting isn't available in all website languages
📁
Lead Software Engineer
📅
CREQ228587 Requisition #
We are looking for an AI/ML-focused Data Engineer who brings deep expertise in building intelligent data pipelines for unstructured content and is experienced in integrating with modern machine learning ecosystems. The ideal candidate will have hands-on experience in PySpark and Python, with a strong focus on document classification, cleansing, quality metrics, and the ability to work with LLMs, vector databases, and Retrieval-Augmented Generation (RAG) frameworks. Candidates will play a critical role in bridging data engineering and machine learning, enabling the development of AI-first applications across the enterprise.
Key Responsibilities:
Build robust, scalable data processing pipelines for unstructured documents (PDFs, emails, forms, etc.) using PySpark and Python.
Implement document cleansing, classification, and enrichment techniques to prepare high-quality data for AI/ML applications. Develop and integrate data workflows that feed into LLM-based pipelines and support vector-based retrieval using RAG architectures.
Engineer vector embeddings, document chunking, and metadata tagging for semantic search and question-answering systems.
Collaborate closely with AI architect, AI/Data engineers, and platform teams to design end-to-end AI solutions.
Communicate data readiness, pipeline quality, and model integration strategies clearly to both technical and non-technical stakeholders.
Apply Agile methodologies and CI/CD best practices to deliver continuously evolving AI capabilities Required Skills:
Overall 5+ years of commercial experience with 2+ years in relevant role Strong proficiency in PySpark and distributed data frameworks.
Solid experience in core Python, including ML/AI libraries (e.g., Transformers, LangChain, Hugging Face, FAISS, etc.).
Proven expertise in processing unstructured data and document intelligence (OCR, NLP, classification, tagging).
Familiarity with vector databases (e.g., Redis) and embedding models for RAG pipelines.
Understanding of LLM lifecycle, including fine-tuning, inference, and prompt engineering.
Experience working in agile environments, collaborating with cross-functional teams.
Excellent communication skills with the ability to interface with both technical and business stakeholders.

Previous Job Searches

Similar Listings

Dubai, Dubai, United Arab Emirates

📁 Lead Software Engineer

Requisition #: CREQ226459

Dubai, Dubai, United Arab Emirates

📁 Lead Software Engineer

Requisition #: CREQ226457

Dubai, Dubai, United Arab Emirates

📁 Lead Software Engineer

Requisition #: CREQ222345