Job Title
AI Engineer / Developer
Role Overview
We are looking for an AI Data Engineer / Developer to design, build, and maintain scalable data pipelines and AI-ready data systems that power machine learning and generative AI applications. This role sits at the intersection of data engineering, software development, and AI, ensuring high-quality, reliable data flows from source to model deployment.
You will work closely with data scientists, ML engineers, and product teams to transform raw data into structured, model-ready datasets and production-grade AI services.
Key ResponsibilitiesData Engineering & Pipelines
Design, build, and maintain scalable data pipelines for structured and unstructured data
Ingest data from multiple sources (databases, APIs, streaming platforms, files, sensors)
Ensure data quality, validation, lineage, and versioning for AI/ML workloads
Optimize data storage and retrieval for performance and cost efficiency
AI & Machine Learning Enablement
Prepare, transform, and feature-engineer datasets for ML and AI models
Support training, evaluation, and deployment of ML and LLM-based systems
Build and maintain data pipelines for model retraining and monitoring
Integrate vector databases and embedding pipelines for AI search and RAG systems
Development & Systems
Develop reusable data and AI services using Python and/or other relevant languages
Build APIs and microservices to serve data and AI outputs
Collaborate on CI/CD pipelines for data and ML workflows
Monitor, debug, and improve production data and AI systems
Collaboration & Governance
Work closely with data scientists, ML engineers, and product teams
Implement data governance, security, and compliance best practices
Document architectures, pipelines, and processes clearly
Required Qualifications
Bachelor’s degree in Computer Science, Engineering, Data Science, or a related field (or equivalent experience)
Strong experience with Python and data engineering frameworks
Experience building ETL/ELT pipelines and working with large datasets
Solid understanding of databases (SQL and NoSQL) and data modeling
Familiarity with machine learning workflows and AI concepts
Experience with cloud platforms (AWS, Azure, or Google Cloud Platform)
Preferred / Nice-to-Have Skills
Experience with ML frameworks (PyTorch, TensorFlow, scikit-learn)
Knowledge of LLM ecosystems (OpenAI, Hugging Face, LangChain, etc.)
Experience with vector databases (Pinecone, FAISS, Weaviate, Milvus)
Familiarity with streaming technologies (Kafka, Spark Streaming, Flink)
Experience with MLOps tools and practices
Understanding of data privacy, security, and compliance standards