Data Science-ML Ops Engineer
We are seeking a highly capable Senior ML Engineer / MLOps Engineer with strong experience in building, deploying, and scaling machine learning systems in production. The ideal candidate will have hands-on expertise across the end-to-end ML lifecycle, including data pipelines, model development, deployment, and monitoring, along with a strong foundation in cloud-native architectures. This role requires close collaboration with data scientists and stakeholders to operationalize ML models for business-critical use cases such as personalization, recommendations, and NLP, while ensuring scalability, reliability, and performance in production environments.
Design, develop, and deploy end-to-end ML pipelines covering data ingestion, transformation, feature engineering, model training, evaluation, and production deployment. Deploy and scale ML models on cloud platforms such as AWS (SageMaker, EKS, Lambda) or GCP (Vertex AI, GKE, Cloud Functions), ensuring robust and cost-efficient architectures. Build and maintain CI/CD/CT pipelines using tools like GitHub Actions, Jenkins, or cloud-native services to automate model training, testing, and deployment. Containerize applications using Docker and orchestrate using Kubernetes, while managing infrastructure through Terraform or CloudFormation.
Implement model lifecycle management practices, including model registries, versioning, and feature stores (e.g., MLflow, Feast), and establish strong observability frameworks using Prometheus and Grafana. Develop monitoring systems to track ML performance metrics, data drift, model drift, and overall model health, ensuring timely retraining and optimization. Build scalable data pipelines using Airflow, Spark, and SQL, and work with orchestration tools such as Apache Airflow or AWS Step Functions. Collaborate closely with data scientists to productionize ML models for real-time and batch inference, enable A/B testing where applicable, and ensure smooth delivery of client-facing solutions. Provide mentorship to junior engineers and drive adoption of best practices in MLOps and software engineering.
Strong experience in ML Engineering / MLOps with demonstrated delivery of end-to-end ML solutions in production environments. Proficiency in Python and advanced SQL, along with hands-on experience in ML frameworks such as Scikit-learn, TensorFlow, and PyTorch. Solid understanding of machine learning algorithms, evaluation techniques, performance metrics, and validation strategies. Hands-on expertise in cloud platforms (AWS or GCP), containerization (Docker), orchestration (Kubernetes), and CI/CD tools (Jenkins, GitHub Actions). Familiarity with MLflow, Feast, Prometheus, Grafana, and modern model monitoring practices including data and model drift detection.
Strong problem-solving, communication, and stakeholder management skills with the ability to work independently in fast-paced environments. Bachelor's degree in computer science, Engineering, or a related field preferred.
Nice-to-have: Experience with real-time ML serving frameworks (KFServing, Seldon, Ray Serve), A/B testing, and experimentation platforms. Exposure to media, subscription, or recommender systems, along with knowledge of experiment design and causal inference, will be an added advantage.
Required Skills: API Gateway, AWS Lambda, AWS Managed Services, AWS VPC, Adapting To Change, Attention To Consistency, Azure IAM, Cloud DevOps, Docker Containerization, Interpersonal Dynamics with Coworkers, Kubernetes, Linux Scripting, Python Programming Language, Results Orientation, Ruby On Rails, Time Management Skills, Unix Shell Scripting, Working under Pressure, Writing Communication Skills