Required Qualifications
· 3+ years of hands-on experience with AWS services, including EKS, EC2, S3, IAM, CloudWatch, and ECR.
· Strong experience operating and troubleshooting Kubernetes (preferably AWS EKS).
· Proficiency in containerization (Docker) and orchestration concepts.
· Strong programming/scripting experience in Python and Bash.
· Experience building and managing CI/CD pipelines (GitLab or equivalent).
· Familiarity with machine learning workflows, including training, inference, and model monitoring.
· Experience with infrastructure-as-code (Terraform or CloudFormation).
· Experience supporting production platforms, including incident management and root cause analysis.
---
Preferred Qualifications
· Experience managing Data Analytics Platforms / Tools (e.g., Domino, SageMaker)
· Experience with ML lifecycle tools such as MLflow, or similar.
· Experience supporting GPU-based workloads or distributed training environments.
· Familiarity with enterprise MLOps architectures and patterns (batch, real-time, microservices).
· Understanding of data processing frameworks and feature pipelines.
---
Other Competencies
· Strong analytical, troubleshooting, and problem-solving skills.
· Effective communication and documentation abilities.
· Ability to collaborate across engineering, analytics, and product teams.
· Self-motivated with the ability to drive initiatives independently.
· Ability to work in a complex, regulated enterprise environment