Job SummaryThe Senior DevOps / Platform Engineer will architect, build, and operate large-scale cloud and Kubernetes platforms, with a strong focus on Alibaba Cloud services. This role drives cloud migrations, defines GitOps-based deployment strategies, and ensures highly reliable, scalable, and secure production systems while partnering closely with SRE and Security teams.
Experience8+ years overall experience in DevOps / Platform EngineeringKey Responsibilities- Architect and drive large-scale migrations of business-critical services to cloud and Kubernetes-based platforms.
- Define and implement GitOps-first deployment strategies using ArgoCD, with Spinnaker for advanced delivery workflows.
- Design, build, and operate production-grade Kubernetes platforms at scale.
- Establish best practices for CI/CD, deployment automation, and release strategies (blue-green, canary, progressive delivery).
- Design and maintain reusable Helm charts and standardized deployment patterns.
- Develop and maintain Python-based tooling and automation for deployment, operations, and reliability.
- Provide deep Linux systems expertise, including performance tuning, debugging, and incident mitigation.
- Own and support production systems, including on-call participation, incident response, and root cause analysis.
- Partner with SRE and Security teams to embed reliability, scalability, and security into platform design.
- Drive architectural reviews, author design documents, and influence long-term platform and migration roadmaps.
- Mentor engineers and raise the bar for DevOps and platform engineering practices.
Required Skills & Experience- Strong DevOps engineering experience in large-scale environments.
- Hands-on expertise with Alibaba Cloud (AliCloud) services:
- ECS
- OSS
- RDS
- VPC
- Strong experience with Kubernetes and container orchestration.
- Expertise in GitOps, ArgoCD, and Spinnaker.
- Experience with Helm, CI/CD pipelines, and deployment automation.
- Proficiency in Python scripting for automation and tooling.
- Strong Linux systems knowledge and troubleshooting skills.
- Experience supporting production environments with high availability requirements.
Competencies- Cloud & Platform Architecture
- DevOps & Release Engineering
- Kubernetes & Container Platforms
- Automation & Reliability Engineering
- Incident Management & RCA
- Mentoring and Technical Leadership
Preferred Skills- Experience with AWS in addition to Alibaba Cloud
- SRE practices and observability tooling
- Security best practices in cloud-native platforms
- Experience with large-scale cloud migration programs