Job Summary:
Rubrik is a company focused on data protection and AI operations. They are hiring a Senior DevOps Infrastructure Engineer to join their Infrastructure team, responsible for the automation, reliability, and scalability of private cloud environments, while driving best practices in DevOps and platform engineering.
Responsibilities:
• Own the design and development of Infrastructure-as-Code (IaC) solutions for private cloud platforms (OpenStack, OpenShift, OLVM, Nutanix, Huawei, VMware) using Terraform, Ansible and Python.
• Build and maintain self-service provisioning pipelines that allow engineering teams to spin up compute, network, and storage resources on demand across private cloud environments.
• Drive end-to-end automation of platform lifecycle operations — from cluster bootstrapping and OS imaging to patching, scaling, and decommissioning.
• Design and implement CI/CD pipelines for infrastructure changes, ensuring all platform updates are tested, versioned, and deployed through automated workflows.
• Develop and maintain platform observability stacks — metrics, logging, alerting, and dashboards — to ensure proactive detection and rapid resolution of infrastructure issues.
• Define and enforce SLOs/SLAs for platform services; lead blameless post-mortems and drive systematic elimination of toil through automation.
• Champion GitOps and DevOps best practices across the team — code reviews, version control for all infrastructure, and automation-first problem solving.
• Collaborate closely with Engineering, Security, and Networking teams to design scalable, secure infrastructure architectures that meet evolving R&D requirements.
• Evaluate and integrate new private cloud technologies and tooling, providing technical recommendations and proof-of-concept implementations.
• Serve as a technical escalation point for complex platform issues — conducting root cause analysis and driving permanent fixes rather than workarounds.
• Maintain comprehensive runbooks, architecture diagrams, and platform documentation as living artifacts within version control.
• Mentor team members on DevOps tooling, automation patterns, and platform engineering principles.
Qualifications:
Required:
• Degree in Engineering, Computer Science, or a related field, or equivalent practical experience.
• 6+ years of experience in DevOps, platform engineering, or infrastructure engineering roles.
• 4+ years of hands-on experience with private cloud platforms — OpenStack, OpenShift, OLVM, Nutanix, Huawei, or equivalent.
• Strong proficiency in Infrastructure-as-Code using Terraform and Ansible; Python and/or shell scripting for automation is essential.
• Solid experience building and maintaining CI/CD pipelines (e.g. GitLab CI, Jenkins, GitHub Actions, ArgoCD) for infrastructure delivery.
• Experience with container orchestration — Kubernetes/OpenShift — and understanding of cloud-native deployment patterns.
• Proficiency with observability tooling: Prometheus, Grafana, ELK/EFK stack, or equivalent.
• Solid understanding of Linux systems, networking (DHCP, DNS, BGP/VXLAN overlays), and storage protocols (NFS, iSCSI, FC).
• Strong communication and documentation skills — able to write clear RFCs, runbooks, and architecture proposals.
• Self-driven, collaborative, and comfortable operating with a high degree of ownership in a fast-moving engineering environment.
Preferred:
• Experience with VMware environments (vSphere, vCenter, NSX) is a strong plus.
• Familiarity with GitOps workflows and version-controlled infrastructure management.
• Experience with cloud storage solutions (AWS S3, Glacier) or hybrid cloud architectures is a plus.
• Familiarity with Rubrik backup and data management solutions is a plus.
Company:
Rubrik is a data security platform that delivers cyber resilience, cyber posture, and cyber recovery solutions. Founded in 2014, the company is headquartered in Palo Alto, USA, with a team of 1001-5000 employees. The company is currently Late Stage.