Rubrik
Rubrik

13 Rubrik Devops Engineer Jobs Hiring Near You

Staff Software Engineer - Reliability

Palo Alto, CA · On-site

$67 - $89.25/hr

Rubrik is a leading company in data protection and AI operations, seeking a Staff Site Reliability Engineer to ensure the reliability and performance of their enterprise infrastructure services. The ...

Staff Software Engineer - Reliability

Palo Alto, CA · On-site

$67 - $89.25/hr

... DevOps, or Platform engineering role operating hyperscale SaaS products. • Comprehensive, hands ... Rubrik is a data security platform that delivers cyber resilience, cyber posture, and cyber ...

Staff/Sr Information Security Engineer

Palo Alto, CA · On-site

$125K - $170K/yr

... of Rubrik's Security Data infrastructure. • Drive the evolution from SIEM-centric operations ... platforms or developer-facing security tooling. • Experience in container orchestration ...

Senior Manager, Product Design

Palo Alto, CA · On-site

$148K - $196K/yr

Rubrik is a leading company in data protection and AI operations, seeking a Senior Manager, Product ... Partner with VP and Director-level leadership in Product Management and Engineering to define the ...

Rubrik is a leading Security and AI Operations Company focused on data protection and cyber ... Engineering, and Executive Leadership. • Identify and drive improvements across end-to-end GTM ...

Senior Machine Learning Engineer

Palo Alto, CA · On-site

$123K - $168K/yr

Rubrik is a leading company at the intersection of data protection, cyber resilience, and enterprise AI acceleration. They are seeking a Senior Machine Learning Engineer to work on their Semantic AI ...

Staff Software Engineer - Reliability

Staff Software Engineer - Reliability

Rubrik

Palo Alto, CA • On-site

$67 - $89.25/hr

Full-time

Posted 23 days ago


Job description

Job Summary:
Rubrik is a leading company in data protection and AI operations, seeking a Staff Site Reliability Engineer to ensure the reliability and performance of their enterprise infrastructure services. The role involves technical leadership, driving architectural vision, and managing cross-organizational reliability standards for cloud systems.
Responsibilities:
• Formulate and execute the architectural vision for Rubrik's Cloud Platform, optimizing backend infrastructure systems like Kubernetes, MySQL, and cloud-native services for performance, security, and multi-region scale.
• Build, scale, and maintain sophisticated custom internal tools, platform controllers, and automation frameworks in Go or Python to systematically eliminate operational toil.
• Wield engineering-wide influence to create technical consensus among component, platform, and security engineering teams, effectively 'shifting left' to embed structural resilience, capacity guards, and compliance from initial feature designs.
• Define, audit, and enforce robust Service Level Indicators (SLIs), Service Level Objectives (SLOs), and Error Budgets across all critical enterprise platform services, translating telemetry insights into actionable product roadmaps during executive reviews.
• Serve as a primary Incident Commander for high-severity cloud outages, establishing roles, directing mitigation vectors under pressure, and orchestrating comprehensive, blameless post-mortems that drive durable systemic fixes.
• Architect cost-observability tools and attribution frameworks, leading cloud infrastructure capacity forecasting, resource quota optimization, and vendor SLA management.
• Set the technical direction for the Application-SRE team, raising the bar on how the team diagnoses, mitigates, and durably resolves the most complex customer-impacting issues across our platform.
• Champion SRE best practices, mentoring senior and junior individual contributors across the organization, participating in interview frameworks, and actively raising the collective technical bar.
• Participate in on-call rotations.
Qualifications:
Required:
• Must be a US Citizen currently residing on CONUS soil (strict regulatory requirement to enable support for federal and FedRAMP environments when required).
• BS, MS, or PhD in Computer Science, Computer Engineering, or a highly related technical discipline.
• A minimum of 8–12+ years of software engineering and production cloud infrastructure experience, with at least 5+ years dedicated to a formal SRE, DevOps, or Platform engineering role operating hyperscale SaaS products.
• Comprehensive, hands-on programming expertise in Golang, Python, or Java with a deep grasp of concurrency models, data structures, and test-driven software design patterns.
• Proven proficiency designing, deploying, analyzing, and auditing complex, large-scale distributed systems, database topologies, and high-availability public cloud meshes.
• Authoritative operational command of Unix/Linux operating system environments (process models, file systems, kernels), systems administration, and advanced L4/L7 networking protocols.
• Institutionalize the channel that converts patterns from customer escalations and POCs into prioritized product and reliability feedback, partnering directly with Product, Sales Engineering and Support leadership.
• Track record of partnering directly with Sales, Support, and customers on escalations and POCs, and translating field signals into engineering action.
• Demonstrated history of technical leadership, mapping architectural dependencies, managing multi-team technical projects, and guiding organizations through critical platform shifts with high technical judgment.
• Participate in on-call rotations.
Preferred:
• Extensive production experience provisioning, lifecycle-managing, and recovering enterprise-scale Kubernetes (GKE, EKS) deployments and large-scale relational/non-relational databases (MySQL).
• Prior experience building, certifying, or auditing infrastructure environments under compliance structures such as FedRAMP (High/Moderate), SOC 2, ISO 27001, or CJIS.
• Fluency in Infrastructure-as-Code (Terraform, Pulumi) module design, multi-tenant state isolation, and enterprise observability fabrics (Prometheus, Grafana, OpenTelemetry).
Company:
Rubrik is a data security platform that delivers cyber resilience, cyber posture, and cyber recovery solutions. Founded in 2014, the company is headquartered in Palo Alto, USA, with a team of 1001-5000 employees. The company is currently Late Stage.

Rubrik logo

About Rubrik

Sourced by ZipRecruiter

Rubrik, the Zero Trust Data Security Company™, delivers data security and operational resilience for enterprises. Rubrik's big idea is to provide data security and data protection on a single platform, including Zero Trust Data Protection, Ransomware Investigation, Incident Containment, Sensitive Data Discovery, and Orchestrated Application Recovery. This means your data is ready so you can recover the data you need, and avoid paying a ransom. Because when you secure your data, you secure your applications, and you secure your business. We are a leader in data security ( , have been recognized as as a Forbes Cloud 100 Company, named as a LinkedIn Top 10 Startup and are proud to have earned Great Place to Work® Certification™. There has never been a more exciting time to join Rubrik, and our future is even brighter. The work you do will help propel our next chapter of growth as you do the best work of your career.

Industry

Internet and it

Company size

1,001 - 5,000 Employees

Headquarters location

Palo Alto, CA, US

Year founded

2014