Role
We are looking for a Staff Site Reliability Engineer (Federal) to join our team. This is an onsite role based in Crystal City, Virginia, reporting to the Manager, Site Reliability Engineering in the Government Cloud department. You will be part of the team that built the world's largest cloud security platform, specifically focusing on enabling US Government organizations to harness speed and agility through a cloud-first strategy. You will bring your vision and passion to help scale our multitenant architecture and enhance services across classified environments.
What you'll do (Role Expectations)
Manage operational tasks for products in US Government classified environments, including deployments, on-call duties, and incident management
Oversee cloud infrastructure components across AWS, private cloud environments, containers, and VMs
Develop scripts, containerized services, and monitoring mechanisms to automate operations tasks and ensure minimal service disruption
Create operations documentation and implement measures to prevent recurring incidents while contributing to DevOps best practices
Build and enhance Zscaler services within classified environments, ensuring 24x7 coverage including night and holiday shifts
Who You Are (Success Profile)
You thrive in ambiguity. You're comfortable building the path as you walk it. You thrive in a dynamic environment, seeing ambiguity not as a hindrance, but as the raw material to build something meaningful.
You act like an owner. Your passion for the mission fuels your bias for action. You operate with integrity because you genuinely care about the outcome. True ownership involves leveraging dynamic range: the ability to navigate seamlessly between high-level strategy and hands-on execution.
You are a problem-solver. You love running towards the challenges because you are laser-focused on finding the solution, knowing that solving the hard problems delivers the biggest impact.
You are a high-trust collaborator. You are ambitious for the team, not just yourself. You embrace our challenge culture by giving and receiving ongoing feedback-knowing that candor delivered with clarity and respect is the truest form of teamwork and the fastest way to earn trust.
You are a learner. You have a true growth mindset and are obsessed with your own development, actively seeking feedback to become a better partner and a stronger teammate. You love what you do and you do it with purpose.
What We're Looking for (Minimum Qualifications)
Foundational understanding of AI/ML technologies and experience leveraging, securing, or positioning AI-driven solutions to optimize outcomes within your functional domain
This position requires an active US Government Secret, Top Secret or TS/SCI security clearance. Maintenance of this clearance is a condition of continued employment
5+ years of Site Reliability Engineering experience in both Operations and Engineering environments
Technical proficiency in Linux administration, network troubleshooting, and automation tools like Ansible and Terraform
Strong skills in Python coding and container-based architectures such as AWS ECS or Kubernetes
What Will Make You Stand Out (Preferred Qualifications)
Experience deploying, monitoring, or maintaining AIOps tools and machine learning pipelines within highly secure, air-gapped, or FedRAMP-authorized cloud environments
Experience managing air-gapped environments and FedRAMP High/Moderate authorization levels, along with an Information Assurance Technician (IAT) Level 2 certification
Active Top Secret security clearance
#LI-Onsite #LI-YC2