1

Site Reliability Engineer Manager Jobs (NOW HIRING)

The SRE team is responsible for maintaining the existing systems, supporting our development teams ... You will work alongside software developers, testers, and project managers. *This position is ...

Site Reliability Engineer (SRE)

Downingtown, PA · On-site

$59 - $78.50/hr

The SRE team is responsible for maintaining the existing systems, supporting our development teams ... You will work alongside software developers, testers, and project managers. *This position is ...

Site Reliability Engineer(SRE)

Dallas, TX · On-site

$56.75 - $75.25/hr

Collaborate with Major Incident Management (MIM) teams during critical events * Drive proactive ... SRE / Production Support / IT Operations * Strong scripting/automation expertise ( Python ...

Site Reliability Engineer (SRE)

Plano, TX · On-site

$54.50 - $72.50/hr

Site Reliability Engineer (SRE) Location: Richmond, VA or Plano, TX Work Model: Hybrid - 3 days onsite per week Duration: Long term contract Job Summary: We are seeking an experienced Site ...

Site Reliability Engineer (SRE)

Englewood, CO

$56.25 - $74.75/hr

The SRE team is responsible for maintaining the existing systems, supporting our development teams ... You will work alongside software developers, testers, and project managers. *This position is ...

The SRE team is responsible for maintaining the existing systems, supporting our development teams ... You will work alongside software developers, testers, and project managers. *This position is ...

Site Reliability Engineer (SRE)

Reston, VA

$59.25 - $78.75/hr

The SRE team is responsible for maintaining the existing systems, supporting our development teams ... You will work alongside software developers, testers, and project managers. *This position is ...

SRE

Charlotte, NC · On-site

$55.75 - $74/hr

Role: SRE Location: Charlotte, NC Skills: Grafana, Python, Splunk, Linux, Scripting. Microsoft 360 ... Lead and manage a team of support engineers in resolving incidents, requests, and problems to ...

next page

Showing results 1-20

Site Reliability Engineer Manager information

See salary details

$10

$63

$91

How much do site reliability engineer manager jobs pay per hour?

As of Jun 15, 2026, the average hourly pay for site reliability engineer manager in the United States is $63.74, according to ZipRecruiter salary data. Most workers in this role earn between $54.81 and $72.84 per hour, depending on experience, location, and employer.

Will AI replace SRE jobs?

AI is expected to augment Site Reliability Engineer (SRE) roles by automating routine tasks such as monitoring, incident response, and data analysis. However, SREs will continue to be essential for designing systems, managing complex issues, and making strategic decisions that require human judgment and expertise. The role is likely to evolve with AI tools rather than be fully replaced.

What is a Site Reliability Engineer Manager?

A Site Reliability Engineer (SRE) Manager oversees a team of site reliability engineers tasked with maintaining the reliability, scalability, and performance of software systems. Their role combines leadership and technical expertise, focusing on automating operations, managing incidents, and ensuring high availability of services. They work closely with engineering and operations teams to implement best practices in monitoring, incident response, and system design. SRE Managers also mentor their teams, set reliability goals, and help drive a culture of continuous improvement within the organization.

What engineers make $500,000?

Senior-level Site Reliability Engineers (SREs) with extensive experience, advanced skills in cloud infrastructure, automation, and monitoring tools can earn $500,000 or more annually, especially in high-cost-of-living areas or large tech companies. Achieving this level often requires specialized certifications, leadership responsibilities, and a strong track record of system reliability improvements.

How much do SRE managers make in the US?

Site Reliability Engineering (SRE) managers in the US typically earn between $130,000 and $180,000 annually, with senior roles and large tech companies offering higher compensation. Salaries can vary based on experience, location, and company size, and often include bonuses and stock options.

What is the role of site reliability engineer manager?

A Site Reliability Engineer Manager oversees a team responsible for maintaining the availability, performance, and reliability of large-scale systems and services. They coordinate incident response, implement automation, and collaborate with development teams to improve system resilience, often using tools like monitoring and alerting platforms. Strong leadership, technical expertise, and understanding of cloud infrastructure are essential for this role.

What is the difference between Site Reliability Engineer Manager vs Site Reliability Engineer?

AspectSite Reliability Engineer (SRE)Site Reliability Engineer Manager
ResponsibilitiesFocuses on designing, implementing, and maintaining reliable systems and automationOversees SRE teams, manages projects, and aligns reliability goals with business objectives
Required SkillsStrong coding, system design, and troubleshooting skillsLeadership, team management, strategic planning
CertificationsGoogle Cloud, AWS certifications, Linux, scriptingSame as SRE, plus management certifications (e.g., PMP) often preferred
Work EnvironmentTechnical, hands-on with systems and automationManagerial, coordinating teams and projects

The main difference is that a Site Reliability Engineer focuses on technical system reliability, while a Site Reliability Engineer Manager oversees teams and strategic initiatives to ensure reliability goals are met across projects.

How does a Site Reliability Engineer Manager typically balance technical leadership with team management responsibilities?

A Site Reliability Engineer Manager often splits their time between overseeing technical projects, such as system reliability improvements and incident response strategies, and managing the growth and well-being of their engineering team. This includes mentoring SREs, facilitating communication between teams, setting priorities, and ensuring that operational goals align with business objectives. Balancing these responsibilities requires strong organizational skills and a proactive approach to both technical challenges and people management. Successful managers regularly engage in hands-on problem-solving while also fostering a collaborative team environment.

What are the key skills and qualifications needed to thrive as a Site Reliability Engineer Manager, and why are they important?

To thrive as a Site Reliability Engineer Manager, you need expertise in systems engineering, incident management, and a strong background in software development or computer science, often supported by a bachelor’s degree or equivalent experience. Familiarity with cloud platforms (like AWS, GCP, or Azure), infrastructure as code tools (such as Terraform), monitoring systems (like Prometheus), and certifications in cloud or DevOps practices are highly valued. Strong leadership, effective communication, and problem-solving abilities help you guide teams and foster collaboration across departments. These skills and qualities ensure the stability, scalability, and reliability of critical systems while enabling teams to respond effectively to complex technical challenges.
What cities are hiring for Site Reliability Engineer Manager jobs? Cities with the most Site Reliability Engineer Manager job openings:
What are the most commonly searched types of Site Reliability Engineer jobs? The most popular types of Site Reliability Engineer jobs are:
What states have the most Site Reliability Engineer Manager jobs? States with the most job openings for Site Reliability Engineer Manager jobs include:
Infographic showing various Site Reliability Engineer Manager job openings in the United States as of June 2026, with employment types broken down into 1% Locum Tenens, 95% Full Time, 1% Part Time, and 3% Contract. Highlights an 87% Physical, 5% Hybrid, and 8% Remote job distribution, with an average salary of $132,583 per year, or $63.7 per hour.
Senior Manager, Site Reliability Engineering

Senior Manager, Site Reliability Engineering

Tubi

San Francisco, CA • On-site, Remote

$67.25 - $89.25/hr

Full-time

Medical, Dental, Vision, Retirement, PTO

Posted 17 days ago


Job description

About the Role:
Site Reliability Engineering (SRE) at Tubi is not a traditional operations team. We are a software engineering organization that applies a developer's mindset and toolkit to the challenges of building and running large-scale, distributed systems. Our mission is to engineer resilience from the ground up, enabling our product teams to innovate rapidly while ensuring our users have a stellar experience. We own the availability, latency, performance, and capacity of our platform, and we achieve our goals through a culture of data-driven decision-making, blameless learning, and relentless automation.
We are seeking an experienced and visionary Senior SRE Manager to lead and grow our newly built Site Reliability Engineering team. You are more than a people manager or a tech lead; you are the strategic leader responsible for architecting our reliability roadmap. You will build and mentor a team of talented engineers, foster a culture of blameless learning and continuous improvement, and champion the engineering practices that allow us to balance rapid innovation with rock-solid stability. You will be a key influencer in our engineering leadership, partnering with peers across the organization to ensure reliability is a shared responsibility and a core tenet of our engineering culture.
What You'll Do:
  • Team Leadership & Mentorship:
    • Lead, mentor, and grow a team of Site Reliability Engineers. Foster a culture of innovation and technical excellence where engineers feel empowered to do their best work. Provide personalized coaching, create professional development plans, and guide the careers of senior and emerging talent within the team.
    • Establish equitable, sustainable on-call practices (including global coverage where applicable) that protect focus time and avoid burnout.
    • Define team rituals - runbook reviews, game days, and incident retros - that reinforce quality and learning.
  • Strategic Planning & Vision: Define and drive the multi-year technical strategy and vision for Tubi's observability, and automation platforms. Partner with infra lead to align Tubi's infrastructure & SRE roadmap. Partner with tech leaders to align the SRE roadmap with business objectives. Champion a data-driven approach to reliability, using Service Level Objectives (SLOs) and error budgets to facilitate productive conversations about risk and feature velocity.
  • Operational Excellence & Incident Management:
    • Own the end-to-end availability, performance, and efficiency of our critical user-facing services. Evolve our incident response practice to reduce Mean Time to Resolution (MTTR) and Mean Time Between Failures (MTBF). Champion a rigorous, blameless, and data-driven post-mortem culture to ensure we learn from both successes and failures, driving eng teams for systemic fixes and automation to prevent the recurrence of incidents.
    • Streamline and improve our existing processes and practices, and collaborate with other teams to enhance our production release standards by improving current processes.
    • Define and tune a 24×7 on-call rotation for low noise and fast response; act as executive escalation partner during major incidents.
    • Own disaster-recovery strategy (playbooks, failover drills, recovery simulations) and track SLO gaps with time-bound remediations.
  • Financial & Vendor Management: Own the SRE budget, tooling, and headcount. Manage relationships with key third-party vendors for our observability and SRE related AI platforms, work with infra lead and finance team for contract negotiations and ensure we derive maximum value from our investments.
  • Cross-Functional Collaboration: Act as a key influencer and strategic partner to leaders in Software Engineering, Product Management, and Infra/Sec. Drive the adoption of SRE best practices and principles throughout the organization, ensuring new services are designed for reliability, scalability, and observability from day one.
  • The AI Mandate: Building the Future of Observability with AI. You will not just manage a team that uses AI; you will lead the charge in building an AI-native SRE function. This is a strategic mandate that requires a forward-thinking leader who understands both the potential and the pitfalls of integrating intelligent systems into critical operations. This includes:
    • AIOps Strategy Development: Developing and executing the strategy for integrating AIOps and machine learning into our observability stack. Your goal will be to move the team from a reactive monitoring posture to one of predictive maintenance and automated anomaly detection, fundamentally changing how we ensure reliability.
    • Accelerating Automation with AI: Championing the effective and responsible use of AI-assisted coding tools (e.g., Claude Code, Cursor) within the SRE team. You will set the standards and practices to leverage these tools to accelerate the development of automation, operational tooling, and infrastructure code.
    • Building the Business Case: Building the techno-economic case for new AI tooling, managing vendor relationships, and ensuring the cost-effective and secure implementation of these powerful systems. You must be able to articulate the ROI of these investments in terms of reduced downtime, improved operational efficiency, and faster incident resolution.
    • Fostering Critical AI Literacy: Fostering a culture that can critically evaluate, debug, and learn from the outputs of AI systems. This involves extending our blameless post-mortem philosophy to AI-driven actions and recommendations, ensuring that the team remains in control and understands the "why" behind automated decisions.

Your Background:
  • 8+ years of experience in a technical field, with at least a year in an engineering leadership position managing SRE, DevOps, or Production Engineering teams.
  • A deep, principled understanding of SRE tenets, including Service Level Indicators (SLIs), SLOs, error budgets, toil reduction, and capacity planning.
  • Exceptional communication, negotiation, and influencing skills, with the ability to articulate complex technical concepts and strategies to both technical and non-technical stakeholders at all levels of the organization.
  • A strong technical background as a hands-on software engineer or site reliability engineer prior to moving into management. Deep knowledge of AWS services (especially networking, IAM, EKS, ALBs/NLBs, Route 53, CloudWatch). Proven experience with Kubernetes in production (EKS preferred), including service exposure, networking, and availability engineering.
  • Hands-on familiarity with modern SRE tools and technologies, including Infrastructure as Code (e.g., Terraform, Ansible), container orchestration (Kubernetes), observability platforms (e.g., Prometheus, Grafana, Datadog, Splunk), and incident tooling (e.g., PagerDuty, FireHydrant), deployment-safety tooling (e.g., Argo Rollouts, LaunchDarkly), and observability standards (e.g., OpenTelemetry).

#LI-BT1
#LI-Hybrid
Pursuant to state and local pay disclosure requirements, the pay range for this role, with final offer amount dependent on education, skills, experience, and location, is listed annually below. This role is also eligible for various benefits, including medical/dental/vision, insurance, a 401(k) plan, paid time off, and other benefits in accordance with applicable plan documents.
High cost labor markets such as but not limited to Los Angeles, New York City, and San Francisco
$227,200-$324,500 USD
Tubi is a division of Fox Corporation, and the FOX Employee Benefits summarized here, covers the majority of all US employee benefits. The following distinctions below outline the differences between the Tubi and FOX benefits:
  • For US-based non-exempt Tubi employees, the FOX Employee Benefits summary accurately captures the Vacation and Sick Time.
  • For all salaried/exempt employees, in lieu of the FOX Vacation policy, Tubi offers a Flexible Time off Policy to manage all personal matters.
  • For all full-time, regular employees, in lieu of FOX Paid Parental Leave, Tubi offers a generous Parental Leave Program, which allows parents twelve (12) weeks of paid bonding leave within the first year of birth, adoption, surrogacy, or foster placement of a child in addition to applicable government leave program(s) and FOX's short-term disability policy. This time is 100% paid through a combination of any applicable state, city, and federal leaves and wage-replacement programs in addition to contributions made by Tubi.
  • For all full-time, regular employees, Tubi offers a monthly wellness reimbursement.
About Tubi:
Boldly built for every fandom, Tubi is a free streaming service that entertains over 100 million monthly active users. Tubi offers the world's largest collection of Hollywood movies and TV shows, thousands of creator-led stories and hundreds of Tubi Originals made for the most passionate fans. Headquartered in San Francisco and founded in 2014, Tubi is part of Tubi Media Group, a division of Fox Corporation.
We are an equal opportunity employer and all qualified applicants will receive consideration for employment without regard to race, color, religion, sex, national origin, gender identity, disability, protected veteran status, or any other characteristic protected by law. We will consider for employment qualified applicants with criminal histories consistent with applicable law.