1

Sre Manager Jobs (NOW HIRING)

About the Role As a Site Reliability Engineer (SRE) at Mercor, you'll own production reliability across our most critical systems, partnering directly with infrastructure leadership. You'll play a ...

Key Responsibilities Observability Engineering • Design, scale, optimize, and manage Prometheus ... Site Reliability Engineering • Apply and evolve a n SRE Maturity Model to help teams mature ...

Site Reliability Engineer

Manhattan, NY · On-site

$63 - $83.50/hr

Site Reliability Engineer Location : NYC, NY (Hybrid) Duration : Contract Candidates with strong ... Proficient in Python and scripting for automation and system management, with a proven track record ...

Site Reliability Engineer

Chicago, IL · On-site

$58.75 - $78/hr

W (flexible on other 2 days) Site Reliability Engineer - Northern Trust, Goals Driven Wealth Management We are searching for a candidate who has extensive experience in Site Reliability Engineering ...

$50.75 - $67.50/hr

Site Reliability Engineer Duration: 12+ months visa: PR-only Canadian''s Location: Remote Skills:: SRE, Kubernetes, APIs, WAF, databases, API Proxy (Gloo, APIGEE), Kafka, and Cloud (AWS/Azure/Google ...

Site Reliability Engineer

Frederick, MD · Hybrid

$56.75 - $75.25/hr

With experts in biomedical science, software engineering, and program management, we focus on ... Transportation Reimbursement Account (TRN) The Site Reliability Engineer role centers on ...

Site Reliability Engineer

Houston, TX

$54.50 - $72.25/hr

Site Reliability Engineer (SRE) Role Overview Looking for a Site Reliability Engineer (SRE) to ... Manage Dynatrace monitoring, including RUM and synthetic monitoring * Configure Adobe Analytics for ...

Site Reliability Engineer

Frederick, MD · Hybrid

$56.75 - $75.25/hr

With experts in biomedical science, software engineering, and program management, we focus on ... Transportation Reimbursement Account (TRN) The Site Reliability Engineer role centers on ...

Site Reliability Engineer (SRE)

Omaha, NE · On-site

$54.50 - $72.50/hr

Site Reliability Engineer (SRE) Location: Omaha, NE / Dallas, TX Job Type: Full Time Job Summary ... Highly skilled in managing production failures, conducting root cause analysis, and driving ...

$57.75 - $76.75/hr

Site Reliability Engineer (SRE) Department: Technology Location: Manila Reporting To: Head of Infra ... System Monitoring & Incident Management * Build and maintain monitoring, alerting, and logging ...

Site Reliability Engineer

Beaverton, OR · On-site

$59.25 - $78.75/hr

Overview As a Site Reliability Engineer, you'll help drive Concora Credit's Mission to enable ... SLO, Monitoring, and Service Health Management: • Develop and maintain service level objectives ...

Site Reliability Engineer (SRE)

San Diego, CA · On-site

$60.50 - $80.50/hr

Common technologies you'll manage include: Kubernetes (eks), Elasticsearch, Redis, RDS, ELB, and ... managing SRE teams and supporting mission critical applications3+ years of Hybrid Cloud ...

next page

Showing results 1-20

Sre Manager information

See salary details

$62K

$117.5K

$168.5K

How much do sre manager jobs pay per year?

As of May 30, 2026, the average yearly pay for sre manager in the United States is $117,488.00, according to ZipRecruiter salary data. Most workers in this role earn between $94,500.00 and $140,000.00 per year, depending on experience, location, and employer.

What are the key skills and qualifications needed to thrive as an SRE Manager, and why are they important?

To thrive as an SRE Manager, you need a deep understanding of site reliability engineering principles, strong experience with systems architecture, and a background in computer science or a related field. Familiarity with tools such as Kubernetes, Prometheus, cloud platforms, and CI/CD pipelines, as well as certifications like AWS Certified Solutions Architect, are commonly expected. Leadership, effective communication, and problem-solving skills are crucial for driving team performance and collaborating across departments. These skills ensure high system reliability, efficient incident management, and a culture of continuous improvement within technical organizations.

What are some common challenges SRE Managers face when leading Site Reliability Engineering teams?

SRE Managers often encounter challenges balancing reliability with rapid development, ensuring their teams have the right mix of software engineering and operations skills. They must also foster a culture of continuous improvement while managing on-call rotations and incident response without causing burnout. Additionally, collaborating effectively with development and product teams to set realistic service level objectives (SLOs) and drive adoption of SRE best practices can require strong communication and negotiation skills.

What are SRE Managers?

SRE Managers are leaders responsible for overseeing Site Reliability Engineering (SRE) teams. They ensure the reliability, scalability, and performance of software systems by guiding engineers in implementing best practices, automation, and monitoring processes. SRE Managers collaborate closely with development and operations teams to balance feature development with system stability. Their role also includes mentoring SREs, managing incident response, and driving improvements in system reliability and operational efficiency.

What is the difference between Sre Manager vs DevOps Engineer?

AspectSre ManagerDevOps Engineer
CredentialsTypically requires a Bachelor's/Master's in CS or related field, with certifications like AWS, Google Cloud, or KubernetesSimilar credentials, often with cloud certifications and scripting skills
Work EnvironmentLeads teams, manages incident response, and oversees reliability strategiesFocuses on automation, CI/CD pipelines, and infrastructure deployment
Industry UsageCommon in large tech companies, financial services, and cloud providersWidely used across startups, tech firms, and enterprises adopting DevOps practices

The Sre Manager and DevOps Engineer roles share overlapping skills in cloud computing, automation, and infrastructure management. While the Sre Manager oversees reliability and team coordination, the DevOps Engineer focuses on implementing automation tools and deployment pipelines. Both roles are crucial for modern IT operations, but the Sre Manager typically has a broader leadership responsibility, whereas the DevOps Engineer is more hands-on with technical implementation.

More about Sre Manager jobs
What cities are hiring for Sre Manager jobs? Cities with the most Sre Manager job openings:
What are the most commonly searched types of Sre jobs? The most popular types of Sre jobs are:
What states have the most Sre Manager jobs? States with the most job openings for Sre Manager jobs include:
Infographic showing various Sre Manager job openings in the United States as of May 2026, with employment types broken down into 67% Full Time, and 33% Contract. Highlights an 67% In-person, and 33% Hybrid job distribution, with an average salary of $117,488 per year, or $56.5 per hour.

Senior Manager, Site Reliability Engineering

Catalyst Brands

Dallas, TX • On-site

$103.50K - $172.50K/yr

Full-time

Posted 6 days ago


Catalyst Brands rating

7.3

Company rating: 7.3 out of 10

Based on 18 frontline employees who took The Breakroom Quiz


Job description

Overview
Senior Manager, Site Reliability Engineering
The Site Reliability Engineering Manager is responsible for overseeing the daily operations and delivery of the Site Reliability Engineering teams. This role plays a key part in driving team productivity and ensuring the ongoing health, performance, resilience, and stability of Catalyst's eCommerce and CRM platforms.
In addition to managing operational aspects, the SRE Sr.Manager actively contributes to the technical direction of the team. This includes shaping the automation strategy, guiding telemetry and observability practices, leading solution delivery, and managing incidents and problems affecting platform reliability.
This is a hybrid leadership role that combines technical expertise with people management. The SRE Manager also contributes to both short and long-term planning initiatives-spanning systems architecture, team development, and organizational strategy.
What You Will Do:
Team Leadership & Project Management
• Provide both technical and people leadership to Site Reliability Engineering (SRE) teams through regular one-on-one meetings, team syncs, and performance reviews.
• Manage project execution by organizing cross-functional teams, assigning responsibilities, and tracking progress against defined schedules and milestones.
• Assist in budgeting, workforce planning, hiring, and third-party contract negotiations to support team growth and operational goals.
Platform Reliability & Operational Excellence
• Drive continuous improvements in platform reliability, stability, and performance by overseeing the deployment of fully automated telemetry, observability, and AI-driven monitoring solutions.
• Lead the development and enhancement of intelligent alerting and automated incident response systems to improve service restoration speed and issue detection.
• Collaborate with administrators and platform engineers on implementation decisions to ensure highly reliable infrastructure, systems, and integrations.
• Document all changes in accordance with change control policies and documentation standards; identify risks and recommend corrective actions when necessary.
Incident & Problem Management
• Provide advanced Incident Management and Problem Management support by analyzing telemetry data and system logs to identify, remediate, and prevent reliability issues.
• Participate in on-call escalation support rotations in alignment with the 24/7/365 support model.
• Act as the Escalation Manager/Critical Incident Manager during major incidents, guiding teams through structured and effective service recovery.
• Communicate timely updates and incident reports to senior leadership during and after critical events.
Stakeholder Collaboration & Support
• Lead conversations and provide business and engineering support for both internal stakeholders and external customers.
What You Will Need:
Experience & Leadership
• 10+ years of experience in global organizations, with a proven ability to communicate effectively across all levels-from executives to individual contributors.
• 5+ years of hands-on Site Reliability Engineering (SRE) experience, including platform automation, telemetry, observability, and self-healing systems.
• Demonstrated leadership and collaboration in high-availability, mission-critical digital environments.
• Should have strong support knowledge and understanding on retail ecommerce flow - Web and Mobile technologies
• Work with software engineers across scrum teams and performance engineering to ensure systems are meeting reliability and performance standards.
• Hands-on experience with debugging, optimizing code and automation.
• Identify opportunities to adopt innovative technologies and continuous improvement - Automation, Shift left, Self-Heal.
Platform & Application Support
• Extensive experience supporting and administering digital retail and eCommerce platforms with one of the Cloud providers (AWS/Azure/Google Cloud).
• Demonstrated experience in application design, software development, testing and production support of Java-J2EE based eCommerce applications.
• Practical experience monitoring and maintaining streaming platform technologies such as Apache Kafka.
• Deep understanding of cloud-native architectures and platform operations.
Monitoring, Telemetry & Observability
• Proficient with modern monitoring, logging, and telemetry tools including:
o New Relic, Splunk, ELK, Datadog, DynaTrace, Catchpoint, and AWS CloudWatch
• Hands-on experience designing and implementing automated health checks, observability pipelines, and self-healing solutions.
Automation & Infrastructure as Code (IaC)
• Strong experience with automation tools and frameworks, such as:
o Jenkins, Chef, Ansible, Terraform.
• Expertise in scripting languages used for platform automation and diagnostics:
o PowerShell, Python, Ruby, AWK, SED, etc.
Cloud, Networking & Systems Knowledge
• Advanced experience with public cloud platforms:
o Microsoft Azure and Amazon Web Services (AWS).
• Solid understanding of networking fundamentals:
o TCP/IP, DNS, DHCP, WINS.
• Advance experience with Content Delivery Networks (CDNs) such as Akamai and Cloudflare.
Tooling & Operational Practices
• Experience using ITSM and collaboration platforms:
o Jira, BMC Remedy, ServiceNow.
• Strong understanding of IT operations frameworks (e.g., ITIL, MOF).
Education & Certifications
• Bachelor's degree in computer science or related technical field.
• Relevant technical certifications are a plus, including:
o Azure/AWS, Microsoft and ITIL.
Pay Range
USD $103,500.00 - USD $172,500.00 /Yr.

What Catalyst Brands employees say

Pay

Hours and flexibility

Workplace

Get the full story on Breakroom