1

Site Reliability Engineer Manager Jobs in Raleigh, NC

Site Reliability Engineer

Morrisville, NC

$53.25 - $70.75/hr

Site Reliability Engineer The Company: Varonis(Nasdaq: VRNS) secures AI and the data that powers it ... Backed by 24x7x365 managed detection and response, Varonis gives thousands of organizations ...

Site Reliability Engineer

Raleigh, NC · On-site +1

$55.50 - $73.75/hr

The Site Reliability Engineer Role Join our dynamic team at Qlik as a Site Reliability Engineer ... Proficiency with incident management best practices and confidently drive an incident in a critical ...

Site Reliability Engineer

Morrisville, NC · On-site

$53.25 - $70.75/hr

Description Site Reliability Engineer The Company: Varonis (Nasdaq: VRNS) secures AI and the data ... Backed by 24x7x365 managed detection and response, Varonis gives thousands of organizations ...

Site Reliability Engineer

Raleigh, NC

$55.50 - $73.75/hr

The Site Reliability Engineer Role Join our dynamic team at Qlik as a Site Reliability Engineer ... Proficiency with incident management best practices and confidently drive an incident in a critical ...

As a Site Reliability Engineer (SRE) at Litera, you will play a key role in ensuring our SaaS ... Experience using configuration management tools (Terraform, Puppet, Ansible) * Experience working ...

Site Reliability Engineer

Raleigh, NC · On-site

$120K - $150K/yr

As a Site Reliability Engineer (SRE) at Litera, you will play a key role in ensuring our SaaS ... Experience using configuration management tools (Terraform, Puppet, Ansible) * Experience working ...

As a Site Reliability Engineer (SRE) at Litera, you will play a key role in ensuring our SaaS ... Experience using configuration management tools (Terraform, Puppet, Ansible) * Experience working ...

Site Reliability Engineer

Raleigh, NC · On-site

$120K - $150K/yr

As a Site Reliability Engineer (SRE) at Litera, you will play a key role in ensuring our SaaS ... Experience using configuration management tools (Terraform, Puppet, Ansible) * Experience working ...

SRE Engineer

Raleigh, NC · On-site

$55.50 - $73.75/hr

Monday- Friday, 9am-5pm The Site Reliability Engineer will be an active contributor responsible for ... management zones, and business application mapping. Create dashboards and reports for both ...

Site Reliability Engineer II

Raleigh, NC

$55.50 - $73.75/hr

Kastle Systems is the leader in managed security, with a track record of introducing innovative ... Site Reliability Engineer II The SRE II sits at the intersection of software engineering and ...

SRE Engineer - PxE Talent

Raleigh, NC · On-site

$55.50 - $73.75/hr

As a SRE Engineer you will actively engage in your engineering craft, taking a hands-on approach to ... Ability to manage and prioritize multiple tasks in a fast-paced and dynamic environment * Strong ...

ServiceNow SRE Engineering Manager

Raleigh, NC

$55.50 - $73.75/hr

As a Manager, ServiceNow SRE Engineer , you will actively engage in your engineering craft, taking a hands-on approach to multiple high-visibility projects. Your expertise will be pivotal in ...

Manager, SRE Engineer - PxE ERM

Raleigh, NC · On-site

$55.50 - $73.75/hr

As a Manager, SRE Engineer , you will actively engage in your engineering craft, taking a hands-on approach to multiple high-visibility projects. Your expertise will be pivotal in delivering ...

next page

Showing results 1-20

Site Reliability Engineer Manager information

See Raleigh, NC salary details

$10

$61

$89

How much do site reliability engineer manager jobs pay per hour?

As of Jun 15, 2026, the average hourly pay for site reliability engineer manager in Raleigh, NC is $61.96, according to ZipRecruiter salary data. Most workers in this role earn between $53.27 and $70.82 per hour, depending on experience, location, and employer.

Will AI replace SRE jobs?

AI is expected to augment Site Reliability Engineer (SRE) roles by automating routine tasks such as monitoring, incident response, and data analysis. However, SREs will continue to be essential for designing systems, managing complex issues, and making strategic decisions that require human judgment and expertise. The role is likely to evolve with AI tools rather than be fully replaced.

What is a Site Reliability Engineer Manager?

A Site Reliability Engineer (SRE) Manager oversees a team of site reliability engineers tasked with maintaining the reliability, scalability, and performance of software systems. Their role combines leadership and technical expertise, focusing on automating operations, managing incidents, and ensuring high availability of services. They work closely with engineering and operations teams to implement best practices in monitoring, incident response, and system design. SRE Managers also mentor their teams, set reliability goals, and help drive a culture of continuous improvement within the organization.

What engineers make $500,000?

Senior-level Site Reliability Engineers (SREs) with extensive experience, advanced skills in cloud infrastructure, automation, and monitoring tools can earn $500,000 or more annually, especially in high-cost-of-living areas or large tech companies. Achieving this level often requires specialized certifications, leadership responsibilities, and a strong track record of system reliability improvements.

How much do SRE managers make in the US?

Site Reliability Engineering (SRE) managers in the US typically earn between $130,000 and $180,000 annually, with senior roles and large tech companies offering higher compensation. Salaries can vary based on experience, location, and company size, and often include bonuses and stock options.

What is the role of site reliability engineer manager?

A Site Reliability Engineer Manager oversees a team responsible for maintaining the availability, performance, and reliability of large-scale systems and services. They coordinate incident response, implement automation, and collaborate with development teams to improve system resilience, often using tools like monitoring and alerting platforms. Strong leadership, technical expertise, and understanding of cloud infrastructure are essential for this role.

What is the difference between Site Reliability Engineer Manager vs Site Reliability Engineer?

AspectSite Reliability Engineer (SRE)Site Reliability Engineer Manager
ResponsibilitiesFocuses on designing, implementing, and maintaining reliable systems and automationOversees SRE teams, manages projects, and aligns reliability goals with business objectives
Required SkillsStrong coding, system design, and troubleshooting skillsLeadership, team management, strategic planning
CertificationsGoogle Cloud, AWS certifications, Linux, scriptingSame as SRE, plus management certifications (e.g., PMP) often preferred
Work EnvironmentTechnical, hands-on with systems and automationManagerial, coordinating teams and projects

The main difference is that a Site Reliability Engineer focuses on technical system reliability, while a Site Reliability Engineer Manager oversees teams and strategic initiatives to ensure reliability goals are met across projects.

How does a Site Reliability Engineer Manager typically balance technical leadership with team management responsibilities?

A Site Reliability Engineer Manager often splits their time between overseeing technical projects, such as system reliability improvements and incident response strategies, and managing the growth and well-being of their engineering team. This includes mentoring SREs, facilitating communication between teams, setting priorities, and ensuring that operational goals align with business objectives. Balancing these responsibilities requires strong organizational skills and a proactive approach to both technical challenges and people management. Successful managers regularly engage in hands-on problem-solving while also fostering a collaborative team environment.

What are the key skills and qualifications needed to thrive as a Site Reliability Engineer Manager, and why are they important?

To thrive as a Site Reliability Engineer Manager, you need expertise in systems engineering, incident management, and a strong background in software development or computer science, often supported by a bachelor’s degree or equivalent experience. Familiarity with cloud platforms (like AWS, GCP, or Azure), infrastructure as code tools (such as Terraform), monitoring systems (like Prometheus), and certifications in cloud or DevOps practices are highly valued. Strong leadership, effective communication, and problem-solving abilities help you guide teams and foster collaboration across departments. These skills and qualities ensure the stability, scalability, and reliability of critical systems while enabling teams to respond effectively to complex technical challenges.
What are the most commonly searched types of Site Reliability Engineer jobs in Raleigh, NC? The most popular types of Site Reliability Engineer jobs in Raleigh, NC are:
What cities near Raleigh, NC are hiring for Site Reliability Engineer Manager jobs? Cities near Raleigh, NC with the most Site Reliability Engineer Manager job openings:
Infographic showing various Site Reliability Engineer Manager job openings in Raleigh, NC as of June 2026, with employment types broken down into 75% Full Time, and 25% Contract. Highlights an 87% In-person, and 13% Remote job distribution, with an average salary of $128,882 per year, or $62 per hour.

Principal Site Reliability Engineer

Fidelity Investments

Durham, NC • On-site

$55 - $73.25/hr

Full-time

Posted 6 days ago


Fidelity Investments rating

8.7

Company rating: 8.7 out of 10

Based on 264 frontline employees who took The Breakroom Quiz

14th of 138 rated financial services


Job description

Job Description:
Position Description:
Combines Operational excellence with Development experience to deliver services at high scale, high availability with resilience. Builds reliability into the ecosystem by applying best practices in Resiliency Engineering, Automation, Observability and Chaos Testing. Streamlines and accelerates software delivery cycle by using DevOps practices and toolchain. Integrates Site Reliability Engineering (SRE) practices (Observability and Chaos) with DevOps processes and delivery pipelines to stop bad code from reaching production. Ensures business-critical enterprise systems are continuously available to internal and external customers. Implements technical standardization and process refinements within the engineering organization and for Site Reliability Engineers. Collaborates with production support teams to define and implement processes for the identification, collection, and analysis of incident data. Brings together technical, procedural, and financial data to reduce toil and increase efficiency.
Primary Responsibilities:
  • Develops Chaos Testing capabilities using multiple Chaos Tools (AWS Fault Injection Service (FIS), Chaos Mesh, and Chaosd) and Chaos Toolkit.
  • Develops and enhances organization's internal Chaos Framework to streamline Chaos Executions and reporting.
  • Provides specialized technical expertise in the adoption of Chaos Engineering by application teams.
  • Chaos tests and observes business-critical applications to understand the weaknesses and increase application resiliency.
  • Activates Observability for the critical applications with recommended Service Level Indicators and Service Level Objectives for Latency, Availability, Error Rate etc.
  • Utilizes modern monitoring tools (Datadog, Splunk, Catchpoint etc.) to reduce mean time to detect an issue and improve the response times.
  • Creates CI/CD pipelines with security and quality checks with Application Lifecycle management toolchain. Helps in integrating Chaos and Observability with CI/CD pipelines.
  • Automates repetitive activities using scripting languages (Python, Groovy etc.).
  • Implements and supports solutions based on cloud platforms AWS/Azure and container orchestration Kubernetes.
  • Onboards /Evaluates New Cloud services that help to enhance the Resiliency of cloud ecosystem. Serves as a liaison for vendor engagement.
  • Participates in incident management, problem management and incident postmortems.
  • Takes part in peer code reviews providing qualitative feedback.
  • Builds processes and capabilities to adapt and respond to risks, and disruptions, while maintaining business operations and data recovery with minimal disruptions.
  • Coaches peer SREs and application teams on SRE and DevOps.
  • Implements Agile methodologies in the team's project completion using incremental and iterative steps.

Education and Experience:
Bachelor's degree in Computer Science, Engineering, Information Technology, Information Systems, or a closely related field (or foreign education equivalent) and five (5) years of experience as a Principal Site Reliability Engineer (or closely related occupation) implementing resilient container and cloud-based applications and infrastructure solutions, using DevOps or SRE practices, in a financial services environment.
Or, alternatively, Master's degree (or foreign education equivalent) in Computer Science, Engineering, Information Technology, Information Systems, or a closely related field (or foreign education equivalent) and three (3) years of experience as a Principal Site Reliability Engineer (or closely related occupation) implementing resilient container and cloud-based applications and infrastructure solutions, using DevOps or SRE practices, in a financial services environment.
Skills and Knowledge:
Candidate must also possess:
  • Demonstrated Expertise ("DE") improving application resiliency by implementing chaos engineering to build system's capability to withstand turbulent conditions in production, using Chaos Mesh, Chaosd, Azure Chaos Studio, AWS FIS, or Gremlin; and driving automation to implement scalable approaches for the planning, design, execution, and reporting of chaos testing using Jenkins pipelines, standard frameworks, data visualization, and dashboards.
  • DE implementing advanced observability practices and techniques in production and pre-production environments, at scale using Datadog, Splunk, or Catchpoint; tracking the error budget, proactively identifying issues, minimizing Mean Time to Repair (MTTR); and balancing customer expectations by implementing Service-Level Indicators (SLIs) and Service-Level Objectives (SLOs) using logs, traces, monitors and synthetic tests.
  • DE migrating and maintaining cloud applications and creating cloud solutions using Amazon Web Services (AWS) or Azure cloud services; Implementing infrastructure as code for cloud; Onboarding new AWS or Azure services with required reviews and security controls in non-production and production environments; and researching evolving cloud ecosystem to adopt machine learning based tools (AWS DevOps guru) to boost AIOps abilities.
  • DE implementing CI/CD pipelines in both production and non-production environments using Application Lifecycle Management (ALM) tools (JIRA, GitHub, Jenkins, SonarQube, Artifactory, or uDeploy) to enable faster code delivery, enhanced software quality, reliability, and security; and developing products, and core and common capabilities for the organization to reduce toil and drive standardization, using containerization and orchestration technologies (Docker or Kubernetes), Infrastructure as Code (IaC) tools, scripting languages (Python or Groovy), and engineering best practices.

#PE1M2
#LI-DNI
Certifications:
Category:
Information Technology
Please be advised that Fidelity's business is governed by the provisions of the Securities Exchange Act of 1934, the Investment Advisers Act of 1940, the Investment Company Act of 1940, ERISA, numerous state laws governing securities, investment and retirement-related financial activities and the rules and regulations of numerous self-regulatory organizations, including FINRA, among others. Those laws and regulations may restrict Fidelity from hiring and/or associating with individuals with certain Criminal Histories.

What Fidelity Investments employees say

Pay

Benefits

Hours and flexibility

Workplace

Get the full story on Breakroom