1

Systems Reliability Engineer Jobs (NOW HIRING)

Site Reliability Engineer (SRE)

Atlanta, GA ยท On-site

$54.75 - $72.75/hr

Required : โ€ข Passionate about building reliable, scalable systems using modern, AI-enabled ... E principles in a production environment โ€ข Strong background in Linux, networking, and system ...

New

Systems Engineer - SRE Enablement

Memphis, TN ยท On-site

$55.50 - $73.75/hr

AutoZone's Site Reliability Engineering (SRE) team is seeking a Systems Engineer with a focus on SRE Enablement. This position is responsible for promoting reliability and operational excellence ...

Site Reliability Engineer

Manhattan, NY ยท On-site

$63 - $83.50/hr

Site Reliability Engineer Location : NYC, NY (Hybrid) Duration : Contract Candidates with strong ... Proficient in Python and scripting for automation and system management, with a proven track record ...

You'll architect the systems and strategies that allow SimSpace to deliver software seamlessly ... What will you be doing as a Staff SRE at SimSpace? * Technical Strategy & Architecture: Design and ...

Senior SRE Engineer

New York, NY ยท On-site

$62.25 - $82.75/hr

Lead and manage the SRE team to uphold system reliability, availability, and performance standards. * Design, implement, and optimize scalable infrastructure and automation solutions to support ...

Reliability Engineer

Gurabo, PR

$98.80K - $124.40K/yr

Acquires and analyzes data from connected systems and continually improves maintenance program ... Reliability Engineering * Equipment preventative maintenance (PM) task creation and management for ...

GCP Site Reliability Engineer Interview Mode: candidates local to Parsippany, Nj who can attend an ... System Reliability: Ensure the reliability and uptime of critical services and infrastructure.

Reliability Engineer

Paulsboro, NJ ยท On-site

$100.20K - $126.10K/yr

Title: Reliability Engineer - Bulk Manufacturing Location: Paulsboro, NJ Type: Direct Schedule ... Experience in bulk pharmaceutical manufacturing and related support systems * Strong preventive and ...

Reliability Engineer

Boston, MA ยท On-site

$111.40K - $140.10K/yr

Acquires and analyzes data from connected systems and continually improves maintenance program ... Reliability Engineering * Equipment preventative maintenance (PM) task creation and management for ...

Site Reliability Engineer (SRE)

Decatur, TX ยท On-site

$129K - $160K/yr

Join Our Team as a Site Reliability Engineer (SRE)! About Us At Energy Worldnet, Inc. (EWN), we ... If you're driven by operational excellence, system reliability, and continuous improvement, we'd ...

next page

Showing results 1-20

Systems Reliability Engineer information

See salary details

$61K

$118K

$141K

How much do systems reliability engineer jobs pay per year?

As of May 30, 2026, the average yearly pay for systems reliability engineer in the United States is $117,973.00, according to ZipRecruiter salary data. Most workers in this role earn between $102,500.00 and $129,000.00 per year, depending on experience, location, and employer.

What are the key skills and qualifications needed to thrive as a Systems Reliability Engineer, and why are they important?

To thrive as a Systems Reliability Engineer, you need expertise in infrastructure management, automation, and software engineering, often supported by a degree in computer science or a related field. Familiarity with tools like Kubernetes, Docker, CI/CD pipelines, monitoring systems (e.g., Prometheus, Grafana), and relevant cloud certifications (AWS, GCP, or Azure) is typically required. Strong problem-solving abilities, communication skills, and a proactive mindset help you prevent and resolve incidents efficiently. These skills ensure systems remain robust, scalable, and highly available, which is critical for maintaining business continuity and user trust.

How does a Systems Reliability Engineer typically collaborate with development and operations teams to ensure system stability?

Systems Reliability Engineers (SREs) work closely with both development and operations teams to bridge the gap between software engineering and IT operations. They participate in design reviews to ensure reliability is built into new features, coordinate with developers to automate deployments, and work with operations to monitor system health and respond to incidents. By fostering a culture of shared responsibility for uptime and performance, SREs help streamline troubleshooting and drive improvements across the organization. Regular communication and joint post-incident reviews are key practices in this collaborative environment.

What are Systems Reliability Engineers?

Systems Reliability Engineers (SREs) are IT professionals responsible for ensuring the reliability, availability, and performance of software systems and infrastructure. They combine software engineering and systems administration skills to automate processes, monitor system health, respond to incidents, and improve system resilience. SREs work closely with development and operations teams to optimize deployment pipelines, manage outages, and implement best practices for scalability and reliability. Their goal is to minimize downtime and ensure a seamless user experience.

What is the difference between Systems Reliability Engineer vs DevOps Engineer?

AspectSystems Reliability EngineerDevOps Engineer
Primary FocusEnsuring system reliability, availability, and performanceAutomating deployment, integration, and continuous delivery
Skills & CertificationsSRE certifications, Linux, scripting, monitoring toolsCI/CD tools, cloud platforms, scripting, automation
Work EnvironmentOperations, infrastructure, and reliability teamsDevelopment and operations collaboration
Industry UsageTech, finance, e-commerceTech, startups, cloud services

While both roles focus on improving system performance, Systems Reliability Engineers primarily concentrate on maintaining system uptime and reliability, whereas DevOps Engineers focus on streamlining development and deployment processes. Both roles often collaborate but serve different core functions within an organization.

More about Systems Reliability Engineer jobs
What cities are hiring for Systems Reliability Engineer jobs? Cities with the most Systems Reliability Engineer job openings:
What states have the most Systems Reliability Engineer jobs? States with the most job openings for Systems Reliability Engineer jobs include:
Infographic showing various Systems Reliability Engineer job openings in the United States as of May 2026, with employment types broken down into 79% Full Time, 16% Part Time, and 5% Contract. Highlights an 80% Physical, 12% Hybrid, and 8% Remote job distribution, with an average salary of $117,973 per year, or $56.7 per hour.

$54.75 - $72.75/hr

Other

Posted yesterday


Job description

What You'll Bring to the Team:

We are seeking a Site Reliability Engineer (SRE) to join one of our Scrum teams and help ensure the reliability, scalability, and performance of the Florence platform. AI-driven tooling and automation are a cornerstone of how we build, operate, and scale our systems.

In this role, you will work closely with product engineers while actively leveraging AI to improve observability, incident response, automation, and overall platform reliability. Coding assignments in this role will require working with AI-assisted development workflows as a core part of how solutions are designed and delivered.

You Will:
  • Be an embedded member of a Scrum team, participating in planning, refinement, reviews, and retrospectives
  • Use AI-powered tools to enhance system reliability, operational efficiency, and developer productivity
  • Design, build, and operate reliable, scalable cloud infrastructure supporting platform and product services
  • Apply AI-assisted analysis to monitoring, alerting, and observability data to detect, predict, and prevent incidents
  • Define and maintain SLOs, SLIs, and error budgets to guide reliability decisions
    Collaborate with software engineers to embed reliability and AI-driven automation into the software development lifecycle
  • Lead and participate in incident response, root cause analysis, and postmortems, leveraging AI insights where appropriate
  • Automate operational tasks and reduce toil through AI-enabled and traditional automation approaches
  • Contribute to disaster recovery planning, testing, and operational readiness
  • Produce and maintain documentation such as runbooks, operational guides, and system diagrams
  • Contribute code as a secondary responsibility, with coding assignments focused on building reliability tooling, automation, and integrations using AI-assisted development practices
An Ideal Candidate Is / Has:
  • Passionate about building reliable, scalable systems using modern, AI-enabled approaches
  • Strong understanding of cloud-native and distributed system architectures
  • Experience applying SRE principles in a production environment
  • Hands-on experience with cloud platforms (AWS preferred)
  • Experience using AI-assisted tools for coding, debugging, automation, or operational analysis
  • Strong background in Linux, networking, and system operations
  • Experience with infrastructure-as-code and automation tools (e.g., Terraform, CI/CD pipelines)
  • Familiarity with modern observability practices (metrics, logs, tracing), including AI-enhanced analysis
  • Comfortable working as part of an agile, cross-functional Scrum team
  • Strong problem-solving, communication, and collaboration skills
  • 4+ years of experience in SRE, DevOps, or similar roles
  • Experience supporting production systems at scale