1

Systems Reliability Engineer Jobs (NOW HIRING)

Evaluation Reliability SRE

Cupertino, CA · On-site

$70.25 - $93.50/hr

Within ERE, Core SRE owns the production backbone: resource management, session orchestration ... We sit at the intersection of distributed systems, ML evaluation infrastructure, and operational ...

Eng II, Reliability Eng

Roanoke Rapids, NC · On-site

$91K - $115K/yr

Experience using Computerized Maintenance Management Systems (CMMS) * Proficient in Microsoft Word, Excel, and PowerPoint Keywords: reliability engineer, mechanical reliability, senior reliability ...

Applies SRE principles and practices, including monitoring, alerting, incident management, and root cause analysis, to improve system reliability and reduce operational risk. Supports the definition ...

SRE

Charlotte, NC · On-site

$55.75 - $74/hr

Role: SRE Location: Charlotte, NC Skills: Grafana, Python, Splunk, Linux, Scripting. Microsoft 360 ... This role will focus on ensuring the stability, performance, and efficiency of the systems while ...

Reliability Engineer

Deer Park, TX

$91K - $115K/yr

Ensure accurate documentation of maintenance and inspection data within CMMS systems to enable data ... Certified Reliability Engineer (CRE) Work Environment At Lubrizol, we are committed to fostering a ...

Reliability Engineer

Deer Park, TX · On-site

$91K - $115K/yr

Ensure accurate documentation of maintenance and inspection data within CMMS systems to enable data ... Certified Reliability Engineer (CRE) Work Environment At Lubrizol, we are committed to fostering a ...

Reliability Engineer

Vandalia, OH · On-site

$112K - $149K/yr

The Reliability Engineer is responsible for driving product and system reliability across the lifecycle of Unison hardware and services. In this role, you will apply reliability engineering tools and ...

Reliability Engineer

Deer Park, TX

$91K - $115K/yr

Ensure accurate documentation of maintenance and inspection data within CMMS systems to enable data ... Certified Reliability Engineer (CRE) Work Environment At Lubrizol, we are committed to fostering a ...

Lead all reliability activities of government mobility products, antenna systems, and crypto products. * Build and maintain a strong working relationship with Viasat Government business, engineering ...

Reliability Engineer

Liverpool, NY · On-site

$98K - $123K/yr

WHO WE ARE At Lockheed Martin Rotary and Mission Systems, we are driven by innovation and integrity ... WHAT WE'RE DOING An Engineer working in the Reliability and Maintainability Engineering group will ...

Systems Engineer - SRE Enablement

Memphis, TN · On-site

$55.50 - $73.75/hr

AutoZone's Site Reliability Engineering (SRE) team is seeking a Systems Engineer with a focus on SRE Enablement. This position is responsible for promoting reliability and operational excellence ...

SRE Engineer

Arlington, VA · On-site

$65.75 - $87.25/hr

The SRE Engineer will improve the reliability, availability, performance, and operational resilience of mission-critical systems for a federal enterprise program. Responsibilities : • Define ...

Site Reliability Engineer

Irondale, AL · On-site

$48.25 - $64/hr

Site Reliability Engineer Site Reliability Engineer (SRE) Hybrid Opportunity | Enterprise Cloud ... This is a high-impact role focused on improving system reliability, scalability, automation, and ...

next page

Showing results 1-20

Systems Reliability Engineer information

See salary details

$61K

$118K

$141K

How much do systems reliability engineer jobs pay per year?

As of Jun 21, 2026, the average yearly pay for systems reliability engineer in the United States is $117,973.00, according to ZipRecruiter salary data. Most workers in this role earn between $102,500.00 and $129,000.00 per year, depending on experience, location, and employer.

What are the key skills and qualifications needed to thrive as a Systems Reliability Engineer, and why are they important?

To thrive as a Systems Reliability Engineer, you need expertise in infrastructure management, automation, and software engineering, often supported by a degree in computer science or a related field. Familiarity with tools like Kubernetes, Docker, CI/CD pipelines, monitoring systems (e.g., Prometheus, Grafana), and relevant cloud certifications (AWS, GCP, or Azure) is typically required. Strong problem-solving abilities, communication skills, and a proactive mindset help you prevent and resolve incidents efficiently. These skills ensure systems remain robust, scalable, and highly available, which is critical for maintaining business continuity and user trust.

What are Systems Reliability Engineers?

Systems Reliability Engineers (SREs) are IT professionals responsible for ensuring the reliability, availability, and performance of software systems and infrastructure. They combine software engineering and systems administration skills to automate processes, monitor system health, respond to incidents, and improve system resilience. SREs work closely with development and operations teams to optimize deployment pipelines, manage outages, and implement best practices for scalability and reliability. Their goal is to minimize downtime and ensure a seamless user experience.

How does a Systems Reliability Engineer typically collaborate with development and operations teams to ensure system stability?

Systems Reliability Engineers (SREs) work closely with both development and operations teams to bridge the gap between software engineering and IT operations. They participate in design reviews to ensure reliability is built into new features, coordinate with developers to automate deployments, and work with operations to monitor system health and respond to incidents. By fostering a culture of shared responsibility for uptime and performance, SREs help streamline troubleshooting and drive improvements across the organization. Regular communication and joint post-incident reviews are key practices in this collaborative environment.

What is the difference between Systems Reliability Engineer vs DevOps Engineer?

AspectSystems Reliability EngineerDevOps Engineer
Primary FocusEnsuring system reliability, availability, and performanceAutomating deployment, integration, and continuous delivery
Skills & CertificationsSRE certifications, Linux, scripting, monitoring toolsCI/CD tools, cloud platforms, scripting, automation
Work EnvironmentOperations, infrastructure, and reliability teamsDevelopment and operations collaboration
Industry UsageTech, finance, e-commerceTech, startups, cloud services

While both roles focus on improving system performance, Systems Reliability Engineers primarily concentrate on maintaining system uptime and reliability, whereas DevOps Engineers focus on streamlining development and deployment processes. Both roles often collaborate but serve different core functions within an organization.

More about Systems Reliability Engineer jobs
What cities are hiring for Systems Reliability Engineer jobs? Cities with the most Systems Reliability Engineer job openings:
What states have the most Systems Reliability Engineer jobs? States with the most job openings for Systems Reliability Engineer jobs include:
What job categories do people searching Systems Reliability Engineer jobs look for? The top searched job categories for Systems Reliability Engineer jobs are:
Infographic showing various Systems Reliability Engineer job openings in the United States as of June 2026, with employment types broken down into 3% As Needed, 50% Full Time, 31% Part Time, 15% Contract, and 1% Nights. Highlights an 87% Physical, 5% Hybrid, and 8% Remote job distribution, with an average salary of $117,973 per year, or $56.7 per hour.
Member of Technical Staff, AI Reliability & Monitoring Engineering Lead

Member of Technical Staff, AI Reliability & Monitoring Engineering Lead

Postman

San Francisco, CA • On-site

$256K - $276K/yr

Other

Posted 14 days ago


Job description

The Opportunity

Postman is seeking an experienced AI Systems Reliability Engineer to help define, build, and maintain the infrastructure and processes that ensure the reliability, scalability, and performance of Postman's AI-powered API and agentic systems in production. This role focuses on monitoring, availability, incident response, and automation to support AI services and tools trusted by millions of developers globally.

What You'll Do
  • Develop and manage reliability metrics (SLOs) for AI-driven API services and agentic AI platform features

  • Implement comprehensive observability and monitoring systems for real-time performance and fault detection

  • Design and drive automated failover, recovery, and incident response strategies for high-availability AI infrastructure

  • Optimize resource utilization, particularly GPU/accelerator efficiency, ensuring cost-effective AI system operation

  • Collaborate closely with engineering, platform, and product teams to align reliability efforts with broader organizational goals

  • Lead efforts to build internal tooling and automation focused on AI system stability and operational excellence

  • Drive continuous improvement in deployment practices, monitoring approaches, and incident management processes

About You
  • Have a strong background in AI reliability engineering, SRE, or DevOps for distributed systems

  • Understand the unique challenges of maintaining large-scale AI systems and integrating AI-specific metrics into reliability frameworks

  • Are experienced with cloud platforms, monitoring tools, and incident response automation

  • Are comfortable collaborating across teams to influence best practices for AI system reliability and operational health

  • Thrive in dynamic, fast-paced environments focusing on delivering reliable, safe AI-powered services

Bonus Skills and Experiences

  • Hands-on experience with AI/ML infrastructure, including GPU/xPU optimization and scaling

  • Familiarity with API platform operations and large-scale distributed services

  • Prior experience building or operating observability tools tailored for AI and agentic systems

  • Contribution to open-source projects or reliability engineering thought leadership

The reasonably estimated base salary for this role ranges from $256,000 to $276,000, plus a competitive equity package. Actual compensation is based on the candidate's skills, qualifications, and experience.