2

Remote Reliability Engineer Jobs in Chicago, IL (NOW HIRING)

Senior Site Reliability Engineer

Chicago, IL ยท On-site +1

$58.75 - $78/hr

Open to remote work in the US. The preferred work location is Chicago, IL. What You Might Work On As a Senior SRE, you may be responsible for a subset of the following, depending on team placement ...

Site Reliability Engineer

Chicago, IL ยท On-site +1

$100K - $120K/yr

Overview The Site Reliability Engineer is a key force behind improving Origami's time to resolution ... All full-time positions are hybrid, with many eligible to be completely remote * Fully Paid by ...

Principal AI Engineer - Agent Ops / SRE

Chicago, IL ยท On-site +1

$58.75 - $78/hr

The Hartford's applied AI COE Team is seeking a Principal AI Engineer - Agent Ops/SRE . The AI-COE ... This role can have a Hybrid or Remote work arrangement. Candidates who live near one of our office ...

Senior Site Reliability Engineer

Mundelein, IL ยท Remote

$58.25 - $77.50/hr

We're hiring a Senior Site Reliability Engineer to help build and operate that infrastructure. This ... This role is remote-friendly across Canada and the US Pacific Northwest. We may use artificial ...

Site Reliability Engineering Manager II

Chicago, IL ยท On-site +1

$58.75 - $78/hr

At Flywire, the SRE team is responsible for the lifecycle of production systems. Our team is ... Our engineering team is distributed across 3 continents and 4 different countries so remote work is ...

Site Reliability Engineer IRC294570

Chicago, IL ยท On-site +1

$130K - $140K/yr

Develop self-healing capabilities and platform automation using Logic Apps and Python. #LI-VK1 Requirements 7+ years of experience in SRE, platform engineering, or cloud infrastructure engineering in ...

Position is remote for candidates located within the NYC, DC-Baltimore, and Minneapolis metro areas. Essential Job Duties Platform Reliability (SRE) * Own availability, latency, and performance ...

New

Position is remote for candidates located within the NYC, DC-Baltimore, and Minneapolis metro areas. Essential Job Duties Platform Reliability (SRE) * Own availability, latency, and performance ...

New

Position is remote for candidates located within the NYC, DC-Baltimore, and Minneapolis metro areas. Essential Job Duties Platform Reliability (SRE) * Own availability, latency, and performance ...

New

next page

Showing results 1-20

Remote Reliability Engineer information

See Chicago, IL salary details

$62.8K

$121.5K

$145.3K

How much do remote reliability engineer jobs pay per year?

As of Jun 17, 2026, the average yearly pay for remote reliability engineer in Chicago, IL is $121,529.00, according to ZipRecruiter salary data. Most workers in this role earn between $105,600.00 and $132,900.00 per year, depending on experience, location, and employer.

What is the difference between Remote Reliability Engineer vs Remote Site Reliability Engineer?

AspectRemote Reliability EngineerRemote Site Reliability Engineer
CredentialsTypically requires certifications like AWS Certified Solutions Architect, Linux Foundation certificationsSimilar credentials, often with additional focus on site-specific tools and monitoring
Work EnvironmentPrimarily remote, focusing on cloud infrastructure and system reliabilityRemote with some on-site responsibilities, focusing on infrastructure and operational stability
Industry UsageUsed across tech, cloud providers, SaaS companiesCommon in data centers, cloud providers, and large enterprise IT
Search & Comparison IntentOften compared due to overlapping roles in system reliability and cloud infrastructureCompared for on-site vs remote operational responsibilities

The main difference is that Remote Reliability Engineers focus on cloud and system reliability remotely, while Remote Site Reliability Engineers may have some on-site duties related to infrastructure. Both roles require similar skills and certifications but differ in their work environment and specific responsibilities.

What are the key skills and qualifications needed to thrive as a Remote Reliability Engineer, and why are they important?

To thrive as a Remote Reliability Engineer, you need a strong background in systems engineering, software development, and infrastructure management, often supported by a degree in computer science or a related field. Proficiency with cloud platforms (such as AWS, Azure, or GCP), monitoring tools (like Prometheus, Grafana), and relevant certifications (e.g., AWS Certified DevOps Engineer) is highly valuable. Excellent problem-solving, communication, and collaboration skills are crucial for working effectively across distributed teams and responding to incidents. These abilities ensure system reliability, quick incident resolution, and seamless remote teamwork, which are vital for maintaining high service uptime and user satisfaction.

How do Remote Reliability Engineers typically collaborate with on-site teams to address urgent technical issues?

Remote Reliability Engineers often utilize a combination of video conferencing, instant messaging, and collaborative monitoring tools to stay closely connected with on-site teams. When urgent technical issues arise, they participate in real-time troubleshooting sessions, analyze system logs remotely, and may guide on-site staff through step-by-step resolution procedures. Building strong communication channels and regular check-ins are essential to ensure swift and effective collaboration, even across different time zones. This structure allows Remote Reliability Engineers to contribute significantly to system uptime while working from a distance.

What is a Remote Reliability Engineer?

A Remote Reliability Engineer is a professional who works from a remote location to ensure that systems, applications, or infrastructure are reliable, available, and performing well. Their responsibilities typically include monitoring system health, diagnosing issues, implementing preventative measures, and collaborating with teams to improve system reliability. They often use tools for automation, incident response, and performance monitoring, all while working offsite. This role is critical in minimizing downtime and ensuring a smooth user experience, especially for companies with complex technical environments. Remote Reliability Engineers must have strong problem-solving skills and be proficient in cloud technologies, automation, and incident management.
What are the most commonly searched types of Reliability Engineer jobs in Chicago, IL? The most popular types of Reliability Engineer jobs in Chicago, IL are:
What job categories do people searching Remote Reliability Engineer jobs in Chicago, IL look for? The top searched job categories for Remote Reliability Engineer jobs in Chicago, IL are:
What cities near Chicago, IL are hiring for Remote Reliability Engineer jobs? Cities near Chicago, IL with the most Remote Reliability Engineer job openings:
Senior Site Reliability Engineer

Senior Site Reliability Engineer

SDI International

Chicago, IL โ€ข On-site, Remote

$58.75 - $78/hr

Other

Posted 17 days ago


Job description

No H1 or C2C. Must be Permanent Resident or US Citizen


Senior Site Reliability Engineer


Description and Requirements

About Our Team

We are building Quantum, a nextโ€‘generation hybrid AI platform that spans Windows, Android, and cloud. As part of this vision, we are expanding the reliability engineering organization that powers crossโ€‘device Personal AI.

We are looking for Senior Site Reliability Engineers (SREs) to help us build and evolve the foundational reliability, observability, and operations capabilities that ensure fast, safe, and dependable for millions of users.

This role may support one of several teams within the SRE organization (e.g., Observability, Operations, or Service Reliability), depending on your strengths and interests.

Operating with the speed, ownership, and creative latitude of a startupโ€”yet supported by the scale, resources, and technical depth. We are building new systems, new tooling, and new operational models from the ground up, and we are doing so with clarity, intention, and high engineering standards.




Location: Open to remote work in the US. The preferred work location is Chicago, IL.


What You Might Work On

As a Senior SRE, you may be responsible for a subset of the following, depending on team placement and skill alignment:

Reliability & Performance Engineering

  • Improving the availability, scalability, and performance of distributed systems across device, edge, and cloud.
  • Defining or refining SLIs, SLOs, and error budgets for critical services.
  • Leading initiatives to remove single points of failure, improve resilience, and reduce operational risk.

Operational Excellence

  • Participating in onโ€‘call rotations and contributing to incident response, triage, and post-incident reviews.
  • Developing automation, runbooks, and selfโ€‘healing systems to reduce alert noise and MTTR.
  • Enhancing operational readiness and supporting incident prevention programs.

Observability & Insight

  • Designing or improving observability systems using OpenTelemetry, Grafana, and modern signal pipelines.
  • Building dashboards, analytics, and alerting that illuminate system health and AI service behavior.
  • Ensuring telemetry is reliable, actionable, and tied to realโ€‘world outcomes.

Deployments & Change Safety

  • Improving reliability of CI/CD workflows, including phased rollouts, canaries, shadow testing, and safe rollback mechanisms.
  • Contributing to the evolution of deployment tooling for device+edge+cloud hybrid systems.

Systems Design & Collaboration

  • Influencing architectural decisions by injecting reliability, observability, and operational considerations early in design.
  • Collaborating with AI/ML engineers, platform engineers, firmware teams, and product partners to deliver robust, dependable user experiences.


Basic Qualifications

  • 10+ years of experience in Site Reliability Engineering, Production Engineering, DevOps, or largeโ€‘scale distributed systems operations
  • Bachelorโ€™s Degree in Computer Science, Engineering, or a related technical discipline
  • Strong experience running production distributed systems at scale
  • Proficiency in at least one modern programming language (e.g., Python, Go, Java, C++)
  • Strong understanding of Linux systems, networking fundamentals, and system performance tuning
  • Experience with monitoring/observability (metrics, logs, tracing)
  • Handsโ€‘on experience with cloud environments (Azure, AWS, or GCP)
  • Experience in incident management, onโ€‘call rotations, and postmortem processes


Preferred Qualifications

  • Deep experience with Azure cloud services
  • Experience with OpenTelemetry for endโ€‘toโ€‘end instrumentation
  • Strong familiarity with Grafana, Prometheus, Loki, Tempo, or similar tools
  • Experience supporting AI/ML systems, model serving, or dataโ€‘intensive workloads
  • Background with hybrid architectures (device + edge + cloud)
  • Experience improving deployment reliability and progressive delivery systems
  • Passion for automation, reliability engineering, and reducing operational friction


What Success Looks Like

  • Systems become more observable, reliable, and predictable.
  • Incidents are resolved quickly, and followโ€‘up improvements prevent recurrence.
  • Alerting becomes more accurate, actionable, and trusted.
  • Deployments become safer and more consistent.
  • Teams move faster because reliability foundations are strong and intuitive.