1

Service Reliability Engineer Jobs in Washington (NOW HIRING)

Associate Site Reliability Engineer

Millersville, MD · On-site +1

$55.50 - $73.50/hr

Participate in incident response activities, service restoration efforts, and post-incident ... engineering, observability, automation, and reliability practices through hands-on work and ...

Respond to and resolve system outages, impairments, and service disruptions while coordinating with ... Expert knowledge of site reliability engineering practices, system monitoring, incident management ...

Senior Site Reliability Engineer

Millersville, MD · On-site +1

$55.50 - $73.50/hr

Define, track, and report service level indicators (SLIs), service level objectives (SLOs), and error budgets to guide engineering decisions and service improvements. Automation, CI/CD ...

... Services contract. Responsibilities : • Monitor and maintain system reliability, availability ... Required : • Expert knowledge of site reliability engineering practices, system monitoring ...

Respond to and resolve system outages, impairments, and service disruptions while coordinating with ... Expert knowledge of site reliability engineering practices, system monitoring, incident management ...

Respond to and resolve system outages, impairments, and service disruptions while coordinating with ... Expert knowledge of site reliability engineering practices, system monitoring, incident management ...

Ardent is seeking a Reliability Engineer to join our team. This is an onsite role in Ashburn, VA ... Hands-on experience with Amazon Web Services (AWS) and cloud-based monitoring tools. * Ability to ...

Site Reliability Engineer (SRE)

Vienna, VA · On-site

$57.25 - $76/hr

The AWS Site Reliability Engineer (SRE) is responsible for the operational health, availability ... You will define and track Service Level Objectives (SLOs) to balance reliability with innovation as ...

Site Reliability Engineer

Washington, DC · On-site

$64.25 - $85.50/hr

... Service Level Objectives (SLOs) and Service Level Agreements (SLAs). Qualifications : Required : • Bachelor's degree in Computer Science, Engineering, or a related technical discipline. • 5 or ...

SRE Engineer

Arlington, VA · On-site

$65.75 - $87.25/hr

Responsibilities : • Define, implement, and maintain site reliability engineering practices for mission-critical applications and shared services, with emphasis on uptime, resiliency ...

Site Reliability Engineer (SRE) (TS)

Washington, DC · On-site

$64.50 - $85.75/hr

Koniag Management Solutions, LLC (KMS), a Koniag Government Services (KGS) company, is hiring a Site Reliability Engineer (SRE). Position requires an active Top Secret/SCI clearance with ability to ...

SRE Engineer

Arlington, VA · On-site

$65.75 - $87.25/hr

Responsibilities : • Define, implement, and maintain site reliability engineering practices for mission-critical applications and shared services, with emphasis on uptime, resiliency ...

next page

Showing results 1-20

Service Reliability Engineer information

See Washington salary details

$69.1K

$133.6K

$159.7K

How much do service reliability engineer jobs pay per year?

As of Jun 21, 2026, the average yearly pay for service reliability engineer in Washington is $133,616.00, according to ZipRecruiter salary data. Most workers in this role earn between $116,100.00 and $146,100.00 per year, depending on experience, location, and employer.

What are Service Reliability Engineers?

Service Reliability Engineers (SREs) are IT professionals who apply software engineering principles to infrastructure and operations problems. Their main goal is to ensure that services are reliable, scalable, and highly available by automating processes, monitoring system performance, and responding to incidents. SREs work closely with development and operations teams to design, build, and maintain robust systems, often using code to manage infrastructure. They also focus on improving system reliability through monitoring, incident response, and post-incident analysis.

How does a Service Reliability Engineer typically collaborate with development and operations teams to improve service uptime?

Service Reliability Engineers (SREs) work closely with both development and operations teams to ensure systems are highly available and resilient. They often participate in incident response, conduct post-incident reviews, and help implement automation to reduce manual intervention. Regular collaboration includes reviewing application changes, contributing to infrastructure design, and sharing best practices for monitoring and alerting. This cross-functional teamwork helps to quickly identify potential issues and proactively enhance system reliability.

Will AI replace SRE jobs?

AI is expected to augment the work of Service Reliability Engineers (SREs) by automating routine tasks such as monitoring, incident response, and data analysis. However, SREs will continue to be essential for designing, managing, and improving complex systems that require human judgment and expertise. The role is likely to evolve with increased use of AI tools but not be fully replaced.

What engineers make $500,000?

Senior engineers in fields such as software, data engineering, and cloud infrastructure can earn $500,000 or more annually, especially with experience, specialized skills, and stock options. Roles in high-demand industries like technology and finance often offer compensation at this level for top-tier professionals.

What are the key skills and qualifications needed to thrive as a Service Reliability Engineer, and why are they important?

To thrive as a Service Reliability Engineer, you need a solid background in systems administration, networking, coding (often in Python or Go), and experience with cloud infrastructure, typically supported by a degree in computer science or a related field. Familiarity with monitoring tools (like Prometheus), CI/CD pipelines, automation frameworks, and certifications such as AWS Certified DevOps Engineer are highly valued. Strong problem-solving abilities, collaboration, and effective communication skills help you proactively address issues and work well within cross-functional teams. These skills ensure system reliability, quick incident recovery, and the seamless delivery of high-availability services.

What is the difference between Service Reliability Engineer vs Site Reliability Engineer?

AspectService Reliability EngineerSite Reliability Engineer
CredentialsTypically requires experience in software engineering, cloud platforms, and monitoring toolsSimilar credentials, often with a focus on software development and systems engineering
Work EnvironmentWorks closely with development and operations teams to ensure service reliabilityWorks on maintaining and improving system reliability, often in cloud or data center environments
Industry UsageCommon in tech companies focusing on service uptime and customer experienceWidely used in tech, especially in cloud and large-scale infrastructure companies

Both roles focus on ensuring system reliability, often requiring similar skills and certifications. The main difference lies in terminology preference and specific organizational focus, but they generally perform comparable functions in maintaining high service availability.

What engineers make $300,000 a year?

Senior-level engineers in fields such as software engineering, data engineering, and site reliability engineering can earn $300,000 or more annually, especially with extensive experience, specialized skills, and working in high-demand industries or companies. Compensation often includes base salary, bonuses, and stock options, particularly in technology firms or startups with significant growth potential.

What does a service reliability engineer do?

A Service Reliability Engineer (SRE) is responsible for ensuring the availability, performance, and reliability of software services. They monitor systems, automate incident response, implement best practices for system stability, and often use tools like monitoring dashboards and automation scripts to prevent outages and improve service quality.
What are popular job titles related to Service Reliability Engineer jobs in Washington? For Service Reliability Engineer jobs in Washington, the most frequently searched job titles are:
What job categories do people searching Service Reliability Engineer jobs in Washington look for? The top searched job categories for Service Reliability Engineer jobs in Washington are:
What cities in Washington are hiring for Service Reliability Engineer jobs? Cities in Washington with the most Service Reliability Engineer job openings:
Infographic showing various Service Reliability Engineer job openings in Washington as of June 2026, with employment types broken down into 100% Full Time. Highlights an 100% In-person job distribution, with an average salary of $133,616 per year, or $64.2 per hour.

$64.25 - $85.50/hr

Other

Posted 12 days ago


Job description

Job Title: Site Reliability Engineer (SRE)

Location: Washington, DC (Onsite)

Clearance: TS/SCI

Position Overview

Seeking a highly motivated Site Reliability Engineer (SRE) to support mission-critical enterprise applications and infrastructure in a high-availability environment. The SRE will be responsible for ensuring system reliability, performance, scalability, and operational efficiency through proactive monitoring, automation, and rapid incident response.

This role bridges development and operations, partnering closely with engineering teams to ensure new capabilities are delivered without compromising production stability. The ideal candidate brings strong Linux expertise, automation skills, and hands-on experience with cloud-native and containerized environments.

Key Responsibilities

Monitoring & Performance

· Monitor system health, availability, and performance using enterprise observability tools

· Analyze metrics and logs to proactively detect and remediate issues

· Tune alerting to reduce noise and prioritize mission impact

Incident Management & Reliability

· Respond to and resolve production incidents across distributed environments

· Perform root cause analysis and lead post-incident reviews

· Implement corrective and preventive actions to improve resilience

· Participate in on-call rotation for outages, upgrades, and urgent activities

Automation & DevOps Enablement

· Automate repetitive operational tasks to improve efficiency and reduce human error

· Support CI/CD pipelines and automated deployment workflows

· Develop scripts and tooling to improve reliability and repeatability

Platform & Infrastructure Support

· Maintain Linux/Unix systems and containerized workloads

· Support Kubernetes/Docker environments and microservices architectures

· Assist with configuration management and environment standardization

· Ensure secure and compliant system configurations

Collaboration & Continuous Improvement

· Partner with development teams to improve service reliability and performance

· Support backlog refinement and reliability engineering initiatives

· Document runbooks, procedures, and knowledge articles

· Contribute to continuous service improvement efforts

Required Qualifications

Education & Experience

· Bachelor’s degree in Computer Science, Engineering, or related technical field

· Minimum 5 years of relevant technical experience

· At least 3 years of systems programming or SRE/DevOps experience

Technical Skills

· Strong proficiency in Python, Bash, or similar scripting languages

· Hands-on experience with Linux/Unix administration

· Experience with Kubernetes and Docker

· Familiarity with cloud platforms (AWS, Azure, or Google Cloud)

· Experience with monitoring and logging tools (e.g., Grafana, Kibana, Prometheus, ELK)

· Working knowledge of CI/CD tools (e.g., GitLab, Jenkins, ArgoCD)

· Understanding of microservices architecture and DevOps practices

· Experience with Git-based workflows

Infrastructure & Networking

· Knowledge of networking fundamentals, load balancers, and firewalls

· Experience with identity and access management (IAM, SSH, VPN, security groups)

· Experience deploying to on-premises or data center environments

Professional Skills

· Strong analytical and troubleshooting abilities

· Excellent time management and ability to work independently

· Effective written and verbal communication skills

· Experience using Jira and Confluence in an Agile environment

Preferred Qualifications

· Experience defining or working with SLIs, SLOs, and error budgets

· Familiarity with Helm and Kubernetes deployment pipelines

· Experience supporting high-availability or mission-critical systems

· Knowledge of security best practices and compliance frameworks