2

Remote Reliability Engineer Jobs in Virginia (NOW HIRING)

Site Reliability Engineer

Mclean, VA · Remote

$57.50 - $76.50/hr

Site Reliability Engineer Job number: 880 This is a remote position. Ad Hoc is a technology company that empowers organizations to deliver scalable, impactful digital services. Using modern, agile ...

Senior Site Reliability Engineer

Mclean, VA · Remote

$57.50 - $76.50/hr

Senior Site Reliability Engineer Job number: 884 This is a remote position. Ad Hoc is a technology company that empowers organizations to deliver scalable, impactful digital services. Using modern ...

SRE - Linux

Reston, VA · On-site +1

$135K - $183K/yr

The SRE is part of a highly skilled engineering and infrastructure team responsible for the design, administration, security, and operation of Verisign's application delivery and application security ...

DevOps/MLOps Engineer

Ashburn, VA · On-site +1

$54 - $74/hr

Remote Work: Niyam understands the value of flexibility. We offer remote work. * Career Growth ... Monitor system performance, availability, and reliability using centralized logging, metrics, and ...

next page

Showing results 1-20

Remote Reliability Engineer information

What is the difference between Remote Reliability Engineer vs Remote Site Reliability Engineer?

AspectRemote Reliability EngineerRemote Site Reliability Engineer
CredentialsTypically requires certifications like AWS Certified Solutions Architect, Linux Foundation certificationsSimilar credentials, often with additional focus on site-specific tools and monitoring
Work EnvironmentPrimarily remote, focusing on cloud infrastructure and system reliabilityRemote with some on-site responsibilities, focusing on infrastructure and operational stability
Industry UsageUsed across tech, cloud providers, SaaS companiesCommon in data centers, cloud providers, and large enterprise IT
Search & Comparison IntentOften compared due to overlapping roles in system reliability and cloud infrastructureCompared for on-site vs remote operational responsibilities

The main difference is that Remote Reliability Engineers focus on cloud and system reliability remotely, while Remote Site Reliability Engineers may have some on-site duties related to infrastructure. Both roles require similar skills and certifications but differ in their work environment and specific responsibilities.

What are the key skills and qualifications needed to thrive as a Remote Reliability Engineer, and why are they important?

To thrive as a Remote Reliability Engineer, you need a strong background in systems engineering, software development, and infrastructure management, often supported by a degree in computer science or a related field. Proficiency with cloud platforms (such as AWS, Azure, or GCP), monitoring tools (like Prometheus, Grafana), and relevant certifications (e.g., AWS Certified DevOps Engineer) is highly valuable. Excellent problem-solving, communication, and collaboration skills are crucial for working effectively across distributed teams and responding to incidents. These abilities ensure system reliability, quick incident resolution, and seamless remote teamwork, which are vital for maintaining high service uptime and user satisfaction.

How do Remote Reliability Engineers typically collaborate with on-site teams to address urgent technical issues?

Remote Reliability Engineers often utilize a combination of video conferencing, instant messaging, and collaborative monitoring tools to stay closely connected with on-site teams. When urgent technical issues arise, they participate in real-time troubleshooting sessions, analyze system logs remotely, and may guide on-site staff through step-by-step resolution procedures. Building strong communication channels and regular check-ins are essential to ensure swift and effective collaboration, even across different time zones. This structure allows Remote Reliability Engineers to contribute significantly to system uptime while working from a distance.

What is a Remote Reliability Engineer?

A Remote Reliability Engineer is a professional who works from a remote location to ensure that systems, applications, or infrastructure are reliable, available, and performing well. Their responsibilities typically include monitoring system health, diagnosing issues, implementing preventative measures, and collaborating with teams to improve system reliability. They often use tools for automation, incident response, and performance monitoring, all while working offsite. This role is critical in minimizing downtime and ensuring a smooth user experience, especially for companies with complex technical environments. Remote Reliability Engineers must have strong problem-solving skills and be proficient in cloud technologies, automation, and incident management.
What are the most commonly searched types of Reliability Engineer jobs in Virginia? The most popular types of Reliability Engineer jobs in Virginia are:
What are popular job titles related to Remote Reliability Engineer jobs in Virginia? For Remote Reliability Engineer jobs in Virginia, the most frequently searched job titles are:
What cities in Virginia are hiring for Remote Reliability Engineer jobs? Cities in Virginia with the most Remote Reliability Engineer job openings:
Site Reliability Engineer

Site Reliability Engineer

Ad Hoc

Mclean, VA • Remote

$57.50 - $76.50/hr

Full-time

Medical, Dental, Vision, Retirement, PTO

Posted 3 days ago


Job description

Site Reliability Engineer

Job number: 880

This is a remote position.

Ad Hoc is a technology company that empowers organizations to deliver scalable, impactful digital services. Using modern, agile methods, our team creates products that meet people’s needs and transform their experience of government.

Work on things that matter

Our collaborations have shaped some of the defining moments in public-sector service delivery. We’ve helped build products that connect Veterans to tailored services, help millions access affordable health care, and support important programs like Head Start. As we work with agencies to deliver critical services, we’re also changing how the government approaches technology.

Built for a remote life

Our culture, communications, and tools are built for remote work, enabling us to bring together top talent nationwide. At Ad Hoc, remote life empowers our teams to design work environments that fit their lives and that foster flexibility and collaboration to achieve positive outcomes for our customers.

Committed to high expectations and a welcoming culture

Ad Hoc values acceptance, accountability, and humility. We aren’t heroes. We learn from our mistakes and improve the process for the next time. We build small, inclusive teams to collaborate closely with our partners to solve the right problems and deliver software that works.

The Veterans Affairs business unit helps transform the VA into a modern digital services organization where Veteran outcomes are at the center of every effort. We partner with the VA to design and deliver seamless user experiences for Veterans, their families and caregivers, and VA employees. By applying better practices in service design, product management, and technology, we enable the VA to increase the use, quality, and reliability of services and decrease the time Veterans spend waiting for outcomes.

Primary Responsibilities:

As a Site Reliability Engineer, you will help ensure the availability, performance, and reliability of a large federal enterprise cloud platform that operates around the clock. With the support and guidance of senior engineers, you will help meet scope, schedule, and delivery requirements while improving the platform's reliability practices. Primary expectations of a Site Reliability Engineer include:

  • Monitoring platform health and supporting service level objectives (SLOs), service level indicators, and error budgets
  • Building and maintaining observability tooling, including metrics, logging, alerting, and dashboards
  • Participating in on-call rotations and incident response, helping restore service and reduce time to recovery
  • Contributing to blameless postmortems and driving follow-up actions
  • Automating repetitive operational tasks to reduce toil
  • Supporting capacity planning and performance tuning across cloud infrastructure (AWS) and Kubernetes (Amazon EKS)
  • Implementing reliability improvements as infrastructure as code (Terraform)
  • Working with government partners and application teams to meet security, SLA, and performance requirements
  • Supporting recruiting efforts by evaluating exercises and assisting with interviews


Basic Qualifications:

  • Bachelor's and 5+ years of experience; relevant experience may be substituted for education
  • Experience with monitoring and observability tooling and on-call operations
  • Proficient with at least one infrastructure-as-code tool (Terraform preferred)
  • Background in key DevOps concepts: containerization, networking, and cloud infrastructure
  • Must be able to obtain and maintain a U.S. Public Trust / suitability determination


Preferred Qualifications:

  • Prior experience with the Department of Veterans Affairs
  • Experience with Kubernetes (Amazon EKS) and AWS in production
  • Familiarity with SLO-based reliability practices and error budgets
  • Relevant certifications (e.g., AWS, Certified Kubernetes Administrator)

To learn more about working at Ad Hoc, please visit:https://adhocteam.us/join

Benefits:

  • Company-subsidized health, dental, and vision insurance
  • Flexible PTO
  • 401K with employer match
  • Paid parental leave after one year of service
  • Employee Assistance Program

Ad Hoc LLC is an Equal Opportunity Employer. All qualified applicants will receive consideration for employment without regard to race, color, national origin, ancestry, sex, sexual orientation, gender identity or expression, religion, age, pregnancy, disability, work-related injury, covered veteran status, political ideology, marital status, or any other factor that the law protects from employment discrimination.

We value the unique skills gained through military service and encourage veterans and transitioning service members to apply.

In support of various state and city equal pay transparency laws, Ad Hoc job descriptions feature the starting range we reasonably expect to pay to candidates who would join our team with little to no need for training on the responsibilities we've outlined above. Actual compensation is influenced by a wide range of factors including but not limited to skill set, level of experience, and responsibility. The range of starting pay for this role is $125,000-$135,000. Our recruiters will be happy to answer any questions you may have, and we look forward to learning more about your salary requirements.

job reference:

https://adhoc.team/