2

Site Reliability Engineer Remote Jobs in Washington

Site Reliability Engineer

Columbia, MD ยท On-site +1

$55.50 - $73.75/hr

Hybrid Columbia MD 3 times per week OR Remote (as applicable to role) Work Authorization ... Job Overview Cogent People Inc. is seeking a Site Reliability to support system reliability ...

next page

Showing results 1-20

Site Reliability Engineer Remote information

What are the key skills and qualifications needed to thrive in the Site Reliability Engineer Remote position, and why are they important?

To thrive as a Site Reliability Engineer Remote, you need expertise in systems administration, cloud infrastructure, automation, coding (often in Python or Go), and a solid grasp of networking fundamentals, usually demonstrated with a degree in computer science or equivalent experience. Familiarity with tools such as Docker, Kubernetes, AWS/GCP/Azure, monitoring platforms like Prometheus, and certifications like AWS Certified SysOps Administrator are highly valued. Excellent problem-solving, communication, and collaboration skills are essential, especially when troubleshooting incidents and passing information across distributed teams. These abilities ensure reliable, scalable services and smooth coordination in a remote work environment.

What is a Site Reliability Engineer Remote job?

A Site Reliability Engineer (SRE) in a remote role is responsible for ensuring the reliability, performance, and scalability of software systems while working from a remote location. They bridge the gap between development and operations by implementing automation, monitoring, and incident response strategies. Remote SREs collaborate with distributed teams to improve infrastructure, troubleshoot issues, and optimize system performance. Strong communication skills, proficiency in cloud technologies, and expertise in software development are essential for success in this role.

What are some common challenges faced by Site Reliability Engineers working remotely, and how are they addressed?

Site Reliability Engineers working remotely may encounter challenges like coordinating across multiple time zones, maintaining clear communication during urgent incidents, and managing complex systems without direct on-site access. These are often addressed by leveraging collaborative tools (like Slack, Zoom, and incident management platforms), implementing well-documented processes, and participating in regular team syncs or on-call rotations. Remote SREs also benefit from automation and observability practices that provide in-depth systems insights without needing physical presence. Many organizations support their success through robust onboarding, continuous training, and establishing clear lines of communication for rapid response scenarios. This blend of technical and teamwork strategies helps remote SREs maintain service reliability and stay connected with their colleagues.

What are the most commonly searched types of Site Reliability Engineer jobs in Washington? The most popular types of Site Reliability Engineer jobs in Washington are:
What are popular job titles related to Site Reliability Engineer Remote jobs in Washington? For Site Reliability Engineer Remote jobs in Washington, the most frequently searched job titles are:
What job categories do people searching Site Reliability Engineer Remote jobs in Washington look for? The top searched job categories for Site Reliability Engineer Remote jobs in Washington are:
What cities in Washington are hiring for Site Reliability Engineer Remote jobs? Cities in Washington with the most Site Reliability Engineer Remote job openings:
Infographic showing various Site Reliability Engineer Remote job openings in Washington as of June 2026, with employment types broken down into 74% Full Time, and 26% Contract. Highlights an 100% Remote job distribution.
Senior Site Reliability Engineer

Senior Site Reliability Engineer

Lean Solutions Group

Washington, DC โ€ข Remote

$64.50 - $85.75/hr

Full-time

This job post hasย expired today.ย Applications are no longer accepted.


Job description

Company Overview:
Global Technology Services is a rapidly expanding organization situated in Medellรญn, Colombia. We pride ourselves on possessing one of the most influential networks within software development and IT services for the entertainment, financial, and logistics sectors. Our corporate projections offer a multitude of opportunities for professionals to elevate their careers and experience substantial growth. Joining our team means engaging with expansive engineering teams across Latin America, Philippines and the United States, contributing to cutting-edge developments in multiple industries.

ย 

Position Title: Senior Site Reliability Engineerย 
Location: Remote-US

ย 

What you will be doing:
We are seeking a highly experienced Senior Site Reliability Engineer to own and evolve the reliability, security, observability, and operational maturity of our cloud platform. This is not a traditional SRE role. We are looking for an engineer who operates with an AI-native mindset and uses AI as a core operational force multiplier across infrastructure, incident response, automation, compliance, and operational excellence.
Required Skills & Experience
To excel in this role, you should possess:
AI-Native SRE Operations (Hard Requirement)
  • Expert use of AI tools and agentic workflows to automate infrastructure and SRE tasks.
  • Hands-on experience using AI for Terraform development, incident triage, log analysis, runbook creation, postmortems, operational automation, CI/CD pipeline generation, and reducing repetitive operational work.
  • Strong understanding of AI capabilities, limitations, and necessary validation processes.
Ability to clearly articulate AI workflows, tooling choices, operational safeguards, and production outcomes.
Cloud Infrastructure & AWS (Hard Requirement)
  • 10+ years managing production infrastructure for SaaS platforms, including 5+ years of senior AWS ownership.
  • Deep expertise with AWS services such as ECS, VPC, IAM, RDS, S3, CloudFront, Route53, Lambda, API Gateway, CloudWatch, Secrets Manager, and related security and governance services.
  • Advanced Terraform experience managing multi-account environments, infrastructure state, drift remediation, and dependency management.
  • Advanced Terraform experience managing multi-account, multi-workspace infrastructure
  • Strong understanding of: provider versioning, state management, drift detection and remediation, dependency management, infrastructure blast radius analysis
  • Proven experience resolving production infrastructure drift safely
  • Significant experience leading production incidents as the accountable owner
  • Ability to operate calmly and effectively during high-severity outages
  • Proven experience authoring detailed postmortems and operational remediation plans
  • Strong understanding of operational risk management and production recovery procedures
Observability & Monitoring
  • Proven experience leading production incidents, driving root-cause analysis, and creating remediation plans.
  • Strong background in observability, monitoring, logging, distributed tracing, and alerting using tools such as Grafana.
  • Experience owning CI/CD pipelines, deployment strategies, infrastructure automation, and operational workflows.
Systems, Security & Compliance
  • Strong Linux administration, containerization (Docker), networking, and scripting skills.
  • Experience with security best practices, identity management (SAML, OIDC, SCIM), and compliance frameworks such as SOC 2, ISO 27001, HIPAA, or PCI.
  • Comfortable working directly with auditors and maintaining compliance controls.
Nice to Have:
  • Experience supporting Spring Boot or JVM-based systems in production
  • Experience with runtime security or EDR tooling such as Falco
  • Experience automating joiner/mover/leaver identity workflows using SCIM and IdP tooling
  • AWS certifications including:
AWS Solutions Architect Professional
AWS DevOps Engineer Professional
AWS Security Specialty
  • Ability to read and debug Kotlin or Java backend services from an SRE perspective
Soft Skills:
  • Excellent verbal and written communication, able to convey ideas clearly.
  • Highly autonomous and proactive, taking ownership of tasks.
  • Adaptable to fast-paced, dynamic work environments.
  • Responsive and reliable across channels, including email and Slack, consistently delivering results.
  • Able to add immediate value to the client, contributing effectively from the first week.
  • React/NodeJS/Backstage developer experienceย 
  • MuleSoft API Management experience
Why you will love GTS:
  • Join a powerful tech workforce and help us change the world through technology
  • Professional development opportunities with international customers
  • Collaborative work environment
  • Career path and mentorship programs that will lead to new levels.
Join Lean Tech and contribute to shaping the data landscape within a dynamic and
growing organization. Your skills will be honed, and your contributions will play a vital
role in our continued success. Lean Tech is an equal opportunity employer. We
celebrate diversity and are committed to creating an inclusive environment for all
employees.