2

Remote Contract Reliability Engineer Jobs in Colorado

As a Software Engineer II on the Site Reliability Engineering team within the Platform Engineering ... remote APIs. * Experience developing and operating production, customer-facing systems in AWS or ...

Site Reliability Engineer II

Denver, CO · On-site +1

$98K - $138K/yr

The Site Reliability Engineer II will be responsible for supporting, enhancing, and maintaining ... Wellness initiatives #BI-Remote DYN365, Inc d/b/a Restaurant365 is an equal opportunity employer ...

Site Reliability Engineer II

Denver, CO · On-site +1

$98K - $138K/yr

The Site Reliability Engineer II will be responsible for supporting, enhancing, and maintaining ... Wellness initiatives #BI-Remote DYN365, Inc d/b/a Restaurant365 is an equal opportunity employer.

DevOps/SRE Engineer

Denver, CO · On-site +1

$130K - $155K/yr

DevOps/SRE Engineer Department: Engineering Employment Type: Full Time Location ... Denver, Colorado (Remote) Compensation: $130,000 - $155,000 / year Description We're looking for a ...

This is a fully remote contract role with flexible hours - real, high-impact engineering work on ... Improve reliability, performance, and safety across existing Python codebases * Collaborate with ...

Contracts Manager

Denver, CO · On-site +1

$99K - $135K/yr

... engineering services for the space industry. Headquartered in Durango, Colorado, with expanding ... Agile Space Industries is seeking an experienced remote Contracts Manager to oversee the full ...

Delivery Engineer | Splunk (W2PE)

Denver, CO · Remote

$84K - $113K/yr

This is a remote, contract position and eligible candidates must be located in the United States. EDUCATION / KNOWLEDGE / CERTIFICATIONS * BA/BS in computer science, computer engineering, finance or ...

This is a fully remote, flexible contract role for experienced engineers who want to work on real ... Improve reliability, performance, and safety across existing Python codebases * Collaborate with ...

next page

Showing results 1-20

Remote Contract Reliability Engineer information

What are Remote Contract Reliability Engineers?

Remote Contract Reliability Engineers are professionals who work remotely, usually on a contract basis, to ensure that systems, equipment, or software operate reliably and efficiently. Their main focus is on analyzing data, troubleshooting issues, and implementing improvements to enhance the dependability and performance of products or processes. They collaborate with teams virtually to identify potential failures, recommend solutions, and help organizations minimize downtime and maintenance costs. Their work spans various industries such as manufacturing, technology, and energy, and typically involves using specialized tools and methodologies to predict and prevent problems before they occur.

What are the key skills and qualifications needed to thrive as a Remote Contract Reliability Engineer, and why are they important?

To thrive as a Remote Contract Reliability Engineer, you need a solid background in reliability engineering, failure analysis, and maintenance planning, often supported by a degree in engineering and relevant industry experience. Familiarity with reliability analysis software (like ReliaSoft), asset management systems, and certifications such as Certified Reliability Engineer (CRE) are typically required. Strong problem-solving, communication, and self-motivation skills are essential for effectively collaborating and delivering results in a remote, contract-based environment. These skills ensure the engineer can optimize system reliability, reduce downtime, and meet client expectations efficiently from a remote location.

How does a Remote Contract Reliability Engineer typically collaborate with on-site teams and stakeholders?

As a Remote Contract Reliability Engineer, you will frequently collaborate with on-site teams through virtual meetings, project management platforms, and real-time data sharing tools. Strong communication skills are essential, as you'll provide recommendations, troubleshoot issues, and review maintenance or performance data remotely. You may also participate in regular status updates and coordinate with cross-functional teams—such as operations, maintenance, and safety—to implement reliability improvements and ensure asset uptime. Successful remote collaboration often relies on proactive communication and clear documentation of your analyses and recommendations.

What is the difference between Remote Contract Reliability Engineer vs Remote Contract Maintenance Technician?

AspectRemote Contract Reliability EngineerRemote Contract Maintenance Technician
CredentialsEngineering degree, certifications like Six Sigma or Reliability EngineeringTechnical diploma or certifications in maintenance or HVAC
Work EnvironmentDesigning reliability strategies, analyzing data remotely, consultingPerforming repairs, inspections, and preventive maintenance remotely or on-site
Employer & Industry UsageManufacturing, energy, aerospace industriesManufacturing plants, facilities management, industrial sectors

The Remote Contract Reliability Engineer focuses on analyzing systems, improving reliability, and providing remote consulting, while the Remote Contract Maintenance Technician handles hands-on repairs and maintenance tasks. Both roles may work remotely or on-site, but their core responsibilities and required skills differ significantly.

What are popular job titles related to Remote Contract Reliability Engineer jobs in Colorado? For Remote Contract Reliability Engineer jobs in Colorado, the most frequently searched job titles are:
What job categories do people searching Remote Contract Reliability Engineer jobs in Colorado look for? The top searched job categories for Remote Contract Reliability Engineer jobs in Colorado are:
What cities in Colorado are hiring for Remote Contract Reliability Engineer jobs? Cities in Colorado with the most Remote Contract Reliability Engineer job openings:
Infographic showing various Remote Contract Reliability Engineer job openings in Colorado as of June 2026, with employment types broken down into 42% Full Time, 17% Part Time, and 41% Contract. Highlights an 100% Remote job distribution.

Senior Site Reliability Engineer (Remote USA)

TechInsights

Denver, CO • On-site, Remote

$149K - $157K/yr

Full-time

Medical, Dental, Vision, Retirement, PTO

Posted 2 days ago


Job description

OUR STORY
TechInsights is the information Platform for the semiconductor industry.
Regarded as the most trusted source of actionable, in-depth intelligence related to semiconductor innovation and surrounding markets, TechInsights' content informs decision makers and professionals whose success depends on accurate knowledge of the semiconductor industry-past, present, or future.
Over 650 companies and 150,000 users access the TechInsights Platform, the world's largest vertically integrated collection of unmatched reverse engineering, teardown, and market analysis in the semiconductor industry. This collection includes detailed circuit analysis, imagery, semiconductor process flows, device teardowns, illustrations, costing and pricing information, forecasts, market analysis, and expert commentary. TechInsights' customers include the most successful technology companies who rely on TechInsights' analysis to make informed business, design, and product decisions faster and with greater confidence. For more information, visit www.techinsights.com.
WHY WORK WITH US
  • Company-sponsored training and development opportunities
  • Comprehensive benefits package (health, dental, vision, wellness, 401K Matching, annual fitness reimbursement)
  • Flexible vacation policy
  • Community involvement opportunities through charitable alliances: https://www.techinsights.com/community-involvement
  • Wellness resources and support
  • Inclusive environment that prioritizes diversity, equity, and accessibility
  • High-growth company driven by high performance
  • Expected salary range: $149,100 - $157,800 USD

THE OPPORTUNITY:
TechInsights is building the reliability and AI operations foundation for its next chapter - an AI-first intelligence platform that runs the most demanding semiconductor intelligence workflows in the world. We're looking for a Senior Site Reliability Engineer who wants to own that foundation.
This is a senior individual contributor role at the technical leadership tier of our Site Reliability Engineering team. You'll own strategic reliability initiatives end-to-end: setting technical direction, defining SLOs and error budgets across our production platform, designing reliability patterns for the AI agent pipelines that power our platform's AI-first capabilities, and enabling our development and AI Engineering teams to build and ship with confidence.
What sets this role apart is its scope. You're not just keeping the lights on - you're building the observability, Internal Developer Platform (IDP), and service catalog that a fast-scaling AI platform needs from day one. You'll be the reliability voice in architectural decisions, the engineer who closes the loop between agent failure modes and platform resilience, and the mentor who builds the team's capability rather than their own indispensability.
If you have deep SRE experience and want to apply it to AI workloads - agent loop observability, blast radius management, LLM infrastructure reliability - this is the role where that expertise becomes a differentiator.
This role is a remote role for candidates based in the United States.
WHAT YOU'LL DO
Platform Reliability & AI Operations
  • Own SLOs, SLIs, and error budgets for all production services; drive error budget discipline across engineering
  • Design reliability patterns for AI agent pipelines: LLM observability, tool-use tracking, failure detection, and graceful degradation
  • Architect for blast radius containment - agent failures must have bounded customer impact through isolation, circuit breaking, and rapid recovery
  • Mature our Canada Central/West active-active architecture toward 24-hour RTO with full regional failover
  • Lead incident response and post-incident reviews that produce durable fixes; maintain DR procedures through regular testing
Developer & AI Engineering Enablement
  • Serve as the primary reliability liaison to Software and AI Engineering, translating requirements into actionable standards
  • Partner with AI Engineering on compute provisioning, model serving, inference latency, and workload isolation
  • Own CI/CD pipeline strategy (Bitbucket Pipelines, GitHub Actions) - set standards, optimize deployment frequency, and ensure teams can ship confidently
  • Drive IDP adoption and enable teams on SRE practices: on-call readiness, SLO definition, runbook development, and self-service tooling
  • Represent reliability in architectural discussions; surface risk before it's committed to design
Observability, IDP & Service Catalog
  • Own the service catalog - a living inventory of all services, AI agents, dependencies, ownership, and SLOs
  • Operate Datadog as the single pane of glass for service health, infrastructure, and agentic pipeline telemetry
  • Extend observability to AI workloads: LLM latency, token consumption, agent completion rates, and pipeline throughput
  • Build golden path templates in Backstage and/or Atlassian Compass so teams ship reliably without routine SRE involvement
  • Apply AIOps in Datadog to automate anomaly detection, incident triage, and remediation recommendations
FinOps, IaC & Continuous Improvement
  • Own infrastructure as code via Terraform and GitOps; enforce IaC policy in partnership with Trust Assurance
  • Own FinOps visibility into AWS cost segments; model cloud cost impact as AI/ML workloads scale
  • Formally mentor junior and intermediate SRE engineers, with accountability for their technical growth and career progression
  • Build AI-assisted automation to progressively reduce toil and scale the team's operational capacity

WHAT YOU'LL BRING
Technical Requirements
  • Bachelor's degree in Computer Science, Engineering, or equivalent combination of education and experience
  • 6-8 years of progressive experience in site reliability engineering, platform engineering, or DevOps, with demonstrated technical leadership at the senior individual contributor level
  • Deep expertise in AWS (EKS, Lambda, CloudWatch, AWS Config) and multi-region architecture patterns
  • Proficiency with Terraform and GitOps; experience with policy-as-code (Sentinel, OPA/Rego, or equivalent)
  • Hands-on Datadog experience at operational depth: dashboards, SLO tracking, alerting, log management, distributed tracing
  • Strong containerization expertise: Docker, Kubernetes (EKS preferred)
  • Proficiency in Python and/or Bash; experience building operational tooling; solid understanding of Java and Spring Boot microservice architecture sufficient to make reliability and deployment decisions for EKS-hosted services
  • Deep expertise in CI/CD pipeline design and optimization using Bitbucket Pipelines and GitHub Actions
  • Familiarity with IDP tooling (Backstage, Atlassian Compass, or equivalent) strongly preferred
  • Experience with AI/ML workload infrastructure, LLM API integration, or agentic system operations considered a strong asset
Professional Skills
  • Leads and owns strategic reliability initiatives end-to-end with a high degree of autonomy; accountable for outcomes, not just tasks
  • Sets technical direction and influences team and department strategy
  • Solves complex, ambiguous reliability problems through systematic analysis and first-principles thinking
  • Formally mentors junior and intermediate engineers; builds team capability through coaching and knowledge transfer
  • Communicates technical reliability concepts clearly to engineering, product, and leadership audiences
  • Approaches operational work with an AI-first posture: builds automation and intelligent tooling as the default
Preferred Qualifications
  • Experience designing reliability architecture for agentic AI systems: agent loop observability, blast radius isolation, graceful degradation for LLM-dependent services
  • AWS certifications: Solutions Architect Professional, DevOps Engineer Professional, or equivalent
  • FinOps Certified Practitioner or demonstrated cloud cost management experience at scale
  • IDP implementation or developer experience program leadership
  • Experience in semiconductor, SaaS, or data-intensive platform environments
  • Experience operating in environments with export-controlled or regulated data
  • Knowledge of BCP/DR program management and formal recovery testing

As part of the recruitment process for this position, you will be required to submit your latest citizenship and/or permanent residency information. This information will be used to comply with U.S. Export Control Laws and Regulations.
WORKING ARRANGEMENT
This is a remote position for candidates based in the United States. Occasional travel may be required.
Technology knows no bounds, and neither does TechInsights. Bringing together talented humans from different perspectives, backgrounds and abilities is something we take seriously. We're committed to building an inclusive environment that welcomes you to be your authentic self and allows us to push past the boundaries together.
TechInsights is committed to meeting the needs of people with disabilities. Accommodations are available on request for candidates taking part in all aspects of the selection process.
AI technology may be used to assist in the screening and assessment of applications for this position. Our recruiters are involved at every stage, and all hiring decisions are made by People and hiring teams.
As part of any recruitment process, TechInsights collects and processes personal data relating to job applicants. We are committed to being transparent about how we collect and use that data and to meeting our data protection obligations. Our Privacy policy can be referenced here: https://www.techinsights.com/privacy-policy