1

Manager Hardware Reliability Engineer Jobs in Washington

Site Reliability Engineer (SRE)

Vienna, VA · On-site

$57.25 - $76/hr

The AWS Site Reliability Engineer (SRE) is responsible for the operational health, availability ... CloudWatch, performance tuning in cloud environments, IaC tools, Databricks management and ...

Site Reliability Engineer (SRE) (TS)

Washington, DC · On-site

$64.50 - $85.75/hr

Koniag Management Solutions, LLC (KMS), a Koniag Government Services (KGS) company, is hiring a Site Reliability Engineer (SRE). Position requires an active Top Secret/SCI clearance with ability to ...

Site Reliability Engineer (SRE) (TS)

Washington, DC · On-site

$64.50 - $85.75/hr

Koniag Management Solutions, LLC (KMS), a Koniag Government Services (KGS) company, is hiring a Site Reliability Engineer (SRE). Position requires an active Top Secret/SCI clearance with ability to ...

You will help ensure MC&FP systems are reliable, scalable, resilient, and efficiently managed ... Expert knowledge of site reliability engineering practices, system monitoring, incident management ...

Required : • Expert knowledge of site reliability engineering practices, system monitoring, incident management, automation, performance tuning, and operational resilience. • Strong understanding ...

Manage incident response, root cause analysis, and post-mortem processes for the AI platform ... , DevOps, or production operations. * Extensive experience with cloud-native infrastructure ...

next page

Showing results 1-20

Manager Hardware Reliability Engineer information

What are some typical challenges faced by a Manager Hardware Reliability Engineer, and how can they be addressed?

A Manager Hardware Reliability Engineer often faces challenges such as ensuring consistent product quality across multiple development cycles, managing cross-functional teams, and balancing strict deadlines with thorough reliability testing. Addressing these challenges involves implementing robust reliability test plans early in the design process, fostering open communication among engineering, manufacturing, and quality assurance teams, and staying updated with industry-standard reliability methodologies. Effective managers also mentor their teams and leverage data-driven approaches to anticipate and mitigate potential hardware failures before products reach customers.

What does a Manager Hardware Reliability Engineer do?

A Manager Hardware Reliability Engineer oversees teams responsible for ensuring the reliability and durability of hardware products throughout their lifecycle. They design and implement testing protocols, analyze failure data, and develop strategies to improve product performance and reduce failure rates. This role involves collaborating with design, manufacturing, and quality teams to identify potential risks and implement solutions. Additionally, they lead root cause analysis for hardware issues, ensure compliance with industry standards, and mentor junior engineers. Their ultimate goal is to deliver high-quality, reliable hardware that meets customer expectations.

What engineers make $300,000 a year?

Senior hardware reliability engineers, especially those with extensive experience, specialized skills, and certifications, can earn $300,000 or more annually. High compensation often depends on factors such as industry, company size, location, and expertise in areas like failure analysis, testing, and reliability modeling.

What is the difference between Manager Hardware Reliability Engineer vs Hardware Reliability Engineer?

AspectManager Hardware Reliability EngineerHardware Reliability Engineer
ResponsibilitiesOversees reliability strategies, manages teams, and coordinates reliability projectsPerforms reliability testing, analysis, and troubleshooting of hardware components
Required CredentialsBachelor's or Master's in Engineering, certifications like Six Sigma or Reliability EngineeringBachelor's in Electrical, Mechanical, or related engineering; certifications optional
Work EnvironmentLeadership roles in R&D, manufacturing, or product teamsHands-on testing labs, design teams, or manufacturing facilities
Industry UsageCommonly found in electronics, aerospace, automotive sectorsUsed across similar industries for hardware development and testing

The main difference is that the Manager Hardware Reliability Engineer focuses on leading teams and strategic reliability planning, while the Hardware Reliability Engineer is involved in technical testing and analysis. Both roles require engineering backgrounds, but the manager position emphasizes leadership and coordination.

What engineer makes $500,000 a year?

A senior hardware reliability engineer or specialized engineering manager with extensive experience and advanced certifications can earn $500,000 or more annually, especially in high-demand industries like technology or aerospace. Such roles often require deep technical expertise, leadership skills, and sometimes stock options or bonuses as part of compensation packages.

What is the highest salary of SRE?

The highest salary for a Site Reliability Engineer (SRE) can exceed $150,000 to $200,000 annually in high-demand tech regions, especially for those with extensive experience, advanced skills in cloud platforms, automation, and monitoring tools. Senior SREs or those in leadership roles may earn even higher compensation, including bonuses and stock options.

What are the key skills and qualifications needed to thrive as a Manager Hardware Reliability Engineer, and why are they important?

To thrive as a Manager Hardware Reliability Engineer, you need expertise in hardware design, failure analysis, reliability testing, and typically a degree in electrical or mechanical engineering. Familiarity with tools like statistical analysis software (e.g., Minitab), reliability prediction systems (e.g., Reliasoft), and quality management certifications such as Six Sigma are highly valuable. Strong leadership, problem-solving, and communication skills distinguish top performers in leading teams and collaborating across departments. These capabilities are crucial for ensuring product quality, minimizing failures, and driving continuous improvement in hardware development.

How much do SRE managers make in the US?

SRE (Site Reliability Engineering) managers in the US typically earn between $130,000 and $180,000 annually, depending on experience, location, and company size. They often oversee teams responsible for system reliability, incident response, and automation, requiring strong technical and leadership skills.
What job categories do people searching Manager Hardware Reliability Engineer jobs in Washington look for? The top searched job categories for Manager Hardware Reliability Engineer jobs in Washington are:
What cities in Washington are hiring for Manager Hardware Reliability Engineer jobs? Cities in Washington with the most Manager Hardware Reliability Engineer job openings:
Infographic showing various Manager Hardware Reliability Engineer job openings in Washington as of June 2026, with employment types broken down into 86% Full Time, 13% Part Time, and 1% Contract. Highlights an 93% Physical, 2% Hybrid, and 5% Remote job distribution.
Site Reliability Engineer (SRE) (TS)

Site Reliability Engineer (SRE) (TS)

Koniag, Inc.

Washington, DC • On-site

$158K - $178K/yr

Full-time

Medical, Dental, Vision, Retirement, PTO

Posted 8 days ago


Job description

Koniag Management Solutions, LLC (KMS), a Koniag Government Services (KGS) company, is hiring a Site Reliability Engineer (SRE). Position requires an active Top Secret/SCI clearance with ability to obtain additional security requirements. Please do not apply if you do not possess the required Top-Secret Clearance.
We offer competitive compensation and an extraordinary benefits package including health, dental and vision insurance, 401K with company matching, flexible spending accounts, paid holidays, three weeks paid time off, and more.
Position Summary:
We are seeking an experienced Site Reliability Engineer (SRE) to blend software engineering and systems administration practices to ensure the reliability, availability, and performance of mission-critical applications. This role focuses on automation, observability, and incident response while upholding strict Service Level Objectives (SLOs). The SRE will help build resilient systems that scale, automate manual processes, manage fleet-wide configurations, and ensure robust system monitoring. The selected candidate will support operations at Joint Base Anacostia-Bolling and must maintain an active TS/SCI clearance.
Key Responsibilities:
  • Ensure application reliability, performance, and availability through automation, monitoring, and systems engineering.
  • Develop infrastructure-as-code (IaC) solutions using Terraform, Ansible, and Desired State Configuration (DSC).
  • Build and manage containerized workloads using Kubernetes, Rancher, Docker, Helm, and related ecosystem tools.
  • Support service mesh and networking constructs such as Cilium, load balancing, ingress management, and distributed storage.
  • Engineer and maintain storage and object systems including Rook, Ceph, MinIO, and S3-compatible platforms.
  • Implement and maintain comprehensive observability platforms (metrics, logging, tracing) to support SLO monitoring and incident response.
  • Lead and participate in incident response activities, postmortem analysis, and reliability engineering improvements.
  • Develop automations, scripts, and tools using Python, PowerShell, and shell scripting.
  • Support CI/CD pipelines and cloud-native deployment methodologies.
  • Collaborate with development and operations teams to embed SRE practices into the application lifecycle.

Required Technical Certifications (at least two):
  • Security +
  • Cloud Associate (such as AWS Solutions Architect Associate, Azure AZ-104, or Google Cloud Associate Cloud Engineer)
  • Terraform Associate
  • Cloud Professional/Architect (such as AWS Solutions Architect Professional or Azure Architect Expert)
  • CKA (Certified Kubernetes Administrator)

Preferred Certifications (Plus):
  • CKA (if not used to meet required cert)
  • RHCSA
  • AWS DevOps Engineer or AZ-400
  • CCSP
  • Advanced observability certifications (Datadog, New Relic, Dynatrace, etc.)
  • Formal incident management or SRE-focused training

Required Technical Knowledge:
Strong understanding of the following technologies:
  • Kubernetes, Rancher, Helm, Docker
  • Cilium, Rook, Ceph, MinIO, S3, PortWorx
  • Load balancing, ingress, and service networking
  • Ansible, Terraform, Desired State Configuration
  • Python, PowerShell, and scripting/automation
  • Distributed systems, cloud computing, and microservices architecture
  • Monitoring/observability practices and tools
  • Incident response frameworks and SLO-based operations

Preferred Experience:
  • Building scalable, fault-tolerant cloud-native systems across hybrid or multi-cloud environments.
  • Developing or supporting enterprise CI/CD pipelines.
  • Managing complex Kubernetes clusters across on-prem and cloud platforms.
  • Implementing enterprise observability stacks (e.g., Prometheus, Loki, Grafana, ELK, Open Telemetry).
  • Supporting large-scale infrastructure within DoD or Intelligence Community environments.

Requirements:
TS/SCI security clearance required, candidate will not be considered without.
Our Equal Employment Opportunity Policy:
The company is an equal opportunity employer. The company shall not discriminate against any employee or applicant because of race, color, religion, creed, ethnicity, sex, sexual orientation, gender or gender identity (except where gender is a bona fide occupational qualification), national origin or ancestry, age, disability, citizenship, military/veteran status, marital status, genetic information or any other characteristic protected by applicable federal, state, or local law. We are committed to equal employment opportunity in all decisions related to employment, promotion, wages, benefits, and all other privileges, terms, and conditions of employment.
The company is dedicated to seeking all qualified applicants. If you require an accommodation to navigate or to apply to a position on our website, please contact Heaven Wood via e-mail at accommodations@koniag-gs.com or by calling 703-488-9377 to request accommodations.
About our Company:
Koniag Government Services (KGS) is an Alaska Native Owned corporation supporting the values and traditions of our native communities through an agile employee and corporate culture that delivers Enterprise Solutions, Professional Services and Operational Management to Federal Government Agencies. As a wholly owned subsidiary of Koniag, we apply our proven commercial solutions to a deep knowledge of Defense and Civilian missions to provide forward leaning technical, professional, and operational solutions. KGS enables successful mission outcomes for our customers through solution-oriented business partnerships and a commitment to exceptional service delivery. We ensure long-term success with a continuous improvement approach while balancing the collective interests of our customers, employees, and native communities. For more information, please visit www.koniag-gs.com.
Equal Opportunity Employer/Veterans/Disabled. Shareholder Preference in accordance with Public Law 88-352
#LI-CT1

Koniag logo

About Koniag

Sourced by ZipRecruiter

Industry

Investment management and consulting services

Company size

501 - 1,000 Employees

Headquarters location

Kodiak, AK, US

Year founded

1972

Social media