1

Reliability Engineer Jobs in Tennessee (NOW HIRING)

Site Reliability Engineer II

Nashville, TN

$55 - $73.25/hr

Site Reliability Engineer II The SRE II sits at the intersection of software engineering and platform operations. You will own the reliability, scalability, and operational hygiene of Kastle's core ...

SRE Engineer

Maryville, TN · On-site

$46.75 - $62.25/hr

Monday- Friday, 9am-5pm The Site Reliability Engineer will be an active contributor responsible for configuring Dynatrace as the main business monitoring platform and SolarWinds as the primary ...

SRE Engineer - PxE Talent

Nashville, TN · On-site

$55 - $73.25/hr

As a SRE Engineer you will actively engage in your engineering craft, taking a hands-on approach to multiple high-visibility projects. Your expertise will be pivotal in delivering solutions that ...

SRE Engineer - PxE Talent

Memphis, TN

$55.25 - $73.50/hr

As a SRE Engineer you will actively engage in your engineering craft, taking a hands-on approach to multiple high-visibility projects. Your expertise will be pivotal in delivering solutions that ...

SRE Engineer - PxE Talent

Hermitage, TN

$50 - $66.50/hr

As a SRE Engineer you will actively engage in your engineering craft, taking a hands-on approach to multiple high-visibility projects. Your expertise will be pivotal in delivering solutions that ...

SRE Engineer

Memphis, TN · On-site

$50.25 - $66.75/hr

Monday- Friday, 9am-5pm The Site Reliability Engineer will be an active contributor responsible for configuring Dynatrace as the main business monitoring platform and SolarWinds as the primary ...

SRE Engineer - PxE Talent

Nashville, TN · On-site

$55 - $73.25/hr

Deloitte is a leading professional services firm, and they are seeking a SRE Engineer to engage in high-visibility projects that deliver customer value. The role involves implementing monitoring ...

SRE Engineer

Memphis, TN · On-site

$50.25 - $66.75/hr

Monday- Friday, 9am-5pm The Site Reliability Engineer will be an active contributor responsible for configuring Dynatrace as the main business monitoring platform and SolarWinds as the primary ...

Boiler Reliability Engineer

Afton, TN · Remote

$90K - $113K/yr

About the role About the Role We are seeking a hands-on Boiler Reliability Engineer to support the performance, reliability, and lifecycle management of critical boiler systems across Reworld ...

Senior Site Reliability Engineer

Nashville, TN · Remote

$55 - $73.25/hr

Bring the SRE mindset: automate toil, prefer boring/stable systems, and relentlessly improve What We\'re Looking For * 5+ years in SRE, DevOps, or Infrastructure Engineering * Strong Kubernetes ...

Maintenance Reliability Engineer Draslovka is seeking a Maintenance Reliability Engineer to join our site asset care team. The maintenance reliability engineer will lead maintenance activities to ...

As a Manager, ServiceNow SRE Engineer , you will actively engage in your engineering craft, taking a hands-on approach to multiple high-visibility projects. Your expertise will be pivotal in ...

next page

Showing results 1-20

Reliability Engineer information

See Tennessee salary details

$55.4K

$107.1K

$128K

How much do reliability engineer jobs pay per year?

As of Jun 11, 2026, the average yearly pay for reliability engineer in Tennessee is $107,074.00, according to ZipRecruiter salary data. Most workers in this role earn between $93,000.00 and $117,100.00 per year, depending on experience, location, and employer.

What are some typical challenges Reliability Engineers face when implementing preventive maintenance strategies?

Reliability Engineers often encounter challenges such as balancing preventive maintenance schedules with production demands, ensuring buy-in from operations teams, and accurately predicting equipment failures. They must analyze large sets of historical data to identify trends and root causes, which can be complex in facilities with diverse machinery. Collaboration with maintenance, operations, and engineering teams is essential to develop effective strategies that minimize downtime while optimizing resources.

What are the key skills and qualifications needed to thrive as a Reliability Engineer, and why are they important?

To thrive as a Reliability Engineer, you need a solid background in engineering principles, failure analysis, and reliability modeling, typically with a degree in engineering or a related field. Familiarity with tools such as FMEA, Root Cause Analysis (RCA), reliability-centered maintenance (RCM) software, and certifications like Certified Reliability Engineer (CRE) are highly valued. Strong problem-solving abilities, attention to detail, and effective communication are crucial soft skills in this role. These skills ensure systems are dependable, downtime is minimized, and organizational performance and safety are optimized.

What is the difference between Reliability Engineer vs Maintenance Engineer?

AspectReliability EngineerMaintenance Engineer
CredentialsTypically requires engineering degree, certifications in reliability or asset managementOften requires engineering or technical diploma, certifications in maintenance or equipment repair
Work EnvironmentFocuses on analysis, design, and improvement of systems for reliabilityHands-on maintenance, repair, and troubleshooting of equipment
Industry UsageCommon in manufacturing, energy, aerospace, and industrial sectorsPrevalent in manufacturing, facilities, and industrial plants

Reliability Engineers focus on designing and improving systems to prevent failures, using data analysis and modeling. Maintenance Engineers perform hands-on repairs and upkeep of equipment to ensure operational continuity. While both roles aim to optimize equipment performance, Reliability Engineers work proactively on system reliability, whereas Maintenance Engineers handle reactive and scheduled maintenance tasks.

What engineers make $500,000?

Senior-level engineers in specialized fields such as petroleum, aerospace, or software engineering can earn $500,000 or more annually, often through a combination of base salary, bonuses, and stock options. High compensation typically requires extensive experience, advanced skills, and working in high-demand industries or leadership roles.

What Does a Reliability Engineer Do?

As a reliability engineer, your duties are to test and evaluate the manufacturing of products and components and ensure that the procedures are efficient and do not lead to abnormally high maintenance or operational costs. Your other responsibilities are to find solutions to product reliability risks. You may manage risk in a supply chain, develop loss prevention strategies, and track the entire lifecycle of product development, from building prototypes to moving a product into full-scale production. You analyze information from department heads and recommend strategies to reduce risk and ensure that the product works reliably.

Is 47 too old to become an engineer?

Reliability engineering is a field that values experience and skills, and age is not a barrier to entering the profession. Many engineers successfully start or transition into the field later in life by gaining relevant certifications, technical knowledge, and practical experience. Continuous learning and adapting to new tools and technologies are important regardless of age.

What are Reliability Engineers?

Reliability Engineers are professionals responsible for ensuring that systems, equipment, or processes function consistently and efficiently over time. They analyze data, identify potential points of failure, and develop maintenance strategies to improve system reliability and minimize downtime. Their work spans various industries, including manufacturing, energy, and technology, and often involves collaborating with design, operations, and maintenance teams. By implementing reliability-centered maintenance and predictive analysis, they help organizations save costs and increase safety.

What is the role of a reliability engineer?

A reliability engineer is responsible for ensuring that systems, equipment, or products perform consistently and dependably over time. They analyze failure data, develop maintenance strategies, and implement improvements to enhance reliability, often using tools like FMEA and root cause analysis. The role typically requires technical skills, problem-solving abilities, and knowledge of industry standards.
What are the most commonly searched types of Reliability Engineer jobs in Tennessee? The most popular types of Reliability Engineer jobs in Tennessee are:
What are popular job titles related to Reliability Engineer jobs in Tennessee? For Reliability Engineer jobs in Tennessee, the most frequently searched job titles are:
What cities in Tennessee are hiring for Reliability Engineer jobs? Cities in Tennessee with the most Reliability Engineer job openings:
Site Reliability Engineer II

Site Reliability Engineer II

Kastle Systems

Nashville, TN

$55 - $73.25/hr

Other

Medical, Dental, Vision, Retirement

Posted 8 days ago


Kastle Systems rating

9.2

Company rating: 9.2 out of 10

Based on 6 frontline employees who took The Breakroom Quiz

3rd of 100 rated security


Job description

Overview

Join the leader in providing smarter solutions for a safer world.

The property technology space is growing rapidly, and Kastle Systems is leading the way. Kastle Systems is the leader in managed security, with a track record of introducing innovative technologies to serve over 460M square feet of real estate globally. Clients span the commercial and multifamily real estate, education, and construction industries and the customers they serve. Delivering a world class customer experience drives everything we do, and Kastle’s mission is to be our customers’ best service provider and to ensure that their security the most effective, efficient and convenient. Kastle's integrated security solution, including access control, video, and remote video monitoring, significantly reduces costs and improves the critically important 24x7 performance for building owners, developers and tenants.

Site Reliability Engineer II

The SRE II sits at the intersection of software engineering and platform operations. You will own the reliability, scalability, and operational hygiene of Kastle’s core infrastructure – engineering away toil, hardening deployment pipelines, and partnering with product engineering teams to make new services production-ready from day one.

This is a mid-level individual contributor role. You are expected to execute technical work independently, drive reliability improvements end-to-end, and participate meaningfully in architecture discussions. You will carry on-call responsibilities as part of a shared rotation with a well-defined escalation model and a strong blameless post-incident review culture.

The team is in the middle of a meaningful platform evolution: formalizing multi-tier release pipelines (Dev → QA → Integration → UAT → Prod) with ArgoCD-based approval gates, building out SLI/SLO frameworks, and migrating toward full GitOps. You will be a hands-on contributor to all of it.

Key Responsibilities:Release Engineering & GitOps
  • Own and evolve the multi-stage deployment pipeline using ArgoCD, including approval gates, promotion policies, and rollback mechanisms.
  • Maintain trunk-based branching discipline and enforce release governance standards across the engineering organization.
  • Manage feature flag lifecycle – from creation and gradual rollout to deprecation – in coordination with product and QA teams.
  • Build and maintain CI/CD pipelines that enable safe, frequent, and auditable deployments.
Infrastructure as Code & Cloud Operations
  • Provision and manage Azure infrastructure using Terraform or OpenTofu, maintaining drift-free state aligned with GitOps principles.
  • Own Kubernetes cluster operations including workload scheduling, resource optimization, RBAC, network policy, and cost governance.
  • Identify and act on infrastructure cost optimization opportunities (compute rightsizing, storage tier selection, idle resource elimination).
  • Support Crossplane or similar operator patterns for Kubernetes-native infrastructure management where applicable.
Reliability & Observability
  • Define, instrument, and enforce SLIs and SLOs in partnership with product engineering teams.
  • Build and maintain observability infrastructure – metrics, logs, and distributed traces – using Prometheus, Grafana, OpenTelemetry, or equivalent tooling.
  • Conduct proactive capacity planning and performance tuning across multi-tenant, distributed environments.
  • Establish and maintain runbooks, dashboards, and alerting policies that reduce cognitive overhead during incidents.
Incident Management
  • Participate in shared on-call rotation covering core platform and infrastructure services; on-call load is balanced across the team with structured handoff practices.
  • Lead mitigation of live production incidents with a focus on minimizing MTTR and clear stakeholder communication under pressure.
  • Facilitate blameless post-incident reviews and drive preventative engineering to closure – not just documentation.
Engineering Partnership
  • Embed with product engineering teams during design and architecture phases to establish reliability, scalability, and security requirements before code is written.
  • Maintain clear, comprehensive documentation for infrastructure architecture, operational procedures, and onboarding guides.
  • Push back constructively when proposed designs compromise reliability or operability, proposing alternatives rather than just raising concerns.

In addition to a great work environment, we provide excellent benefits (Medical/Dental/Vision, 401K, Tuition/Training Assistance, BrightHorizons Lifestyle Assistance, Wellness Program, etc.) and we're proud to be a Certified Great Place to Work! For more information about what it's like to work with us, please visit Kastle Careers.


Responsibilities
  • Experience: 4–6 years in an SRE, Platform Engineering, or Infrastructure Engineering role, with demonstrated ownership of production systems.
  • Cloud – Azure: Hands-on experience managing production infrastructure in Azure: AKS, Azure Container Registry, Azure Monitor, Cosmos DB, Key Vault, Azure Front Door, or equivalent services. AWS/GCP backgrounds considered with clear willingness to operate in Azure.
  • Kubernetes: Deep operational experience with Kubernetes in production: resource management, network policies, RBAC, HPA/VPA, persistent volumes, and debugging live workload issues.
  • GitOps & Release Tooling: Experience with ArgoCD, Flux, or equivalent GitOps deployment tools. Familiarity with multi-stage progressive delivery and approval gate patterns is a strong plus.
  • Infrastructure as Code: Proven track record with Terraform, OpenTofu, or Pulumi in a production GitOps context – not just writing HCL, but maintaining drift-free state and managing state backends safely.
  • Observability: Hands-on configuration of Prometheus, Grafana, OpenTelemetry, and/or ELK/OpenSearch. Ability to go from symptom to instrumentation to dashboard without hand-holding.
  • Programming & Scripting: Proficiency in Python or Go for automation and tooling; strong Bash scripting. Ability to read and reason about application code when debugging production issues. Proficiency in C# and SQL for reviewing deliverables and participating in triage.
  • Linux & Networking: Solid understanding of Linux internals, TCP/IP, DNS, TLS, and HTTP semantics. Comfortable debugging at the network and OS layer.

Qualifications
  • Experience with Crossplane or other Kubernetes-native infrastructure operators.
  • Familiarity with feature flag platforms (LaunchDarkly, Flagsmith, or similar) and gradual rollout strategies.
  • Background in IoT, physical security, access control, or other latency-sensitive, event-driven domains.
  • Comfort with async collaboration across distributed time zones (US + India team structure).
  • Experience with AI-assisted development tooling and an appetite to incorporate it into engineering workflows.
  • Knowledge of CMMC 2.0, SOC 2, or FedRAMP compliance postures as they apply to infrastructure and access control.

Equal Opportunity Statement

At Kastle, we believe that diversity makes us stronger -  at work and in the world.  Kastle Systems International, LLC is an Equal Opportunity / Affirmative Action employer. All qualified applicants will receive consideration for employment without regard to race, color, religion, sex, sexual orientation, gender identity, national origin, disability, age, protected veteran status, marital status, pregnancy or any other basis protected by applicable federal or state laws.

Qualifications:
  • Experience with Crossplane or other Kubernetes-native infrastructure operators.
  • Familiarity with feature flag platforms (LaunchDarkly, Flagsmith, or similar) and gradual rollout strategies.
  • Background in IoT, physical security, access control, or other latency-sensitive, event-driven domains.
  • Comfort with async collaboration across distributed time zones (US + India team structure).
  • Experience with AI-assisted development tooling and an appetite to incorporate it into engineering workflows.
  • Knowledge of CMMC 2.0, SOC 2, or FedRAMP compliance postures as they apply to infrastructure and access control.
Education:UNAVAILABLEEmployment Type: UNAVAILABLE