1

Reliability Engineer Manager Jobs in Georgia (NOW HIRING)

Site Reliability Engineer II

Atlanta, GA · On-site

$54.75 - $72.75/hr

ABOUT THIS POSITION We are looking for a talented and driven Site Reliability Engineering (SRE) to support our engineering team, which manages the infrastructure and services that power our Waystar ...

Senior Site Reliability Engineer

Atlanta, GA · On-site

$54.75 - $72.75/hr

Who We Are QGenda is redefining healthcare workforce management everywhere care is delivered. We're ... About Your Role As a Senior Site Reliability Engineer, you will work with our Infrastructure and ...

$156K - $288K/yr

Take a leading role in incident management, including coordinating response efforts ... Reliability Engineering or similar DevOps roles focused on system reliability and incident ...

Staff Site Reliability Engineer

Atlanta, GA · Remote

$54.75 - $72.75/hr

As a Lead SRE, you'll be a technical and operational leader for reliability across Develocity. You ... Build and maintain comprehensive observability for all managed services, including logging, metrics ...

Senior Site Reliability Engineer

Atlanta, GA

$54.75 - $72.75/hr

Design, implement, and manage scalable systems that ensure high availability, fault tolerance, and ... Actively contribute to fostering an SRE culture within the organization by promoting observability ...

Site Reliability Engineer II

Atlanta, GA · On-site

$54.75 - $72.75/hr

ABOUT THIS POSITION We are looking for a talented and driven Site Reliability Engineering (SRE) to support our engineering team, which manages the infrastructure and services that power our Waystar ...

Site Reliability Engineer II

Atlanta, GA · On-site

$54.75 - $72.75/hr

Waystar is a healthcare technology company that simplifies healthcare payments and enhances revenue cycle management. They are seeking a Site Reliability Engineer II to support their engineering team ...

Manager, SRE Engineer - PxE ERM

Atlanta, GA · On-site

$54.75 - $72.75/hr

As a Manager, SRE Engineer , you will actively engage in your engineering craft, taking a hands-on approach to multiple high-visibility projects. Your expertise will be pivotal in delivering ...

ServiceNow SRE Engineering Manager

Atlanta, GA · On-site

$54.75 - $72.75/hr

As a Manager, ServiceNow SRE Engineer , you will actively engage in your engineering craft, taking a hands-on approach to multiple high-visibility projects. Your expertise will be pivotal in ...

Join us as a Site Reliability Engineer, Cyber Security Engineering to support and improve our ... Build, implement, iterate over CI/CD pipelines * Assist with the Management, Development, Design ...

Site Reliability Engineer (AWS)

Atlanta, GA · Hybrid

$54.75 - $72.75/hr

This position is under our CTO org to support SRE functions for innovation and growth for the ... Manage deployment pipelines and configuration management for consistent and reliable app ...

Site Reliability Engineer (AWS)

Atlanta, GA · Hybrid

$54.75 - $72.75/hr

This position is under our CTO org to support SRE functions for innovation and growth for the ... Manage deployment pipelines and configuration management for consistent and reliable app ...

AWS Site Reliability Engineer

Atlanta, GA · On-site

$54.75 - $72.75/hr

Partner with business and technical product owners to set SLOs / SLIs / error budgets to manage ... Site Reliability Engineering principles of change management, monitoring, emergency response ...

Sr. Site Reliability Engineer

Atlanta, GA · On-site

$54.75 - $72.75/hr

Extensive/Strong AWS experience: experience in designing, deploying managing scalable/reliable ... Drive the adoption of SRE best practices and ensure adherence to reliability and performance ...

Site Reliability Engineer

Atlanta, GA · On-site

$54.75 - $72.75/hr

Position : SRE Duration : 6 to 12 Months Location : Atlanta or St. Louis - Day Onsite Job ... QA, Product Management, and Production Ops teams to make sure Product Releases on-time with ...

Define and manage SLOs, SLIs, and error budgets * Build and improve CI/CD pipelines and operational ... Mentor other engineers and help set SRE standards and best practices Required Qualifications * 5+ ...

next page

Showing results 1-20

Reliability Engineer Manager information

See Georgia salary details

$52.5K

$111.4K

$133.7K

How much do reliability engineer manager jobs pay per year?

As of Jun 23, 2026, the average yearly pay for reliability engineer manager in Georgia is $111,361.00, according to ZipRecruiter salary data. Most workers in this role earn between $97,400.00 and $121,900.00 per year, depending on experience, location, and employer.

How much do SRE managers make in the US?

Reliability Engineer Managers, often called SRE Managers, typically earn between $120,000 and $180,000 annually in the US, depending on experience, location, and company size. They oversee teams responsible for system reliability, incident response, and automation, often requiring skills in cloud platforms, monitoring tools, and leadership. Compensation may also include bonuses and stock options.

What does a Reliability Engineer Manager do?

A Reliability Engineer Manager oversees teams responsible for improving the reliability and performance of systems, machinery, or processes within an organization. They develop maintenance strategies, lead root cause analyses of failures, and implement best practices to minimize downtime and costs. Additionally, they collaborate with other departments to ensure that reliability goals align with business objectives and compliance standards. Their role is crucial in industries such as manufacturing, energy, and technology, where system uptime and safety are critical.

What engineering jobs pay $500,000?

Senior engineering roles such as Reliability Engineer Managers, Petroleum Engineers, and Software Engineering Directors can reach or exceed $500,000 annually, especially with experience, bonuses, and stock options. These positions often require advanced skills, leadership, and industry expertise, typically found in high-demand sectors like energy, technology, and aerospace.

What is the highest salary of SRE?

The highest salary for a Reliability Engineer (SRE) can exceed $200,000 annually in high-demand markets, especially for those with extensive experience, advanced skills in automation and cloud platforms, and leadership responsibilities. Senior SREs or SRE Managers often earn higher compensation, including bonuses and stock options, reflecting their expertise and strategic impact on system reliability.

What are some common challenges Reliability Engineer Managers face when balancing long-term reliability improvements with immediate operational demands?

Reliability Engineer Managers often need to prioritize urgent maintenance issues while also driving long-term reliability initiatives. Balancing these competing demands can be challenging, as immediate equipment failures may require quick fixes that temporarily interrupt ongoing improvement projects. Effective managers work closely with operations, maintenance, and engineering teams to communicate priorities, allocate resources, and implement sustainable solutions that address root causes rather than just symptoms. This role typically involves using data-driven decision-making and fostering a culture of proactive maintenance and continuous improvement.

What are the key skills and qualifications needed to thrive as a Reliability Engineer Manager, and why are they important?

To thrive as a Reliability Engineer Manager, you need a strong background in engineering principles, reliability analysis, and maintenance strategies, typically supported by a degree in engineering and experience in reliability roles. Familiarity with reliability-centered maintenance (RCM), failure mode and effects analysis (FMEA), and asset management software such as SAP or Maximo is common, along with certifications like Certified Reliability Engineer (CRE). Leadership, problem-solving, and effective communication are vital soft skills for managing teams and driving cross-functional initiatives. These competencies are crucial for minimizing downtime, optimizing equipment performance, and ensuring long-term operational efficiency.

What is the difference between Reliability Engineer Manager vs Reliability Engineer?

AspectReliability EngineerReliability Engineer Manager
Required CredentialsBachelor's in Engineering or related field; certifications like CRC, CRESame as Reliability Engineer, plus leadership experience
Work EnvironmentDesign, analyze, and improve system reliability; often in teamsOversees Reliability Engineers; manages projects and teams
Employer & Industry UsageManufacturing, aerospace, energy, automotiveSame industries, with added managerial responsibilities
Common Search & ComparisonFocuses on technical skills and hands-on reliability tasksFocuses on leadership, team management, and strategic planning

The main difference between a Reliability Engineer and a Reliability Engineer Manager lies in their responsibilities. The Reliability Engineer focuses on technical analysis and system improvements, while the Reliability Engineer Manager oversees teams, manages projects, and develops strategies to enhance reliability across the organization.

What is a reliability engineering manager?

A reliability engineering manager oversees teams responsible for ensuring the dependability and performance of equipment, systems, or products. They develop maintenance strategies, analyze failure data, and implement improvements to enhance system uptime, often using tools like FMEA and reliability modeling. Strong leadership, technical expertise, and knowledge of industry standards are essential for this role.
What are the most commonly searched types of Reliability Engineer jobs in Georgia? The most popular types of Reliability Engineer jobs in Georgia are:
What cities in Georgia are hiring for Reliability Engineer Manager jobs? Cities in Georgia with the most Reliability Engineer Manager job openings:
Infographic showing various Reliability Engineer Manager job openings in Georgia as of June 2026, with employment types broken down into 94% Full Time, 4% Part Time, and 2% Contract. Highlights an 87% Physical, 5% Hybrid, and 8% Remote job distribution, with an average salary of $111,361 per year, or $53.5 per hour.
Site Reliability Engineer (SRE) - AI Platform & Cloud

Site Reliability Engineer (SRE) - AI Platform & Cloud

Morgan Stanley

Alpharetta, GA • On-site

$55.75 - $74/hr

Full-time

Posted 3 days ago


Morgan Stanley rating

8.3

Company rating: 8.3 out of 10

Based on 147 frontline employees who took The Breakroom Quiz

39th of 138 rated financial services


Job description

In the Technology division, we leverage innovation to build the connections and capabilities that power our Firm, enabling our clients and colleagues to redefine markets and shape the future of our communities.
This is a Software Engineering position at Director level, which is part of the job family responsible for developing and maintaining software solutions that support business needs.
Since 1935, Morgan Stanley is known as a global leader in financial services, always evolving and innovating to better serve our clients and our communities in more than 40 countries around the world.
Our mission is to develop a firmwide Artificial Intelligence (AI) Development Platform that aligns with the firm's Technology principles and drives efficiency and consistency, controls, security and strong governance and promotes innovation, enabling teams to build applications that leverage AI capabilities and accelerate the adoption of AI across our businesses.
This role is for an experienced and driven Site Reliability Engineer (SRE) to join our AI Platform team to help support, scale and harden the infrastructure that powers our AI/ML systems. You will collaborate closely with infrastructure engineering, cloud engineering, data engineering, and security teams to ensure availability, reliability, performance, and security of production AI workloads (training, inference, data pipelines) in a regulated, high-stakes financial environment.
As an SRE on the AI platform, you will bring deep operations, automation, and systems engineering skills to enable our models and pipelines to run reliably at scale, while balancing cost, security, and compliance constraints.
The ideal candidate will have strong hands-on experience supporting software platforms on any combination of the following platforms - Kubernetes, Cloud (AWS, Azure, and/or Google), API based development, REST framework, data engineering, and large-scale API Gateway environments etc. Knowledge of AIML and hands-on experience implementing solutions using Generative AI are also preferable. The candidate will have great communication skills, a team-based mentality and a strong passion for using AI to increase productivity as well as help generate new ideas for product & technical improvements.
What you'll do in the role:
  • Operate, monitor, and maintain the infrastructure supporting GenAI applications (training, inference, feature store, data ingestion, model serving)
  • Design and build automation for core platform capabilities, reducing manual toil
  • Develop and maintain infrastructure-as-code (IaC) for provisioning and managing compute, storage, network, GPU clusters, Kubernetes / container orchestration, etc.
  • Establish, monitor, and enforce SLOs/SLIs/SLAs, error budgets, alerting, and dashboards
  • Lead incident response, root cause analysis (RCA), postmortems, and systemic remediation
  • Perform capacity planning, scaling strategies, workload scheduling, and resource forecasting
  • Optimize cost vs. performance tradeoffs in large-scale compute environments
  • Harden systems for security, compliance, auditability, and data governance
  • Collaborate across teams (cloud engineers, data engineers, infrastructure, security) to ensure safe deployment, rollout, rollback, and integration of new systems
  • Define disaster recovery (DR) strategies, backup/restore practices, fault tolerance mechanisms
  • Maintain runbooks, operational playbooks, documentation, and training materials
  • Participate in on-call rotations and respond to production incidents 24/7 as needed
  • Continuously evaluate and integrate new tools, frameworks, or technologies to enhance platform reliability

What you'll bring to the role:
  • Bachelor's or Master's degree in Computer Science or related field, or equivalent job experience
  • 5 years of production experience in SRE / Infrastructure / ops for large-scale systems
  • Strong programming/scripting skills (Python, Go, Java, or equivalent)
  • Deep experience with containerization (Docker), orchestration (Kubernetes, etc.)
  • Infrastructure-as-code (Terraform, Helm, CloudFormation, Ansible, etc.)
  • Familiarity with GPU / AI compute clusters, high-performance data storage, and distributed architectures
  • Experience with monitoring / observability / logging / alerting tools (Prometheus, Grafana, ELK / EFK, Datadog, etc.)
  • Networking & systems engineering knowledge (TCP/IP, DNS, routing, load balancing, distributed storage)
  • Solid experience in capacity planning, performance tuning, scaling, and incident response
  • Demonstrated ability to lead RCAs, deploy fixes, and drive reliability improvements
  • Experience in regulated environments (financial services, compliance, audit, security) is a strong plus
  • Excellent communication, documentation, and cross-team collaboration skills
  • Proven track record of reducing operational toil via automation

Nice to have
  • Understanding of SRE techniques.
  • Proficiency with Open Telemetry tools including Grafana, Loki, Prometheus, and Cortex.
  • Good knowledge of Microservice based architecture, industry standards, for both public and private cloud.
  • Knowledge of data pipeline technologies (Kafka, Spark, Flink, etc.)
  • Good knowledge of various DB engines (SQL, Redis, Kafka, Snowflake, etc) for cloud app storage.
  • Experience working with Generative AI development, embeddings, fine tuning of Generative AI models.
  • Experience in high-performance computing (HPC), distributed GPU cluster scheduling (e.g. Slurm, Kubernetes GPU scheduling)
  • Understanding of ModelOps/ ML Ops/ LLM Op.
  • Experience with chaos engineering, canary deployments, blue/green rollouts

We have a track record of innovation and passion for unlocking new opportunities, we help our clients raise, manage and allocate capital. We do this by offering a wide range of investment banking, securities, wealth management and asset management services.
All that we do at Morgan Stanley is driven by our five core values: do the right thing, put clients first, lead with exceptional ideas, commit to diversity and inclusion, and give back. These aren't just beliefs, they guide the decisions we make every day, ensuring we do what's best for our clients, communities and more than 80,000 employees around the world. And at the core of our success are the people who drive it - relentless collaborators and creative thinkers who are fueled by diverse thinking and experiences.
Wherever you are in our 1,200 global offices, you'll have the opportunity to work alongside the best and the brightest in an environment where you are empowered to achieve your full potential. We are proud to support our employees and their families at every point along their work-life journey, offering some of the most attractive and comprehensive employee benefits and perks in the industry.
At Morgan Stanley Alpharetta, we support the Firm's global business and functions from Wealth Management and Institutional Securities to Technology and Operations, Finance and Human Resources. With the 2020 acquisition of E-TRADE, Morgan Stanley Alpharetta grew significantly and has grown its role in our Wealth Management business helping deliver a premiere experience for the digitally inclined investor and trader. Learn more about our work and culture in Morgan Stanley Alpharetta.
Morgan Stanley's goal is to build and maintain a workforce that is diverse in experience and background but uniform in reflecting our standards of integrity and excellence. Consequently, our recruiting efforts reflect our desire to attract and retain the best and brightest from all talent pools. We want to be the first choice for prospective employees.
It is the policy of the Firm to ensure equal employment opportunity without discrimination or harassment on the basis of race, color, religion, creed, age, sex, sex stereotype, gender, gender identity or expression, transgender, sexual orientation, national origin, citizenship, disability, marital and civil partnership/union status, pregnancy, veteran or military service status, genetic information, or any other characteristic protected by law.
Morgan Stanley is an equal opportunity employer committed to diversifying its workforce (M/F/Disability/Vet).
WHAT YOU CAN EXPECT FROM MORGAN STANLEY:
At Morgan Stanley, we raise, manage and allocate capital for our clients - helping them reach their goals. We do it in a way that's differentiated - and we've done that for 90 years. Our values - putting clients first, doing the right thing, leading with exceptional ideas, committing to diversity and inclusion, and giving back - aren't just beliefs, they guide the decisions we make every day to do what's best for our clients, communities and more than 80,000 employees in 1,200 offices across 42 countries. At Morgan Stanley, you'll find an opportunity to work alongside the best and the brightest, in an environment where you are supported and empowered. Our teams are relentless collaborators and creative thinkers, fueled by their diverse backgrounds and experiences. We are proud to support our employees and their families at every point along their work-life journey, offering some of the most attractive and comprehensive employee benefits and perks in the industry. There's also ample opportunity to move about the business for those who show passion and grit in their work.
To learn more about our offices across the globe, please copy and paste https://www.morganstanley.com/about-us/global-offices into your browser.
Morgan Stanley is an equal opportunity employer committed to building and maintaining a workforce that is diverse in experience and background. Our recruiting efforts reflect our strong commitment to a culture of inclusion, where individuals are hired, developed, and advanced based on their skills and talents.
Our workforce reflects a broad cross-section of the global communities in which we operate, bringing a variety of backgrounds, talents, perspectives, and experiences.
For more information, please visit: https://www.morganstanley.com/people-opportunities/eeo.

What Morgan Stanley employees say

Pay

Benefits

Hours and flexibility

Workplace

Get the full story on Breakroom