Ai Reliability Engineer Jobs in Atlanta, GA (NOW HIRING)

AI Reliability Engineer (AI SRE) - Q126

$55.75 - $74/hr

AI Reliability Engineer (AI SRE) Company: R2 Technologies Location: Alpharetta, GA (Hybrid / Remote Options Available) Employment Type: Full-Time / Contractual About R2 Technologies: R2 Technologies ...

R2 Technologies Corporation

AI Reliability Engineer (AI SRE) - Q126

Alpharetta, GA · On-site

$55.75 - $74/hr

Morgan Stanley

Site Reliability Engineer (SRE) - AI Platform & Cloud

Alpharetta, GA

$55.75 - $74/hr

As an SRE on the AI platform, you will bring deep operations, automation, and systems engineering skills to enable our models and pipelines to run reliably at scale, while balancing cost, security ...

Morgan Stanley

Site Reliability Engineer (SRE) - AI Platform & Cloud

Alpharetta, GA

$55.75 - $74/hr

Morgan Stanley

Site Reliability Engineer (SRE) - AI Platform & Cloud

Alpharetta, GA · On-site

$55.75 - $74/hr

Morgan Stanley

Site Reliability Engineer (SRE) - AI Platform & Cloud

Alpharetta, GA · On-site

$55.75 - $74/hr

Coreforce

PRINCIPAL SITE RELIABILITY ENGINEER (SRE)

Atlanta, GA · On-site

$180K - $210K/yr

Automation & AI-Enabled Efficiency * Bachelor's Degree in Computer Science or Engineering. * 5+ years of experience in Site Reliability Engineering, DevOps, or Platform Engineering * Strong ...

Coreforce

PRINCIPAL SITE RELIABILITY ENGINEER (SRE)

Atlanta, GA · On-site

$180K - $210K/yr

Coreforce

PRINCIPAL SITE RELIABILITY ENGINEER (SRE)

Atlanta, GA · Hybrid

$54.75 - $72.75/hr

Coreforce

PRINCIPAL SITE RELIABILITY ENGINEER (SRE)

Atlanta, GA · Hybrid

$54.75 - $72.75/hr

FAVARH

Senior Site Reliability Engineer (SRE)

Atlanta, GA

$99.09K - $123.86K/yr

This role is ideal for someone who thinks like a developer, understands AI infrastructure, and is passionate about reliability, observability, and operational excellence. Key Responsibilities

FAVARH

Senior Site Reliability Engineer (SRE)

Atlanta, GA

$99.09K - $123.86K/yr

This role is ideal for someone who thinks like a developer, understands AI infrastructure, and is passionate about reliability, observability, and operational excellence. Key Responsibilities

Voya Financial, Inc.

Senior Site Reliability Engineer (SRE)

Atlanta, GA · On-site

$99.09K - $123.86K/yr

This role is ideal for someone who thinks like a developer, understands AI infrastructure, and is passionate about reliability, observability, and operational excellence. Key Responsibilities

Voya Financial, Inc.

Senior Site Reliability Engineer (SRE)

Atlanta, GA · On-site

$99.09K - $123.86K/yr

This role is ideal for someone who thinks like a developer, understands AI infrastructure, and is passionate about reliability, observability, and operational excellence. Key Responsibilities

anduril

Senior Reliability Engineer

Atlanta, GA

Senior Reliability Engineer Atlanta, Georgia, United States Anduril Industries is a defense ... Anduril's family of systems is powered by Lattice OS, an AI-powered operating system that turns ...

New

anduril

Senior Reliability Engineer

Atlanta, GA

Senior Reliability Engineer Atlanta, Georgia, United States Anduril Industries is a defense ... Anduril's family of systems is powered by Lattice OS, an AI-powered operating system that turns ...

New

Central Business Solutions, Inc

Senior Databricks AI Platform SRE

Alpharetta, GA · On-site

$55.75 - $74/hr

Senior Databricks AI Platform SRE Description: We are looking for a Senior Databricks AI Platform SRE to join our Platform SRE team. This role will be critical in designing, building, and optimizing ...

Central Business Solutions, Inc

Senior Databricks AI Platform SRE

Alpharetta, GA · On-site

$55.75 - $74/hr

Anduril Industries

Senior Reliability Engineer

Atlanta, GA · On-site

Anduril's family of systems is powered by Lattice OS, an AI-powered operating system that turns ... ABOUT THE TEAM The Reliability Engineering team partners across Anduril's engineering ...

Anduril Industries

Senior Reliability Engineer

Atlanta, GA · On-site

Anduril's family of systems is powered by Lattice OS, an AI-powered operating system that turns ... ABOUT THE TEAM The Reliability Engineering team partners across Anduril's engineering ...

Anduril Industries

Senior Reliability Engineer

Atlanta, GA

Anduril's family of systems is powered by Lattice OS, an AI-powered operating system that turns ... ABOUT THE TEAM The Reliability Engineering team partners across Anduril's engineering ...

Anduril Industries

Senior Reliability Engineer

Atlanta, GA

Anduril's family of systems is powered by Lattice OS, an AI-powered operating system that turns ... ABOUT THE TEAM The Reliability Engineering team partners across Anduril's engineering ...

Central Business Solutions

Senior Databricks AI Platform SRE

Alpharetta, GA · On-site

$55.75 - $74/hr

Central Business Solutions

Senior Databricks AI Platform SRE

Alpharetta, GA · On-site

$55.75 - $74/hr

Gradle Technologies

Staff Site Reliability Engineer

Atlanta, GA · Remote

$54.75 - $72.75/hr

Who We Are AI is changing how software gets built. Code production is becoming a commodity. The ... As a Lead SRE, you'll be a technical and operational leader for reliability across Develocity. You ...

Quick apply

Gradle Technologies

Staff Site Reliability Engineer

Atlanta, GA · Remote

$54.75 - $72.75/hr

Patientco

Site Reliability Engineer II

Atlanta, GA

$54.75 - $72.75/hr

As an SRE Specialist, you'll work closely with engineering, product, and data teams to ensure our ... Active use of artificial intelligence (AI) tools and techniques to enhance performance, drive ...

Patientco

Site Reliability Engineer II

Atlanta, GA

$54.75 - $72.75/hr

Crew Career Center

Senior SRE

Atlanta, GA

$54.75 - $72.75/hr

Crew Career Center

Senior SRE

Atlanta, GA

$54.75 - $72.75/hr

Crew Career Center

Site Reliability Engineer II

Atlanta, GA

$54.75 - $72.75/hr

Crew Career Center

Site Reliability Engineer II

Atlanta, GA

$54.75 - $72.75/hr

Patientco

Senior SRE

Atlanta, GA · On-site

$54.75 - $72.75/hr

Patientco

Senior SRE

Atlanta, GA · On-site

$54.75 - $72.75/hr

Waystar

Senior SRE

Atlanta, GA · On-site

$54.75 - $72.75/hr

Waystar

Senior SRE

Atlanta, GA · On-site

$54.75 - $72.75/hr

R2 Technologies Corporation

Full Stack SRE (Kubernetes & Observability) - Q125

Alpharetta, GA · On-site

$55.75 - $74/hr

... Intelligence (AI), Machine Learning (ML), software development, project management, SAP, and ... Full Stack SRE (Kubernetes & Observability) Location: Alpharetta, GA (willing to travel to client ...

R2 Technologies Corporation

Full Stack SRE (Kubernetes & Observability) - Q125

Alpharetta, GA · On-site

$55.75 - $74/hr

Exatech Inc

SRE Architect [AIOps & Dynatrace] - Atlanta, GA [hybrid]

Atlanta, GA · Hybrid

$54.75 - $72.75/hr

AI in SRE * Partner with application/domain teams to strengthen their SRE maturity and operational readiness. * Write automation, scripts, and REST APIs to integrate with external systems and ...

Exatech Inc

SRE Architect [AIOps & Dynatrace] - Atlanta, GA [hybrid]

Atlanta, GA · Hybrid

$54.75 - $72.75/hr

AI in SRE * Partner with application/domain teams to strengthen their SRE maturity and operational readiness. * Write automation, scripts, and REST APIs to integrate with external systems and ...

Showing results 1-20

People also search for

Ai Mod

Ai Reliability Engineer Jobs in Atlanta, GA

Ai Reliability Engineer information

See Atlanta, GA salary details

$58.7K

$113.4K

$135.6K

How much do ai reliability engineer jobs pay per year?

As of May 28, 2026, the average yearly pay for ai reliability engineer in Atlanta, GA is $113,449.00, according to ZipRecruiter salary data. Most workers in this role earn between $98,600.00 and $124,100.00 per year, depending on experience, location, and employer.

What are the key skills and qualifications needed to thrive as an AI Reliability Engineer, and why are they important?

To thrive as an AI Reliability Engineer, you need a solid background in computer science or engineering, expertise in AI/ML concepts, and experience with software testing and reliability methodologies. Familiarity with tools like TensorFlow, PyTorch, CI/CD pipelines, and reliability testing frameworks, along with certifications in cloud platforms (e.g., AWS Certified Machine Learning), is highly valuable. Analytical thinking, problem-solving abilities, and strong collaboration skills set top performers apart in this role. These skills ensure robust, dependable AI systems that meet performance standards and maintain trust in critical applications.

What are some common challenges Ai Reliability Engineers face when ensuring model robustness in production environments?

Ai Reliability Engineers often encounter challenges such as monitoring AI model performance for drift or unexpected behavior, managing data quality issues, and implementing automated alerting systems for anomalies. In production, it's crucial to ensure that AI models operate consistently and remain reliable under varying conditions and data inputs. Collaborating closely with data scientists, software engineers, and DevOps teams is essential to address these challenges and to continuously improve model reliability and uptime.

What are AI Reliability Engineers?

AI Reliability Engineers are professionals responsible for ensuring that artificial intelligence systems function reliably, safely, and effectively over time. They work on monitoring AI models in production, identifying and mitigating potential failures, and improving the robustness of AI systems. Their tasks often include testing, validation, performance monitoring, and implementing best practices for maintaining AI infrastructure. By focusing on reliability, they help organizations deploy AI solutions that are dependable and trustworthy in real-world environments.

What is a $900,000 AI job?

A $900,000 AI job typically refers to highly senior roles such as AI executives, chief AI officers, or lead AI engineers at top technology companies, often involving advanced expertise in machine learning, deep learning, and AI strategy. These positions usually require extensive experience, specialized skills, and may include performance-based bonuses or stock options that contribute to the high total compensation.

What is the difference between Ai Reliability Engineer vs Data Scientist?

Aspect	Ai Reliability Engineer	Data Scientist
Required Credentials	Bachelor's or master's in CS, engineering, or related; certifications in AI/ML	Bachelor's or master's in CS, statistics, or related; certifications in data analysis or ML
Work Environment	Tech companies, AI-focused teams, engineering departments	Research labs, tech firms, analytics teams
Employer & Industry Usage	AI product development, machine learning systems, reliability testing	Data analysis, predictive modeling, business insights

While both roles involve AI and ML, Ai Reliability Engineers focus on ensuring AI system robustness and uptime, whereas Data Scientists analyze data to generate insights and models. The roles often collaborate but serve different primary functions within AI projects.

What are popular job titles related to Ai Reliability Engineer jobs in Atlanta, GA? For Ai Reliability Engineer jobs in Atlanta, GA, the most frequently searched job titles are:

What job categories do people searching Ai Reliability Engineer jobs in Atlanta, GA look for? The top searched job categories for Ai Reliability Engineer jobs in Atlanta, GA are:

What cities near Atlanta, GA are hiring for Ai Reliability Engineer jobs? Cities near Atlanta, GA with the most Ai Reliability Engineer job openings:

Ai Reliability Engineer jobs near you

AI Reliability Engineer (AI SRE) - Q126

R2 Technologies Corporation

Alpharetta, GA • On-site

Apply

$55.75 - $74/hr

Full-time

Posted 20 days ago

Job description

Overview:
Job Title: AI Reliability Engineer (AI SRE)
Company: R2 Technologies
Location: Alpharetta, GA (Hybrid / Remote Options Available)
Employment Type: Full-Time / Contractual
About R2 Technologies: R2 Technologies is a Certified Minority Business Enterprise (MBE) headquartered in Alpharetta, GA. With over two decades of experience across global markets, we have built a reputation as a trusted partner for IT staffing excellence and cutting-edge digital product innovation. We are driven by innovation and operate on a simple philosophy: "We deliver what we promise, and we promise only what we can deliver." Beyond providing top-tier IT talent, R2 builds cutting-edge proprietary solutions like SmartEnt-an Enterprise AI & IoT Intelligence Platform utilizing advanced NLP and AI technologies. By partnering closely with our clients, we deliver technology-driven outcomes that are realistic, measurable, and impactful.
Job Summary: As enterprise AI shifts from prototypes to mission-critical production systems, we need engineers who can guarantee stability. R2 Technologies is seeking an AI Reliability Engineer to merge traditional Site Reliability Engineering (SRE) with LLM operations. You will be the guardian of our production AI, responsible for monitoring foundation models for performance drift, optimizing token usage and GPU costs, and ensuring high-availability inference for our SmartEnt platform.
Key Responsibilities: * Deploy, scale, and manage LLM inference servers (e.g., vLLM, Ray Serve, NVIDIA Triton) on Kubernetes across multi-cloud environments.

Implement comprehensive observability, logging, and tracing for complex agentic workflows using platforms like LangSmith, MLflow, or Weights & Biases (Weave).
Monitor production models for data drift, hallucination rates, and latency spikes, implementing automated rollback or model-routing strategies when necessary.
Optimize cloud infrastructure to balance GPU utilization, inference speed, and token cost (FinOps for AI).
Automate infrastructure provisioning (IaC) and CI/CD pipelines specifically tailored for machine learning models and fine-tuned adapters.
Actively utilize AI-assisted coding tools (GitHub Copilot, Cursor) to automate infrastructure management and incident response scripting.

Qualifications: * Up to 3 years of hands-on experience in SRE, DevOps, MLOps, or Cloud Infrastructure.

Strong proficiency in containerization and orchestration (Docker, Kubernetes, Helm).
Experience configuring and scaling GPU-backed workloads in cloud environments (AWS, Azure, or GCP).
Familiarity with LLM observability tools and trace-level debugging of AI applications.
Proven experience or strong familiarity working alongside AI coding assistants to enhance productivity.
Scripting skills in Python and Bash, with a strong focus on system reliability, automation, and cost-optimization.

Skills:
Reliability Engineering,Kubernetes

Apply

Ai Reliability Engineer Jobs in Atlanta, GA (NOW HIRING)

AI Reliability Engineer (AI SRE) - Q126

AI Reliability Engineer (AI SRE) - Q126

Site Reliability Engineer (SRE) - AI Platform & Cloud

Site Reliability Engineer (SRE) - AI Platform & Cloud

Site Reliability Engineer (SRE) - AI Platform & Cloud

Site Reliability Engineer (SRE) - AI Platform & Cloud

PRINCIPAL SITE RELIABILITY ENGINEER (SRE)

PRINCIPAL SITE RELIABILITY ENGINEER (SRE)

PRINCIPAL SITE RELIABILITY ENGINEER (SRE)

PRINCIPAL SITE RELIABILITY ENGINEER (SRE)

Senior Site Reliability Engineer (SRE)

Senior Site Reliability Engineer (SRE)

Senior Site Reliability Engineer (SRE)

Senior Site Reliability Engineer (SRE)

Senior Reliability Engineer

Senior Reliability Engineer

Senior Databricks AI Platform SRE

Senior Databricks AI Platform SRE

Senior Reliability Engineer

Senior Reliability Engineer

Senior Reliability Engineer

Senior Reliability Engineer

Senior Databricks AI Platform SRE

Senior Databricks AI Platform SRE

Staff Site Reliability Engineer

Staff Site Reliability Engineer

Site Reliability Engineer II

Site Reliability Engineer II

Senior SRE

Senior SRE

Site Reliability Engineer II

Site Reliability Engineer II

Senior SRE

Senior SRE

Senior SRE

Senior SRE

Full Stack SRE (Kubernetes & Observability) - Q125

Full Stack SRE (Kubernetes & Observability) - Q125

SRE Architect [AIOps & Dynatrace] - Atlanta, GA [hybrid]

SRE Architect [AIOps & Dynatrace] - Atlanta, GA [hybrid]

People also search for

Ai Reliability Engineer information

See Atlanta, GA salary details

How much do ai reliability engineer jobs pay per year?

What are the key skills and qualifications needed to thrive as an AI Reliability Engineer, and why are they important?

What are some common challenges Ai Reliability Engineers face when ensuring model robustness in production environments?

What are AI Reliability Engineers?

What is a $900,000 AI job?

What is the difference between Ai Reliability Engineer vs Data Scientist?

AI Reliability Engineer (AI SRE) - Q126

Share this job

Job description

Share this job