Ai Reliability Engineer Jobs in Santa Rosa, CA (NOW HIRING)

Senior Site Reliability Engineer

$67.75 - $90/hr

Join us. The Role As a member of the SRE team, you will proactively and reactively improve the ... You will leverage and continuously improve AI-driven tooling and automation to enhance ...

Block

Senior Site Reliability Engineer

Bodega Bay, CA · On-site

$67.75 - $90/hr

Join us. The Role As a member of the SRE team, you will proactively and reactively improve the ... You will leverage and continuously improve AI-driven tooling and automation to enhance ...

Diamond Foundry

R&D Test Engineer, Thermal/Mechanical/Reliability

Bodega Bay, CA · On-site

$125K - $155K/yr

We are seeking an on-site R&D Test Engineer to join our AI Chips team. This role will work closely with the thermal/mechanical/reliability lead to execute thermal testing, coordinate reliability ...

Diamond Foundry

R&D Test Engineer, Thermal/Mechanical/Reliability

Bodega Bay, CA · On-site

$125K - $155K/yr

Diamond Foundry

R&D Test Engineer, Thermal/Mechanical/Reliability

Bodega Bay, CA

$105K - $140K/yr

Quick apply

Apply Early

Diamond Foundry

R&D Test Engineer, Thermal/Mechanical/Reliability

Bodega Bay, CA

$105K - $140K/yr

Apply Early

Fabrion

DevOps Engineer (Founding Team)

Bodega Bay, CA · On-site

$62.50 - $85.75/hr

About the Role We're building an AI-native, multi-tenant enterprise platform for complex domains in ... Obsessed with reliability, latency, uptime, and repeatability * Security-aware and compliance ...

Quick apply

Fabrion

DevOps Engineer (Founding Team)

Bodega Bay, CA · On-site

$62.50 - $85.75/hr

Blue Origin

Principal Engineer - Product Integrity

Bodega Bay, CA

As a Principal Engineer for Product Integrity at Blue Origin, you will define the strategic ... Track record of establishing industry-leading reliability data pipelines, analytics, and AI/ML ...

Blue Origin

Principal Engineer - Product Integrity

Bodega Bay, CA

hireVouch

Founding Engineer - AI Agents

Bodega Bay, CA

Own hard reliability and latency problems in messy real-world systems * Help turn experimental AI ... Owning engineering relationships & feedback loops with customers * Product features that span ...

Quick apply

Apply Early

hireVouch

Founding Engineer - AI Agents

Bodega Bay, CA

Apply Early

WEX

Senior Backend Engineer - AI Platform

Bodega Bay, CA · On-site +1

$145K - $191K/yr

... AI tools engineered for advanced reasoning, strategic planning, and the orchestration of intricate ... reliability. * Conduct objective and comparative analyses of competing technologies to advise the ...

WEX

Senior Backend Engineer - AI Platform

Bodega Bay, CA · On-site +1

$145K - $191K/yr

eNett

Senior Backend Engineer - AI Platform

Bodega Bay, CA · On-site +1

$145K - $191K/yr

eNett

Senior Backend Engineer - AI Platform

Bodega Bay, CA · On-site +1

$145K - $191K/yr

JazzX AI

Staff Software Engineer

Bodega Bay, CA

About JazzX AI: Vision: Enterprises operating on institutional intelligence--governed, self ... You'll be responsible for ensuring the platform's scalability, reliability, security and ...

Quick apply

Apply Early

JazzX AI

Staff Software Engineer

Bodega Bay, CA

About JazzX AI: Vision: Enterprises operating on institutional intelligence--governed, self ... You'll be responsible for ensuring the platform's scalability, reliability, security and ...

Apply Early

WEX

Staff Backend Engineer (AI & Agentic Systems)

Bodega Bay, CA · On-site +1

WEX

Staff Backend Engineer (AI & Agentic Systems)

Bodega Bay, CA · On-site +1

eNett

Staff Backend Engineer (AI & Agentic Systems)

Bodega Bay, CA · On-site +1

eNett

Staff Backend Engineer (AI & Agentic Systems)

Bodega Bay, CA · On-site +1

HCLTech

Solution Principal

Santa Rosa, CA · On-site

... E B) Deals Management Experience: * Large Deals Management experience * Exposure to Tech vertical * Multi- service line deals management experience * Experience in Pricing Models C) AI experience:

HCLTech

Solution Principal

Santa Rosa, CA · On-site

HCLTech

Solution Principal

Sonoma, CA · On-site

HCLTech

Solution Principal

Sonoma, CA · On-site

eAI

Head of AI & Data

Bodega Bay, CA

$135K - $163K/yr

Build AI agent systems using LLMs and structured financial data * Define model architecture ... reliability, and explainability Data Infrastructure & Engineering * Build and scale data pipelines ...

eAI

Head of AI & Data

Bodega Bay, CA

$135K - $163K/yr

Block

Principal Engineer, AI Systems

Bodega Bay, CA · On-site

... reliability, safety, and performance at scale * Design agent orchestration systems including ... engineering, retrieval-augmented generation, and model optimization * A track record of taking AI ...

Block

Principal Engineer, AI Systems

Bodega Bay, CA · On-site

Duckcreek

Vice President, Software Engineering - Agentic AI and Data Engineering Platform

Bodega Bay, CA · On-site

The Vice President, Software Engineering - Agentic AI and Data Engineering Platform is a senior ... reliability, security, and performance. A key responsibilityforthe role is accelerating the ...

Duckcreek

Vice President, Software Engineering - Agentic AI and Data Engineering Platform

Bodega Bay, CA · On-site

Outline Systems

Vice President, Software Engineering - Agentic AI and Data Engineering Platform

Bodega Bay, CA · On-site

Outline Systems

Vice President, Software Engineering - Agentic AI and Data Engineering Platform

Bodega Bay, CA · On-site

eAI

Head of AI & Data (Remote)

Bodega Bay, CA · Remote

Quick apply

eAI

Head of AI & Data (Remote)

Bodega Bay, CA · Remote

Agility Robotics

Director of AI

Bodega Bay, CA · On-site +1

$257K - $402K/yr

Deliver production-ready AI models that meet performance, latency, and reliability requirements for ... D.) in Computer Science, Robotics, Electrical Engineering, or a related field. * Minimum of 8 years ...

Quick apply

Apply Early

Agility Robotics

Director of AI

Bodega Bay, CA · On-site +1

$257K - $402K/yr

Apply Early

Apple

Sensing Hardware - Mechanical Modeling and Simulation (FEA) Engineering Director

Bodega Bay, CA

$311K - $496K/yr

... and reliability of Apple's most advanced products. As the director of this team, you will guide a ... Applied AI/ML for Simulation: Guide the application of AI/ML to enhance simulation workflows, using ...

Apple

Sensing Hardware - Mechanical Modeling and Simulation (FEA) Engineering Director

Bodega Bay, CA

$311K - $496K/yr

Showing results 1-20

Ai Reliability Engineer Jobs in Santa Rosa, CA

Ai Reliability Engineer information

See Santa Rosa, CA salary details

$66.7K

$129K

$154.2K

How much do ai reliability engineer jobs pay per year?

As of Jul 6, 2026, the average yearly pay for ai reliability engineer in Santa Rosa, CA is $128,983.00, according to ZipRecruiter salary data. Most workers in this role earn between $112,100.00 and $141,000.00 per year, depending on experience, location, and employer.

What are the key skills and qualifications needed to thrive as an AI Reliability Engineer, and why are they important?

To thrive as an AI Reliability Engineer, you need a solid background in computer science or engineering, expertise in AI/ML concepts, and experience with software testing and reliability methodologies. Familiarity with tools like TensorFlow, PyTorch, CI/CD pipelines, and reliability testing frameworks, along with certifications in cloud platforms (e.g., AWS Certified Machine Learning), is highly valuable. Analytical thinking, problem-solving abilities, and strong collaboration skills set top performers apart in this role. These skills ensure robust, dependable AI systems that meet performance standards and maintain trust in critical applications.

What is the difference between Ai Reliability Engineer vs Data Scientist?

Aspect	Ai Reliability Engineer	Data Scientist
Required Credentials	Bachelor's or master's in CS, engineering, or related; certifications in AI/ML	Bachelor's or master's in CS, statistics, or related; certifications in data analysis or ML
Work Environment	Tech companies, AI-focused teams, engineering departments	Research labs, tech firms, analytics teams
Employer & Industry Usage	AI product development, machine learning systems, reliability testing	Data analysis, predictive modeling, business insights

While both roles involve AI and ML, Ai Reliability Engineers focus on ensuring AI system robustness and uptime, whereas Data Scientists analyze data to generate insights and models. The roles often collaborate but serve different primary functions within AI projects.

What are AI Reliability Engineers?

AI Reliability Engineers are professionals responsible for ensuring that artificial intelligence systems function reliably, safely, and effectively over time. They work on monitoring AI models in production, identifying and mitigating potential failures, and improving the robustness of AI systems. Their tasks often include testing, validation, performance monitoring, and implementing best practices for maintaining AI infrastructure. By focusing on reliability, they help organizations deploy AI solutions that are dependable and trustworthy in real-world environments.

What are some common challenges Ai Reliability Engineers face when ensuring model robustness in production environments?

Ai Reliability Engineers often encounter challenges such as monitoring AI model performance for drift or unexpected behavior, managing data quality issues, and implementing automated alerting systems for anomalies. In production, it's crucial to ensure that AI models operate consistently and remain reliable under varying conditions and data inputs. Collaborating closely with data scientists, software engineers, and DevOps teams is essential to address these challenges and to continuously improve model reliability and uptime.

What are popular job titles related to Ai Reliability Engineer jobs in Santa Rosa, CA? For Ai Reliability Engineer jobs in Santa Rosa, CA, the most frequently searched job titles are:

Site Reliability Engineer Remote

What job categories do people searching Ai Reliability Engineer jobs in Santa Rosa, CA look for? The top searched job categories for Ai Reliability Engineer jobs in Santa Rosa, CA are:

Mechanical Integration Engineer

What cities near Santa Rosa, CA are hiring for Ai Reliability Engineer jobs? Cities near Santa Rosa, CA with the most Ai Reliability Engineer job openings:

Ai Reliability Engineer jobs near you

Senior Site Reliability Engineer

Block

Bodega Bay, CA • On-site

Apply

$67.75 - $90/hr

Other

Medical, Dental, Vision, Life, Retirement, PTO

Posted 27 days ago

Block rating

7.9

Based on 16 frontline employees who took The Breakroom Quiz

9th of 20 rated payment service providers

Job description

Block builds simple, powerful tools that make progress towards an economy that's truly open to all.

Each of our brands unlocks different aspects of the economy for more people. Square makes commerce and financial services accessible to sellers. Cash App is the easy way to spend, send, and store money. Afterpay is transforming the way customers manage their spending over time. TIDAL is a music platform that empowers artists to thrive as entrepreneurs. Bitkey is a simple self-custody wallet built for bitcoin. Proto is a suite of bitcoin mining products and services. Together, we're helping build a financial system that is open to everyone. Join us.

The Role

As a member of the SRE team, you will proactively and reactively improve the reliability of Block's platform and critical infrastructure. You are metrics-driven, systems-oriented, and focused on building distributed platforms that enable safe, scalable product development.

You will leverage and continuously improve AI-driven tooling and automation to enhance observability, accelerate incident detection and response, and reduce operational toil. This includes applying AI to incident analysis, alert tuning, and operational workflows.

You will participate in primary platform oncall (12 hours per day, one week every few weeks, depending on team size), supporting Block's most critical (Tier 0) services. In this role, you will lead incident command, coordinate mitigation, and drive effective escalation during high-severity events.

You Will

Build and extend platforms to improve system reliability
Work on team goals that encompass reliability for the entire company
Standardize reliability tools across multiple platforms and organizations
Triage, coordinate, and lead stabilization of sev 0-1 incidents
Serve as primary oncall, maintaining structured escalation paths and exercising leadership escalation
Drive platform-wide reliability improvements, shared operational tooling, and deploy-safety patterns
Use AI-driven systems to improve signal detection, reduce noise, and accelerate root cause analysis
Design and implement safe deployment patterns (progressive delivery, automated rollback, guardrails)

You Have

Drive to root cause systems with many moving parts and take the necessary steps to fix them
Demonstrated technical initiative and leadership on previous projects, especially those with a backend/platform focus
Familiarity with AI-driven tooling for observability, incident analysis, or automation
A mindset that naturally reaches for AI to accelerate problem-solving and reduce toil
Experience running production oncall for high-availability systems
Strong incident management skills - structured triage, mitigation under pressure, blameless postmortems
Fluency with CI/CD pipelines, progressive rollout strategies, and rollback automation
Monitoring & observability expertise - building/tuning alerts for uptime, error rates, latency regression, and resource exhaustion
Ability to create and maintain evidence-based maturity assessments using trailing 90-day data windows.
Comfort with vendor/dependency management - maintaining validated escalation contacts reachable within 5 minutes.
Boundless curiosity, autonomy, and a strong sense of accountability
A strong desire to perform and grow as an engineer
5+ years of software development experience

Technologies We Use and Teach

Kotlin, Modern Java (11+)
HTTP, JSON, gRPC, and Protocol Buffers
MySQL / Vitess / DynamoDB
Event driven architectures
DataDog
LaunchDarkly
Terraform, Kubernetes, Istio/Envoy
Amazon Web Services

This program shifts Block from reactive incident handling to repeatable, system-wide reliability gains - fewer customer-visible incidents, faster response, higher product velocity, and lower burnout across the organization.

We're working to build a more inclusive economy where our customers have equal access to opportunity, and we strive to live by these same values in building our workplace. Block is a proud equal opportunity employer. We work hard to evaluate all employees and job applicants consistently, based solely on the core competencies required of the role at hand, and without regard to any legally protected class. We believe in being fair, and are committed to an inclusive interview experience, including providing reasonable accommodations to disabled applicants throughout the recruitment process. We encourage applicants to share any needed accommodations with their recruiter, who will treat these requests as confidentially as possible. Want to learn more about what we're doing to build an inclusive workplace? Check out our Inclusion & Diversity page

Full-time employee benefits include the following:

Healthcare coverage (Medical, Vision and Dental insurance)
Health Savings Account and Flexible Spending Account
Retirement Plans including company match
Employee Stock Purchase Program
Wellness programs, including access to mental health, 1:1 financial planners, and a monthly wellness allowance
Paid parental and caregiving leave
Paid time off (including 12 paid holidays)
Paid sick leave (1 hour per 26 hours worked (max 80 hours per calendar year to the extent legally permissible) for non-exempt employees and covered by our Flexible Time Off policy for exempt employees)
Learning and Development resources
Paid Life insurance, AD&D, and disability benefits

These benefits are further detailed in Block's policies. This role is also eligible to participate in Block's equity plan subject to the terms of the applicable plans and policies, and may be eligible for a sign-on bonus. Sales roles may be eligible to participate in a commission plan subject to the terms of the applicable plans and policies. Pay and benefits are subject to change at any time, consistent with the terms of any applicable compensation or benefit plans.

Block takes a market-based approach to pay, and pay may vary depending on your location. U.S. locations are categorized into one of four zones based on a cost of labor index for that geographic area. The successful candidate's starting pay will be determined based on job-related skills, experience, qualifications, work location, and market conditions. These ranges may be modified in the future.

Zone A: USD $189,000 - USD $283,600

Zone B: USD $179,600 - USD $269,400

Zone C: USD $170,100 - USD $255,100

Zone D: USD $160,700 - USD $241,100

What Block employees say

Pay

Hours and flexibility

Workplace

Get the full story on Breakroom

Apply

Ai Reliability Engineer Jobs in Santa Rosa, CA (NOW HIRING)

Senior Site Reliability Engineer

Senior Site Reliability Engineer

R&D Test Engineer, Thermal/Mechanical/Reliability

R&D Test Engineer, Thermal/Mechanical/Reliability

R&D Test Engineer, Thermal/Mechanical/Reliability

R&D Test Engineer, Thermal/Mechanical/Reliability

DevOps Engineer (Founding Team)

DevOps Engineer (Founding Team)

Principal Engineer - Product Integrity

Principal Engineer - Product Integrity

Founding Engineer - AI Agents

Founding Engineer - AI Agents

Senior Backend Engineer - AI Platform

Senior Backend Engineer - AI Platform

Senior Backend Engineer - AI Platform

Senior Backend Engineer - AI Platform

Staff Software Engineer

Staff Software Engineer

Staff Backend Engineer (AI & Agentic Systems)

Staff Backend Engineer (AI & Agentic Systems)

Staff Backend Engineer (AI & Agentic Systems)

Staff Backend Engineer (AI & Agentic Systems)

Solution Principal

Solution Principal

Solution Principal

Solution Principal

Head of AI & Data

Head of AI & Data

Principal Engineer, AI Systems

Principal Engineer, AI Systems

Vice President, Software Engineering - Agentic AI and Data Engineering Platform

Vice President, Software Engineering - Agentic AI and Data Engineering Platform

Vice President, Software Engineering - Agentic AI and Data Engineering Platform

Vice President, Software Engineering - Agentic AI and Data Engineering Platform

Head of AI & Data (Remote)

Head of AI & Data (Remote)

Director of AI

Director of AI

Sensing Hardware - Mechanical Modeling and Simulation (FEA) Engineering Director

Sensing Hardware - Mechanical Modeling and Simulation (FEA) Engineering Director

Ai Reliability Engineer information

See Santa Rosa, CA salary details

How much do ai reliability engineer jobs pay per year?

What are the key skills and qualifications needed to thrive as an AI Reliability Engineer, and why are they important?

What is the difference between Ai Reliability Engineer vs Data Scientist?

What are AI Reliability Engineers?

What are some common challenges Ai Reliability Engineers face when ensuring model robustness in production environments?

Senior Site Reliability Engineer

Share this job

Block rating

Get the real story on frontline employers

Job description

What Block employees say

Get the real story on frontline employers

Pay

Only some people get paid breaks

Most people get paid when they’re sick

The job rarely spills into unpaid time

Hours and flexibility

Less than 4 weeks notice of work schedule

Most people don’t worry about their hours

Most people can’t choose their shifts

Workplace

Most people feel treated with respect

Most people get breaks without interruption

Most people are stressed out

Share this job