1

Ai Reliability Engineer Jobs in Santa Rosa, CA (NOW HIRING)

Senior Site Reliability Engineer

Bodega Bay, CA · On-site

$67.75 - $90/hr

Join us. The Role As a member of the SRE team, you will proactively and reactively improve the ... You will leverage and continuously improve AI-driven tooling and automation to enhance ...

DevOps Engineer (Founding Team)

Bodega Bay, CA · On-site

$62.50 - $85.75/hr

About the Role We're building an AI-native, multi-tenant enterprise platform for complex domains in ... Obsessed with reliability, latency, uptime, and repeatability * Security-aware and compliance ...

Own hard reliability and latency problems in messy real-world systems * Help turn experimental AI ... Owning engineering relationships & feedback loops with customers * Product features that span ...

About JazzX AI: Vision: Enterprises operating on institutional intelligence--governed, self ... You'll be responsible for ensuring the platform's scalability, reliability, security and ...

Head of AI & Data

Bodega Bay, CA

$135K - $163K/yr

Build AI agent systems using LLMs and structured financial data * Define model architecture ... reliability, and explainability Data Infrastructure & Engineering * Build and scale data pipelines ...

Data Engineer

Santa Rosa, CA · On-site

$150K - $175K/yr

Data Engineer Location: San Francisco, CA or New York, NY Work Model: Onsite Compensation: $150,000 ... AI-enabled applications * Improve data quality, consistency, and reliability across business ...

Data Engineer

Sonoma, CA · On-site

$150K - $175K/yr

Data Engineer Location: San Francisco, CA or New York, NY Work Model: Onsite Compensation: $150,000 ... AI-enabled applications * Improve data quality, consistency, and reliability across business ...

Build AI agent systems using LLMs and structured financial data * Define model architecture ... reliability, and explainability Data Infrastructure & Engineering * Build and scale data pipelines ...

You will partner with product, engineering, design, and GTM to turn priorities into shipped ... Experience shipping AI products, agent workflows, or tools that require reliability and evaluation

You will partner with product, engineering, design, and GTM to turn priorities into shipped ... Experience shipping AI products, agent workflows, or tools that require reliability and evaluation

Director of AI

Bodega Bay, CA · On-site +1

$257K - $402K/yr

Deliver production-ready AI models that meet performance, latency, and reliability requirements for ... D.) in Computer Science, Robotics, Electrical Engineering, or a related field. * Minimum of 8 years ...

next page

Showing results 1-20

People also search for

Ai Reliability Engineer information

See Santa Rosa, CA salary details

$66.7K

$129K

$154.2K

How much do ai reliability engineer jobs pay per year?

As of Jun 11, 2026, the average yearly pay for ai reliability engineer in Santa Rosa, CA is $128,983.00, according to ZipRecruiter salary data. Most workers in this role earn between $112,100.00 and $141,000.00 per year, depending on experience, location, and employer.

What are the key skills and qualifications needed to thrive as an AI Reliability Engineer, and why are they important?

To thrive as an AI Reliability Engineer, you need a solid background in computer science or engineering, expertise in AI/ML concepts, and experience with software testing and reliability methodologies. Familiarity with tools like TensorFlow, PyTorch, CI/CD pipelines, and reliability testing frameworks, along with certifications in cloud platforms (e.g., AWS Certified Machine Learning), is highly valuable. Analytical thinking, problem-solving abilities, and strong collaboration skills set top performers apart in this role. These skills ensure robust, dependable AI systems that meet performance standards and maintain trust in critical applications.

What is the difference between Ai Reliability Engineer vs Data Scientist?

AspectAi Reliability EngineerData Scientist
Required CredentialsBachelor's or master's in CS, engineering, or related; certifications in AI/MLBachelor's or master's in CS, statistics, or related; certifications in data analysis or ML
Work EnvironmentTech companies, AI-focused teams, engineering departmentsResearch labs, tech firms, analytics teams
Employer & Industry UsageAI product development, machine learning systems, reliability testingData analysis, predictive modeling, business insights

While both roles involve AI and ML, Ai Reliability Engineers focus on ensuring AI system robustness and uptime, whereas Data Scientists analyze data to generate insights and models. The roles often collaborate but serve different primary functions within AI projects.

What are AI Reliability Engineers?

AI Reliability Engineers are professionals responsible for ensuring that artificial intelligence systems function reliably, safely, and effectively over time. They work on monitoring AI models in production, identifying and mitigating potential failures, and improving the robustness of AI systems. Their tasks often include testing, validation, performance monitoring, and implementing best practices for maintaining AI infrastructure. By focusing on reliability, they help organizations deploy AI solutions that are dependable and trustworthy in real-world environments.

What are some common challenges Ai Reliability Engineers face when ensuring model robustness in production environments?

Ai Reliability Engineers often encounter challenges such as monitoring AI model performance for drift or unexpected behavior, managing data quality issues, and implementing automated alerting systems for anomalies. In production, it's crucial to ensure that AI models operate consistently and remain reliable under varying conditions and data inputs. Collaborating closely with data scientists, software engineers, and DevOps teams is essential to address these challenges and to continuously improve model reliability and uptime.
What are popular job titles related to Ai Reliability Engineer jobs in Santa Rosa, CA? For Ai Reliability Engineer jobs in Santa Rosa, CA, the most frequently searched job titles are:
What job categories do people searching Ai Reliability Engineer jobs in Santa Rosa, CA look for? The top searched job categories for Ai Reliability Engineer jobs in Santa Rosa, CA are:
What cities near Santa Rosa, CA are hiring for Ai Reliability Engineer jobs? Cities near Santa Rosa, CA with the most Ai Reliability Engineer job openings:
Infographic showing various Ai Reliability Engineer job openings in Santa Rosa, CA as of June 2026, with employment types broken down into 78% Full Time, and 22% Contract. Highlights an 90% In-person, and 10% Remote job distribution, with an average salary of $128,983 per year, or $62 per hour.
Senior Site Reliability Engineer

Senior Site Reliability Engineer

Block

Bodega Bay, CA • On-site

$67.75 - $90/hr

Other

Medical, Dental, Vision, Life, Retirement, PTO

Posted yesterday


Block rating

7.9

Company rating: 7.9 out of 10

Based on 16 frontline employees who took The Breakroom Quiz

9th of 17 rated payment service providers


Job description

Block builds simple, powerful tools that make progress towards an economy that's truly open to all.

Each of our brands unlocks different aspects of the economy for more people. Square makes commerce and financial services accessible to sellers. Cash App is the easy way to spend, send, and store money. Afterpay is transforming the way customers manage their spending over time. TIDAL is a music platform that empowers artists to thrive as entrepreneurs. Bitkey is a simple self-custody wallet built for bitcoin. Proto is a suite of bitcoin mining products and services. Together, we're helping build a financial system that is open to everyone. Join us.

The Role

As a member of the SRE team, you will proactively and reactively improve the reliability of Block's platform and critical infrastructure. You are metrics-driven, systems-oriented, and focused on building distributed platforms that enable safe, scalable product development.

You will leverage and continuously improve AI-driven tooling and automation to enhance observability, accelerate incident detection and response, and reduce operational toil. This includes applying AI to incident analysis, alert tuning, and operational workflows.

You will participate in primary platform oncall (12 hours per day, one week every few weeks, depending on team size), supporting Block's most critical (Tier 0) services. In this role, you will lead incident command, coordinate mitigation, and drive effective escalation during high-severity events.

You Will

  • Build and extend platforms to improve system reliability
  • Work on team goals that encompass reliability for the entire company
  • Standardize reliability tools across multiple platforms and organizations
  • Triage, coordinate, and lead stabilization of sev 0-1 incidents
  • Serve as primary oncall, maintaining structured escalation paths and exercising leadership escalation
  • Drive platform-wide reliability improvements, shared operational tooling, and deploy-safety patterns
  • Use AI-driven systems to improve signal detection, reduce noise, and accelerate root cause analysis
  • Design and implement safe deployment patterns (progressive delivery, automated rollback, guardrails)

You Have

  • Drive to root cause systems with many moving parts and take the necessary steps to fix them
  • Demonstrated technical initiative and leadership on previous projects, especially those with a backend/platform focus
  • Familiarity with AI-driven tooling for observability, incident analysis, or automation
  • A mindset that naturally reaches for AI to accelerate problem-solving and reduce toil
  • Experience running production oncall for high-availability systems
  • Strong incident management skills - structured triage, mitigation under pressure, blameless postmortems
  • Fluency with CI/CD pipelines, progressive rollout strategies, and rollback automation
  • Monitoring & observability expertise - building/tuning alerts for uptime, error rates, latency regression, and resource exhaustion
  • Ability to create and maintain evidence-based maturity assessments using trailing 90-day data windows.
  • Comfort with vendor/dependency management - maintaining validated escalation contacts reachable within 5 minutes.
  • Boundless curiosity, autonomy, and a strong sense of accountability
  • A strong desire to perform and grow as an engineer
  • 5+ years of software development experience

Technologies We Use and Teach

  • Kotlin, Modern Java (11+)
  • HTTP, JSON, gRPC, and Protocol Buffers
  • MySQL / Vitess / DynamoDB
  • Event driven architectures
  • DataDog
  • LaunchDarkly
  • Terraform, Kubernetes, Istio/Envoy
  • Amazon Web Services

This program shifts Block from reactive incident handling to repeatable, system-wide reliability gains - fewer customer-visible incidents, faster response, higher product velocity, and lower burnout across the organization.

We're working to build a more inclusive economy where our customers have equal access to opportunity, and we strive to live by these same values in building our workplace. Block is a proud equal opportunity employer. We work hard to evaluate all employees and job applicants consistently, based solely on the core competencies required of the role at hand, and without regard to any legally protected class. We believe in being fair, and are committed to an inclusive interview experience, including providing reasonable accommodations to disabled applicants throughout the recruitment process. We encourage applicants to share any needed accommodations with their recruiter, who will treat these requests as confidentially as possible. Want to learn more about what we're doing to build an inclusive workplace? Check out our Inclusion & Diversity page

Full-time employee benefits include the following:

  • Healthcare coverage (Medical, Vision and Dental insurance)
  • Health Savings Account and Flexible Spending Account
  • Retirement Plans including company match
  • Employee Stock Purchase Program
  • Wellness programs, including access to mental health, 1:1 financial planners, and a monthly wellness allowance
  • Paid parental and caregiving leave
  • Paid time off (including 12 paid holidays)
  • Paid sick leave (1 hour per 26 hours worked (max 80 hours per calendar year to the extent legally permissible) for non-exempt employees and covered by our Flexible Time Off policy for exempt employees)
  • Learning and Development resources
  • Paid Life insurance, AD&D, and disability benefits

These benefits are further detailed in Block's policies. This role is also eligible to participate in Block's equity plan subject to the terms of the applicable plans and policies, and may be eligible for a sign-on bonus. Sales roles may be eligible to participate in a commission plan subject to the terms of the applicable plans and policies. Pay and benefits are subject to change at any time, consistent with the terms of any applicable compensation or benefit plans.

Block takes a market-based approach to pay, and pay may vary depending on your location. U.S. locations are categorized into one of four zones based on a cost of labor index for that geographic area. The successful candidate's starting pay will be determined based on job-related skills, experience, qualifications, work location, and market conditions. These ranges may be modified in the future.

Zone A: USD $189,000 - USD $283,600

Zone B: USD $179,600 - USD $269,400

Zone C: USD $170,100 - USD $255,100

Zone D: USD $160,700 - USD $241,100


What Block employees say

Pay

Hours and flexibility

Workplace

Get the full story on Breakroom