Ml Inference Jobs in California (NOW HIRING)

Software Engineer - GenAI inference

... ML inference internals: attention, MLPs, recurrent modules, quantization, sparse operations, etc ... • Hands-on experience with CUDA, GPU programming, and key libraries (cuBLAS, cuDNN, NCCL, etc ...

Databricks

Software Engineer - GenAI inference

San Francisco, CA · On-site

Apple

Senior Full-Stack Engineer - Web Platforms for ML Inference

Cupertino, CA

$150K - $277K/yr

Discover and configure inference services Interact with ML pipelines and workflows Monitor usage, health, and operational signals Establish best practices around testing, maintainability ...

Apple

Senior Full-Stack Engineer - Web Platforms for ML Inference

Cupertino, CA

$150K - $277K/yr

Discover and configure inference services Interact with ML pipelines and workflows Monitor usage, health, and operational signals Establish best practices around testing, maintainability ...

Advanced Micro Devices, Inc

Senior Product Manager - ROCm & AI/ML Inference Software

Santa Clara, CA · On-site

$179K/yr

... inference requirements and translates market signals into actionable product strategy. Open-Source Community Engagement * Serve as AMD's active presence in the open-source AI/ML community: monitor ...

Advanced Micro Devices, Inc

Senior Product Manager - ROCm & AI/ML Inference Software

Santa Clara, CA · On-site

$179K/yr

Databricks

Staff Software Engineer - GenAI inference

San Francisco, CA

Deep understanding of ML inference internals: attention, MLPs, recurrent modules, quantization, sparse operations, etc. * Hands-on experience with CUDA, GPU programming, and key libraries (cuBLAS ...

Databricks

Staff Software Engineer - GenAI inference

San Francisco, CA

Databricks

Staff Software Engineer - GenAI inference

San Francisco, CA · On-site

... of ML inference internals: attention, MLPs, recurrent modules, quantization, sparse operations, etc. • Hands-on experience with CUDA, GPU programming, and key libraries (cuBLAS, cuDNN, NCCL, etc ...

Databricks

Staff Software Engineer - GenAI inference

San Francisco, CA · On-site

Amazon

Software Development Engineer, AI/ML, AWS Neuron, Model Inference

Cupertino, CA

The Inference Enablement and Acceleration team is at the forefront of running a wide range of models and supporting novel architecture alongside maximizing their performance for AWS's custom ML ...

Amazon

Software Development Engineer, AI/ML, AWS Neuron, Model Inference

Cupertino, CA

Amazon

Software Development Engineer, AI/ML, AWS Neuron, Model Inference

Cupertino, CA

Amazon

Software Development Engineer, AI/ML, AWS Neuron, Model Inference

Cupertino, CA

Amazon

Software Development Engineer AI/ML, Inference Serving, AWS Neuron

Cupertino, CA · On-site

Cupertino, CA · On-site

$128K - $177K/yr

Amazon

Senior Software Development Engineer, AI/ML, AWS Neuron, Model Inference

Cupertino, CA · On-site

$128K - $177K/yr

Amazon

Software Development Engineer, AI/ML, AWS Neuron, Model Inference

Cupertino, CA

Amazon

Software Development Engineer, AI/ML, AWS Neuron, Model Inference

Cupertino, CA

Advanced Micro Devices, Inc

Senior Product Manager -ROCm& AI/ML Inference Software

Santa Clara, CA · On-site

$149K - $197K/yr

New

Advanced Micro Devices, Inc

Senior Product Manager -ROCm& AI/ML Inference Software

Santa Clara, CA · On-site

$149K - $197K/yr

New

Apple

ML Framework (MetalLM) Engineer, Graphics, Game and ML

Cupertino, CA

$150K - $277K/yr

Apple's Server ML Frameworks team in GPU, Graphics and Machine Learning works on enabling Apple Intelligence through high-performance, distributed inference of GenAI applications (such as LLMs) on ...

Apple

ML Framework (MetalLM) Engineer, Graphics, Game and ML

Cupertino, CA

$150K - $277K/yr

Amazon

Senior Software Development Engineer, AI/ML, AWS Neuron, Model Inference

Cupertino, CA

$128K - $177K/yr

Amazon

Senior Software Development Engineer, AI/ML, AWS Neuron, Model Inference

Cupertino, CA

$128K - $177K/yr

Figure

Staff AI Inference and Acceleration Engineer

San Jose, CA · On-site

They are seeking a Staff AI Inference & Acceleration Engineer to own the on-board inference ... ML team to define model architecture constraints that are hardware-friendly from the outset. • ...

Figure

Staff AI Inference and Acceleration Engineer

San Jose, CA · On-site

Cerebras Systems

Software Engineer, Inference Platform

Sunnyvale, CA · On-site

Our team primarily owns the orchestration layer that runs inference on our datacenter clusters which glues together the cloud components to the ML components. We are often the first team to face ...

Cerebras Systems

Software Engineer, Inference Platform

Sunnyvale, CA · On-site

Amazon

Senior Software Development Engineer, AI/ML, AWS Neuron, Model Inference

Cupertino, CA

$128K - $177K/yr

Amazon

Senior Software Development Engineer, AI/ML, AWS Neuron, Model Inference

Cupertino, CA

$128K - $177K/yr

Showing results 1-20

Ml Inference Jobs in California

Ml Inference information

What is a $900000 AI job?

A $900,000 AI job typically refers to high-level roles in artificial intelligence, such as senior machine learning engineers or AI research directors, often involving advanced skills in deep learning, data modeling, and programming with tools like Python and TensorFlow. These positions usually require extensive experience, specialized knowledge, and may include leadership responsibilities or strategic decision-making.

What is ML inference?

ML inference refers to the process of using a trained machine learning model to make predictions or decisions based on new data. After a model has been trained on historical data, inference is the phase where that model is deployed and used in real-world applications, such as recognizing speech, detecting objects in images, or recommending products. The focus in ML inference is on speed, efficiency, and scalability to ensure quick predictions, often in real time. This process is critical for practical applications like mobile apps, web services, and embedded systems. Optimizing inference involves reducing latency, memory usage, and computational requirements.

What is the difference between Ml Inference vs Data Scientist?

Aspect	ML Inference	Data Scientist
Required Credentials	Knowledge of machine learning models, programming skills	Degree in data science, statistics, or related fields
Work Environment	Deploying models in production, real-time data processing	Data analysis, model development, research
Industry Usage	AI product deployment, software companies	Research institutions, tech firms, consulting

ML Inference focuses on deploying trained models to make predictions on new data, often in real-time. Data Scientists develop and analyze models, working primarily in research and development. While both roles require understanding of machine learning, ML Inference emphasizes deployment and operationalization, whereas Data Scientists focus on model creation and analysis.

What engineer makes $500,000 a year?

Senior machine learning engineers with extensive experience, advanced skills in deep learning, and expertise in deploying large-scale models can earn salaries approaching or exceeding $500,000 annually, especially in high-cost-of-living areas or top tech companies. Compensation often includes base salary, bonuses, and stock options, reflecting their specialized knowledge and impact on product development.

Which 3 jobs will survive AI?

Jobs involving Ml Inference, such as data scientists, machine learning engineers, and AI system architects, are likely to persist as they require specialized expertise in developing, deploying, and maintaining AI models. These roles demand critical thinking, domain knowledge, and skills in programming and data analysis that are less easily automated. Continuous learning and staying updated with AI tools and frameworks are essential for these professions to remain relevant.

What are some common challenges faced by ML Inference Engineers when deploying models to production?

ML Inference Engineers often encounter challenges such as optimizing model latency and throughput to meet production requirements, ensuring compatibility with diverse hardware environments, and managing model versioning and updates without disrupting service. Additionally, balancing resource utilization and inference accuracy while monitoring real-time performance metrics is crucial. Collaboration with data scientists, DevOps, and software engineers is typically essential to streamline deployment and maintain robust, scalable inference pipelines.

Will MLE be replaced by AI?

Machine Learning Engineers (MLEs) design, develop, and optimize AI models and systems. While AI automation tools can assist with certain tasks, MLEs are essential for building, tuning, and maintaining complex models, making complete replacement unlikely in the near term. Their expertise in data handling, model deployment, and system integration remains critical in AI development environments.

What are the key skills and qualifications needed to thrive in ML Inference, and why are they important?

To thrive in ML Inference, you need a solid background in machine learning principles, programming (Python or C++), and experience with deploying models at scale, often supported by a degree in computer science or a related field. Familiarity with frameworks and tools such as TensorFlow, PyTorch, ONNX, and cloud platforms like AWS SageMaker or Google AI Platform is typically required. Strong problem-solving skills, attention to detail, and effective communication are crucial soft skills for collaborating with multidisciplinary teams and optimizing model performance. These skills ensure efficient, scalable, and reliable deployment of machine learning solutions in real-world applications.

What are popular job titles related to Ml Inference jobs in California? For Ml Inference jobs in California, the most frequently searched job titles are:

What job categories do people searching Ml Inference jobs in California look for? The top searched job categories for Ml Inference jobs in California are:

What cities in California are hiring for Ml Inference jobs? Cities in California with the most Ml Inference job openings:

Ml Inference jobs near you

Infographic showing various Ml Inference job openings in California as of June 2026, with employment types broken down into 95% Full Time, 4% Part Time, and 1% Temporary. Highlights an 83% Physical, 4% Hybrid, and 13% Remote job distribution.

Software Engineer - GenAI inference

Databricks

San Francisco, CA • On-site

Apply

Full-time

This job post has expired today. Applications are no longer accepted.

Job description

Job Summary:
Databricks is the data and AI company that empowers organizations to unify and democratize data, analytics, and AI. They are seeking a Software Engineer for GenAI inference to design, develop, and optimize the inference engine powering their Foundation Model API, working at the intersection of research and production.
Responsibilities:
• Contribute to the design and implementation of the inference engine, and collaborate on model-serving stack optimized for large-scale LLMs inference
• Collaborate with researchers to bring new model architectures or features (sparsity, activation compression, mixture-of-experts) into the engine
• Optimize for latency, throughput, memory efficiency, and hardware utilization across GPUs, and accelerators
• Build and maintain instrumentation, profiling, and tracing tooling to uncover bottlenecks and guide optimizations
• Develop and enhance scalable routing, batching, scheduling, memory management, and dynamic loading mechanisms for inference workloads
• Support reliability, reproducibility, and fault tolerance in the inference pipelines, including A/B launches, rollback, and model versioning
• Integrate with federated, distributed inference infrastructure – orchestrate across nodes, balance load, handle communication overhead
• Collaborate cross-functionally: with platform engineers, cloud infrastructure, and security/compliance teams
• Document and share learnings, contributing to internal best practices and open-source efforts when possible
Qualifications:
Required:
• BS/MS/PhD in Computer Science, or a related field
• Strong software engineering background (3+ years or equivalent) in performance-critical systems
• Solid understanding of ML inference internals: attention, MLPs, recurrent modules, quantization, sparse operations, etc.
• Hands-on experience with CUDA, GPU programming, and key libraries (cuBLAS, cuDNN, NCCL, etc.)
• Comfortable designing and operating distributed systems, including RPC frameworks, queuing, RPC batching, sharding, memory partitioning
• Demonstrated ability to uncover and solve performance bottlenecks across layers (kernel, memory, networking, scheduler)
• Experience building instrumentation, tracing, and profiling tools for ML models
• Ability to work closely with ML researchers, translate novel model ideas into production systems
• Ownership mindset and eagerness to dive deep into complex system challenges
Preferred:
• Bonus: published research or open-source contributions in ML systems, inference optimization, or model serving
Company:
Databricks is a data and AI platform that unifies data engineering, analytics, and machine learning on a lakehouse architecture. Founded in 2013, the company is headquartered in San Francisco, USA, with a team of 5001-10000 employees. The company is currently Late Stage.

About Databricks

Sourced by ZipRecruiter

Industry

Software development

Company size

5,001 - 10,000 Employees

Headquarters location

San Francisco, CA, US

Year founded

2013

Website

databricks.com

Social media

View All Databricks Jobs

Apply

Ml Inference Jobs in California (NOW HIRING)

Software Engineer - GenAI inference

Software Engineer - GenAI inference

Senior Full-Stack Engineer - Web Platforms for ML Inference

Senior Full-Stack Engineer - Web Platforms for ML Inference

Senior Product Manager - ROCm & AI/ML Inference Software

Senior Product Manager - ROCm & AI/ML Inference Software

Staff Software Engineer - GenAI inference

Staff Software Engineer - GenAI inference

Staff Software Engineer - GenAI inference

Staff Software Engineer - GenAI inference

Software Development Engineer, AI/ML, AWS Neuron, Model Inference

Software Development Engineer, AI/ML, AWS Neuron, Model Inference

Software Development Engineer, AI/ML, AWS Neuron, Model Inference

Software Development Engineer, AI/ML, AWS Neuron, Model Inference

Software Development Engineer AI/ML, Inference Serving, AWS Neuron

Software Development Engineer AI/ML, Inference Serving, AWS Neuron

Software Development Engineer, AI/ML, AWS Neuron, Model Inference

Software Development Engineer, AI/ML, AWS Neuron, Model Inference

Software Development Engineer, AI/ML, AWS Neuron, Model Inference

Software Development Engineer, AI/ML, AWS Neuron, Model Inference

Software Development Engineer, AI/ML, AWS Neuron, Model Inference

Software Development Engineer, AI/ML, AWS Neuron, Model Inference

Software Development Engineer AI/ML, Inference Serving, AWS Neuron

Software Development Engineer AI/ML, Inference Serving, AWS Neuron

Senior Software Development Engineer, AI/ML, AWS Neuron, Model Inference

Senior Software Development Engineer, AI/ML, AWS Neuron, Model Inference

Software Development Engineer, AI/ML, AWS Neuron, Model Inference

Software Development Engineer, AI/ML, AWS Neuron, Model Inference

Senior Product Manager -ROCm& AI/ML Inference Software

Senior Product Manager -ROCm& AI/ML Inference Software

ML Framework (MetalLM) Engineer, Graphics, Game and ML

ML Framework (MetalLM) Engineer, Graphics, Game and ML

Senior Software Development Engineer, AI/ML, AWS Neuron, Model Inference

Senior Software Development Engineer, AI/ML, AWS Neuron, Model Inference

Staff AI Inference and Acceleration Engineer

Staff AI Inference and Acceleration Engineer

Software Engineer, Inference Platform

Software Engineer, Inference Platform

Senior Software Development Engineer, AI/ML, AWS Neuron, Model Inference

Senior Software Development Engineer, AI/ML, AWS Neuron, Model Inference

Ml Inference information

Software Engineer - GenAI inference

Share this job

Job description

About Databricks

Industry

Company size

Headquarters location

Year founded

Website

Social media

Share this job