Deep Learning Quantization Jobs in California (NOW HIRING)

Software Dev Engineer, Machine Learning Compilers

... deep learning workloads to heterogeneous device backends You will also partner up with peer science teams to innovate on model quantization and compression techniques for efficient execution on ...

Amazon

Software Dev Engineer, Machine Learning Compilers

Sunnyvale, CA · On-site

quadric, Inc

Deep Learning Compiler Engineer (New Grad)

Burlingame, CA · On-site

$120K - $160K/yr

The Role As a new-grad Deep Learning Compiler Engineer, you will work on CGC, Quadric's neural ... Familiarity with neural network quantization, fixed-point arithmetic, or numerical analysis for ML.

quadric, Inc

Deep Learning Compiler Engineer (New Grad)

Burlingame, CA · On-site

$120K - $160K/yr

Amazon

Software Dev Engineer, Machine Learning Compilers

Sunnyvale, CA

Amazon

Software Dev Engineer, Machine Learning Compilers

Sunnyvale, CA

Apptronik

Senior Perception Learning Engineer

Sunnyvale, CA

$190K - $235K/yr

Strong classical computer vision skills (geometry-based methods, feature extraction) complementing deep learning approaches. * Expertise in model acceleration, quantization, or compression (TensorRT ...

Apptronik

Senior Perception Learning Engineer

Sunnyvale, CA

$190K - $235K/yr

Roblox

Principal Machine Learning Engineer, Content Safety

San Mateo, CA · On-site

You will feel a deep sense of responsibility in proactively protecting our community thoughtfully ... learning, quantization, LoRA, distillation). * Drive End-to-End Product Development: You will not ...

Roblox

Principal Machine Learning Engineer, Content Safety

San Mateo, CA · On-site

Apple

Machine Learning Engineer - Model Inference

Cupertino, CA · On-site

In this role on our ML Platform Team, you will leverage advanced deep learning and large language ... Familiarity with inference optimization techniques, including quantization, pruning, knowledge ...

Apple

Machine Learning Engineer - Model Inference

Cupertino, CA · On-site

Altos Labs

Scientist /Senior Scientist, Multimodal & Relational Machine Learning Foundation Models

San Diego, CA · On-site

$97K - $132K/yr

... deep research, and the mathematical underpinnings of set-invariant and graph-structured ... via quantization, distillation, or memory-efficient attention mechanisms. Company : Altos Labs ...

Altos Labs

Scientist /Senior Scientist, Multimodal & Relational Machine Learning Foundation Models

San Diego, CA · On-site

$97K - $132K/yr

Deeproute.ai

Research Scientist, Reinforcement Learning

Fremont, CA · On-site

Proficiency in deep learning frameworks such as PyTorch * Experience with distributed training frameworks (Ray, Horovod, etc.) * Knowledge of model optimization (quantization, pruning) and CUDA is a ...

Quick apply

Deeproute.ai

Research Scientist, Reinforcement Learning

Fremont, CA · On-site

Deeproute.ai

Research Scientist, Reinforcement Learning

Fremont, CA · On-site

Deeproute.ai

Research Scientist, Reinforcement Learning

Fremont, CA · On-site

Apptronik

Senior Perception Learning Engineer

Sunnyvale, CA · On-site

$122K - $168K/yr

Apptronik

Senior Perception Learning Engineer

Sunnyvale, CA · On-site

$122K - $168K/yr

XPENG

Machine Learning Engineer - LLM, AI & Robotics

Santa Clara, CA · On-site

Work on efficient LLMs (e.g. small LLMs, weight sharing, model quantization, etc.) that can be ... Deep understanding of language model modern architectures, 'under the hood' LLM training knowledge ...

XPENG

Machine Learning Engineer - LLM, AI & Robotics

Santa Clara, CA · On-site

Deeproute.ai

Research Scientist, Reinforcement Learning

Fremont, CA

Deeproute.ai

Research Scientist, Reinforcement Learning

Fremont, CA

XPENG

Staff Machine Learning Engineer

Santa Clara, CA

Proficiency in Python and deep learning frameworks such as PyTorch or TensorFlow. * Solid ... Experience with ONNX, TensorRT, model quantization, C++ inference pipelines, CUDA , or edge ...

XPENG

Staff Machine Learning Engineer

Santa Clara, CA

Proficiency in Python and deep learning frameworks such as PyTorch or TensorFlow. * Solid ... Experience with ONNX, TensorRT, model quantization, C++ inference pipelines, CUDA , or edge ...

XPENG

Machine Learning Engineer - LLM, AI & Robotics

Santa Clara, CA · On-site

XPENG

Machine Learning Engineer - LLM, AI & Robotics

Santa Clara, CA · On-site

XPENG

Staff Machine Learning Engineer

Santa Clara, CA · On-site

Proficiency in Python and deep learning frameworks such as PyTorch or TensorFlow. * Solid ... Experience with ONNX, TensorRT, model quantization, C++ inference pipelines, CUDA , or edge ...

XPENG

Staff Machine Learning Engineer

Santa Clara, CA · On-site

Proficiency in Python and deep learning frameworks such as PyTorch or TensorFlow. * Solid ... Experience with ONNX, TensorRT, model quantization, C++ inference pipelines, CUDA , or edge ...

Apptronik

Senior Perception Learning Engineer

Sunnyvale, CA · On-site

$122K - $167K/yr

... deep learning approaches. • Expertise in model acceleration, quantization, or compression (TensorRT, ONNX Runtime). • Familiarity with real-time frameworks and middleware such as ROS 2, GStreamer ...

Apptronik

Senior Perception Learning Engineer

Sunnyvale, CA · On-site

$122K - $167K/yr

Qualcomm

Modem Machine Learning Engineer

San Diego, CA · On-site

You will place a strong emphasis on modern deep learning architectures, building scalable MLOps ... Exposure to on-device ML deployment, quantization, and neural network optimization tools.

Qualcomm

Modem Machine Learning Engineer

San Diego, CA · On-site

You will place a strong emphasis on modern deep learning architectures, building scalable MLOps ... Exposure to on-device ML deployment, quantization, and neural network optimization tools.

XPENG

Staff Machine Learning Engineer

Santa Clara, CA · On-site

... quantization / inference acceleration. • Work with deployment and platform teams to validate ... and deep learning frameworks such as PyTorch or TensorFlow. • Solid understanding of object ...

XPENG

Staff Machine Learning Engineer

Santa Clara, CA · On-site

Amazon

Software Dev Engineer, Machine Learning Compilers

Sunnyvale, CA · On-site

... deep learning workloads to heterogeneous device backends. You will also partner up with peer science teams to innovate on model quantization and compression techniques for efficient execution on ...

Amazon

Software Dev Engineer, Machine Learning Compilers

Sunnyvale, CA · On-site

Apple

Machine Learning Engineer - Model Inference

Cupertino, CA

$150K - $277K/yr

In this role on our ML Platform Team, you will leverage advanced deep learning and large language ... Develop techniques such as dynamic batching, caching, quantization, pruning, model compilation, and ...

Apple

Machine Learning Engineer - Model Inference

Cupertino, CA

$150K - $277K/yr

Showing results 1-20

Deep Learning Quantization Jobs in California

Deep Learning Quantization information

What are the key skills and qualifications needed to thrive as a Deep Learning Quantization Engineer, and why are they important?

To excel as a Deep Learning Quantization Engineer, you need a strong background in machine learning, applied mathematics, and computer science, usually supported by an advanced degree in a related field. Familiarity with deep learning frameworks (such as TensorFlow or PyTorch), quantization toolkits, and hardware acceleration platforms is crucial. Analytical thinking, problem-solving, and clear technical communication are standout soft skills in this role. These abilities are essential for efficiently optimizing models for deployment on resource-constrained hardware while maintaining accuracy and performance.

What is the difference between Deep Learning Quantization vs Machine Learning Engineer?

Aspect	Deep Learning Quantization	Machine Learning Engineer
Required Credentials	Advanced degrees in AI, Computer Science, or related fields; knowledge of neural networks	Bachelor's or Master's in CS, Data Science, or related fields; programming skills
Work Environment	Research labs, AI development teams, hardware optimization settings	Software development teams, data-driven projects, product-focused environments
Industry Usage	AI hardware optimization, model deployment, edge computing	Model development, data analysis, software solutions across industries

Deep Learning Quantization focuses on reducing model size and improving inference speed through techniques like weight and activation quantization, often in hardware or embedded systems. Machine Learning Engineers develop, implement, and optimize machine learning models for various applications. While both roles require knowledge of AI and programming, Deep Learning Quantization is more specialized in model optimization techniques, whereas Machine Learning Engineers work broadly on model development and deployment.

What is deep learning quantization?

Deep learning quantization is the process of reducing the precision of the numbers used to represent a neural network's parameters, activations, or both. By converting the typically used 32-bit floating-point values to lower bit-width formats such as 16-bit or 8-bit integers, quantization significantly reduces the memory footprint and computational requirements of deep learning models. This technique helps deploy models efficiently on edge devices and mobile hardware while maintaining acceptable accuracy levels. Quantization is widely used in model optimization for faster inference and lower power consumption.

What are some common challenges faced when implementing deep learning quantization in production environments?

One of the main challenges in implementing deep learning quantization is balancing model accuracy with computational efficiency, as quantization can sometimes lead to a drop in model performance. Additionally, ensuring hardware compatibility and optimizing for different devices (such as CPUs, GPUs, or edge devices) can require extensive testing and tuning. Collaboration with data scientists, software engineers, and hardware specialists is often essential to successfully deploy quantized models at scale. Staying updated with the latest quantization techniques and frameworks is also important for overcoming these challenges.

What are popular job titles related to Deep Learning Quantization jobs in California? For Deep Learning Quantization jobs in California, the most frequently searched job titles are:

What job categories do people searching Deep Learning Quantization jobs in California look for? The top searched job categories for Deep Learning Quantization jobs in California are:

What cities in California are hiring for Deep Learning Quantization jobs? Cities in California with the most Deep Learning Quantization job openings:

Deep Learning Quantization jobs near you

Infographic showing various Deep Learning Quantization job openings in California as of July 2026, with employment types broken down into 74% Full Time, 23% Part Time, 1% Temporary, and 2% Contract. Highlights an 72% Physical, 2% Hybrid, and 26% Remote job distribution.

Software Dev Engineer, Machine Learning Compilers

Amazon

Sunnyvale, CA • On-site

Apply

Full-time

Re-posted 3 days ago

Amazon rating

7.4

Based on 6,965 frontline employees who took The Breakroom Quiz

6th of 39 rated national retailers

Job description

Amazon Devices is an inventive research and development company that designs and engineers high-profile consumer products like the Kindle family, Fire Tablets, Fire TV, Health & Wellness devices, Amazon Echo, and Astro. We are building the next generation of edge AI capabilities through our advanced compression platform, compiler and custom neural accelerator silicon. Come join us to accelerate deep learning networks on edge processors and beyond.

We are looking for a talented and passionate software engineer to be part of an exciting technology creation team at Amazon. You will have an enormous opportunity to make a large impact on the design, architecture, and implementation of deep learning technologies embedded into consumer products used every day, by people you know. The position provides an unique opportunity to contribute and make an impact from hardware design stage followed by pre and post silicon development as well as productizing it on consumer devices.
In this role you will be work along side partner science teams to develop the compiler infrastructure and lower deep learning workloads to heterogeneous device backends

You will also partner up with peer science teams to innovate on model quantization and compression techniques for efficient execution on hardware.
Key job responsibilities
Design and develop software stack for deep learning accelerator
Develop Compiler passes for graph ingestions, optimizations and partitioning.
Develop backend code generation capabilities across heterogeneous platforms
Profile, analyze and optimize system level performance, develop new tooling where necessary
Participate in design reviews, API development, and documentation
Successfully collaborate with hardware, software, applied science and product teams to onboard more and more user experiences to be powered by Deep Learning accelerator.
Mentor and provide guidance to junior engineers
A day in the life
You join a small team building the compiler that brings large AI models to a new generation of custom silicon. The chip has a fraction of the memory of a phone, and the compiler is what makes language models run on it at all. The team is small enough that each engineer owns a meaningful piece of the system end to end

There is no layer between you and the problem.
The morning starts with results from an overnight run. A piece of the compiler you own just produced its tightest result yet on a real model. You ship the change for hardware validation.
You spend the afternoon directing AI agents through the codebase, reviewing their changes, and steering the design

Before lunch, you load your compiled model onto the chip and run it through a demo app you wrote yourself, watching tokens stream out of silicon you helped make work. Later, you meet with the research team. They depend on your component.

You sketch a cleaner interface together.
About the team
We sit at the intersection of AI models and custom silicon, and our work decides what is possible at the edge.
Engineers here bring deep experience across compilers and program analysis, optimization algorithms, computer architecture, machine learning systems, and the practical craft of getting large software to run reliably under tight constraints. People have shipped production code generators, tuned schedulers for novel hardware, and worked at every layer from the model down to the bare metal.
Because the team is small, you work alongside that experience daily, not at a distance. You partner directly with researchers shaping the models, hardware engineers shaping the silicon, and firmware engineers shaping the runtime

You learn how each layer constrains and unlocks the others, and you see your decisions land end to end.
This is a place to build technical depth quickly and own work that matters from day one.

What Amazon employees say

Pay

Benefits

Hours and flexibility

Workplace

Get the full story on Breakroom

About Amazon

Sourced by ZipRecruiter

Amazon.com, Inc., commonly known as Amazon, is an American multinational technology company. It was founded by Jeff Bezos in 1994 and initially started as an online marketplace for books. Since then, Amazon has expanded its operations and become one of the largest e-commerce companies in the world. Amazon's primary business is its online retail platform, where customers can purchase a vast array of products, including electronics, clothing, books, home goods, and much more. The company offers a convenient and user-friendly shopping experience, with features such as fast shipping, customer reviews, and personalized recommendations. In addition to its e-commerce platform, Amazon has diversified its business into various other areas. One of its notable ventures is Amazon Web Services (AWS), a comprehensive cloud computing platform that provides services such as storage, compute power, and database management to individuals and businesses. AWS has become a leader in the cloud computing industry, powering many websites and applications worldwide. Amazon has also developed its own consumer electronics, including the popular Amazon Kindle e-reader, Fire tablets, Fire TV streaming devices, and the Alexa-powered Echo smart speakers. The Alexa voice assistant, integrated into these devices, allows users to interact with their devices using voice commands, perform tasks, and access information. Furthermore, Amazon has expanded into media and entertainment. It operates Prime Video, a streaming service that offers a wide range of movies, TV shows, and original content. Amazon Music provides a platform for streaming and purchasing digital music, while Audible offers audiobooks and other audio content. The company's commitment to customer satisfaction and convenience is demonstrated by its membership program, Amazon Prime. Prime members receive various benefits, including free two-day shipping, access to streaming services, exclusive deals, and more.

Industry

It services, book publishers, retail, real estate and computer and electronic product manufacturing

Company size

10,000+ Employees

Headquarters location

Seattle, WA, US

Website

amazon.com

Social media

View All Amazon Jobs

Apply

Deep Learning Quantization Jobs in California (NOW HIRING)

Software Dev Engineer, Machine Learning Compilers

Software Dev Engineer, Machine Learning Compilers

Deep Learning Compiler Engineer (New Grad)

Deep Learning Compiler Engineer (New Grad)

Software Dev Engineer, Machine Learning Compilers

Software Dev Engineer, Machine Learning Compilers

Senior Perception Learning Engineer

Senior Perception Learning Engineer

Principal Machine Learning Engineer, Content Safety

Principal Machine Learning Engineer, Content Safety

Machine Learning Engineer - Model Inference

Machine Learning Engineer - Model Inference

Scientist /Senior Scientist, Multimodal & Relational Machine Learning Foundation Models

Scientist /Senior Scientist, Multimodal & Relational Machine Learning Foundation Models

Research Scientist, Reinforcement Learning

Research Scientist, Reinforcement Learning

Research Scientist, Reinforcement Learning

Research Scientist, Reinforcement Learning

Senior Perception Learning Engineer

Senior Perception Learning Engineer

Machine Learning Engineer - LLM, AI & Robotics

Machine Learning Engineer - LLM, AI & Robotics

Research Scientist, Reinforcement Learning

Research Scientist, Reinforcement Learning

Staff Machine Learning Engineer

Staff Machine Learning Engineer

Machine Learning Engineer - LLM, AI & Robotics

Machine Learning Engineer - LLM, AI & Robotics

Staff Machine Learning Engineer

Staff Machine Learning Engineer

Senior Perception Learning Engineer

Senior Perception Learning Engineer

Modem Machine Learning Engineer

Modem Machine Learning Engineer

Staff Machine Learning Engineer

Staff Machine Learning Engineer

Software Dev Engineer, Machine Learning Compilers

Software Dev Engineer, Machine Learning Compilers

Machine Learning Engineer - Model Inference

Machine Learning Engineer - Model Inference

Deep Learning Quantization information

What are the key skills and qualifications needed to thrive as a Deep Learning Quantization Engineer, and why are they important?

What is the difference between Deep Learning Quantization vs Machine Learning Engineer?

What is deep learning quantization?

What are some common challenges faced when implementing deep learning quantization in production environments?

Software Dev Engineer, Machine Learning Compilers

Share this job

Amazon rating

Get the real story on frontline employers

Job description

What Amazon employees say

Get the real story on frontline employers

Pay

Most people get paid breaks

Most people don’t get paid when they’re sick

The job rarely spills into unpaid time

Benefits

Sick days use up paid time off

Only some part-timers can get health insurance

Most part-timers get paid time off

Hours and flexibility

Less than 4 weeks notice of work schedule

Some people worry about their hours

Only some people can choose their shifts

Workplace

Most people feel treated with respect

Most people get breaks without interruption

Some people are stressed out

About Amazon

Industry

Company size

Headquarters location

Website

Social media

Share this job