Deep Learning Expertise: Strong familiarity with PyTorch and deep knowledge of inference engines like TensorRT , ONNX Runtime, or TVM. * Quantization Depth: Hands-on experience with INT8/FP8/INT4 ...
Deep Learning Expertise: Strong familiarity with PyTorch and deep knowledge of inference engines like TensorRT , ONNX Runtime, or TVM. * Quantization Depth: Hands-on experience with INT8/FP8/INT4 ...
Sr. Computer Vision Engineer (Deep Learning)
$180K - $240K/yr
Sr. Computer Vision Engineer (Deep Learning) Mountain View, CA Harbinger is an American commercial ... Proficiency in model optimization techniques such as quantization, pruning, and knowledge ...
Sr. Computer Vision Engineer (Deep Learning)
$180K - $240K/yr
Sr. Computer Vision Engineer (Deep Learning) Mountain View, CA Harbinger is an American commercial ... Proficiency in model optimization techniques such as quantization, pruning, and knowledge ...
Senior Deep Learning Software Engineer, TensorRT Performance
Santa Clara, CA · On-site
$143.90K - $189.70K/yr
... like quantization, scheduling, memory management, and distributed inference to set the gold ... Scale performance of deep learning models across different architectures and types of NVIDIA ...
Senior Deep Learning Software Engineer, TensorRT Performance
Santa Clara, CA · On-site
$143.90K - $189.70K/yr
... like quantization, scheduling, memory management, and distributed inference to set the gold ... Scale performance of deep learning models across different architectures and types of NVIDIA ...
OR · Hybrid
$122.40K - $161.30K/yr
... like quantization, scheduling, memory management, and distributed inference to set the gold ... Scale performance of deep learning models across different architectures and types of NVIDIA ...
Sr. Computer Vision Engineer (Deep Learning)
$180K - $240K/yr
Sr. Computer Vision Engineer (Deep Learning) Mountain View, CA Harbinger is an American commercial ... Proficiency in model optimization techniques such as quantization, pruning, and knowledge ...
Sr. Computer Vision Engineer (Deep Learning)
$180K - $240K/yr
Sr. Computer Vision Engineer (Deep Learning) Mountain View, CA Harbinger is an American commercial ... Proficiency in model optimization techniques such as quantization, pruning, and knowledge ...
Senior Deep Learning Compiler Verification Engineer
$122.70K - $168.50K/yr
Work with deep learning compiler and architecture teams to analyze and validate sophisticated ... DL model internals depth: experience with quantization, operator fusion, mixed-precision, or graph ...
Senior Deep Learning Compiler Verification Engineer
$122.70K - $168.50K/yr
Work with deep learning compiler and architecture teams to analyze and validate sophisticated ... DL model internals depth: experience with quantization, operator fusion, mixed-precision, or graph ...
Senior Deep Learning Compiler Verification Engineer
$103.60K - $142.20K/yr
Work with deep learning compiler and architecture teams to analyze and validate sophisticated ... DL model internals depth: experience with quantization, operator fusion, mixed-precision, or graph ...
Senior Deep Learning Compiler Verification Engineer
$103.60K - $142.20K/yr
Work with deep learning compiler and architecture teams to analyze and validate sophisticated ... DL model internals depth: experience with quantization, operator fusion, mixed-precision, or graph ...
Staff Machine Learning Engineer - Autonomous Driving Model Quantization & Deployment
Santa Clara, CA · On-site
Deep Learning Expertise: Strong familiarity with PyTorch and deep knowledge of inference engines like TensorRT , ONNX Runtime, or TVM. * Quantization Depth: Hands-on experience with INT8/FP8/INT4 ...
Staff Machine Learning Engineer - Autonomous Driving Model Quantization & Deployment
Santa Clara, CA · On-site
Deep Learning Expertise: Strong familiarity with PyTorch and deep knowledge of inference engines like TensorRT , ONNX Runtime, or TVM. * Quantization Depth: Hands-on experience with INT8/FP8/INT4 ...
Staff Machine Learning Engineer - Autonomous Driving Model Quantization & Deployment
Santa Clara, CA · On-site
Deep Learning Expertise: Strong familiarity with PyTorch and deep knowledge of inference engines like TensorRT , ONNX Runtime, or TVM. * Quantization Depth: Hands-on experience with INT8/FP8/INT4 ...
Staff Machine Learning Engineer - Autonomous Driving Model Quantization & Deployment
Santa Clara, CA · On-site
Deep Learning Expertise: Strong familiarity with PyTorch and deep knowledge of inference engines like TensorRT , ONNX Runtime, or TVM. * Quantization Depth: Hands-on experience with INT8/FP8/INT4 ...
Senior Deep Learning Software Engineer, TensorRT Performance
Santa Clara, CA · Hybrid
$143.90K - $189.70K/yr
... like quantization, scheduling, memory management, and distributed inference to set the gold ... Scale performance of deep learning models across different architectures and types of NVIDIA ...
Senior Deep Learning Software Engineer, TensorRT Performance
Santa Clara, CA · Hybrid
$143.90K - $189.70K/yr
... like quantization, scheduling, memory management, and distributed inference to set the gold ... Scale performance of deep learning models across different architectures and types of NVIDIA ...
Senior Deep Learning Compiler Verification Engineer
$117K - $160.70K/yr
Work with deep learning compiler and architecture teams to analyze and validate sophisticated ... DL model internals depth: experience with quantization, operator fusion, mixed-precision, or graph ...
Senior Deep Learning Compiler Verification Engineer
$117K - $160.70K/yr
Work with deep learning compiler and architecture teams to analyze and validate sophisticated ... DL model internals depth: experience with quantization, operator fusion, mixed-precision, or graph ...
Senior Deep Learning Compiler Verification Engineer
Santa Clara, CA · On-site
$122.70K - $168.50K/yr
Work with deep learning compiler and architecture teams to analyze and validate sophisticated ... DL model internals depth: experience with quantization, operator fusion, mixed-precision, or graph ...
Senior Deep Learning Compiler Verification Engineer
Santa Clara, CA · On-site
$122.70K - $168.50K/yr
Work with deep learning compiler and architecture teams to analyze and validate sophisticated ... DL model internals depth: experience with quantization, operator fusion, mixed-precision, or graph ...
We are looking for a world-class Principal Deep Learning Engineer to join our Autonomous Driving ... Experience with model optimization, quantization, and deployment on embedded platforms (especially ...
We are looking for a world-class Principal Deep Learning Engineer to join our Autonomous Driving ... Experience with model optimization, quantization, and deployment on embedded platforms (especially ...
We are looking for a world-class Principal Deep Learning Engineer to join our Autonomous Driving ... Experience with model optimization, quantization, and deployment on embedded platforms (especially ...
We are looking for a world-class Principal Deep Learning Engineer to join our Autonomous Driving ... Experience with model optimization, quantization, and deployment on embedded platforms (especially ...
$104.40K - $143.40K/yr
Work with deep learning compiler and architecture teams to analyze and validate sophisticated ... DL model internals depth: experience with quantization, operator fusion, mixed-precision, or graph ...
Sr. Computer Vision Engineer (Deep Learning)
Mountain View, CA · On-site
$180K - $240K/yr
We are seeking a highly skilled Senior Deep Learning Engineer to drive the development and ... Proficiency in model optimization techniques such as quantization, pruning, and knowledge ...
Sr. Computer Vision Engineer (Deep Learning)
Mountain View, CA · On-site
$180K - $240K/yr
We are seeking a highly skilled Senior Deep Learning Engineer to drive the development and ... Proficiency in model optimization techniques such as quantization, pruning, and knowledge ...
Sr. Computer Vision Engineer (Deep Learning)
$180K - $240K/yr
We are seeking a highly skilled Senior Deep Learning Engineer to drive the development and ... Proficiency in model optimization techniques such as quantization, pruning, and knowledge ...
Sr. Computer Vision Engineer (Deep Learning)
$180K - $240K/yr
We are seeking a highly skilled Senior Deep Learning Engineer to drive the development and ... Proficiency in model optimization techniques such as quantization, pruning, and knowledge ...
$139.90K/yr
... like quantization, scheduling, memory management, and distributed inference to set the gold ... Scale performance of deep learning models across different architectures and types of NVIDIA ...
... like quantization, scheduling, memory management, and distributed inference to set the gold ... Scale performance of deep learning models across different architectures and types of NVIDIA ...
... like quantization, scheduling, memory management, and distributed inference to set the gold ... Scale performance of deep learning models across different architectures and types of NVIDIA ...
... like quantization, scheduling, memory management, and distributed inference to set the gold ... Scale performance of deep learning models across different architectures and types of NVIDIA ...
... like quantization, scheduling, memory management, and distributed inference to set the gold ... Scale performance of deep learning models across different architectures and types of NVIDIA ...
Deep Learning Quantization information
See salary details
$21.8K is the 25th percentile. Wages below this are outliers.
$11K - $22.7K
27% of jobs
$22.7K - $34.5K
0% of jobs
$34.5K - $46.2K
0% of jobs
$46.2K - $57.9K
0% of jobs
$57.9K - $69.6K
0% of jobs
The median wage is $80.4K / yr.
$69.6K - $81.4K
25% of jobs
$81.4K - $93.1K
18% of jobs
$101.5K is the 75th percentile. Wages above this are outliers.
$93.1K - $104.8K
7% of jobs
$104.8K - $116.5K
2% of jobs
$116.5K - $128.3K
0% of jobs
$128.3K - $140K
21% of jobs
$11K
$83.9K
$140K
How much do deep learning quantization jobs pay per year?
What are the key skills and qualifications needed to thrive as a Deep Learning Quantization Engineer, and why are they important?
What are some common challenges faced when implementing deep learning quantization in production environments?
What is deep learning quantization?
What is the difference between Deep Learning Quantization vs Machine Learning Engineer?
| Aspect | Deep Learning Quantization | Machine Learning Engineer |
|---|---|---|
| Required Credentials | Advanced degrees in AI, Computer Science, or related fields; knowledge of neural networks | Bachelor's or Master's in CS, Data Science, or related fields; programming skills |
| Work Environment | Research labs, AI development teams, hardware optimization settings | Software development teams, data-driven projects, product-focused environments |
| Industry Usage | AI hardware optimization, model deployment, edge computing | Model development, data analysis, software solutions across industries |
Deep Learning Quantization focuses on reducing model size and improving inference speed through techniques like weight and activation quantization, often in hardware or embedded systems. Machine Learning Engineers develop, implement, and optimize machine learning models for various applications. While both roles require knowledge of AI and programming, Deep Learning Quantization is more specialized in model optimization techniques, whereas Machine Learning Engineers work broadly on model development and deployment.

Staff Machine Learning Engineer - Autonomous Driving Model Quantization & Deployment
XPENGSanta Clara, CA
Other
Posted 18 days ago
Job description
The Mission: The challenge of Vision-Language-Action (VLA) models and Foundation Models isn't just their intelligence-it's their real-time execution at the edge. We are seeking a high-caliber Staff Machine Learning Engineer to bridge the gap between massive research models and production-ready L4 autonomous driving systems. You will lead the effort to optimize and deploy our VLA models onto vehicle-grade compute platforms for our global fleet.
Key Responsibilities:
- Lead Optimization Strategy: Own the end-to-end quantization and optimization roadmap for large-scale multimodal models (Transformers, VLMs).
- Model Compression: Apply and innovate in PTQ (Post-Training Quantization), QAT (Quantization-Aware Training), and pruning techniques to fit VLA models into strict memory and power envelopes.
- Hardware-Software Co-design: Collaborate directly with model researchers to ensure architectures are "deployment-friendly" and with platform teams to influence future hardware requirements.
- Production Excellence: Develop and maintain robust, safety-critical deployment stacks in Modern C++, ensuring 24/7 stability and deterministic performance on the road.
- Proven Track Record: 5-8 years of experience in model deployment, quantization, or high-performance computing (HPC).
- Core Technical Skills: Mastery of Modern C++ and deep experience with CUDA or other hardware acceleration libraries.
- Deep Learning Expertise: Strong familiarity with PyTorch and deep knowledge of inference engines like TensorRT, ONNX Runtime, or TVM.
- Quantization Depth: Hands-on experience with INT8/FP8/INT4 quantization and knowledge of the unique challenges in quantizing Large Language Models (LLMs) or Transformers.
- Platform Knowledge: Solid understanding of computer architecture (Cache, Memory Bandwidth, SIMD) and experience with embedded/edge compute constraints.
- Systems Thinking: Ability to debug complex performance bottlenecks across the entire software stack.
- Experience with VLA/VLM or other Foundation Model deployment.
- Background in autonomous driving, robotics, or real-time safety-critical systems.
- Contributions to open-source inference or compiler projects.
- A fun, supportive and engaging environment
- Infrastructures and computational resources to support your ML model development/research.
- Opportunity to work on cutting edge technologies with the top talent in the field.
- Opportunity to make significant impact on transportation revolution by the means of advancing autonomous driving
- Competitive compensation package
- Snacks, lunches, dinners, and fun activities
The base salary range for this full-time position is $215,280-$364,320, in addition to bonus, equity and benefits. Our salary ranges are determined by role, level, and location. The range displayed on each job posting reflects the minimum and maximum target for new hire salaries for the position across all US locations. Within the range, individual pay is determined by work location and additional factors, including job-related skills, experience, and relevant education or training.
We are an Equal Opportunity Employer. It is our policy to provide equal employment opportunities to all qualified persons without regard to race, age, color, sex, sexual orientation, religion, national origin, disability, veteran status or marital status or any other prescribed category set forth in federal or state regulations.