1

Deep Learning Quantization Jobs in Massachusetts

The scope for GPU usage ranges from traditional computer vision and deep learning architectures to ... Hands-on work with ML model optimization (post-training quantization, layer pruning, etc) or hand ...

The scope for GPU usage ranges from traditional computer vision and deep learning architectures to ... Hands-on work with ML model optimization (post-training quantization, layer pruning, etc) or hand ...

The scope for GPU usage ranges from traditional computer vision and deep learning architectures to ... Hands-on work with ML model optimization (post-training quantization, layer pruning, etc) or hand ...

next page

Showing results 1-20

Deep Learning Quantization information

What are the key skills and qualifications needed to thrive as a Deep Learning Quantization Engineer, and why are they important?

To excel as a Deep Learning Quantization Engineer, you need a strong background in machine learning, applied mathematics, and computer science, usually supported by an advanced degree in a related field. Familiarity with deep learning frameworks (such as TensorFlow or PyTorch), quantization toolkits, and hardware acceleration platforms is crucial. Analytical thinking, problem-solving, and clear technical communication are standout soft skills in this role. These abilities are essential for efficiently optimizing models for deployment on resource-constrained hardware while maintaining accuracy and performance.

What are some common challenges faced when implementing deep learning quantization in production environments?

One of the main challenges in implementing deep learning quantization is balancing model accuracy with computational efficiency, as quantization can sometimes lead to a drop in model performance. Additionally, ensuring hardware compatibility and optimizing for different devices (such as CPUs, GPUs, or edge devices) can require extensive testing and tuning. Collaboration with data scientists, software engineers, and hardware specialists is often essential to successfully deploy quantized models at scale. Staying updated with the latest quantization techniques and frameworks is also important for overcoming these challenges.

What is deep learning quantization?

Deep learning quantization is the process of reducing the precision of the numbers used to represent a neural network's parameters, activations, or both. By converting the typically used 32-bit floating-point values to lower bit-width formats such as 16-bit or 8-bit integers, quantization significantly reduces the memory footprint and computational requirements of deep learning models. This technique helps deploy models efficiently on edge devices and mobile hardware while maintaining acceptable accuracy levels. Quantization is widely used in model optimization for faster inference and lower power consumption.

What is the difference between Deep Learning Quantization vs Machine Learning Engineer?

AspectDeep Learning QuantizationMachine Learning Engineer
Required CredentialsAdvanced degrees in AI, Computer Science, or related fields; knowledge of neural networksBachelor's or Master's in CS, Data Science, or related fields; programming skills
Work EnvironmentResearch labs, AI development teams, hardware optimization settingsSoftware development teams, data-driven projects, product-focused environments
Industry UsageAI hardware optimization, model deployment, edge computingModel development, data analysis, software solutions across industries

Deep Learning Quantization focuses on reducing model size and improving inference speed through techniques like weight and activation quantization, often in hardware or embedded systems. Machine Learning Engineers develop, implement, and optimize machine learning models for various applications. While both roles require knowledge of AI and programming, Deep Learning Quantization is more specialized in model optimization techniques, whereas Machine Learning Engineers work broadly on model development and deployment.

What are popular job titles related to Deep Learning Quantization jobs in Massachusetts? For Deep Learning Quantization jobs in Massachusetts, the most frequently searched job titles are:
What job categories do people searching Deep Learning Quantization jobs in Massachusetts look for? The top searched job categories for Deep Learning Quantization jobs in Massachusetts are:
What cities in Massachusetts are hiring for Deep Learning Quantization jobs? Cities in Massachusetts with the most Deep Learning Quantization job openings:

Senior Software Engineer - C++ GPU Performance

Zoox

Boston, MA • On-site

$217K - $307K/yr

Full-time

Medical, Life, PTO

Posted 13 days ago


Job description

Zoox is building the world's most advanced self-driving hardware and software solution. The efficiency demands of such a system require an expert fine tuning of both the compute hardware architecture as well as the algorithms and middleware that runs on it to achieve maximum throughput at the most optimal power levels. 
 
The Software Performance team's mission is to analyze, optimize and provide guidance to the software and hardware teams in order to meet the required specifications.   
 
As a GPU performance software engineer within the Software Performance team, you will instrument, monitor, analyze and optimize GPU-based algorithms that are performance-critical for our solution. The scope for GPU usage ranges from traditional computer vision and deep learning architectures to complex geometric reasoning and multi-agent decision making. Your work will strongly influence design decisions of future compute platforms & resource allocation.
In this role, you will:
  • Build real-time instrumentation for performance monitoring (CPU, GPU, latency, memory) and develop offline benchmarking frameworks, tools, and scripts to evaluate & analyze performance at scale in CI/vehicle, and establish budgets for next-gen architectures.
  • Analyze performance metrics to identify GPU hotspots and root causes, and propose and co-implement actionable solutions with component teams.
  • Support teams on bringing serial algorithms to the GPU to maximize compute utilization and improve overall latency.
  • Work as part of the Core team to design a middleware framework that promotes by default efficient and performant code development by maximizing CPU and GPU.
Qualifications
  • BS in computer science or related field and 7+ years of experience.
  • Strong knowledge of CUDA as applied to recent GPU microarchitectures (e.g., Ampere, Blackwell) and experience debugging/optimizing GPU kernels using tools like Nsight.
  • Strong knowledge of C++ and experience in large code bases, comfortable in Linux development environments.
  • Experience in development, debugging, and profiling of complex multiprocess systems (e.g., robotic systems, game engines).
Bonus Qualifications
  • Experience with GPU kernel development in a real-time environment, including PTX-level programming, CPU SIMD instructions (e.g., AVX intrinsics), and custom CUDA layers with frameworks like TensorRT & XLA.
  • Hands-on work with ML model optimization (post-training quantization, layer pruning, etc) or hand-tuning GPU kernels (in OpenGL, CUDA, RocM or similar).
  • Proficiency with SQL, DataBricks, Looker, or other business intelligence tools.
$217,000 - $307,000 a year
Base Salary Range
 
There are three major components to compensation for this position: salary, Amazon Restricted Stock Units (RSUs), and Zoox Stock Appreciation Rights. A sign-on bonus may be offered as part of the compensation package. The listed range applies only to the base salary. Compensation will vary based on geographic location and level. Leveling, as well as positioning within a level, is determined by a range of factors, including, but not limited to, a candidate's relevant years of experience, domain knowledge, and interview performance. The salary range listed in this posting is representative of the range of levels Zoox is considering for this position.
 
Zoox also offers a comprehensive package of benefits, including paid time off (e.g. sick leave, vacation, bereavement), unpaid time off, Zoox Stock Appreciation Rights, Amazon RSUs, health insurance, long-term care insurance, long-term and short-term disability insurance, and life insurance.
About Zoox
Zoox is developing the first ground-up, fully autonomous vehicle fleet and the supporting ecosystem required to bring this technology to market. Sitting at the intersection of robotics, machine learning, and design, Zoox aims to provide the next generation of mobility-as-a-service in urban environments. We're looking for top talent that shares our passion and wants to be part of a fast-moving and highly execution-oriented team.

Follow us on LinkedIn

Accommodations
If you need an accommodation to participate in the application or interview process please reach out to [email protected] or your assigned recruiter.

A Final Note:
You do not need to match every listed expectation to apply for this position. Here at Zoox, we know that diverse perspectives foster the innovation we need to be successful, and we are committed to building a team that encompasses a variety of backgrounds, experiences, and skills.
We may use artificial intelligence (AI) tools to support parts of the hiring process, such as reviewing applications, analyzing resumes, or assessing responses. These tools assist our recruitment team but do not replace human judgment. Final hiring decisions are ultimately made by humans. If you would like more information about how your data is processed, please contact us.
apply for this job