1

Deep Learning Quantization Jobs in Chicago, IL (NOW HIRING)

Senior Machine Learning Engineer (LLMs)

Chicago, IL · On-site

$126.20K - $166.40K/yr

Deep understanding of transformers, attention, and training dynamics * Strong Python plus PyTorch ... Inference optimization (quantization, speculative decoding, vLLM, Triton) * Experience shipping LLM ...

Senior Machine Learning Engineer (LLMs)

Chicago, IL

$126.20K - $166.40K/yr

Deep understanding of transformers, attention, and training dynamics * Strong Python plus PyTorch ... Inference optimization (quantization, speculative decoding, vLLM, Triton) * Experience shipping LLM ...

Senior Machine Learning Engineer (LLMs)

Chicago, IL · On-site

$126.20K - $166.40K/yr

Deep understanding of transformers, attention, and training dynamics * Strong Python plus PyTorch ... Inference optimization (quantization, speculative decoding, vLLM, Triton) * Experience shipping LLM ...

Senior ML Engineer

Chicago, IL · On-site +1

$107.60K - $147.80K/yr

Advanced Python and deep learning proficiency (PyTorch, HuggingFace Transformers, spaCy ... models via quantization, batching, and throughput tuning * Proficiency with inference ...

Deep Learning Quantization information

See Chicago, IL salary details

$11.3K

$86.4K

$144.2K

How much do deep learning quantization jobs pay per year?

As of May 31, 2026, the average yearly pay for deep learning quantization in Chicago, IL is $86,414.00, according to ZipRecruiter salary data. Most workers in this role earn between $74,200.00 and $143,200.00 per year, depending on experience, location, and employer.

What are the key skills and qualifications needed to thrive as a Deep Learning Quantization Engineer, and why are they important?

To excel as a Deep Learning Quantization Engineer, you need a strong background in machine learning, applied mathematics, and computer science, usually supported by an advanced degree in a related field. Familiarity with deep learning frameworks (such as TensorFlow or PyTorch), quantization toolkits, and hardware acceleration platforms is crucial. Analytical thinking, problem-solving, and clear technical communication are standout soft skills in this role. These abilities are essential for efficiently optimizing models for deployment on resource-constrained hardware while maintaining accuracy and performance.

What are some common challenges faced when implementing deep learning quantization in production environments?

One of the main challenges in implementing deep learning quantization is balancing model accuracy with computational efficiency, as quantization can sometimes lead to a drop in model performance. Additionally, ensuring hardware compatibility and optimizing for different devices (such as CPUs, GPUs, or edge devices) can require extensive testing and tuning. Collaboration with data scientists, software engineers, and hardware specialists is often essential to successfully deploy quantized models at scale. Staying updated with the latest quantization techniques and frameworks is also important for overcoming these challenges.

What is deep learning quantization?

Deep learning quantization is the process of reducing the precision of the numbers used to represent a neural network's parameters, activations, or both. By converting the typically used 32-bit floating-point values to lower bit-width formats such as 16-bit or 8-bit integers, quantization significantly reduces the memory footprint and computational requirements of deep learning models. This technique helps deploy models efficiently on edge devices and mobile hardware while maintaining acceptable accuracy levels. Quantization is widely used in model optimization for faster inference and lower power consumption.

What is the difference between Deep Learning Quantization vs Machine Learning Engineer?

AspectDeep Learning QuantizationMachine Learning Engineer
Required CredentialsAdvanced degrees in AI, Computer Science, or related fields; knowledge of neural networksBachelor's or Master's in CS, Data Science, or related fields; programming skills
Work EnvironmentResearch labs, AI development teams, hardware optimization settingsSoftware development teams, data-driven projects, product-focused environments
Industry UsageAI hardware optimization, model deployment, edge computingModel development, data analysis, software solutions across industries

Deep Learning Quantization focuses on reducing model size and improving inference speed through techniques like weight and activation quantization, often in hardware or embedded systems. Machine Learning Engineers develop, implement, and optimize machine learning models for various applications. While both roles require knowledge of AI and programming, Deep Learning Quantization is more specialized in model optimization techniques, whereas Machine Learning Engineers work broadly on model development and deployment.

What are popular job titles related to Deep Learning Quantization jobs in Chicago, IL? For Deep Learning Quantization jobs in Chicago, IL, the most frequently searched job titles are:
What cities near Chicago, IL are hiring for Deep Learning Quantization jobs? Cities near Chicago, IL with the most Deep Learning Quantization job openings:
AI & Machine Learning Engineer

AI & Machine Learning Engineer

Lightspeed

Northbrook, IL • On-site

Full-time

Posted 3 days ago


Job description

Job Summary:
LightSpeed Build Technologies is revolutionizing the construction industry through AI-powered robotics. As an AI & Machine Learning Engineer, you will design, build, and deploy intelligent systems for construction robots, focusing on machine learning models for computer vision, predictive analytics, and process optimization.
Responsibilities:
• Design, train, and deploy ML models for robotic control, quality prediction, and process optimization
• Develop reinforcement learning and imitation learning systems for robot task planning
• Build predictive maintenance models using sensor data to anticipate equipment failures
• Implement anomaly detection for real-time quality monitoring during automated assembly
• Optimize model inference for edge deployment on GPU-accelerated hardware in production
• Develop deep learning pipelines for object detection, segmentation, and pose estimation
• Build real-time vision systems for robotic guidance, workpiece tracking, and dimensional verification
• Implement 3D point cloud processing for construction material recognition
• Design and train models for visual quality inspection using depth cameras and industrial imaging
• Build ML data pipelines from sensor acquisition through model training and deployment
• Establish data labeling, versioning, and management workflows for training datasets
• Implement model monitoring, A/B testing, and continuous improvement in production
• Design experiment tracking and reproducibility infrastructure (MLflow, Weights & Biases)
• Integrate ML models with ROS2-based robot control for real-time inference
• Optimize models for NVIDIA Jetson, industrial PCs, and edge computing platforms
• Collaborate with robotics engineers on sensor selection, placement, and calibration
• Support scaling ML systems across multiple production cells and sites
Qualifications:
Required:
• 4+ years hands-on ML engineering building and deploying production models
• Deep proficiency with PyTorch or TensorFlow for model development and training
• Strong computer vision experience: object detection, segmentation, depth estimation, or 3D vision
• Understanding of reinforcement learning, imitation learning, or robot learning approaches
• Experience optimizing ML models for edge deployment (TensorRT, ONNX, quantization)
• Strong Python with experience in C++ for performance-critical components
• Experience with ML infrastructure: data pipelines, experiment tracking, model serving
• Proficiency with Linux, Docker, Git, and CI/CD workflows
• Understanding of real-time system constraints for ML inference in production
Preferred:
• MS or PhD in Machine Learning, Computer Science, Robotics, or related field
• Experience with robotics simulation: MuJoCo, IsaacSIM, or similar
• Background in manufacturing, industrial automation, or construction technology
• Experience with ROS/ROS2 integration for ML-powered robotics
• Published research or patents in computer vision, robot learning, or related ML
• Experience with NVIDIA ecosystem: CUDA, cuDNN, TensorRT, Jetson platforms
Company:
BUILDING TOMORROW'S HOMES, FASTER AND SMARTER The Lightspeed Integrated Walls, Floors and Roof Systems are built with advanced software and AI driven industrial robots, allowing us to seamlessly craft the walls, floors and roofs, integrating the framing, MEPs, insulation, and drywall in a single, efficient manufacturing line. Founded in , the company is headquartered in , , with a team of 11-50 employees. The company is currently Early Stage.