1

Deep Learning Quantization Jobs in Secaucus, NJ (NOW HIRING)

Optimize model inference for production environments using quantization, pruning, and hardware ... Expertise in Python and deep learning frameworks (PyTorch, TensorFlow, Hugging Face). * Hands-on ...

Computer Vision/ML Engineer

Manhattan, NY

$122.90K - $145K/yr

The position We are looking for our lead deep learning engineer to spearhead the development of our ... Optimize models for embedded deployment using quantization, pruning, TensorRT, and NVIDIA Triton

Computer Vision/ML Engineer

New York, NY ยท On-site

$122K - $143.90K/yr

The position We are looking for our lead deep learning engineer to spearhead the development of our ... Optimize models for embedded deployment using quantization, pruning, TensorRT, and NVIDIA Triton

Optimize model inference for production environments using quantization, pruning, and hardware ... Expertise in Python and deep learning frameworks (PyTorch, TensorFlow, Hugging Face). * Hands-on ...

Computer Vision/ML Engineer

Brooklyn, NY

$117.80K - $138.90K/yr

The Position We are looking for our lead deep learning engineer to spearhead the development of our ... Optimize models for embedded deployment using quantization, pruning, TensorRT, and NVIDIA Triton

Computer Vision/ML Engineer

New York, NY ยท On-site

$122K - $143.90K/yr

The position We are looking for our lead deep learning engineer to spearhead the development of our ... Optimize models for embedded deployment using quantization, pruning, TensorRT, and NVIDIA Triton

Implement techniques such as distillation, quantization, and pruning to aggressively accelerate ... Strong experience in deep learning systems and infrastructure * Expertise in PyTorch, CUDA, Triton ...

Implement techniques such as distillation, quantization, and pruning to aggressively accelerate ... Strong experience in deep learning systems and infrastructure * Expertise in PyTorch, CUDA, Triton ...

... quantization, compression, and resource-efficient AI, to drive performance improvements and ... Research experience in machine learning, deep learning, natural language processing, and/or ...

... quantization, compression, and resource-efficient AI, to drive performance improvements and ... Research experience in deep learning, reinforcement learning, natural language processing, computer ...

next page

Showing results 1-20

Deep Learning Quantization information

See Secaucus, NJ salary details

$11.2K

$85.3K

$142.3K

How much do deep learning quantization jobs pay per year?

As of Jun 1, 2026, the average yearly pay for deep learning quantization in Secaucus, NJ is $85,285.00, according to ZipRecruiter salary data. Most workers in this role earn between $73,200.00 and $141,300.00 per year, depending on experience, location, and employer.

What are the key skills and qualifications needed to thrive as a Deep Learning Quantization Engineer, and why are they important?

To excel as a Deep Learning Quantization Engineer, you need a strong background in machine learning, applied mathematics, and computer science, usually supported by an advanced degree in a related field. Familiarity with deep learning frameworks (such as TensorFlow or PyTorch), quantization toolkits, and hardware acceleration platforms is crucial. Analytical thinking, problem-solving, and clear technical communication are standout soft skills in this role. These abilities are essential for efficiently optimizing models for deployment on resource-constrained hardware while maintaining accuracy and performance.

What are some common challenges faced when implementing deep learning quantization in production environments?

One of the main challenges in implementing deep learning quantization is balancing model accuracy with computational efficiency, as quantization can sometimes lead to a drop in model performance. Additionally, ensuring hardware compatibility and optimizing for different devices (such as CPUs, GPUs, or edge devices) can require extensive testing and tuning. Collaboration with data scientists, software engineers, and hardware specialists is often essential to successfully deploy quantized models at scale. Staying updated with the latest quantization techniques and frameworks is also important for overcoming these challenges.

What is deep learning quantization?

Deep learning quantization is the process of reducing the precision of the numbers used to represent a neural network's parameters, activations, or both. By converting the typically used 32-bit floating-point values to lower bit-width formats such as 16-bit or 8-bit integers, quantization significantly reduces the memory footprint and computational requirements of deep learning models. This technique helps deploy models efficiently on edge devices and mobile hardware while maintaining acceptable accuracy levels. Quantization is widely used in model optimization for faster inference and lower power consumption.

What is the difference between Deep Learning Quantization vs Machine Learning Engineer?

AspectDeep Learning QuantizationMachine Learning Engineer
Required CredentialsAdvanced degrees in AI, Computer Science, or related fields; knowledge of neural networksBachelor's or Master's in CS, Data Science, or related fields; programming skills
Work EnvironmentResearch labs, AI development teams, hardware optimization settingsSoftware development teams, data-driven projects, product-focused environments
Industry UsageAI hardware optimization, model deployment, edge computingModel development, data analysis, software solutions across industries

Deep Learning Quantization focuses on reducing model size and improving inference speed through techniques like weight and activation quantization, often in hardware or embedded systems. Machine Learning Engineers develop, implement, and optimize machine learning models for various applications. While both roles require knowledge of AI and programming, Deep Learning Quantization is more specialized in model optimization techniques, whereas Machine Learning Engineers work broadly on model development and deployment.

What job categories do people searching Deep Learning Quantization jobs in Secaucus, NJ look for? The top searched job categories for Deep Learning Quantization jobs in Secaucus, NJ are:
What cities near Secaucus, NJ are hiring for Deep Learning Quantization jobs? Cities near Secaucus, NJ with the most Deep Learning Quantization job openings:
Computer VisionML Engineer (New York)

Computer VisionML Engineer (New York)

Norbert Health

Manhattan, NY โ€ข On-site

$122.90K - $145K/yr

Full-time

This job post hasย expired today.ย Applications are no longer accepted.


Job description

The company

Norbert is building autonomous robots that deliver healthcare.

Our AI sensing platform mounts on mobile robots and does the work of a care team memberโ€”rounding on patients, capturing vitals without contact (FDA-cleared for pulse and respiratory rate, more in the pipeline), running assessments, documenting to the EMR, and escalating when something's wrong. Autonomously.

We're not building demos. We're deployed in real facilities today, monitoring hundreds of patients daily. We're solving one of healthcare's hardest problems: a global nursing shortage that will hit 40% by 2030.

We're a small, international team backed by top-tier VCs, with offices in Brooklyn and Paris. We ship things that matter.

The position

We are looking for our lead deep learning engineer to spearhead the development of our groundbreaking sensing technology.

What you will do:
  • Design, fine-tune, and deploy computer vision models (YOLO, InsightFace, MediaPipe, facial landmark detection, object tracking, pose estimation) for real-time inference on the edge
  • Optimize models for embedded deployment using quantization, pruning, TensorRT, and NVIDIA Triton
  • Build and maintain MLOps pipelines for model training, validation, and performance monitoring
  • Develop video processing pipelines that integrate with both classical signal processing and ML based vital sign extraction
  • Establish engineering best practices and help reduce technical debt as we scale
  • Contribute to the architecture and implementation of the computer vision stack from research to production
What we look for:
  • Master's or PhD degree in Machine learning / Computer vision
  • Strong fundamentals: data structures, CV algorithms, and systems programming
  • Strong C++ skills - this is critical for our edge deployment pipeline
  • Solid Python proficiency for ML experimentation and tooling
  • Ability to work independently, solve complex problems, and drive projects to completion
  • 5+ years experience deploying computer vision models to production, ideally on resource-constrained devices
  • Experience with PyTorch and model optimization for edge AI
  • Proven ability to take models from research to production on embedded hardware

Nice to haves:

  • Experience with NVIDIA Jetson platform, TensorRT, or Triton Inference Server
  • MLOps experience (experiment tracking, model versioning, performance monitoring)
  • Experience with sensor fusion (RGB, IR, depth cameras)
  • Background in medical devices, regulated environments, or healthcare applications
  • Experience working in fast-moving early-stage environments
What we offer:
  • Real impact: your code provides care for patients today
  • High autonomy and technical ownership - you'll shape our computer vision architecture
  • Work at the intersection of cutting-edge AI, edge computing, and healthcare
  • A talented, excellent, diverse and international team
  • Cutting-edge stack: embedded AI, robotics, LLMs, multimodal sensing
  • Talented, international team tackling meaningful problems in remote patient monitoring
  • Competitive salary and equity
  • Transparent, mission-driven culture focused on continuous learning
#J-18808-Ljbffr