Deep Learning Quantization Jobs in New York (NOW HIRING)

Lead Machine Learning Engineer-MLOps

$112K - $148K/yr

Implement quantization techniques and deploy large language models (LLMs) to maximize efficiency ... Deep knowledge and passion for data science fundamentals, training and deploying models

JP Morgan Chase

Lead Machine Learning Engineer-MLOps

Manhattan, NY

$112K - $148K/yr

Implement quantization techniques and deploy large language models (LLMs) to maximize efficiency ... Deep knowledge and passion for data science fundamentals, training and deploying models

NAVA Software Solutions

Machine Learning Operations Engineer - Remote

Jersey City, NJ · On-site +1

$76K - $102K/yr

Apply techniques like quantization, distillation, and pruning to optimize LLM models for efficient ... Deep understanding of LLM architectures (e.g., Transformers), training techniques, and inference ...

NAVA Software Solutions

Machine Learning Operations Engineer - Remote

Jersey City, NJ · On-site +1

$76K - $102K/yr

Paramount

Senior Machine Learning Engineer, Content Engineering

Manhattan, NY · Hybrid

$115K - $157K/yr

Tune vector quantization strategies (PQ, SQ, Binary Quantization) to reduce memory footprint and ... Deep knowledge of vector embedding generation, storage and retrieval, with preference for hands-on ...

Paramount

Senior Machine Learning Engineer, Content Engineering

Manhattan, NY · Hybrid

$115K - $157K/yr

Paramount

Senior Machine Learning Engineer, Content Engineering

Manhattan, NY · On-site

$115K - $157K/yr

Paramount

Senior Machine Learning Engineer, Content Engineering

Manhattan, NY · On-site

$115K - $157K/yr

Point72

Research Engineer, Knowledge Graph Intelligence

New York, NY · On-site

$175K - $250K/yr

Prior experience in the domains of LLMs, foundation models, or large-scale deep learning systems, with a complete understanding of modern training, fine-tuning, quantization, and model evaluation.

Point72

Research Engineer, Knowledge Graph Intelligence

New York, NY · On-site

$175K - $250K/yr

Point72

Research Engineer, Knowledge Graph Intelligence

New York, NY

$175K - $250K/yr

Point72

Research Engineer, Knowledge Graph Intelligence

New York, NY

$175K - $250K/yr

Mirage

ML Engineer, Generative Video

New York, NY · On-site

$175K - $275K/yr

Implement techniques such as distillation, quantization, and pruning to aggressively accelerate ... Strong experience in deep learning systems and infrastructure * Expertise in PyTorch, CUDA, Triton ...

Mirage

ML Engineer, Generative Video

New York, NY · On-site

$175K - $275K/yr

Mirage

ML Engineer, Generative Video

New York, NY · On-site

$175K - $275K/yr

Quick apply

Mirage

ML Engineer, Generative Video

New York, NY · On-site

$175K - $275K/yr

Meta

AI Research Scientist, CoreML - Monetization AI

New York, NY

$154K/yr

... quantization, compression, and resource-efficient AI, to drive performance improvements and ... Research experience in natural language processing, large language modeling, deep learning ...

Meta

AI Research Scientist, CoreML - Monetization AI

New York, NY

$154K/yr

Meta

Research Engineer, Monetization AI

New York, NY

$183K/yr

... quantization, compression, and resource-efficient AI, to drive performance improvements and ... Research experience in machine learning, deep learning, natural language processing, and/or ...

Meta

Research Engineer, Monetization AI

New York, NY

$183K/yr

Baseten

Software Engineer, Model Performance Systems

New York, NY · On-site

$160K - $200K/yr

You are excited to learn about (or already play with) quantization, speculative decoding ... Deep Learning (Literally): You will gain world-class expertise in GPU orchestration and LLM ...

Baseten

Software Engineer, Model Performance Systems

New York, NY · On-site

$160K - $200K/yr

You are excited to learn about (or already play with) quantization, speculative decoding ... Deep Learning (Literally): You will gain world-class expertise in GPU orchestration and LLM ...

Baseten

Software Engineer - Model Performance

New York, NY · On-site

$180K - $360K/yr

Deep dive into underlying codebases of TensorRT, PyTorch, TensorRT-LLM, vllm, sglang, CUDA, and ... Familiarity with LLM optimization techniques (e.g., quantization, speculative decoding, continuous ...

Baseten

Software Engineer - Model Performance

New York, NY · On-site

$180K - $360K/yr

NYU Langone Health

Sr. AI Engineer

Manhattan, NY · Remote

$97K - $140K/yr

Optimize inference performance and cost efficiency through techniques such as model quantization ... learning, and deep learning 5. Experience with AI platforms like PyTorch or TensorFlow 6. ...

NYU Langone Health

Sr. AI Engineer

Manhattan, NY · Remote

$97K - $140K/yr

Goldman Sachs, Inc.

Compliance, New York, Vice President, AI/ML Engineer

New York, NY

$196K - $253K/yr

Strong foundation in machine learning algorithms, including deep learning architectures (e.g ... quantization, pruning, and knowledge distillation. * Experience with model interpretability ...

Goldman Sachs, Inc.

Compliance, New York, Vice President, AI/ML Engineer

New York, NY

$196K - $253K/yr

NYULMC

Sr. AI Engineer

New York, NY · On-site

$114K - $157K/yr

NYULMC

Sr. AI Engineer

New York, NY · On-site

$114K - $157K/yr

Goldman Sachs

Compliance, New York, Vice President, AI/ML Engineer

New York, NY · On-site

$196K - $253K/yr

Goldman Sachs

Compliance, New York, Vice President, AI/ML Engineer

New York, NY · On-site

$196K - $253K/yr

JPMorgan Chase & Co.

AI Agents Applied Research/Engineering Lead - Vice President

Manhattan, NY · On-site

$164K - $260K/yr

... quantization to meet production constraints such as latency, memory, and cost. * Apply ... Strong foundation in ML, deep learning, statistical modeling, and experimental design. * Experience ...

JPMorgan Chase & Co.

AI Agents Applied Research/Engineering Lead - Vice President

Manhattan, NY · On-site

$164K - $260K/yr

NYU Langone Health

Sr. AI Engineer

Manhattan, NY · Remote

$114K - $157K/yr

NYU Langone Health

Sr. AI Engineer

Manhattan, NY · Remote

$114K - $157K/yr

JPMorgan Chase & Co

AI Agents Applied Research/Engineering Lead - Vice President

Manhattan, NY

JPMorgan Chase & Co

AI Agents Applied Research/Engineering Lead - Vice President

Manhattan, NY

JP Morgan Chase

AI Agents Applied Research/Engineering Lead - Vice President

Manhattan, NY

JP Morgan Chase

AI Agents Applied Research/Engineering Lead - Vice President

Manhattan, NY

Showing results 1-20

Deep Learning Quantization Jobs in New York

Deep Learning Quantization information

What are the key skills and qualifications needed to thrive as a Deep Learning Quantization Engineer, and why are they important?

To excel as a Deep Learning Quantization Engineer, you need a strong background in machine learning, applied mathematics, and computer science, usually supported by an advanced degree in a related field. Familiarity with deep learning frameworks (such as TensorFlow or PyTorch), quantization toolkits, and hardware acceleration platforms is crucial. Analytical thinking, problem-solving, and clear technical communication are standout soft skills in this role. These abilities are essential for efficiently optimizing models for deployment on resource-constrained hardware while maintaining accuracy and performance.

What is the difference between Deep Learning Quantization vs Machine Learning Engineer?

Aspect	Deep Learning Quantization	Machine Learning Engineer
Required Credentials	Advanced degrees in AI, Computer Science, or related fields; knowledge of neural networks	Bachelor's or Master's in CS, Data Science, or related fields; programming skills
Work Environment	Research labs, AI development teams, hardware optimization settings	Software development teams, data-driven projects, product-focused environments
Industry Usage	AI hardware optimization, model deployment, edge computing	Model development, data analysis, software solutions across industries

Deep Learning Quantization focuses on reducing model size and improving inference speed through techniques like weight and activation quantization, often in hardware or embedded systems. Machine Learning Engineers develop, implement, and optimize machine learning models for various applications. While both roles require knowledge of AI and programming, Deep Learning Quantization is more specialized in model optimization techniques, whereas Machine Learning Engineers work broadly on model development and deployment.

What is deep learning quantization?

Deep learning quantization is the process of reducing the precision of the numbers used to represent a neural network's parameters, activations, or both. By converting the typically used 32-bit floating-point values to lower bit-width formats such as 16-bit or 8-bit integers, quantization significantly reduces the memory footprint and computational requirements of deep learning models. This technique helps deploy models efficiently on edge devices and mobile hardware while maintaining acceptable accuracy levels. Quantization is widely used in model optimization for faster inference and lower power consumption.

What are some common challenges faced when implementing deep learning quantization in production environments?

One of the main challenges in implementing deep learning quantization is balancing model accuracy with computational efficiency, as quantization can sometimes lead to a drop in model performance. Additionally, ensuring hardware compatibility and optimizing for different devices (such as CPUs, GPUs, or edge devices) can require extensive testing and tuning. Collaboration with data scientists, software engineers, and hardware specialists is often essential to successfully deploy quantized models at scale. Staying updated with the latest quantization techniques and frameworks is also important for overcoming these challenges.

What are popular job titles related to Deep Learning Quantization jobs in New York? For Deep Learning Quantization jobs in New York, the most frequently searched job titles are:

What job categories do people searching Deep Learning Quantization jobs in New York look for? The top searched job categories for Deep Learning Quantization jobs in New York are:

What cities in New York are hiring for Deep Learning Quantization jobs? Cities in New York with the most Deep Learning Quantization job openings:

Deep Learning Quantization jobs near you

Infographic showing various Deep Learning Quantization job openings in New York as of July 2026, with employment types broken down into 74% Full Time, 24% Part Time, and 2% Contract. Highlights an 76% Physical, 2% Hybrid, and 22% Remote job distribution.

Lead Machine Learning Engineer-MLOps

JP Morgan Chase

Manhattan, NY

Apply

$112K - $148K/yr

Full-time

Medical, Retirement

Re-posted 2 days ago

JPMorgan Chase & Co. rating

8.0

Based on 491 frontline employees who took The Breakroom Quiz

58th of 149 rated banks

Job description

We are looking for a Senior MLOps engineer to work closely with Data Scientists to build and deploy ML models on a modern MLOps stack.

As Lead Machine Learning Engineer on the Recommendation Engine team, you'll build and maintain pipelines for distributed model training on large compute clusters, batch/real-time model serving, hyperparameter tuning at scale, model monitoring, production validation and other activities vital for model development, testing and deployment in a well-managed, controlled environment.

Our product, Personalization and Insights, builds and supports high throughput, low latency applications which leverage state of the art machine learning architectures, and which are deployed in AWS. These applications power personalized experiences across Chase Consumer & Community Banking channels, to help weave a user experience that includes traditional banking services with other services in the Travel, Merchant Offer Shopping, and Dining spaces.

Job responsibilities

Build, deploy, and maintain robust pipelines for distributed training on GPU-enabled clusters to support scalable machine learning workflows.

Develop and manage pipelines for high-throughput, real-time inference as well as batch inference, ensuring optimal performance and reliability.

Implement quantization techniques and deploy large language models (LLMs) to maximize efficiency and resource utilization.

Oversee the management and optimization of vector databases to support advanced AI and machine learning applications.

Establish and maintain comprehensive monitoring and observability pipelines to ensure system health, performance, and rapid issue resolution.

Collaborate with cross-functional teams to integrate new technologies and continuously improve existing infrastructure.

Partner with product, architecture, and other engineering teams to define scalable and performant technical solutions.

Required qualifications, capabilities, and skills

BS in Computer Science or related Engineering field with 6+ years of experience Or MS degree in Computer Science or related Engineering field with 4+ years experience.

Solid knowledge and extensive experience in Python and in cloud computing, preferably AWS
Understanding of quantization techniques such as PTQ, AWQ etc. used to quantize LLMs for accelerating inference on specific GPU architectures
Experience in systems engineering fundamentals: caching, CUDA, autoscaling, high throughput, low latency, x-region resilient applications
Deep knowledge and passion for data science fundamentals, training and deploying models

Experience in monitoring and observability tools to monitor model input/output and features stats

Operational experience in big data/ML tools such as Ray, DuckDB, Spark and in training/inference systems such as Ray, vllm/SGLang

Solid grounding in engineering fundamentals and analytical mindset

Preferred qualifications, capabilities, and skills

Experience with recommendation and personalization systems is a plus.
CUDA experience is a big plus

Solid fundamentals and experience in containers (docker ecosystem), container orchestration systems [Kubernetes, ECS], DAG orchestration [Airflow, Kubeflow etc]

Good knowledge of Databases

JPMorganChase, one of the oldest financial institutions, offers innovative financial solutions to millions of consumers, small businesses and many of the world's most prominent corporate, institutional and government clients under the J.P. Morgan and Chase brands. Our history spans over 200 years and today we are a leader in investment banking, consumer and small business banking, commercial banking, financial transaction processing and asset management.

We offer a competitive total rewards package including base salary determined based on the role, experience, skill set and location. Those in eligible roles may receive commission-based pay and/or discretionary incentive compensation, paid in the form of cash and/or forfeitable equity, awarded in recognition of individual achievements and contributions. We also offer a range of benefits and programs to meet employee needs, based on eligibility. These benefits include comprehensive health care coverage, on-site health and wellness centers, a retirement savings plan, backup childcare, tuition reimbursement, mental health support, financial coaching and more. Additional details about total compensation and benefits will be provided during the hiring process.

We recognize that our people are our strength and the diverse talents they bring to our global workforce are directly linked to our success. We are an equal opportunity employer and place a high value on diversity and inclusion at our company. We do not discriminate on the basis of any protected attribute, including race, religion, color, national origin, gender, sexual orientation, gender identity, gender expression, age, marital or veteran status, pregnancy or disability, or any other basis protected under applicable law. We also make reasonable accommodations for applicants' and employees' religious practices and beliefs, as well as mental health or physical disability needs. Visit our FAQs for more information about requesting an accommodation.

JPMorgan Chase & Co. is an Equal Opportunity Employer, including Disability/Veterans

J.P. Morgan's Commercial & Investment Bank is a global leader across banking, markets, securities services and payments. Corporations, governments and institutions throughout the world entrust us with their business in more than 100 countries. The Commercial & Investment Bank provides strategic advice, raises capital, manages risk and extends liquidity in markets around the world.

What JPMorgan Chase & Co. employees say

Pay

Benefits

Hours and flexibility

Workplace

Get the full story on Breakroom

About JPMorgan Chase & Co

Sourced by ZipRecruiter

Industry

Finance and insurance and banking and credit intermediation

Company size

10,000+ Employees

Headquarters location

New York, NY, US

Website

jpmorganchase.com

Social media

View All JPMorgan Chase & Co Jobs

Apply

Deep Learning Quantization Jobs in New York (NOW HIRING)

Lead Machine Learning Engineer-MLOps

Lead Machine Learning Engineer-MLOps

Machine Learning Operations Engineer - Remote

Machine Learning Operations Engineer - Remote

Senior Machine Learning Engineer, Content Engineering

Senior Machine Learning Engineer, Content Engineering

Senior Machine Learning Engineer, Content Engineering

Senior Machine Learning Engineer, Content Engineering

Research Engineer, Knowledge Graph Intelligence

Research Engineer, Knowledge Graph Intelligence

Research Engineer, Knowledge Graph Intelligence

Research Engineer, Knowledge Graph Intelligence

ML Engineer, Generative Video

ML Engineer, Generative Video

ML Engineer, Generative Video

ML Engineer, Generative Video

AI Research Scientist, CoreML - Monetization AI

AI Research Scientist, CoreML - Monetization AI

Research Engineer, Monetization AI

Research Engineer, Monetization AI

Software Engineer, Model Performance Systems

Software Engineer, Model Performance Systems

Software Engineer - Model Performance

Software Engineer - Model Performance

Sr. AI Engineer

Sr. AI Engineer

Compliance, New York, Vice President, AI/ML Engineer

Compliance, New York, Vice President, AI/ML Engineer

Sr. AI Engineer

Sr. AI Engineer

Compliance, New York, Vice President, AI/ML Engineer

Compliance, New York, Vice President, AI/ML Engineer

AI Agents Applied Research/Engineering Lead - Vice President

AI Agents Applied Research/Engineering Lead - Vice President

Sr. AI Engineer

Sr. AI Engineer

AI Agents Applied Research/Engineering Lead - Vice President

AI Agents Applied Research/Engineering Lead - Vice President

AI Agents Applied Research/Engineering Lead - Vice President

AI Agents Applied Research/Engineering Lead - Vice President

Deep Learning Quantization information

What are the key skills and qualifications needed to thrive as a Deep Learning Quantization Engineer, and why are they important?

What is the difference between Deep Learning Quantization vs Machine Learning Engineer?

What is deep learning quantization?

What are some common challenges faced when implementing deep learning quantization in production environments?

Lead Machine Learning Engineer-MLOps

Share this job

JPMorgan Chase & Co. rating

Get the real story on frontline employers

Job description

What JPMorgan Chase & Co. employees say

Get the real story on frontline employers

Pay

Only some people get paid breaks

Most people get paid when they’re sick

The job rarely spills into unpaid time

Benefits

Sick days don’t use up paid time off

Most people say they can afford the health insurance

Most people get paid time off

Hours and flexibility

Less than 4 weeks notice of work schedule

Most people don’t worry about their hours

Only some people can choose their shifts

Workplace

Most people feel treated with respect

Most people get breaks without interruption

Most people are stressed out

About JPMorgan Chase & Co

Industry

Company size

Headquarters location

Website

Social media

Share this job