Optimize deep learning models for production inference, including quantization and batching. * Deploy and manage GPU workloads in Kubernetes environments. * Build scalable, low-latency systems using ...
Optimize deep learning models for production inference, including quantization and batching. * Deploy and manage GPU workloads in Kubernetes environments. * Build scalable, low-latency systems using ...
Machine Learning Engineer
New York, NY · Hybrid
$145K - $180K/yr
Demonstrated ownership of deep-learning inference optimization in production (quantization, distillation, compilation, kernel/profile-level performance work) for transformer NLP and/or CV models.
Machine Learning Engineer
New York, NY · Hybrid
$145K - $180K/yr
Demonstrated ownership of deep-learning inference optimization in production (quantization, distillation, compilation, kernel/profile-level performance work) for transformer NLP and/or CV models.
Machine Learning Engineer
Manhattan, NY · Hybrid
$145K - $180K/yr
Demonstrated ownership of deep-learning inference optimization in production (quantization, distillation, compilation, kernel/profile-level performance work) for transformer NLP and/or CV models.
Machine Learning Engineer
Manhattan, NY · Hybrid
$145K - $180K/yr
Demonstrated ownership of deep-learning inference optimization in production (quantization, distillation, compilation, kernel/profile-level performance work) for transformer NLP and/or CV models.
Machine Learning Engineer
Manhattan, NY · On-site
$145K - $180K/yr
Demonstrated ownership of deep-learning inference optimization in production (quantization, distillation, compilation, kernel/profile-level performance work) for transformer NLP and/or CV models.
Machine Learning Engineer
Manhattan, NY · On-site
$145K - $180K/yr
Demonstrated ownership of deep-learning inference optimization in production (quantization, distillation, compilation, kernel/profile-level performance work) for transformer NLP and/or CV models.
Machine Learning Engineer
Manhattan, NY · Hybrid
$145K - $180K/yr
Demonstrated ownership of deep-learning inference optimization in production (quantization, distillation, compilation, kernel/profile-level performance work) for transformer NLP and/or CV models.
Machine Learning Engineer
Manhattan, NY · Hybrid
$145K - $180K/yr
Demonstrated ownership of deep-learning inference optimization in production (quantization, distillation, compilation, kernel/profile-level performance work) for transformer NLP and/or CV models.
AI / Machine Learning Engineer
Iselin, NJ · On-site
Strong understanding of deep learning architectures for image and text recognition. * Familiarity ... Preferred Qualifications * Experience with model quantization and optimization for mobile ...
AI / Machine Learning Engineer
Iselin, NJ · On-site
Strong understanding of deep learning architectures for image and text recognition. * Familiarity ... Preferred Qualifications * Experience with model quantization and optimization for mobile ...
Strong understanding of deep learning architectures for image and text recognition. * Familiarity ... Preferred Qualifications * Experience with model quantization and optimization for mobile ...
Strong understanding of deep learning architectures for image and text recognition. * Familiarity ... Preferred Qualifications * Experience with model quantization and optimization for mobile ...
AI / Machine Learning Engineer
Iselin, NJ · On-site
Strong understanding of deep learning architectures for image and text recognition. * Familiarity ... Preferred Qualifications * Experience with model quantization and optimization for mobile ...
Quick apply
AI / Machine Learning Engineer
Iselin, NJ · On-site
Strong understanding of deep learning architectures for image and text recognition. * Familiarity ... Preferred Qualifications * Experience with model quantization and optimization for mobile ...
Strong understanding of deep learning architectures for image and text recognition. * Familiarity ... Preferred Qualifications * Experience with model quantization and optimization for mobile ...
Strong understanding of deep learning architectures for image and text recognition. * Familiarity ... Preferred Qualifications * Experience with model quantization and optimization for mobile ...
Member of Technical Staff (AI Inference Engineer)
New York, NY · On-site
$220K - $485K/yr
INT8/FP8/FP4 quantization, mixed-precision serving. * Profiling and debugging tools: Nsight Compute ... Familiarity with at least one deep learning framework (PyTorch, JAX, TensorFlow). * Understanding ...
Member of Technical Staff (AI Inference Engineer)
New York, NY · On-site
$220K - $485K/yr
INT8/FP8/FP4 quantization, mixed-precision serving. * Profiling and debugging tools: Nsight Compute ... Familiarity with at least one deep learning framework (PyTorch, JAX, TensorFlow). * Understanding ...
Serve at scale via distillation, quantization, streaming/progressive render, and multi-GPU ... Deep expertise in generative models for video. Onsite NYC $200-400k base + equity
Serve at scale via distillation, quantization, streaming/progressive render, and multi-GPU ... Deep expertise in generative models for video. Onsite NYC $200-400k base + equity
Computer Vision/ML Engineer
Brooklyn, NY · On-site
$117K - $138K/yr
The position We are looking for our lead deep learning engineer to spearhead the development of our ... Optimize models for embedded deployment using quantization, pruning, TensorRT, and NVIDIA Triton
Computer Vision/ML Engineer
Brooklyn, NY · On-site
$117K - $138K/yr
The position We are looking for our lead deep learning engineer to spearhead the development of our ... Optimize models for embedded deployment using quantization, pruning, TensorRT, and NVIDIA Triton
Computer Vision/ML Engineer
New York, NY · On-site
$122K - $143K/yr
The position We are looking for our lead deep learning engineer to spearhead the development of our ... Optimize models for embedded deployment using quantization, pruning, TensorRT, and NVIDIA Triton
Computer Vision/ML Engineer
New York, NY · On-site
$122K - $143K/yr
The position We are looking for our lead deep learning engineer to spearhead the development of our ... Optimize models for embedded deployment using quantization, pruning, TensorRT, and NVIDIA Triton
AI Solutions Architect
New York, NY · Remote
Optimize model inference for production environments using quantization, pruning, and hardware ... Expertise in Python and deep learning frameworks (PyTorch, TensorFlow, Hugging Face). * Hands-on ...
AI Solutions Architect
New York, NY · Remote
Optimize model inference for production environments using quantization, pruning, and hardware ... Expertise in Python and deep learning frameworks (PyTorch, TensorFlow, Hugging Face). * Hands-on ...
Computer Vision/ML Engineer
New York, NY · On-site
$122K - $143K/yr
The position We are looking for our lead deep learning engineer to spearhead the development of our ... Optimize models for embedded deployment using quantization, pruning, TensorRT, and NVIDIA Triton
Quick apply
Computer Vision/ML Engineer
New York, NY · On-site
$122K - $143K/yr
The position We are looking for our lead deep learning engineer to spearhead the development of our ... Optimize models for embedded deployment using quantization, pruning, TensorRT, and NVIDIA Triton
Lead Machine Learning Engineer-MLOps
Manhattan, NY · On-site
$112K - $148K/yr
Implement quantization techniques and deploy large language models (LLMs) to maximize efficiency ... Deep knowledge and passion for data science fundamentals, training and deploying models
Lead Machine Learning Engineer-MLOps
Manhattan, NY · On-site
$112K - $148K/yr
Implement quantization techniques and deploy large language models (LLMs) to maximize efficiency ... Deep knowledge and passion for data science fundamentals, training and deploying models
Lead Machine Learning Engineer-MLOps
Manhattan, NY · On-site
$112K - $148K/yr
Implement quantization techniques and deploy large language models (LLMs) to maximize efficiency ... Deep knowledge and passion for data science fundamentals, training and deploying models
Lead Machine Learning Engineer-MLOps
Manhattan, NY · On-site
$112K - $148K/yr
Implement quantization techniques and deploy large language models (LLMs) to maximize efficiency ... Deep knowledge and passion for data science fundamentals, training and deploying models
Machine Learning Operations Engineer - Remote
Jersey City, NJ · On-site +1
$76K - $102K/yr
Apply techniques like quantization, distillation, and pruning to optimize LLM models for efficient ... Deep understanding of LLM architectures (e.g., Transformers), training techniques, and inference ...
Machine Learning Operations Engineer - Remote
Jersey City, NJ · On-site +1
$76K - $102K/yr
Apply techniques like quantization, distillation, and pruning to optimize LLM models for efficient ... Deep understanding of LLM architectures (e.g., Transformers), training techniques, and inference ...
... quantization, compression, and resource-efficient AI, to drive performance improvements and ... Research experience in natural language processing, large language modeling, deep learning ...
... quantization, compression, and resource-efficient AI, to drive performance improvements and ... Research experience in natural language processing, large language modeling, deep learning ...
Research Engineer, Video Generation
New York, NY · On-site
$175K - $275K/yr
Implement techniques such as distillation, quantization, and pruning to aggressively accelerate ... Strong experience in deep learning systems and infrastructure * Expertise in PyTorch, CUDA, Triton ...
Research Engineer, Video Generation
New York, NY · On-site
$175K - $275K/yr
Implement techniques such as distillation, quantization, and pruning to aggressively accelerate ... Strong experience in deep learning systems and infrastructure * Expertise in PyTorch, CUDA, Triton ...
Deep Learning Quantization information
What are the key skills and qualifications needed to thrive as a Deep Learning Quantization Engineer, and why are they important?
What is the difference between Deep Learning Quantization vs Machine Learning Engineer?
| Aspect | Deep Learning Quantization | Machine Learning Engineer |
|---|---|---|
| Required Credentials | Advanced degrees in AI, Computer Science, or related fields; knowledge of neural networks | Bachelor's or Master's in CS, Data Science, or related fields; programming skills |
| Work Environment | Research labs, AI development teams, hardware optimization settings | Software development teams, data-driven projects, product-focused environments |
| Industry Usage | AI hardware optimization, model deployment, edge computing | Model development, data analysis, software solutions across industries |
Deep Learning Quantization focuses on reducing model size and improving inference speed through techniques like weight and activation quantization, often in hardware or embedded systems. Machine Learning Engineers develop, implement, and optimize machine learning models for various applications. While both roles require knowledge of AI and programming, Deep Learning Quantization is more specialized in model optimization techniques, whereas Machine Learning Engineers work broadly on model development and deployment.
What is deep learning quantization?
What are some common challenges faced when implementing deep learning quantization in production environments?
- Freelance Machine Learning Compiler Engineer
- Senior Meta Machine Learning
- Graduate Machine Learning Engineer
- Machine Learning Manager
- Freelance Google Machine Learning Engineer
- Remote Tesla Machine Learning Engineer
- Evening Computer Vision Deep Learning Engineer
- Home Based Python Machine Learning
- Deep Learning Ai
- Deep Learning

Other
Medical, Retirement
Posted 3 days ago
JPMorgan Chase & Co. rating
8.1
Based on 468 frontline employees who took The Breakroom Quiz
46th of 141 rated banks
Job description
Be an integral part of an agile team that's constantly pushing the envelope to enhance, build, and deliver top-notch technology products.
As a Lead Software Engineer at JPMorgan Chase within the Commercial and Investment Banking team, you will play a pivotal role in an agile team, enhancing and delivering secure, stable, and scalable technology products. As a core technical contributor, you will drive critical technology solutions across multiple technical areas, supporting the firm's business objectives.
Job Responsibilities
- Lead the design, development, and troubleshooting of software solutions, applying innovative approaches to complex technical challenges.
- Write secure, high-quality production code and maintain algorithms integrated with firm systems.
- Produce architecture and design artifacts for advanced applications, ensuring compliance with design constraints.
- Analyze and visualize large, diverse data sets to improve software applications and systems.
- Identify and resolve hidden issues and patterns in data to enhance code quality and system architecture.
- Collaborate with software engineering communities to explore and adopt emerging technologies.
- Guide system design and architecture discussions, focusing on reliability and scalability.
- Optimize deep learning models for production inference, including quantization and batching.
- Deploy and manage GPU workloads in Kubernetes environments.
- Build scalable, low-latency systems using web services and APIs.
- Partner with product and program management teams to deliver business-driven solutions.
Required qualifications, capabilities, and skills
- Formal training or certification on software engineering concepts and 5+ years applied experience
- Professional software development experience, with emphasis on ML systems.
- Strong proficiency in Python and experience with ML frameworks (TensorFlow, PyTorch, or similar).
- Experience with cloud technologies (Docker, Kubernetes, EKS) and public clouds (AWS, GCP).
- Hands-on experience with ML model serving frameworks (TorchServe, TensorFlow Serving, Triton Inference Server).
- Experience deploying and managing GPU workloads in Kubernetes.
- Familiarity with scalable, low-latency systems based on web services and APIs.
- Experience with NoSQL databases (Cassandra or equivalent) for high-throughput data access.
- Understanding of GPU resource management and cost optimization.
- Experience with modern microservices architecture.
- Ability to lead the design of large-scale systems and evaluate tradeoffs.
Preferred qualifications, capabilities, and skills
- MS/PhD in Computer Science, Machine Learning, or a related field.
- Proficiency in Java, Python, Scala, or C++.
- Experience with graph neural networks and graph processing frameworks (DGL, PyTorch Geometric, NetworkX).
- Knowledge of GPU programming (CUDA) and performance optimization.
- Experience with model monitoring, A/B testing, and ML observability tools.
- Familiarity with MLOps tools and practices (MLflow, Kubeflow, SageMaker).
- Experience serving large-scale models and optimizing for performance.
FEDERAL DEPOSIT INSURANCE ACT:
This position is subject to Section 19 of the Federal Deposit Insurance Act. As such, an employment offer for this position is contingent on JPMorgan Chase's review of criminal conviction history, including pretrial diversions or program entries.
About UsJPMorganChase, one of the oldest financial institutions, offers innovative financial solutions to millions of consumers, small businesses and many of the world's most prominent corporate, institutional and government clients under the J.P. Morgan and Chase brands. Our history spans over 200 years and today we are a leader in investment banking, consumer and small business banking, commercial banking, financial transaction processing and asset management.
We offer a competitive total rewards package including base salary determined based on the role, experience, skill set and location. Those in eligible roles may receive commission-based pay and/or discretionary incentive compensation, paid in the form of cash and/or forfeitable equity, awarded in recognition of individual achievements and contributions. We also offer a range of benefits and programs to meet employee needs, based on eligibility. These benefits include comprehensive health care coverage, on-site health and wellness centers, a retirement savings plan, backup childcare, tuition reimbursement, mental health support, financial coaching and more. Additional details about total compensation and benefits will be provided during the hiring process.
We recognize that our people are our strength and the diverse talents they bring to our global workforce are directly linked to our success. We are an equal opportunity employer and place a high value on diversity and inclusion at our company. We do not discriminate on the basis of any protected attribute, including race, religion, color, national origin, gender, sexual orientation, gender identity, gender expression, age, marital or veteran status, pregnancy or disability, or any other basis protected under applicable law. We also make reasonable accommodations for applicants' and employees' religious practices and beliefs, as well as mental health or physical disability needs. Visit our FAQs for more information about requesting an accommodation.
JPMorgan Chase & Co. is an Equal Opportunity Employer, including Disability/Veterans
About the TeamJ.P. Morgan's Commercial & Investment Bank is a global leader across banking, markets, securities services and payments. Corporations, governments and institutions throughout the world entrust us with their business in more than 100 countries. The Commercial & Investment Bank provides strategic advice, raises capital, manages risk and extends liquidity in markets around the world.
What JPMorgan Chase & Co. employees say
Pay
Benefits
Hours and flexibility
Workplace
Get the full story on Breakroom
About JPMorgan Chase & Co
Sourced by ZipRecruiter
Industry
Finance and insurance and banking and credit intermediation
Company size
10,000+ Employees
Headquarters location
New York, NY, US