1

Ml Inference Jobs (NOW HIRING)

Lead ML Inference Engineer, Advertising

Austin, TX · On-site

$101.60K - $133.80K/yr

Stay at the forefront of advancements in inference frameworks, ML hardware acceleration, and distributed systems, and incorporate innovations where and when they are impactful We're excited if you ...

Stay at the forefront of advancements in inference frameworks, ML hardware acceleration, and distributed systems, and incorporate innovations where and when they are impactful We're excited if you ...

Lead ML Inference Engineer, Advertising

Austin, TX · On-site

$101.60K - $133.80K/yr

Stay at the forefront of advancements in inference frameworks, ML hardware acceleration, and distributed systems, and incorporate innovations where and when they are impactful We're excited if you ...

next page

Showing results 1-20

Ml Inference information

See salary details

$37.5K

$122.7K

$196.5K

How much do ml inference jobs pay per year?

As of May 29, 2026, the average yearly pay for ml inference in the United States is $122,738.00, according to ZipRecruiter salary data. Most workers in this role earn between $98,500.00 and $136,000.00 per year, depending on experience, location, and employer.

What are the key skills and qualifications needed to thrive in ML Inference, and why are they important?

To thrive in ML Inference, you need a solid background in machine learning principles, programming (Python or C++), and experience with deploying models at scale, often supported by a degree in computer science or a related field. Familiarity with frameworks and tools such as TensorFlow, PyTorch, ONNX, and cloud platforms like AWS SageMaker or Google AI Platform is typically required. Strong problem-solving skills, attention to detail, and effective communication are crucial soft skills for collaborating with multidisciplinary teams and optimizing model performance. These skills ensure efficient, scalable, and reliable deployment of machine learning solutions in real-world applications.

What are some common challenges faced by ML Inference Engineers when deploying models to production?

ML Inference Engineers often encounter challenges such as optimizing model latency and throughput to meet production requirements, ensuring compatibility with diverse hardware environments, and managing model versioning and updates without disrupting service. Additionally, balancing resource utilization and inference accuracy while monitoring real-time performance metrics is crucial. Collaboration with data scientists, DevOps, and software engineers is typically essential to streamline deployment and maintain robust, scalable inference pipelines.

What is ML inference?

ML inference refers to the process of using a trained machine learning model to make predictions or decisions based on new data. After a model has been trained on historical data, inference is the phase where that model is deployed and used in real-world applications, such as recognizing speech, detecting objects in images, or recommending products. The focus in ML inference is on speed, efficiency, and scalability to ensure quick predictions, often in real time. This process is critical for practical applications like mobile apps, web services, and embedded systems. Optimizing inference involves reducing latency, memory usage, and computational requirements.

What is the difference between Ml Inference vs Data Scientist?

AspectML InferenceData Scientist
Required CredentialsKnowledge of machine learning models, programming skillsDegree in data science, statistics, or related fields
Work EnvironmentDeploying models in production, real-time data processingData analysis, model development, research
Industry UsageAI product deployment, software companiesResearch institutions, tech firms, consulting

ML Inference focuses on deploying trained models to make predictions on new data, often in real-time. Data Scientists develop and analyze models, working primarily in research and development. While both roles require understanding of machine learning, ML Inference emphasizes deployment and operationalization, whereas Data Scientists focus on model creation and analysis.

More about Ml Inference jobs
What cities are hiring for Ml Inference jobs? Cities with the most Ml Inference job openings:
What states have the most Ml Inference jobs? States with the most job openings for Ml Inference jobs include:
Infographic showing various Ml Inference job openings in the United States as of May 2026, with employment types broken down into 25% Internship, and 75% Full Time. Highlights an 25% In-person, and 75% Remote job distribution, with an average salary of $122,738 per year, or $59 per hour.
Senior ML Infrastructure Engineer, Inference Platform

Senior ML Infrastructure Engineer, Inference Platform

General Motors

Sunnyvale, CA • On-site, Remote

$155.42K - $205.90K/yr

Full-time

Medical, Dental, Vision, Life, Retirement, PTO

Posted 26 days ago


General Motors rating

8.1

Company rating: 8.1 out of 10

Based on 301 frontline employees who took The Breakroom Quiz

5th of 44 rated automakers


Job description

Job Description

About the Team:

The ML Inference Platform is part of the AV ML Infrastructure organization. Our team owns the cloud-agnostic, reliable, and cost-efficient platform that powers GM's AI efforts. We're proud to serve teams developing autonomous vehicles (L3/L4/L5), as well as other groups building AI-driven products for GM and its customers. We enable rapid innovation and feature development by optimizing for high-priority, ML-centric use cases. Our platform supports the serving of state-of-the-art (SOTA) machine learning models for experimental, online and bulk inference, with a focus on performance, availability, concurrency, and scalability. We're committed to maximizing GPU utilization across platforms (B200, H100, A100, and more) while maintaining reliability and cost efficiency.

About the Role:

We are seeking a Senior ML Infrastructure engineer to help build and scale robust platforms for ML Inference workflows. In this role, you'll work closely with ML engineers and researchers to ensure efficient model serving and inference in production, for workflows such as data mining, labeling, model distillation, evaluations, simulations and more. This is a high-impact opportunity to influence the future of AI infrastructure at GM. You will play a key role in shaping the architecture, roadmap and user-experience of a robust ML inference service supporting real-time, batch, and experimental inference needs. The ideal candidate brings experience in designing distributed systems for ML, strong problem-solving skills, and a product mindset focused on platform usability and reliability.

What you'll be doing:

  • Design and implement core platform backend software components.

  • Collaborate with ML engineers and researchers to understand critical workflows, parse them to platform requirements, and deliver incremental value.

  • Lead technical decision-making on model serving strategies, orchestration, caching, model versioning, and auto-scaling mechanisms for highly optimized use of accelerators.

  • Drive the development of monitoring, observability, and metrics to ensure reliability, performance, and resource optimization of inference services.

  • Proactively research and integrate state-of-the-art model serving frameworks, hardware accelerators, and distributed computing techniques.

  • Lead technical initiatives across GM's ML ecosystem.

  • Raise the engineering bar through technical leadership, establishing best practices.

  • Contribute to open source projects; represent GM in relevant communities.

Minimum Requirements

  • 5+ years of industry experience, with focus on machine learning systems or high performance backend services.

  • Expertise in either Python, C++ or other relevant coding languages.

  • Expertise in ML inference, model serving frameworks (triton, rayserve, vLLM etc).

  • Strong communication skills and a proven ability to drive cross-functional initiatives.

  • Ability to thrive in a dynamic, multi-tasking environment with ever-evolving priorities.

Preferred Qualifications

  • Deep expertise building zero-to-one ML infrastructure platforms.

  • Experience working with or designing interfaces, apis and clients for ML workflows.

  • Experience with Ray framework, and/or vLLM.

  • Experience with distributed systems, and handling large-scale data processing.

  • Familiarity with telemetry, and other feedback loops to inform product improvements.

  • Familiarity with hardware acceleration (GPUs) and optimizations for inference workloads.

Compensation:The compensation information is a good faith estimate only. It is based on what a successful applicant might be paidin accordance withapplicable state laws. The compensation may not be representative for positionslocatedoutside of New York, Colorado, California, or Washington.

  • The salary range for this role is $155,420 to $205,900. The actual base salary a successful candidate will be offered within this range will vary based on factors relevant to the position.

  • Bonus Potential: An incentivepayprogram offers payouts based on company performance, job level, and individual performance.

  • Benefits: GM offers a variety of health and wellbeing benefit programs. Benefit options include medical, dental, vision, Health Savings Account, Flexible Spending Accounts, retirement savings plan, sickness and accident benefits, life insurance, paid vacation & holidays, tuitionassistanceprograms, employeeassistanceprogram, GM vehicle discounts and more.

Benefits:

  • Benefits: GM offers a variety of health and wellbeing benefit programs. Benefit options include medical, dental, vision, Health Savings Account, Flexible Spending Accounts, retirement savings plan, sickness and accident benefits, life insurance, paid vacation & holidays, tuitionassistanceprograms, employeeassistanceprogram, GM vehicle discounts and more.

Relocation:This job may be eligible forrelocationbenefits.

Remote/Hybrid: This role is basedremotelybut if you live within a 50-mile radius of Mountain View, you are expected to report to that location three times a week, at minimum.

About GM

Our vision is a world with Zero Crashes, Zero Emissions and Zero Congestion and we embrace the responsibility to lead the change that will make our world better, safer and more equitable for all.

Why Join Us

We believe we all must make a choice every day - individually and collectively - to drive meaningful change through our words, our deeds and our culture. Every day, we want every employee to feel they belong to one General Motors team.

Benefits Overview

From day one, we're looking out for your well-being-at work and at home-so you can focus on realizing your ambitions. Learn how GM supports a rewarding career that rewards you personally by visiting Total Rewards resources.

Non-Discrimination and Equal Employment Opportunities (U.S.)

General Motors is committed to being a workplace that is not only free of unlawful discrimination, but one that genuinely fosters inclusion and belonging. We strongly believe that providing an inclusive workplace creates an environment in which our employees can thrive and develop better products for our customers.

All employment decisions are made on a non-discriminatory basis without regard to sex, race, color, national origin, citizenship status, religion, age, disability, pregnancy or maternity status, sexual orientation, gender identity, status as a veteran or protected veteran, or any other similarly protected status in accordance with federal, state and local laws.

We encourage interested candidates to review the key responsibilities and qualifications for each role and apply for any positions that match their skills and capabilities. Applicants in the recruitment process may be required, where applicable, to successfully complete a role-related assessment(s) and/or a pre-employment screening prior to beginning employment. To learn more, visit How we Hire.

Accommodations

General Motors offers opportunities to all job seekers including individuals with disabilities. If you need a reasonable accommodation to assist with your job search or application for employment, email us or call us at 1-800-865-7580. In your email, please include a description of the specific accommodation you are requesting as well as the job title and requisition number of the position for which you are applying.


What General Motors employees say

Pay

Benefits

Hours and flexibility

Workplace

Get the full story on Breakroom


General Motors logo

About General Motors

Sourced by ZipRecruiter

General Motors is a company with global scale and capabilities, headquartered in Detroit, Michigan, with employees around the world. The company employs over 165,000 people, serves six continents, operates across 22 time zones, and has a diverse workforce speaking 75 languages1. GM’s vision is to drive the world forward by pioneering innovations that move and connect people to what matters. The company is working towards an all-electric future with its new Ultium Platform and is pushing transportation options beyond our wildest imaginations with autonomous vehicles. GM is also committed to becoming the most inclusive company in the world.

Industry

Transportation equipment manufacturing

Company size

10,000+ Employees

Headquarters location

Detroit, MI, US

Year founded

1908