Preferred : • Internship or project that deployed a microservice or ML inference demo. • Coursework/research with PyTorch or TensorFlow; simple CUDA projects a plus. • Familiarity with Grafana ...
Preferred : • Internship or project that deployed a microservice or ML inference demo. • Coursework/research with PyTorch or TensorFlow; simple CUDA projects a plus. • Familiarity with Grafana ...
Software Development Engineer - AI/ML, Amazon Neuron, Multimodal Inference
Seattle, WA · On-site
$159K/yr
The Inference Enablement and Acceleration team is at the forefront of running a wide range of models and supporting novel architecture alongside maximizing their performance for AWS's custom ML ...
Software Development Engineer - AI/ML, Amazon Neuron, Multimodal Inference
Seattle, WA · On-site
$159K/yr
The Inference Enablement and Acceleration team is at the forefront of running a wide range of models and supporting novel architecture alongside maximizing their performance for AWS's custom ML ...
Software Development Engineer - AI/ML, Amazon Neuron, Multimodal Inference
Seattle, WA · On-site
$159K/yr
The Inference Enablement and Acceleration team is at the forefront of running a wide range of models and supporting novel architecture alongside maximizing their performance for AWS's custom ML ...
Software Development Engineer - AI/ML, Amazon Neuron, Multimodal Inference
Seattle, WA · On-site
$159K/yr
The Inference Enablement and Acceleration team is at the forefront of running a wide range of models and supporting novel architecture alongside maximizing their performance for AWS's custom ML ...
Software Development Engineer - AI/ML, Amazon Neuron, Multimodal Inference
Seattle, WA · On-site
$159K/yr
The Inference Enablement and Acceleration team is at the forefront of running a wide range of models and supporting novel architecture alongside maximizing their performance for AWS's custom ML ...
Software Development Engineer - AI/ML, Amazon Neuron, Multimodal Inference
Seattle, WA · On-site
$159K/yr
The Inference Enablement and Acceleration team is at the forefront of running a wide range of models and supporting novel architecture alongside maximizing their performance for AWS's custom ML ...
ML Software Engineer
$171K - $258K/yr
Our team builds ML-inference applications and services on Apple Silicon in the datacenter, specifically focusing in recent years on generative AI as part of the Private Cloud Compute component of ...
ML Software Engineer
$171K - $258K/yr
Our team builds ML-inference applications and services on Apple Silicon in the datacenter, specifically focusing in recent years on generative AI as part of the Private Cloud Compute component of ...
Preferred Qualifications Experience with ML inference hardware acceleration (DSPs, NPUs, ASICs).Familiarity with diverse neural network architectures and training methodologies for efficient edge ...
Preferred Qualifications Experience with ML inference hardware acceleration (DSPs, NPUs, ASICs).Familiarity with diverse neural network architectures and training methodologies for efficient edge ...
Software Engineer, Inference AI/ML
Bellevue, WA · On-site
$92K - $135K/yr
Internship or project that deployed a microservice or ML inference demo. * Coursework/research with PyTorch or TensorFlow; simple CUDA projects a plus. * Familiarity with Grafana/Prometheus ...
Software Engineer, Inference AI/ML
Bellevue, WA · On-site
$92K - $135K/yr
Internship or project that deployed a microservice or ML inference demo. * Coursework/research with PyTorch or TensorFlow; simple CUDA projects a plus. * Familiarity with Grafana/Prometheus ...
Senior AI/ML Engineer
Seattle, WA · On-site
$118K - $163K/yr
Implement scalable model serving architectures forreal-timeand batch inference * Developing ... Partner with AI/ML scientists to productionize models while meeting accuracy, performance ...
Senior AI/ML Engineer
Seattle, WA · On-site
$118K - $163K/yr
Implement scalable model serving architectures forreal-timeand batch inference * Developing ... Partner with AI/ML scientists to productionize models while meeting accuracy, performance ...
Software Engineer II- AI/ML, AWS Neuron
Seattle, WA · On-site
$111K - $151K/yr
This comprehensive toolkit includes an ML compiler, runtime, and application framework that seamlessly integrates with popular ML frameworks like PyTorch and JAX enabling unparalleled ML inference ...
Software Engineer II- AI/ML, AWS Neuron
Seattle, WA · On-site
$111K - $151K/yr
This comprehensive toolkit includes an ML compiler, runtime, and application framework that seamlessly integrates with popular ML frameworks like PyTorch and JAX enabling unparalleled ML inference ...
Senior AI/ML Engineer
$119K - $163K/yr
Implement scalable model serving architectures forreal-timeand batch inference * Developing ... Partner with AI/ML scientists to productionize models while meeting accuracy, performance ...
Senior AI/ML Engineer
$119K - $163K/yr
Implement scalable model serving architectures forreal-timeand batch inference * Developing ... Partner with AI/ML scientists to productionize models while meeting accuracy, performance ...
Senior AI/ML Engineer
Seattle, WA · On-site
$118K - $163K/yr
Implement scalable model serving architectures forreal-timeand batch inference * Developing ... Partner with AI/ML scientists to productionize models while meeting accuracy, performance ...
Senior AI/ML Engineer
Seattle, WA · On-site
$118K - $163K/yr
Implement scalable model serving architectures forreal-timeand batch inference * Developing ... Partner with AI/ML scientists to productionize models while meeting accuracy, performance ...
Senior AI/ML Engineer
Seattle, WA · On-site
$118K - $163K/yr
You will be the technical authority for ML engineering challenges from setting up model training and fine-tuning to architectures and system design for serving AI/ML inference solutions in production.
Senior AI/ML Engineer
Seattle, WA · On-site
$118K - $163K/yr
You will be the technical authority for ML engineering challenges from setting up model training and fine-tuning to architectures and system design for serving AI/ML inference solutions in production.
Software Engineer II- AI/ML, AWS Neuron
Seattle, WA · On-site
$111K - $151K/yr
This comprehensive toolkit includes an ML compiler, runtime, and application framework that seamlessly integrates with popular ML frameworks like PyTorch and JAX enabling unparalleled ML inference ...
Software Engineer II- AI/ML, AWS Neuron
Seattle, WA · On-site
$111K - $151K/yr
This comprehensive toolkit includes an ML compiler, runtime, and application framework that seamlessly integrates with popular ML frameworks like PyTorch and JAX enabling unparalleled ML inference ...
Software Engineer II- AI/ML, AWS Neuron
$111K - $151K/yr
This comprehensive toolkit includes an ML compiler, runtime, and application framework that seamlessly integrates with popular ML frameworks like PyTorch and JAX enabling unparalleled ML inference ...
Software Engineer II- AI/ML, AWS Neuron
$111K - $151K/yr
This comprehensive toolkit includes an ML compiler, runtime, and application framework that seamlessly integrates with popular ML frameworks like PyTorch and JAX enabling unparalleled ML inference ...
This role develops, enables and performance tunes building blocks for all key ML model families, including Llama3, GPT OSS, Qwen3, DeepSeek and beyond. The Neuron Inference Technology team works side ...
This role develops, enables and performance tunes building blocks for all key ML model families, including Llama3, GPT OSS, Qwen3, DeepSeek and beyond. The Neuron Inference Technology team works side ...
This role develops, enables and performance tunes building blocks for all key ML model families, including Llama3, GPT OSS, Qwen3, DeepSeek and beyond. The Neuron Inference Technology team works side ...
This role develops, enables and performance tunes building blocks for all key ML model families, including Llama3, GPT OSS, Qwen3, DeepSeek and beyond. The Neuron Inference Technology team works side ...
Responsibilities : • Build, profile and optimize our training and inference framework • Collaborate with ML teams to accelerate their research and development and enable them to develop the next ...
Responsibilities : • Build, profile and optimize our training and inference framework • Collaborate with ML teams to accelerate their research and development and enable them to develop the next ...
Responsibilities : • Build, profile and optimize our training and inference framework • Collaborate with ML teams to accelerate their research and development and enable them to develop the next ...
Responsibilities : • Build, profile and optimize our training and inference framework • Collaborate with ML teams to accelerate their research and development and enable them to develop the next ...
Responsibilities : • Build, profile and optimize our training and inference framework • Collaborate with ML teams to accelerate their research and development and enable them to develop the next ...
Responsibilities : • Build, profile and optimize our training and inference framework • Collaborate with ML teams to accelerate their research and development and enable them to develop the next ...
The Trainium chip delivers industry-leading ML inference and training performance at the lowest cost in the cloud. This is enabled by edge software stack, the AWS Neuron Software Development Kit ...
The Trainium chip delivers industry-leading ML inference and training performance at the lowest cost in the cloud. This is enabled by edge software stack, the AWS Neuron Software Development Kit ...
Ml Inference information
See Kent, WA salary details
$42.3K - $58.7K
2% of jobs
$58.7K - $75K
3% of jobs
$75K - $91.3K
6% of jobs
$91.3K - $107.6K
9% of jobs
$112.8K is the 25th percentile. Wages below this are outliers.
$107.6K - $123.9K
15% of jobs
The median wage is $134.8K / yr.
$123.9K - $140.2K
22% of jobs
$149.2K is the 75th percentile. Wages above this are outliers.
$140.2K - $156.6K
32% of jobs
$156.6K - $172.9K
3% of jobs
$172.9K - $189.2K
4% of jobs
$189.2K - $205.5K
1% of jobs
$205.5K - $221.8K
2% of jobs
$42.3K
$138.6K
$221.8K
How much do ml inference jobs pay per year?
What is ML inference?
What is the difference between Ml Inference vs Data Scientist?
| Aspect | ML Inference | Data Scientist |
|---|---|---|
| Required Credentials | Knowledge of machine learning models, programming skills | Degree in data science, statistics, or related fields |
| Work Environment | Deploying models in production, real-time data processing | Data analysis, model development, research |
| Industry Usage | AI product deployment, software companies | Research institutions, tech firms, consulting |
ML Inference focuses on deploying trained models to make predictions on new data, often in real-time. Data Scientists develop and analyze models, working primarily in research and development. While both roles require understanding of machine learning, ML Inference emphasizes deployment and operationalization, whereas Data Scientists focus on model creation and analysis.
Which 3 jobs will survive AI?
What engineers make $500,000?
What is a $900,000 AI job?
Is ML a high paying job?
What are some common challenges faced by ML Inference Engineers when deploying models to production?
What are the key skills and qualifications needed to thrive in ML Inference, and why are they important?
Full-time
This job post has expired today. Applications are no longer accepted.
Job description
CoreWeave is The Essential Cloud for AI™, delivering a platform of technology and tools for innovators to build and scale AI. The Software Engineer will join the Inference team to implement features and fixes for model-serving services, focusing on improving latency and reliability on the GPU platform.
Responsibilities:
• Implement well-scoped features and fixes in Python/Go/C++ for model-serving services (e.g., Triton, vLLM, TensorRT-LLM, Ray Serve).
• Write tests, code comments, and short design docs; participate in code reviews.
• Add basic metrics and dashboards; assist with alarms and runbooks.
• Follow on-call runbooks and learn incident response in a guided rotation.
• Contribute to performance experiments (e.g., request batching, concurrency, caching) with guidance.
Qualifications:
Required:
• BS/MS in CS, EE, or related field, or equivalent practical experience.
• Foundations in data structures, algorithms, and networked services.
• Experience with Python or Go (C++ a plus) and Linux fundamentals; Git/CI basics.
• Exposure to containers and Kubernetes (coursework or projects welcome).
• Curiosity about GPU inference concepts (micro-batching, KV cache, streaming).
Preferred:
• Internship or project that deployed a microservice or ML inference demo.
• Coursework/research with PyTorch or TensorFlow; simple CUDA projects a plus.
• Familiarity with Grafana/Prometheus/OpenTelemetry or similar tooling.
Company:
CoreWeave provides cloud infrastructure services designed to support artificial intelligence and high-performance computing workloads. Founded in 2017, the company is headquartered in Livingston, USA, with a team of 1001-5000 employees. The company is currently Late Stage.
About CoreWeave
Sourced by ZipRecruiter
Industry
It services
Company size
51 - 200 Employees
Headquarters location
New York, NY, US
Year founded
2017