Cuda Machine Learning Performance Engineer Jobs

Senior HPC Performance Engineer - AI for Science at Scale

$143.90K - $189.70K/yr

We are seeking a Sr. HPC Performance engineer to join our team of scientists and engineers ... CUDA or OAI Triton * Fluent in modern machine learning frameworks such as PyTorch, JAX, Warp

Nvidia Corporation

Senior HPC Performance Engineer - AI for Science at Scale

Santa Clara, CA · On-site

$143.90K - $189.70K/yr

Nvidia

Senior HPC Performance Engineer - AI for Science at Scale

Santa Clara, CA

$143.90K - $189.70K/yr

Nvidia

Senior HPC Performance Engineer - AI for Science at Scale

Santa Clara, CA

$143.90K - $189.70K/yr

Motional

Machine Learning Systems Engineer

Las Vegas, NV · On-site +1

We are looking for a Machine Learning Systems Engineer to join our ML Acceleration team. In this ... Design and maintain high-performance GPU kernels in Triton or CUDA for state-of-the-art ML ...

Motional

Machine Learning Systems Engineer

Las Vegas, NV · On-site +1

Anthropic

Performance Engineer

San Francisco, CA

Running machine learning (ML) algorithms at our scale often requires solving novel systems problems. As a Performance Engineer, you'll be responsible for identifying these problems, and then ...

Anthropic

Performance Engineer

San Francisco, CA

Susquehanna International Group, LLP

GPU Performance Engineer | Experienced Hire

Philadelphia, PA · On-site

Overview We are looking for a GPU Performance Engineer to build highly optimized CUDA kernels for ... To meet the unique challenges of global markets, Susquehanna applies machine learning and advanced ...

Susquehanna International Group, LLP

GPU Performance Engineer | Experienced Hire

Philadelphia, PA · On-site

Susquehanna International Group, LLP

GPU Performance Engineer | Experienced Hire

New York, NY · On-site

Susquehanna International Group, LLP

GPU Performance Engineer | Experienced Hire

New York, NY · On-site

Nvidia

Senior Deep Learning Kernel Software Performance Architect

Santa Clara, CA

$152.10K - $206.70K/yr

... accelerate machine learning, data analytics and high-performance computing applications. This ... CUDA Compiler teams to identify performance issues. * AI/ML training and inference performance ...

Nvidia

Senior Deep Learning Kernel Software Performance Architect

Santa Clara, CA

$152.10K - $206.70K/yr

Nvidia

OR · Hybrid

Strong mathematical foundation in machine learning and deep learning ... Expert programming skills in C, C++, and/or Python * Familiarity with GPU computing (CUDA or ...

Nvidia

OR · Hybrid

Strong mathematical foundation in machine learning and deep learning ... Expert programming skills in C, C++, and/or Python * Familiarity with GPU computing (CUDA or ...

Nvidia Corporation

Senior Deep Learning Kernel Software Performance Architect

Santa Clara, CA · On-site

$152.10K - $206.70K/yr

Nvidia Corporation

Senior Deep Learning Kernel Software Performance Architect

Santa Clara, CA · On-site

$152.10K - $206.70K/yr

NVIDIA

Senior Deep Learning Kernel Software Performance Architect

Santa Clara, CA

$150.90K - $205.10K/yr

NVIDIA

Senior Deep Learning Kernel Software Performance Architect

Santa Clara, CA

$150.90K - $205.10K/yr

SynergisticIT

AI and Machine Learning (ML) Performance Engineer

Pittsburgh, PA

$130.80K/yr

Machine Learning And Artificial Intelligence Developer You will be responsible for Machine Learning ... Investigate and optimize models performance. * Solving complex problems with multi-layered data ...

SynergisticIT

AI and Machine Learning (ML) Performance Engineer

Pittsburgh, PA

$130.80K/yr

Wayve

Staff ML Performance Engineer (Training Efficiency)

Sunnyvale, CA · On-site

... MS in Machine Learning, Computer Science, Engineering, or a related technical discipline or ... Experience implementing GPU kernels (CUDA, Triton, etc). * Knowledge of computing fundamentals ...

Wayve

Staff ML Performance Engineer (Training Efficiency)

Sunnyvale, CA · On-site

Venturefizz Product Management Community

Machine Learning Systems Engineer

Boston, MA · On-site +1

$144K - $192K/yr

Machine Learning Systems Engineer Boston, MA We are looking for a Machine Learning Systems Engineer ... Design and maintain high-performance GPU kernels in Triton or CUDA for state-of-the-art ML ...

Venturefizz Product Management Community

Machine Learning Systems Engineer

Boston, MA · On-site +1

$144K - $192K/yr

Apple

Machine Learning Video Processing Engineer

Cupertino, CA · On-site

$147.40K - $272.10K/yr

Experience with performance (power and speed) optimization: GPGPU SIMD programming. Knowledge of ... Experience with GPU APIs preferably Metal, CUDA, OpenGL, and/or OpenCL. Excellent written and oral ...

Apple

Machine Learning Video Processing Engineer

Cupertino, CA · On-site

$147.40K - $272.10K/yr

TRM

Machine Learning Infrastructure Engineer

San Francisco, CA · On-site

You will work at the intersection of distributed systems, cloud infrastructure, GPU performance ... CUDA familiarity and experience debugging GPU-related issues is a plus. * Adaptable. Goals can ...

TRM

Machine Learning Infrastructure Engineer

San Francisco, CA · On-site

Nvidia

Senior Performance Engineer - Deep Learning

Santa Clara, CA

$122.70K - $168.50K/yr

Our Deep Learning models performance engineering team at NVIDIA is hiring software engineers at all ... Experience in writing GPU kernels using any of - CUDA, OpenAI Triton, CuTeDSL, Pallas, or other ...

Nvidia

Senior Performance Engineer - Deep Learning

Santa Clara, CA

$122.70K - $168.50K/yr

Nvidia Corporation

Senior Performance Engineer - Deep Learning

Santa Clara, CA · On-site

$122.70K - $168.50K/yr

Nvidia Corporation

Senior Performance Engineer - Deep Learning

Santa Clara, CA · On-site

$122.70K - $168.50K/yr

Apple

Machine Learning Video Processing Engineer

Cupertino, CA · On-site

$181.10K - $318.40K/yr

Apple

Machine Learning Video Processing Engineer

Cupertino, CA · On-site

$181.10K - $318.40K/yr

SynergisticIT

AI and Machine Learning (ML) Performance Engineer

Mclean, VA

$143.60K/yr

Machine Learning And Artificial Intelligence Developer Synergistic IT is a full-service staffing ... Investigate and optimize models performance. * Solving complex problems with multi-layered data ...

SynergisticIT

AI and Machine Learning (ML) Performance Engineer

Mclean, VA

$143.60K/yr

Intuit

Staff Machine Learning Engineer

Mountain View, CA · On-site

$197K - $266.50K/yr

Computer science fundamentals: data structures, algorithms, performance complexity, and ... CUDA and cuDNN) * Experience with integrating applications and platforms with cloud technologies (i ...

Intuit

Staff Machine Learning Engineer

Mountain View, CA · On-site

$197K - $266.50K/yr

Showing results 1-20

People also search for

Ai Mod

Cuda Machine Learning Performance Engineer Jobs

Cuda Machine Learning Performance Engineer information

See salary details

$109K

$141K

How much do cuda machine learning performance engineer jobs pay per year?

As of Jun 3, 2026, the average yearly pay for cuda machine learning performance engineer in the United States is $139,529.00, according to ZipRecruiter salary data. Most workers in this role earn between $140,000.00 and $140,000.00 per year, depending on experience, location, and employer.

What are the key skills and qualifications needed to thrive as a CUDA Machine Learning Performance Engineer, and why are they important?

To thrive as a CUDA Machine Learning Performance Engineer, you need strong expertise in parallel programming, GPU architectures, and a solid background in computer science or related fields. Familiarity with CUDA, performance profiling tools (like Nsight), and deep learning frameworks such as TensorFlow or PyTorch is typically required. Analytical thinking, problem-solving, and clear communication are crucial soft skills for diagnosing performance bottlenecks and collaborating with cross-functional teams. These skills ensure optimal machine learning implementations, efficient resource utilization, and advancement of high-performance computing solutions.

What are some common challenges faced by a CUDA Machine Learning Performance Engineer when optimizing ML workloads?

CUDA Machine Learning Performance Engineers often encounter challenges in identifying and resolving performance bottlenecks within GPU-accelerated ML pipelines. Balancing memory usage, maximizing parallelism, and minimizing data transfer between the CPU and GPU are key concerns. Engineers must also keep up with rapid advancements in both hardware and software frameworks, requiring continuous learning and adaptation. Collaboration with data scientists and software engineers is frequent, as you’ll need to translate high-level ML models into efficient, scalable GPU implementations.

What are Cuda Machine Learning Performance Engineers?

Cuda Machine Learning Performance Engineers are specialized professionals who optimize and accelerate machine learning applications using NVIDIA's CUDA platform. They analyze code performance on GPUs, identify bottlenecks, and implement improvements to maximize computational efficiency. Their work often involves collaborating with data scientists and software developers to ensure machine learning algorithms run efficiently on CUDA-enabled hardware. They are proficient in parallel programming, GPU architectures, and performance profiling tools. Their expertise helps organizations achieve faster model training and inference, leading to more effective use of hardware resources.

What is the difference between Cuda Machine Learning Performance Engineer vs Data Scientist?

Aspect	Cuda Machine Learning Performance Engineer	Data Scientist
Required Credentials	Knowledge of CUDA, GPU programming, machine learning frameworks	Statistics, programming, data analysis skills, often a degree in data science or related fields
Work Environment	Technical teams focused on optimizing ML models for GPU hardware	Data analysis, model development, business insights
Industry Usage	Tech, AI, high-performance computing sectors	Finance, healthcare, marketing, tech

The Cuda Machine Learning Performance Engineer specializes in optimizing machine learning models for GPU hardware using CUDA, focusing on performance and efficiency. In contrast, a Data Scientist primarily develops and analyzes models to extract insights from data. While both roles require a strong understanding of machine learning, the Performance Engineer emphasizes technical optimization, whereas the Data Scientist focuses on data analysis and model interpretation.

Senior HPC Performance Engineer - AI for Science at Scale

Nvidia Corporation

Santa Clara, CA • On-site

Apply

$143.90K - $189.70K/yr

Full-time

Posted 23 days ago

Job description

NVIDIA has become the platform upon which every new AI-powered application is built. We are seeking a Sr. HPC Performance engineer to join our team of scientists and engineers passionate about building the next generation of scientific machine learning (ML) frameworks. Starting with digital biology, through high performance computing (HPC) and powerful ML methods, together, we will advance NVIDIA's capacity to accelerate AI for Science and industries that depend on it.
What you'll be doing:

Design and implement computationally performant features for large scale, CUDA-backed ML training frameworks, using low level acceleration and scaling strategies such as kernel design, GPU porting, data structure innovations, distributed learning technologies
Optimize computational performance of wide range of business-critical ML models via accelerated hardware and software stack, as well as algorithmic improvements
Develop and maintain HPC software stack for atomistic modeling and generative machine learning in digital biology and beyond
Collaborate with multiple HPC, AI infrastructure, and research teams
Drive the testing and maintenance of the algorithms and software modules

What we need to see:

Advanced degree in a quantitative field such as Computer Science, Computational Biophysics, Computational Chemistry, Physics, Mathematics, or equivalent experience
5+ years of relevant experience
Consistent track record in performance engineering as well as software design, building and packaging and launching software products, with a focus on acceleration
Deep understanding of parallel programming in C++, Python; programming experience CUDA or OAI Triton
Fluent in modern machine learning frameworks such as PyTorch, JAX, Warp
Experience with HPC solutions to research problems for biology or chemistry, including but not limited to atomistic simulations
Recognized for technical leadership contributions, capable of self-direction, and ability to learn from and teach others
You should display strong communication skills, be organized and self-motivated, and play well with others (be an excellent teammate!)

Ways to stand out from the crowd:

Contribution to major scientific AI for Science codebase with acceleration features such as new kernels
Familiarity with pioneering language and geometric models used in AI for Science applications in biology and chemistry

With highly competitive salaries and a comprehensive benefits package, NVIDIA is widely considered to be one of the technology industry's most desirable employers. We are an equal opportunity employer and value diversity at our company. We have some of the most forward-thinking, resourceful and talented people in the world working with us and our engineering teams are growing fast in some of the hottest state-of-the-art fields: Digital Biology, Artificial Intelligence, and Autonomous Vehicles. Are you a creative and autonomous engineer with a real passion for machine learning, computational chemistry, data science & parallel computing? If so, we want to hear from you.
Your base salary will be determined based on your location, experience, and the pay of employees in similar positions. The base salary range is 184,000 USD - 287,500 USD.
You will also be eligible for equity and benefits.
Applications for this job will be accepted at least until February 21, 2026.
This posting is for an existing vacancy.
NVIDIA uses AI tools in its recruiting processes.
NVIDIA is committed to fostering a diverse work environment and proud to be an equal opportunity employer. As we highly value diversity in our current and future employees, we do not discriminate (including in our hiring and promotion practices) on the basis of race, religion, color, national origin, gender, gender expression, sexual orientation, age, marital status, veteran status, disability status or any other characteristic protected by law.

About Nvidia

Sourced by ZipRecruiter

NVIDIA has been transforming computer graphics, PC gaming, and accelerated computing for more than 25 years. It's a unique legacy of innovation that's fueled by great technology--and amazing people. Today, we're tapping into the unlimited potential of AI to define the next era of computing. An era in which our GPU acts as the brains of computers, robots, and self-driving cars that can understand the world. Doing what's never been done before takes vision, innovation, and the world's best talent.

Industry

Computer and electronic product manufacturing

Company size

10,000+ Employees

Headquarters location

Santa Clara, CA, US

Year founded

1993

Website

nvidia.com

Social media

View All Nvidia Jobs

Apply

Cuda Machine Learning Performance Engineer Jobs

Senior HPC Performance Engineer - AI for Science at Scale

Senior HPC Performance Engineer - AI for Science at Scale

Senior HPC Performance Engineer - AI for Science at Scale

Senior HPC Performance Engineer - AI for Science at Scale

Machine Learning Systems Engineer

Machine Learning Systems Engineer

Performance Engineer

Performance Engineer

GPU Performance Engineer | Experienced Hire

GPU Performance Engineer | Experienced Hire

GPU Performance Engineer | Experienced Hire

GPU Performance Engineer | Experienced Hire

Senior Deep Learning Kernel Software Performance Architect

Senior Deep Learning Kernel Software Performance Architect

Senior Deep Learning Performance Architect - LPU

Senior Deep Learning Performance Architect - LPU

Senior Deep Learning Kernel Software Performance Architect

Senior Deep Learning Kernel Software Performance Architect

Senior Deep Learning Kernel Software Performance Architect

Senior Deep Learning Kernel Software Performance Architect

AI and Machine Learning (ML) Performance Engineer

AI and Machine Learning (ML) Performance Engineer

Staff ML Performance Engineer (Training Efficiency)

Staff ML Performance Engineer (Training Efficiency)

Machine Learning Systems Engineer

Machine Learning Systems Engineer

Machine Learning Video Processing Engineer

Machine Learning Video Processing Engineer

Machine Learning Infrastructure Engineer

Machine Learning Infrastructure Engineer

Senior Performance Engineer - Deep Learning

Senior Performance Engineer - Deep Learning

Senior Performance Engineer - Deep Learning

Senior Performance Engineer - Deep Learning

Machine Learning Video Processing Engineer

Machine Learning Video Processing Engineer

AI and Machine Learning (ML) Performance Engineer

AI and Machine Learning (ML) Performance Engineer

Staff Machine Learning Engineer

Staff Machine Learning Engineer

People also search for

Cuda Machine Learning Performance Engineer information

See salary details

How much do cuda machine learning performance engineer jobs pay per year?

What are the key skills and qualifications needed to thrive as a CUDA Machine Learning Performance Engineer, and why are they important?

What are some common challenges faced by a CUDA Machine Learning Performance Engineer when optimizing ML workloads?

What are Cuda Machine Learning Performance Engineers?

What is the difference between Cuda Machine Learning Performance Engineer vs Data Scientist?

Senior HPC Performance Engineer - AI for Science at Scale

Share this job

Job description

About Nvidia

Industry

Company size

Headquarters location

Year founded

Website

Social media

Share this job