1

Deep Learning Performance Architect Jobs (NOW HIRING)

Performance Architect

San Jose, CA ยท On-site

$195K/yr

Performance Architect-SAP S/4HANA (Hybrid) San Jose, CA Long Term Contract Our client looking for a ... The ideal candidate should combine deep SAP expertise, performance engineering skills, and strong ...

Senior Performance Architect, Nemotron

Redmond, WA ยท On-site

$187K/yr

We are now looking for a Senior Performance Architect for Nemotron! At NVIDIA, we are redefining ... Experience with deep learning frameworks like PyTorch, TRT-LLM, VLLM, SGLang * A Growth mindset and ...

Senior Performance Architect, Nemotron

Hillsboro, OR ยท On-site

$181K/yr

We are now looking for a Senior Performance Architect for Nemotron! At NVIDIA, we are redefining ... Experience with deep learning frameworks like PyTorch, TRT-LLM, VLLM, SGLang * A Growth mindset and ...

The NVIDIA GPU Architecture group is looking for world class architects and software developers to ... highest performance in the world for deep learning and parallel processing algorithms. We are ...

Performance Architect

Austin, TX ยท Hybrid

$165K/yr

As a Performance Architect in Embedded Processor Architecture, Engineering & Solutions (EPAES) team, you will provide deep technical expertise and technical leadership in analyzing and optimizing ...

Systems Performance Architect

Beaverton, OR ยท On-site

$176K/yr

Systems Performance Architect The people here at Apple don't just create products -- they create ... Passionate about learning new things from deep technical topics to user workflows. * Strong ...

Systems Performance Architect

Beaverton, OR ยท On-site

$173K/yr

... learning new things from deep technical topics to user workflows. Strong interpersonal skills and ability to work with multi-disciplinary teams. Good communication and presentation skills

next page

Showing results 1-20

Deep Learning Performance Architect information

See salary details

$156.5K

$168K

How much do deep learning performance architect jobs pay per year?

As of Jun 26, 2026, the average yearly pay for deep learning performance architect in the United States is $167,842.00, according to ZipRecruiter salary data. Most workers in this role earn between $167,000.00 and $167,000.00 per year, depending on experience, location, and employer.

What is the highest paid type of architect?

Among various architecture roles, enterprise architects and solutions architects tend to have the highest salaries, especially in technology and IT sectors. Deep Learning Performance Architects, as specialized roles in AI and machine learning, also command high compensation, particularly with advanced skills in neural networks, cloud platforms, and performance optimization. Overall, roles requiring specialized technical expertise and strategic responsibilities typically offer the highest pay in architecture fields.

What are the key skills and qualifications needed to thrive as a Deep Learning Performance Architect, and why are they important?

To thrive as a Deep Learning Performance Architect, you need a strong background in computer science, deep learning frameworks, parallel computing, and optimization techniques, typically supported by a relevant degree and experience in AI or high-performance computing. Familiarity with tools such as TensorFlow, PyTorch, CUDA, and profiling or benchmarking systems is essential. Analytical problem-solving, effective communication, and a collaborative mindset help professionals excel in cross-functional teams and resolve complex performance bottlenecks. These skills are vital for optimizing AI workloads, ensuring scalability, and maximizing the efficiency of deep learning models in production environments.

What is a Deep Learning Performance Architect?

A Deep Learning Performance Architect is a specialized professional who designs, analyzes, and optimizes the performance of deep learning systems and models. They work to improve the efficiency, speed, and scalability of machine learning algorithms on various hardware platforms such as GPUs, TPUs, and CPUs. Their role often involves collaborating with software engineers and data scientists to identify bottlenecks and implement solutions that enhance computational capabilities for AI workloads. By doing so, they ensure that deep learning applications run faster and more efficiently, making the best use of available resources.

Is ML a high paying job?

Deep Learning Performance Architects and related machine learning roles are generally well-paid due to the specialized skills required, such as expertise in neural networks, programming, and data analysis. Salaries tend to be higher than average, especially with experience, advanced degrees, and proficiency in tools like TensorFlow or PyTorch.

What is the difference between Deep Learning Performance Architect vs Machine Learning Engineer?

AspectDeep Learning Performance ArchitectMachine Learning Engineer
CredentialsAdvanced degrees in AI, deep learning, or related fields; certifications in deep learning frameworksDegrees in computer science, data science, or related fields; certifications in machine learning tools
Work EnvironmentResearch labs, AI development teams, performance optimization settingsData-driven projects, model development, deployment environments
Industry UsageTech companies, AI research firms, organizations focusing on deep learning optimizationTech companies, startups, enterprises applying machine learning solutions

The Deep Learning Performance Architect specializes in optimizing deep learning models for efficiency and scalability, focusing on hardware and software performance. In contrast, Machine Learning Engineers develop, train, and deploy machine learning models across various applications. While both roles require strong technical skills, the Architect emphasizes performance tuning and system optimization, whereas the Engineer focuses on model development and implementation.

What are some common challenges faced by Deep Learning Performance Architects when optimizing large-scale neural network models?

Deep Learning Performance Architects often encounter challenges such as balancing model accuracy with computational efficiency, managing memory constraints on specialized hardware, and optimizing inference or training speed across different platforms. They frequently need to profile and analyze bottlenecks at both the algorithmic and hardware levels, often requiring close collaboration with software engineers and hardware designers. Staying current with rapidly evolving deep learning frameworks and hardware accelerators is also essential to ensure optimal performance and scalability.

How much does a deep learning architect make at Nvidia?

A deep learning architect at Nvidia typically earns between $150,000 and $200,000 annually, depending on experience, location, and level of expertise. Compensation may also include bonuses, stock options, and benefits, reflecting the company's competitive pay structure for specialized AI roles.

What does a deep learning architect do?

A deep learning performance architect designs and optimizes neural network models and infrastructure to improve AI system efficiency and accuracy. They work with frameworks like TensorFlow or PyTorch, analyze model performance, and implement solutions to enhance scalability and speed in machine learning applications.
More about Deep Learning Performance Architect jobs
What job categories do people searching Deep Learning Performance Architect jobs look for? The top searched job categories for Deep Learning Performance Architect jobs are:
Infographic showing various Deep Learning Performance Architect job openings in the United States as of June 2026, with employment types broken down into 88% Full Time, and 12% Part Time. Highlights an 87% Physical, 2% Hybrid, and 11% Remote job distribution, with an average salary of $167,842 per year, or $80.7 per hour.
Senior Performance Engineer - Deep Learning

Senior Performance Engineer - Deep Learning

Nvidia Corporation

Santa Clara, CA โ€ข On-site

$122K - $168K/yr

Full-time

Posted 25 days ago


Job description

Our Deep Learning models performance engineering team at NVIDIA is hiring software engineers at all experience levels to build and optimize the libraries and tools that enable Deep Learning Researchers and Engineers to design, develop, and deploy efficient AI applications. We are an ambitious and diverse team that builds optimizations directly into mainstream open source Deep Learning frameworks - PyTorch and JAX, which boost the performance at all levels of NVIDIA's AI stack. Our team has a wide collaborative footprint, working not only with multiple teams across NVIDIA but also with the broader open-source community to deliver SOTA Deep Learning performance on the best AI platform in the world!
What you will be doing:
  • Build and support Transformer Engine, the open-source library for accelerating the training of Large Language Models.
  • Collaborate on systems research that improves Deep Learning model performance, such as training using extremely low precision, parallelism methods, etc.
  • Implement, benchmark, and optimize new Deep Learning models such as LLMs straight out of groundbreaking research to scale efficiently on NVIDIA GPUs and systems.
  • Build and contribute to NVIDIA submissions on community benchmarks such as MLPerf.
  • Engage with the open-source community as well as support enterprise customers and partners by delivering the benefits of NVIDIA's latest hardware and software innovations.
  • Influence the design of new hardware generations and core platform software components for NVIDIA hardware and systems.

What we need to see:
  • BS or equivalent experience in Computer Science, Electrical Engineering, or a related field.
  • 3+ years of experience in C++ and Python programming.
  • Strong background, experience, or coursework in parallel systems programming, preferably on GPUs.
  • Knowledge of Computer Architecture, Code Optimization, and/or Operating Systems.
  • Proven experience in developing large software projects.
  • Excellent verbal and written communication skills.

Ways to stand out from the crowd:
  • Experience in PyTorch, JAX, or any other DL framework.
  • Experience with performance analysis, profiling, and code optimization techniques, especially with multi-GPU or multi-node systems.
  • Knowledge of modern LLM architectures, attention mechanisms, and/or low-level DL libraries such as cuBLAS, cuDNN, and cuSOLVER.
  • Experience in writing GPU kernels using any of - CUDA, OpenAI Triton, CuTeDSL, Pallas, or other similar libraries.
  • Any past contributions to the open source community and/or experience working with multidisciplinary teams also showcase readiness for the team's responsibilities.

Your base salary will be determined based on your location, experience, and the pay of employees in similar positions. The base salary range is 152,000 USD - 241,500 USD for Level 3, and 184,000 USD - 287,500 USD for Level 4.
You will also be eligible for equity and benefits.
Applications for this job will be accepted at least until March 8, 2026.
This posting is for an existing vacancy.
NVIDIA uses AI tools in its recruiting processes.
NVIDIA is committed to fostering a diverse work environment and proud to be an equal opportunity employer. As we highly value diversity in our current and future employees, we do not discriminate (including in our hiring and promotion practices) on the basis of race, religion, color, national origin, gender, gender expression, sexual orientation, age, marital status, veteran status, disability status or any other characteristic protected by law.

Nvidia logo

About Nvidia

Sourced by ZipRecruiter

NVIDIA has been transforming computer graphics, PC gaming, and accelerated computing for more than 25 years. It's a unique legacy of innovation that's fueled by great technology--and amazing people. Today, we're tapping into the unlimited potential of AI to define the next era of computing. An era in which our GPU acts as the brains of computers, robots, and self-driving cars that can understand the world. Doing what's never been done before takes vision, innovation, and the world's best talent.

Industry

Computer and electronic product manufacturing

Company size

10,000+ Employees

Headquarters location

Santa Clara, CA, US

Year founded

1993