1

Gpu Performance Engineer Jobs in Raleigh, NC (NOW HIRING)

... Engineering practice, you will design and drive deployment of fully integrated architectures for GPU-accelerated AI factories and high-performance computing infrastructure in close partnership with ...

Senior Software Engineer, AI Inference

Raleigh, NC · On-site +1

$133.65K - $220.68K/yr

... performance across a growing matrix of models and hardware. You will be building and shipping a ... Manage and scale multi-cloud GPU infrastructure using Terraform and Ansible, including both bare ...

Senior Software Engineer, Agentic AI

Durham, NC

$118.40K - $156.10K/yr

... high-performance data pipelines, RAG systems, vector databases, and GPU-optimized training and ... Solid understanding of asynchronous programming, callbacks, request lifecycles, and event-driven ...

Senior VLSI CAD Software Engineer

Durham, NC · On-site

$118.40K - $156.10K/yr

An era in which our GPU acts as the brains of computers, robots, and self-driving cars that can ... Design, development, review, test, and support of high-capacity and high-performance chip design ...

Senior AI Systems Engineer

Raleigh, NC · On-site +1

$92.40K - $126.40K/yr

Maintain observability across AI systems through logging, metrics, performance monitoring, alerting ... Experience with GPU-based systems or running AI models in HPC environments. * Experience writing ...

AI/ML Engineer

Raleigh, NC · Remote

$140K - $220K/yr

Design, develop, and deploy AI/ML models and pipelines that meet mission and performance objectives ... Exposure to GPU-based or edge inference environments. * Bachelor's or Master's degree in Computer ...

AI/ML Engineer

Durham, NC · Remote

$140K - $220K/yr

Design, develop, and deploy AI/ML models and pipelines that meet mission and performance objectives ... Exposure to GPU-based or edge inference environments. * Bachelor's or Master's degree in Computer ...

AI/ML Engineer

Raleigh, NC · Remote

$140K - $220K/yr

Design, develop, and deploy AI/ML models and pipelines that meet mission and performance objectives ... Exposure to GPU-based or edge inference environments. * Bachelor's or Master's degree in Computer ...

AI/ML Engineer

Durham, NC · Remote

$140K - $220K/yr

Design, develop, and deploy AI/ML models and pipelines that meet mission and performance objectives ... Exposure to GPU-based or edge inference environments. * Bachelor's or Master's degree in Computer ...

AI/ML Engineer

Durham, NC · Remote

$140K - $220K/yr

Design, develop, and deploy AI/ML models and pipelines that meet mission and performance objectives ... Exposure to GPU-based or edge inference environments. * Bachelor's or Master's degree in Computer ...

AI/ML Engineer

Raleigh, NC · Remote

$140K - $220K/yr

Design, develop, and deploy AI/ML models and pipelines that meet mission and performance objectives ... Exposure to GPU-based or edge inference environments. * Bachelor's or Master's degree in Computer ...

AI/ML Engineer

Durham, NC · On-site +1

$140K - $220K/yr

Design, develop, and deploy AI/ML models and pipelines that meet mission and performance objectives ... Exposure to GPU-based or edge inference environments. * Bachelor's or Master's degree in Computer ...

next page

Showing results 1-20

Gpu Performance Engineer information

See Raleigh, NC salary details

$10

$58

$95

How much do gpu performance engineer jobs pay per hour?

As of Jun 1, 2026, the average hourly pay for gpu performance engineer in Raleigh, NC is $58.43, according to ZipRecruiter salary data. Most workers in this role earn between $47.88 and $66.11 per hour, depending on experience, location, and employer.

What are the key skills and qualifications needed to thrive as a GPU Performance Engineer, and why are they important?

To thrive as a GPU Performance Engineer, you need a strong background in computer architecture, programming (C/C++), and a degree in computer science, electrical engineering, or a related field. Proficiency with GPU profiling tools (e.g., NVIDIA Nsight, AMD Radeon GPU Profiler), performance analysis frameworks, and parallel computing libraries like CUDA or OpenCL is typically required. Analytical thinking, problem-solving abilities, and effective communication are crucial soft skills for collaborating with developers and debugging performance bottlenecks. These skills and qualities are essential for optimizing GPU performance, ensuring efficient software-hardware interaction, and delivering high-quality graphics or compute solutions.

What are some common challenges faced by GPU Performance Engineers when optimizing graphics workloads?

GPU Performance Engineers often encounter challenges such as identifying performance bottlenecks within complex graphics pipelines, balancing resource utilization, and achieving optimal frame rates across diverse hardware configurations. They must use specialized profiling tools and collaborate closely with developers, driver engineers, and QA teams to address issues like memory bandwidth limitations or shader inefficiencies. Staying updated with rapidly evolving GPU architectures and optimizing for both current and next-generation hardware are also key aspects of the role.

What is a GPU Performance Engineer?

A GPU Performance Engineer is a specialist who analyzes, optimizes, and improves the performance of graphics processing units (GPUs). They work on identifying bottlenecks, optimizing code, and ensuring that GPU hardware and software deliver maximum efficiency and speed. Their role may involve working with drivers, firmware, and applications to enhance graphics and compute workloads. This job is essential in industries like gaming, AI, and high-performance computing where GPU efficiency directly impacts user experience and system performance.

What is the difference between Gpu Performance Engineer vs Gpu Hardware Engineer?

AspectGpu Performance EngineerGpu Hardware Engineer
Primary FocusOptimizing GPU performance, benchmarking, and tuning softwareDesigning, developing, and testing GPU hardware components
Required SkillsProgramming, performance analysis, GPU architecture knowledgeHardware design, circuit analysis, FPGA/ASIC experience
Work EnvironmentSoftware development teams, labs for testing performanceHardware labs, manufacturing facilities, R&D centers
Common CertificationsNone specific, often requires computer engineering or related degreesElectrical engineering, VLSI design certifications

The Gpu Performance Engineer primarily focuses on optimizing and testing GPU software performance, while the Gpu Hardware Engineer designs and develops the physical GPU components. Both roles require a strong background in computer engineering, but differ in their core responsibilities and work environments.

What are popular job titles related to Gpu Performance Engineer jobs in Raleigh, NC? For Gpu Performance Engineer jobs in Raleigh, NC, the most frequently searched job titles are:
What job categories do people searching Gpu Performance Engineer jobs in Raleigh, NC look for? The top searched job categories for Gpu Performance Engineer jobs in Raleigh, NC are:
HPC AI Solution Architect (S2S)

HPC AI Solution Architect (S2S)

Deloitte

Raleigh, NC • On-site

Other

Posted 11 days ago


Deloitte rating

8.1

Company rating: 8.1 out of 10

Based on 86 frontline employees who took The Breakroom Quiz

59th of 138 rated financial services


Job description

Lead Cloud HPC- AI Infrastructure Architect(S2S)

As a Lead Cloud Integrated Infra Engineer on the Silicon2Service team in Deloitte's AI & Engineering practice, you will design and drive deployment of fully integrated architectures for GPU-accelerated AI factories and high-performance computing infrastructure in close partnership with Deloitte AI specialists and our ecosystem partners. You will shape end-to-end solutions-from discovery and reference architecture mapping through sizing and implementation.  You will partner with Sales Executives, AI application specialists, delivery engineering, and managed services to help clients achieve measurable outcomes from private AI assets. You will lead technical solution strategy for pursuits and active opportunities and translate complex client needs into clear, complete solutions and delivery requirements.

Recruiting for this role ends on 6/26/2026.


Work you'll do
As a Lead Cloud Integrated Infra Engineer on the Silicon2Service team, you will be responsible for:

  • Leading architecture for pursuits and active opportunities, including discovery, requirements, constraints, and target-state design
  • Creatively defining reference architectures for on-premises, cloud, and hybrid GPU platforms across compute, network, storage, security, software and operations
  • Driving architecture trade-offs and decisions across performance, scalability, reliability, locality, total cost of ownership, time-to-value, and risk
  • Owning the technical solution strategy in proposals and RFPs, including architecture narrative, assumptions, dependencies, sizing guidance, and delivery approach
  • Facilitating client workshops and technical reviews and translating engineering detail into executive-ready communications
  • Architecting complex, innovative technology solutions with a focus on business outcomes, cost of quality, and long-term scalability and sustainability.
  • Engaging with C-Suite client leadership during sales and delivery, including leading technical pre-sales discussions, shaping proposals, and supporting the closing of new business opportunities
  •  Supporting go-to-market strategies, including participation in industry events, conferences, and client briefings

The Team

The Silicon to Service team at Deloitte delivers end-to-end AI factories and advanced technology services that help organizations build, deploy, and operate large-scale, private AI and data platforms. We enable the next phase of enterprise AI adoption through private AI economics with cloud-like ese of use.  Join this unique opportunity to work on innovative AI platforms and emerging technologies in the rapidly evolving AI market while solving complex enterprise problems for some of the world's largest organizations.


Qualifications

Required:

  • 10+ years of experience in infrastructure architecture or engineering for large-scale platforms including design, implementation, operations, and optimization.
  • 4+ years designing or delivering GPU-accelerated platforms for AI, ML, or high-performance computing
  • 3+ years Linux system administration in production environments
  • 3+ years designing or operating distributed compute clusters for AI/HPC in hybrid cloud setups, including multi-GPU topologies, partitioning, scheduler integration, and scalability for edge-to-cloud workloads.
  • 2+ years with high-performance networking or storage for AI/HPC
  • 2+ years building containerized platforms using Kubernetes or Red Hat OpenShift, including GPU operators/drivers, CUDA container runtime, and cluster lifecycle automation
  • 2+ years automating infrastructure as code(IaC) with tools like Terraform and Ansible
  • At least 2 end-to-end deployments of reference architectures in the cloud or on-prem, including variants with security controls, network segmentation, operational runbooks, and validation testing
  • Experience in pre-sales or sales engineering, including discovery, solution demonstrations, and proposal/RFP contributions
  • Ability to travel 50%, on average, based on the work you do and the clients and industries/sectors you serve.
  • Limited immigration sponsorship may be available.

Preferred:

  • 2+ years implementing AI/HPC cluster scheduling  (Slurm and Kubernetes), including multi-tenant queues, quotas, and GPU-aware policies
  • 2+ years supporting generative AI infrastructure patterns, including multi-node distributed training
  • Experience with AI agents and frameworks
  • Experience with high-throughput storage for AI/HPC
  • Experience executing NVIDIA co-sell motions with OEMS (Dell, HPC, Lenovo), CSPs ( AWS, Azure, Google Cloud), or independent software vendors ( Run:ai, OpenShift, Weights & Biases)

The wage range for this role takes into account the wide range of factors that are considered in making compensation decisions including but not limited to skill sets; experience and training; licensure and certifications; and other business and organizational needs. The disclosed range estimate has not been adjusted for the applicable geographic differential associated with the location at which the position may be filled. At Deloitte, it is not typical for an individual to be hired at or near the top of the range for their role and compensation decisions are dependent on the facts and circumstances of each case. A reasonable estimate of the current range is $141,200 to $278,300.

You may also be eligible to participate in a discretionary annual incentive program, subject to the rules governing the program, whereby an award, if any, depends on various factors, including, without limitation, individual and organizational performance.

Qualifications:

Lead Cloud HPC- AI Infrastructure Architect(S2S)

As a Lead Cloud Integrated Infra Engineer on the Silicon2Service team in Deloitte's AI & Engineering practice, you will design and drive deployment of fully integrated architectures for GPU-accelerated AI factories and high-performance computing infrastructure in close partnership with Deloitte AI specialists and our ecosystem partners. You will shape end-to-end solutions-from discovery and reference architecture mapping through sizing and implementation.  You will partner with Sales Executives, AI application specialists, delivery engineering, and managed services to help clients achieve measurable outcomes from private AI assets. You will lead technical solution strategy for pursuits and active opportunities and translate complex client needs into clear, complete solutions and delivery requirements.

Recruiting for this role ends on 6/26/2026.


Work you'll do
As a Lead Cloud Integrated Infra Engineer on the Silicon2Service team, you will be responsible for:

  • Leading architecture for pursuits and active opportunities, including discovery, requirements, constraints, and target-state design
  • Creatively defining reference architectures for on-premises, cloud, and hybrid GPU platforms across compute, network, storage, security, software and operations
  • Driving architecture trade-offs and decisions across performance, scalability, reliability, locality, total cost of ownership, time-to-value, and risk
  • Owning the technical solution strategy in proposals and RFPs, including architecture narrative, assumptions, dependencies, sizing guidance, and delivery approach
  • Facilitating client workshops and technical reviews and translating engineering detail into executive-ready communications
  • Architecting complex, innovative technology solutions with a focus on business outcomes, cost of quality, and long-term scalability and sustainability.
  • Engaging with C-Suite client leadership during sales and delivery, including leading technical pre-sales discussions, shaping proposals, and supporting the closing of new business opportunities
  •  Supporting go-to-market strategies, including participation in industry events, conferences, and client briefings

The Team

The Silicon to Service team at Deloitte delivers end-to-end AI factories and advanced technology services that help organizations build, deploy, and operate large-scale, private AI and data platforms. We enable the next phase of enterprise AI adoption through private AI economics with cloud-like ese of use.  Join this unique opportunity to work on innovative AI platforms and emerging technologies in the rapidly evolving AI market while solving complex enterprise problems for some of the world's largest organizations.


Qualifications

Required:

  • 10+ years of experience in infrastructure architecture or engineering for large-scale platforms including design, implementation, operations, and optimization.
  • 4+ years designing or delivering GPU-accelerated platforms for AI, ML, or high-performance computing
  • 3+ years Linux system administration in production environments
  • 3+ years designing or operating distributed compute clusters for AI/HPC in hybrid cloud setups, including multi-GPU topologies, partitioning, scheduler integration, and scalability for edge-to-cloud workloads.
  • 2+ years with high-performance networking or storage for AI/HPC
  • 2+ years building containerized platforms using Kubernetes or Red Hat OpenShift, including GPU operators/drivers, CUDA container runtime, and cluster lifecycle automation
  • 2+ years automating infrastructure as code(IaC) with tools like Terraform and Ansible
  • At least 2 end-to-end deployments of reference architectures in the cloud or on-prem, including variants with security controls, network segmentation, operational runbooks, and validation testing
  • Experience in pre-sales or sales engineering, including discovery, solution demonstrations, and proposal/RFP contributions
  • Ability to travel 50%, on average, based on the work you do and the clients and industries/sectors you serve.
  • Limited immigration sponsorship may be available.

Preferred:

  • 2+ years implementing AI/HPC cluster scheduling  (Slurm and Kubernetes), including multi-tenant queues, quotas, and GPU-aware policies
  • 2+ years supporting generative AI infrastructure patterns, including multi-node distributed training
  • Experience with AI agents and frameworks
  • Experience with high-throughput storage for AI/HPC
  • Experience executing NVIDIA co-sell motions with OEMS (Dell, HPC, Lenovo), CSPs ( AWS, Azure, Google Cloud), or independent software vendors ( Run:ai, OpenShift, Weights & Biases)

The wage range for this role takes into account the wide range of factors that are considered in making compensation decisions including but not limited to skill sets; experience and training; licensure and certifications; and other business and organizational needs. The disclosed range estimate has not been adjusted for the applicable geographic differential associated with the location at which the position may be filled. At Deloitte, it is not typical for an individual to be hired at or near the top of the range for their role and compensation decisions are dependent on the facts and circumstances of each case. A reasonable estimate of the current range is $141,200 to $278,300.

You may also be eligible to participate in a discretionary annual incentive program, subject to the rules governing the program, whereby an award, if any, depends on various factors, including, without limitation, individual and organizational performance.

Education:Bachelor's DegreeEmployment Type:

What Deloitte employees say

Pay

Benefits

Hours and flexibility

Workplace

Get the full story on Breakroom