1

Gpu Programming Jobs in Raleigh, NC (NOW HIRING)

Senior Developer Technology Engineer - AI

Durham, NC · Hybrid

$52.75 - $69.50/hr

... programming, e.g., CUDA, OpenACC, OpenMP, MPI, pthreads, etc. * Hands on experience doing low-level performance optimizations. * In-depth expertise with CPU and GPU architecture fundamentals. * Good ...

EDA Workflow Optimization Engineer

Durham, NC · Hybrid

$107.70K - $127.10K/yr

Hands-on experience running GPU-based workloads in a batch computing environment and a deep understanding of distributed system principles. * Strong programming and debugging skills with C/C ...

... Engineering practice, you will design and drive deployment of fully integrated architectures for GPU-accelerated AI factories and high-performance computing infrastructure in close partnership with ...

Develop innovative HW, GPU and system designs to extend the state of the art performance and efficiency * You are expected to understand the design and implementation, develop power metrics and drive ...

Knowledge of computer architecture (GPU/FPGA/distributed computing), operating systems, networking ... S. in Computer Science, Applied Mathematics, Computer Engineering or Electrical Engineering ...

Knowledge of computer architecture (GPU/FPGA/distributed computing), operating systems, networking ... S. in Computer Science, Applied Mathematics, Computer Engineering or Electrical Engineering ...

NVIDIA is seeking an outstanding Senior ASIC Verification Engineer to verify the design and implementation of the world's leading SoC's and GPU's. This position offers the opportunity to have a real ...

next page

Showing results 1-20

Gpu Programming information

See Raleigh, NC salary details

$32.1K

$63.2K

$92.8K

How much do gpu programming jobs pay per year?

As of May 31, 2026, the average yearly pay for gpu programming in Raleigh, NC is $63,160.00, according to ZipRecruiter salary data. Most workers in this role earn between $49,100.00 and $77,800.00 per year, depending on experience, location, and employer.

What is a GPU Programming job?

A GPU Programming job involves writing and optimizing code to run on Graphics Processing Units (GPUs) for parallel computing tasks. This role is commonly found in fields like machine learning, scientific computing, gaming, and data analytics. GPU programmers use languages such as CUDA, OpenCL, or Vulkan to accelerate computations and improve performance. They work closely with software engineers and data scientists to optimize algorithms for high-performance applications.

What are the key skills and qualifications needed to thrive in the Gpu Programming position, and why are they important?

To excel in GPU Programming, you need a strong background in parallel computing concepts, mathematics, and proficiency in languages such as CUDA, OpenCL, or DirectX/OpenGL, often supported by a degree in computer science, engineering, or a related field. Familiarity with NVIDIA and AMD GPU development tools, performance profilers, and possibly certifications like NVIDIA's Deep Learning Institute courses are valuable. Teamwork, effective communication, and strong problem-solving abilities are essential soft skills in this field. These competencies enable efficient development, optimization, and integration of high-performance GPU code in real-world applications.

What types of projects or applications do GPU Programmers commonly work on?

GPU Programmers are often involved in developing or optimizing software for high-performance applications such as machine learning, scientific simulations, real-time rendering in gaming and visualization, and video/image processing tools. Their daily work may include collaborating with software engineers, data scientists, and hardware teams to create efficient, scalable parallel algorithms that leverage GPU capabilities. The role frequently requires problem-solving to maximize computational efficiency and troubleshooting complex performance bottlenecks. By working across multidisciplinary teams, GPU Programmers help deliver robust solutions for data-intensive problems in areas like healthcare, finance, automotive technology, and entertainment.
What are popular job titles related to Gpu Programming jobs in Raleigh, NC? For Gpu Programming jobs in Raleigh, NC, the most frequently searched job titles are:
Infographic showing various Gpu Programming job openings in Raleigh, NC as of May 2026, with employment types broken down into 1% Internship, 96% Full Time, 1% Part Time, 1% Temporary, and 1% Nights. Highlights an 34% Physical, 2% Hybrid, and 64% Remote job distribution, with an average salary of $63,160 per year, or $30.4 per hour.

Senior Staff Engineer - AI Data Path

Data Direct Networks

Raleigh, NC

$103K - $140K/yr

Full-time

Posted 25 days ago


Job description

Overview

This is an incredible opportunity to be part of a company that has been at the forefront of AI and high-performance data storage innovation for over two decades. DataDirect Networks (DDN) is a global market leader renowned for powering many of the world's most demanding AI data centers, in industries ranging from life sciences and healthcare to financial services, autonomous cars, Government, academia, research and manufacturing.

"DDN's A3I solutions are transforming the landscape of AI infrastructure." - IDC 

 

"The real differentiator is DDN. I never hesitate to recommend DDN. DDN is the de facto name for AI Storage in high performance environments" - Marc Hamilton, VP, Solutions Architecture & Engineering | NVIDIA 

DDN is the global leader in AI and multi-cloud data management at scale. Our cutting-edge data intelligence platform is designed to accelerate AI workloads, enabling organizations to extract maximum value from their data. With a proven track record of performance, reliability, and scalability, DDN empowers businesses to tackle the most challenging AI and data-intensive workloads with confidence. 

Our success is driven by our unwavering commitment to innovation, customer-centricity, and a team of passionate professionals who bring their expertise and dedication to every project. This is a chance to make a significant impact at a company that is shaping the future of AI and data management. 

Our commitment to innovation, customer success, and market leadership makes this an exciting and rewarding role for a driven professional looking to make a lasting impact in the world of AI and data storage. 

Job Description

DDN is seeking a highly experienced Senior Staff Engineer specializing in AI Data Path & Storage to lead hands-on development and integration of advanced storage systems with next-generation AI inference pipelines. This role involves coding, prototyping, and rapidly iterating on solutions in close collaboration with architects to design and deliver high-performance data movement architectures. You will leverage NVIDIA's NIXL (Inference Transfer Library) alongside the Infinia Data Intelligence Platform to enable ultra-low-latency, high-throughput data movement across GPU, memory, and distributed storage layers, including workloads involving KV cache management and vector database retrieval. The ideal candidate brings deep expertise in distributed storage, GPU data paths, and large-scale system optimization, with a proven track record of building and shipping production-grade AI infrastructure.

Key Responsibilities

  • Lead the design and implementation of high-performance data movement pipelines using NVIDIA NIXL across GPU, CPU, and storage tiers.
  • Architect and drive integration of DDN Infinia with GPU-accelerated inference platforms for large-scale, real-time AI workloads.
  • Own end-to-end optimization of I/O paths between GPU memory and storage using technologies such as NVIDIA GPUDirect Storage, RDMA, and NVMe-over-Fabrics.
  • Define and implement multi-tier storage architectures (NVMe, SSD, object storage) optimized for inference latency, throughput, and scalability.
  • Lead development of advanced KV cache management strategies, including offloading, prefetching, and persistence across distributed storage layers.
  • Partner with AI/ML engineering teams to optimize inference performance in frameworks such as PyTorch and TensorFlow.
  • Establish benchmarking frameworks and lead performance tuning efforts for storage and data movement in production inference environments.
  • Diagnose and resolve complex system bottlenecks across storage, networking, and GPU subsystems.
  • Influence architecture decisions for distributed inference systems, ensuring scalability, resilience, and efficient data locality.
  • Drive engineering excellence through best practices in observability, performance monitoring, automation, and reliability engineering.
  • Mentor junior engineers and provide technical leadership across cross-functional teams.

Required Qualifications

  • Bachelor's or Master's degree in Computer Science, Engineering, or a related field.
  • 12+ years of experience in storage systems, distributed systems, or performance engineering.
  • Proven track record of architecting and delivering large-scale, high-performance infrastructure systems.
  • Deep expertise in distributed storage architectures (object storage, scalable file systems, or cloud-native storage platforms).
  • Strong understanding of Linux I/O stack, filesystem internals, and storage protocols.
  • Extensive hands-on experience with NVMe, SSD optimization, and high-performance storage environments.
  • Strong experience with RDMA, InfiniBand, or other high-speed data transfer technologies.
  • Solid understanding of GPU computing concepts and CPU-GPU data movement patterns.
  • Proficiency in Python and/or C/C++, with advanced debugging, profiling, and performance tuning skills.
  • Demonstrated ability to optimize latency-sensitive, high-throughput production systems.

Preferred Skills

  • Hands-on experience with NVIDIA NIXL or similar data movement frameworks.
  • Experience with GPU-aware storage pipelines and GPUDirect Storage.
  • Strong understanding of AI inference systems, LLM serving architectures, and KV cache optimization.
  • Experience with Retrieval-Augmented Generation (RAG) pipelines and open vector search ecosystems.
  • Background in high-performance computing (HPC) or hyperscale distributed environments.
  • Expertise in caching strategies, memory tiering, and data locality optimization.
  • Experience designing disaggregated compute and storage architectures.

What You'll Work On

  • Leading the evolution of storage systems into GPU-native data layers for AI inference
  • Building next-generation distributed AI infrastructure using NIXL and Infinia
  • Driving performance breakthroughs in real-time LLM inference at scale
  • Designing storage architectures for large-scale AI datasets and retrieval systems

Salary Range for this role: $215,000 - $265,000

DDN

Join our dynamic and driven team, where engineering excellence is at the heart of everything we do. We seek individuals who love to challenge themselves and are fueled by curiosity. Here, you'll have the opportunity to work across various areas of the company, thanks to our flat organizational structure that encourages hands-on involvement and direct contributions to our mission. Leadership is earned by those who take initiative and consistently deliver outstanding results, both in their work ethic and deliverables, making strong prioritization skills essential. Additionally, we value strong communication skills in all our engineers and researchers, as they are crucial for the success of our teams and the company as a whole.

Interview Process: After submitting your application, one of our recruiters will review your resume. If your application passes this stage, you will be invited to a 30-minute interview during which a member of our team will ask some basic questions. If you clear the interview, you will enter the main process, which can consist of up to four interviews in total:

  • Coding assessment: Often in a language of your choice.
  • Systems design: Translate high-level requirements into a scalable, fault-tolerant service (depending on role).
  • Real-time problem-solving: Demonstrate practical skills in a live problem-solving session.
  • Meet and greet with the wider team.
  • Our goal is to finish the main process in 2-3 weeks at most.

DataDirect Networks (DDN) is an Equal Opportunity/Affirmative Action employer. All qualified applicants will receive consideration for employment without regard to race, color, religion, gender, gender identity, gender expression, transgender, sex stereotyping, sexual orientation, national origin, disability, protected Veteran Status, or any other characteristic protected by applicable federal, state, or local law.

#LI-Remote

Employment Type: FULL_TIME