2

Full Time Gpu Programming Jobs (NOW HIRING)

Dahlgren, Virginia Status: Full Time Clearance: Secret Salary: $120,000-$140,000 SEG is an industry ... GPU programming Experience with version-control software (Git, Subversion, Mercurial) and ...

Dahlgren, Virginia Status: Full Time Clearance: Secret Salary: $120,000-$140,000 SEG is an industry ... GPU programming Experience with version-control software (Git, Subversion, Mercurial) and ...

Software Engineer - Midlevel

Dahlgren, VA ยท On-site

$120K - $140K/yr

Dahlgren, Virginia Status: Full Time Clearance: Secret Salary: $120,000-$140,000 SEG is an industry ... Knowledge or experience in multi-threading (POSIX, OpenMP, MPI), SSE or GPU programming

Software Engineer - Midlevel

Dahlgren, VA ยท On-site

$120K - $140K/yr

Dahlgren, Virginia Status: Full Time Clearance: Secret Salary: $120,000-$140,000 SEG is an industry ... GPU programming Experience with version-control software (Git, Subversion, Mercurial) and ...

Experience with OpenCV, ffmpeg, GPU programming * Experience with the raw image camera pipeline and ... This is a full-time onsite role in Dallas, TX, and we will ask you to relocate if you're not in the ...

Columbia, Maryland Job Status: Full Time Clearance: Secret Salary: $150,000 to $190,000 SEG, an ... Knowledge or experience in multi-threading (POSIX, OpenMP, MPI), SSE or GPU programming

Senior Software Engineer

Dahlgren, VA ยท On-site

$150K - $190K/yr

Dahlgren, VA Job Status: Full Time Clearance: Secret Salary: $150,000 to $190,000 SEG, an Astrion ... Knowledge or experience in multi-threading (POSIX, OpenMP, MPI), SSE or GPU programming

next page

Showing results 1-20

Full Time Gpu Programming information

See salary details

$33K

$65K

$95.5K

How much do full time gpu programming jobs pay per year?

As of Jun 19, 2026, the average yearly pay for full time gpu programming in the United States is $64,974.00, according to ZipRecruiter salary data. Most workers in this role earn between $50,500.00 and $80,000.00 per year, depending on experience, location, and employer.

What engineers make $500,000?

Senior engineers in specialized fields such as GPU programming, software engineering, or machine learning can earn $500,000 or more annually, especially with experience, advanced skills, and in high-demand industries like tech or finance. Compensation often includes base salary, bonuses, and stock options, particularly at large tech companies or startups with significant growth potential.

What are some typical challenges faced by professionals in full-time GPU programming roles, and how can they be addressed?

Full-time GPU programmers often encounter challenges such as optimizing code for parallel execution, managing memory efficiently, and debugging complex kernels. These tasks require a deep understanding of GPU architecture and tools like CUDA or OpenCL. Collaborating closely with other developers, data scientists, and hardware engineers is essential to identify performance bottlenecks and ensure that solutions meet project requirements. Staying updated with the latest advancements and regularly profiling code can help address these challenges effectively.

Are full stack devs still in demand?

Full stack developers remain in high demand due to their versatility in handling both front-end and back-end development, especially in tech companies and startups. Skills in frameworks like React, Node.js, and cloud platforms enhance employability, and continuous learning is important to stay competitive.

What is the salary of CUDA programmer?

The salary of a CUDA programmer varies based on experience, location, and industry, but typically ranges from $80,000 to $130,000 annually for full-time roles. Skilled programmers with expertise in GPU parallel computing and CUDA tools are in high demand, often commanding higher salaries.

What are the key skills and qualifications needed to thrive as a Full Time GPU Programmer, and why are they important?

To thrive as a Full Time GPU Programmer, you need strong programming skills in languages like C++ and Python, a deep understanding of parallel computing, and often a degree in computer science or a related field. Familiarity with GPU programming frameworks such as CUDA or OpenCL, and experience with performance profiling tools, are typically required. Problem-solving ability, attention to detail, and effective teamwork distinguish top performers in this role. These skills ensure efficient development of high-performance, scalable applications that leverage GPU capabilities for computational tasks.

Will AI replace coders?

Full Time GPU Programmers develop software that leverages graphics processing units, often requiring specialized coding skills. While AI tools can assist with coding tasks, human expertise remains essential for complex problem-solving, system design, and optimization, making complete replacement unlikely by 2040.

What is the difference between Full Time Gpu Programming vs Gpu Software Engineer?

AspectFull Time Gpu ProgrammingGpu Software Engineer
Required CredentialsBachelor's or higher in Computer Science, Engineering, or related field; experience with GPU programming languagesSimilar credentials; often requires experience with CUDA, OpenCL, or Vulkan
Work EnvironmentPrimarily focused on developing and optimizing GPU code, often in research or high-performance computing settingsDesigning, developing, and maintaining GPU-based software applications in industry or research
Employer & Industry UsageTech companies, research labs, gaming, and scientific computingTech firms, gaming, AI, and scientific industries

Both roles involve GPU programming skills, but Full Time Gpu Programming emphasizes writing and optimizing GPU code, while Gpu Software Engineer focuses on developing complete GPU-accelerated software solutions. The roles often overlap but differ in scope and application.

What is a Full Time GPU Programmer?

A Full Time GPU Programmer is a software developer who specializes in writing and optimizing code to run on Graphics Processing Units (GPUs). These professionals use languages like CUDA or OpenCL to leverage the parallel processing capabilities of GPUs for tasks such as scientific computing, machine learning, graphics rendering, and more. They work on designing, implementing, and debugging GPU-accelerated algorithms to improve performance and efficiency in various applications. Full Time GPU Programmers often collaborate with other engineers and researchers to integrate GPU solutions into larger software systems.
More about Full Time Gpu Programming jobs
What are the most commonly searched types of Gpu Programming jobs? The most popular types of Gpu Programming jobs are:
What states have the most Full Time Gpu Programming jobs? States with the most job openings for Full Time Gpu Programming jobs include:
What job categories do people searching Full Time Gpu Programming jobs look for? The top searched job categories for Full Time Gpu Programming jobs are:
Infographic showing various Full Time Gpu Programming job openings in the United States as of June 2026, with employment types broken down into 2% As Needed, 9% Full Time, 83% Part Time, 1% Temporary, 4% Contract, and 1% Nights. Highlights an 95% Physical, 1% Hybrid, and 4% Remote job distribution, with an average salary of $64,974 per year, or $31.2 per hour.

LLM Inference Frameworks and Optimization Engineer

Together AI

San Francisco, CA โ€ข On-site

$160K - $230K/yr

Full-time

Medical

Posted 29 days ago


Job description

About the Role
At Together.ai, we are building state-of-the-art infrastructure to enable efficient and scalable inference for large language models (LLMs). Our mission is to optimize inference frameworks, algorithms, and infrastructure, pushing the boundaries of performance, scalability, and cost-efficiency.
We are seeking anInference Frameworks and Optimization Engineer to design, develop, and optimize distributed inference engines that support multimodal and language models at scale. This role will focus on low-latency, high-throughput inference, GPU/accelerator optimizations, and software-hardware co-design, ensuring efficient large-scale deployment of LLMs and vision models.
This role offers a unique opportunity to shape the future of LLM inference infrastructure, ensuring scalable, high-performance AI deployment across a diverse range of applications. If you're passionate about pushing the boundaries of AI inference, we'd love to hear from you!
Responsibilities
Inference Framework Development and Optimization
  • Design and develop fault-tolerant, high-concurrency distributed inference engine for text, image, and multimodal generation models.
  • Implement and optimize distributed inference strategies, including Mixture of Experts (MoE) parallelism, tensor parallelism, pipeline parallelism for high-performance serving.
  • Apply CUDA graph optimizations, TensorRT/TRT-LLM graph optimizations, and PyTorch-based compilation (torch.compile), and speculative decoding to enhance efficiency and scalability.
Software-Hardware Co-Design and AI Infrastructure
  • Collaborate with hardware teams on performance bottleneck analysis, co-optimize inference performance for GPUs, TPUs, or custom accelerators.
  • Work closely with AI researchers and infrastructure engineers to develop efficient model execution plans and optimize E2E model serving pipelines.
Requirements
Must-Have:
  • Experience:
    • 3+ years of experience in deep learning inference frameworks, distributed systems, or high-performance computing.
  • Technical Skills:
    • Familiar with at least one LLM inference frameworks (e.g., TensorRT-LLM, vLLM, SGLang, TGI(Text Generation Inference)).
    • Background knowledge and experience in at least one of the following: GPU programming (CUDA/Triton/TensorRT), compiler, model quantization, and GPU cluster scheduling.
    • Deep understanding of KV cache systems like Mooncake, PagedAttention, or custom in-house variants.
  • Programming:
    • Proficient in Python and C++/CUDA for high-performance deep learning inference.
  • Optimization Techniques:
    • Deep understanding of Transformer architectures and LLM/VLM/Diffusion model optimization.
    • Knowledge of inference optimization, such as workload scheduling, CUDA graph, compiled, efficient kernels
  • Soft Skills:
    • Strong analytical problem-solving skills with a performance-driven mindset.
    • Excellent collaboration and communication skills across teams.

Nice-to-Have:
  • Experience in developing software systems for large-scale data center networks with RDMA/RoCE
  • Familiar with distributed filesystem(e.g., 3FS, HDFS, Ceph)
  • Familiar with open source distributed scheduling/orchestration frameworks, such as Kubernetes (K8S)
  • Contributions to open-source deep learning inference projects.

About Together AI
Together AI is a research-driven artificial intelligence company. We believe open and transparent AI systems will drive innovation and create the best outcomes for society, and together we are on a mission to significantly lower the cost of modern AI systems by co-designing software, hardware, algorithms, and models. We have contributed to leading open-source research, models, and datasets to advance the frontier of AI, and our team has been behind technological advancement such as FlashAttention, Hyena, FlexGen, and RedPajama. We invite you to join a passionate group of researchers in our journey in building the next generation AI infrastructure.
Compensation
We offer competitive compensation, startup equity, health insurance and other competitive benefits. The US base salary range for this full-time position is: $160,000 - $230,000 + equity + benefits. Our salary ranges are determined by location, level and role. Individual compensation will be determined by experience, skills, and job-related knowledge.
Equal Opportunity
Together AI is an Equal Opportunity Employer and is proud to offer equal employment opportunity to everyone regardless of race, color, ancestry, religion, sex, national origin, sexual orientation, age, citizenship, marital status, disability, gender identity, veteran status, and more.
Please see our privacy policy at https://www.together.ai/privacy