Lengthy professional software development experience in performance-critical environments. Extensive hands-on experience in GPU programming (HIP/CUDA) and optimizing deep learning kernels and ...
Lengthy professional software development experience in performance-critical environments. Extensive hands-on experience in GPU programming (HIP/CUDA) and optimizing deep learning kernels and ...
Senior Software Engineer - Radar Control & Signal Processing
Moorestown, NJ · On-site
$120K - $159K/yr
Who you are • Professional experience proactively understanding problems and constraints, identifying solutions, and implementing robust, efficient C++ or CUDA code. • Strong advocate for test ...
Senior Software Engineer - Radar Control & Signal Processing
Moorestown, NJ · On-site
$120K - $159K/yr
Who you are • Professional experience proactively understanding problems and constraints, identifying solutions, and implementing robust, efficient C++ or CUDA code. • Strong advocate for test ...
Senior Staff Software Development Engineer- GPU/AI/ML
Santa Clara, CA · Hybrid
$144K - $190K/yr
Substantial professional experience in software development within performance-critical environments. * Extensive HIP/CUDA experience optimizing deep learning and OSS LLM inference/training kernels ...
Senior Staff Software Development Engineer- GPU/AI/ML
Santa Clara, CA · Hybrid
$144K - $190K/yr
Substantial professional experience in software development within performance-critical environments. * Extensive HIP/CUDA experience optimizing deep learning and OSS LLM inference/training kernels ...
Senior Staff Software Development Engineer- GPU/AI/ML
Santa Clara, CA · On-site
$178K/yr
Substantial professional experience in software development within performance-critical environments. * Extensive HIP/CUDA experience optimizing deep learning and OSS LLM inference/training kernels ...
Senior Staff Software Development Engineer- GPU/AI/ML
Santa Clara, CA · On-site
$178K/yr
Substantial professional experience in software development within performance-critical environments. * Extensive HIP/CUDA experience optimizing deep learning and OSS LLM inference/training kernels ...
Senior Software Engineer-RF/EW
$120K - $159K/yr
... CUDA to optimize performance for applications Why Join Us - Competitive salary & benefits package - Opportunities for career advancement & professional development - Access to cutting-edge ...
Senior Software Engineer-RF/EW
$120K - $159K/yr
... CUDA to optimize performance for applications Why Join Us - Competitive salary & benefits package - Opportunities for career advancement & professional development - Access to cutting-edge ...
Sr. Software Development Engineer, Frontier AI & Robotics
Seattle, WA · On-site
$139K - $183K/yr
... TensorRT, CUDA, and other NVIDIA tools - Collaborate closely with scientists to influence model ... professional software development experience - 5+ years of programming with at least one software ...
Sr. Software Development Engineer, Frontier AI & Robotics
Seattle, WA · On-site
$139K - $183K/yr
... TensorRT, CUDA, and other NVIDIA tools - Collaborate closely with scientists to influence model ... professional software development experience - 5+ years of programming with at least one software ...
Transition algorithms from platforms such as MATLAB or OpenCL to CUDA. * Analyze and enhance image ... Ideally 1-3 years of experience in software development experience (professional experience ...
Transition algorithms from platforms such as MATLAB or OpenCL to CUDA. * Analyze and enhance image ... Ideally 1-3 years of experience in software development experience (professional experience ...
Develop custom TensorRT plugins and CUDA kernels for performance-critical components. * Integrate ... What You Have: * 5+ years of professional experience developing and deploying deep learning models ...
Quick apply
Develop custom TensorRT plugins and CUDA kernels for performance-critical components. * Integrate ... What You Have: * 5+ years of professional experience developing and deploying deep learning models ...
Support GPU-enabled workloads and CUDA-based processing * Guide users on efficient cluster ... Professional Development - Paid training, Certifications, and Enrichment ABOUT PHOENIX OPERATIONS ...
Quick apply
Support GPU-enabled workloads and CUDA-based processing * Guide users on efficient cluster ... Professional Development - Paid training, Certifications, and Enrichment ABOUT PHOENIX OPERATIONS ...
Support GPU-enabled workloads and CUDA-based processing * Develop and maintain automation scripts ... Professional Development - Paid training, Certifications, and Enrichment ABOUT PHOENIX OPERATIONS ...
Support GPU-enabled workloads and CUDA-based processing * Develop and maintain automation scripts ... Professional Development - Paid training, Certifications, and Enrichment ABOUT PHOENIX OPERATIONS ...
Staff AI Software Engineer, Edge Model Optimization & Deployment
Seattle, WA · On-site
$70K - $300K/yr
Develop custom TensorRT plugins and CUDA kernels for performance-critical components. * Integrate ... What You Have: * 5+ years of professional experience developing and deploying deep learning models ...
Staff AI Software Engineer, Edge Model Optimization & Deployment
Seattle, WA · On-site
$70K - $300K/yr
Develop custom TensorRT plugins and CUDA kernels for performance-critical components. * Integrate ... What You Have: * 5+ years of professional experience developing and deploying deep learning models ...
Support GPU-enabled workloads and CUDA-based processing * Guide users on efficient cluster ... Professional Development - Paid training, Certifications, and Enrichment ABOUT PHOENIX OPERATIONS ...
Support GPU-enabled workloads and CUDA-based processing * Guide users on efficient cluster ... Professional Development - Paid training, Certifications, and Enrichment ABOUT PHOENIX OPERATIONS ...
Staff AI Software Engineer, Edge Model Optimization & Deployment
Seattle, WA · On-site
$70K - $300K/yr
Develop custom TensorRT plugins and CUDA kernels for performance-critical components. * Integrate ... What You Have: * 5+ years of professional experience developing and deploying deep learning models ...
Staff AI Software Engineer, Edge Model Optimization & Deployment
Seattle, WA · On-site
$70K - $300K/yr
Develop custom TensorRT plugins and CUDA kernels for performance-critical components. * Integrate ... What You Have: * 5+ years of professional experience developing and deploying deep learning models ...
Support GPU-enabled workloads and CUDA-based processing * Develop and maintain automation scripts ... Professional Development - Paid training, Certifications, and Enrichment ABOUT PHOENIX OPERATIONS ...
Support GPU-enabled workloads and CUDA-based processing * Develop and maintain automation scripts ... Professional Development - Paid training, Certifications, and Enrichment ABOUT PHOENIX OPERATIONS ...
Staff AI Software Engineer, Edge Model Optimization & Deployment
Seattle, WA · On-site
$70K - $300K/yr
Develop custom TensorRT plugins and CUDA kernels for performance-critical components. * Integrate ... What You Have: * 5+ years of professional experience developing and deploying deep learning models ...
Staff AI Software Engineer, Edge Model Optimization & Deployment
Seattle, WA · On-site
$70K - $300K/yr
Develop custom TensorRT plugins and CUDA kernels for performance-critical components. * Integrate ... What You Have: * 5+ years of professional experience developing and deploying deep learning models ...
Support GPU-enabled workloads and CUDA-based processing * Develop and maintain automation scripts ... Professional Development - Paid training, Certifications, and Enrichment ABOUT PHOENIX OPERATIONS ...
Quick apply
Support GPU-enabled workloads and CUDA-based processing * Develop and maintain automation scripts ... Professional Development - Paid training, Certifications, and Enrichment ABOUT PHOENIX OPERATIONS ...
Support GPU-enabled workloads and CUDA-based processing * Guide users on efficient cluster ... Professional Development - Paid training, Certifications, and Enrichment ABOUT PHOENIX OPERATIONS ...
Support GPU-enabled workloads and CUDA-based processing * Guide users on efficient cluster ... Professional Development - Paid training, Certifications, and Enrichment ABOUT PHOENIX OPERATIONS ...
Camera Modelling Engineer
San Diego, CA · On-site
$87K - $116K/yr
Scripting for process automation • ML algorithm development for Image Processing • Familiarity with using CUDA GPU library • Analyzing image quality issues using C/C++ models and perform ...
Camera Modelling Engineer
San Diego, CA · On-site
$87K - $116K/yr
Scripting for process automation • ML algorithm development for Image Processing • Familiarity with using CUDA GPU library • Analyzing image quality issues using C/C++ models and perform ...
Software Engineer role
Brookfield, WI · Hybrid
Transition algorithms from platforms such as MATLAB or OpenCL to CUDA. * Analyze and enhance image ... Ideally 1-3 years of experience in software development experience (professional experience ...
Software Engineer role
Brookfield, WI · Hybrid
Transition algorithms from platforms such as MATLAB or OpenCL to CUDA. * Analyze and enhance image ... Ideally 1-3 years of experience in software development experience (professional experience ...
Senior Machine Learning Engineer, Runtime and Serving
Mountain View, CA · On-site +1
$213K - $263K/yr
JAX, XLA, Triton, and CUDA), to . You will be pleasantly challenged with deploying Waymo ML models ... S. or M.S. in CS, EE, Deep Learning or a related field * 5+ years of professional software ...
Senior Machine Learning Engineer, Runtime and Serving
Mountain View, CA · On-site +1
$213K - $263K/yr
JAX, XLA, Triton, and CUDA), to . You will be pleasantly challenged with deploying Waymo ML models ... S. or M.S. in CS, EE, Deep Learning or a related field * 5+ years of professional software ...
Professional Cuda information
What is the difference between Professional Cuda vs Cuda Developer?
| Aspect | Professional Cuda | Cuda Developer |
|---|---|---|
| Required Credentials | Typically requires a degree in Computer Science or related field, with certifications in CUDA programming | Often requires similar degrees and certifications, focusing on CUDA expertise |
| Work Environment | Works in research labs, tech companies, or industries utilizing GPU computing | Works in software development teams, research, or hardware optimization projects |
| Industry Usage | Used across high-performance computing, AI, and scientific research sectors | Commonly employed in software development, gaming, and simulation industries |
Both roles involve CUDA programming, but a Professional Cuda typically emphasizes advanced GPU computing skills in research or industry applications, while a Cuda Developer focuses on software development and optimization using CUDA technology. The roles often overlap, but the Professional Cuda may have a broader scope in high-performance computing projects.
Staff Software Development Engineer- GPU, LLM, AI
Advanced Micro Devices, IncSanta Clara, CA • On-site
Full-time
This job post has expired 1 day ago. Applications are no longer accepted.
Advanced Micro Devices rating
8.4
Based on 7 frontline employees who took The Breakroom Quiz
23rd of 139 rated electronics manufacturers
Job description
WHAT YOU DO AT AMD CHANGES EVERYTHING
At AMD, our mission is to build great products that accelerate next-generation computing experiences—from AI and data centers, to PCs, gaming and embedded systems. Grounded in a culture of innovation and collaboration, we believe real progress comes from bold ideas, human ingenuity and a shared passion to create something extraordinary. When you join AMD, you’ll discover the real differentiator is our culture. We push the limits of innovation to solve the world’s most important challenges—striving for execution excellence, while being direct, humble, collaborative, and inclusive of diverse perspectives. Join us as we shape the future of AI and beyond. Together, we advance your career.
THE ROLE:
AMD is looking for an influential software engineer who is passionate about improving the performance of key applications and benchmarks. You will be a member of a core team of incredibly talented industry specialists and will work with the very latest hardware and software technology.
THE PERSON:
As a Senior Staff Software Developer, you will be at the heart of AMD's AI strategy, tackling one of the most exciting challenges in the industry: training and running AI to make AI itself more efficient on GPUs on the fly, which can dramatically alter the trajectory of AI progress. This is a high-impact, hands-on role where your work will directly define the software that powers the future of AI.
KEY RESPONSIBILITIES:
Architect and Drive the AI Software Stack: You will establish best practices and optimize performance from the lowest-level GPU kernels to large-scale distributed systems, shaping the foundational software for AMD hardware. By leveraging cutting-edge Large Language Models (LLMs) and agent-based technologies, you will accelerate the development and performance enhancement of the AMD ROCm ecosystem, ensuring it remains at the forefront of AI innovation.
Accelerate Foundational Models: Your work will directly accelerate cutting-edge applications like foundation models (LLMs) and autonomous AI agents, ensuring AMD is the platform of choice for the most demanding workloads.
Innovate Across Hardware and Software: You will contribute to the entire co-design lifecycle, from influencing future GPU architectures to developing groundbreaking software for new accelerators and collaborating with the broader AI community.
Success in this role requires a deep passion for software engineering, strong technical ownership to see complex problems through to resolution, and the ability to influence technical direction across teams. As a senior engineer, you will also be expected to mentor others and effectively communicate your ideas to shape the future of AI at AMD.
To excel in this role, we seek a candidate with exceptional technical expertise, who can bridge deep proficiency in high-performance C++ software engineering and low-level GPU programming with a robust understanding of Large Language Models (LLMs) and AI systems. The ideal candidate can bridge kernel engineering with AI post-training (RL) experience. A great candidate is deep in one and light on the other.
Kernel engineering means demonstrating mastery in designing complex, scalable systems using modern C++, coupled with a fundamental grasp of GPU architectures (HIP/CUDA), memory hierarchies, and kernel optimization to maximize hardware performance. This expertise should be evidenced by significant hands-on experience in large-scale C++/HIP/CUDA projects, such as contributing to the ROCm ecosystem (e.g., rocBLAS, hipDNN, Composable Kernel, AITemplate), CUDA libraries (e.g., cuBLAS, cuDNN, CUTLASS, Thrust, CUB, NCCL), or the C++/HIP/CUDA core of ML frameworks like PyTorch, TensorFlow, or JAX.
AI post-training is equally critical, and requires deep understanding of LLMs, including but not limited to transformer architectures, attention mechanisms, and the full model lifecycle, with hands-on experience in advanced model alignment and post-training techniques like Supervised Fine-Tuning (SFT) and Reinforcement Learning (e.g., RLHF, GRPO). Candidates must also stay at the forefront of LLM advancements, showing familiarity with cutting-edge trends such as Mixture-of-Experts (MoE) architectures, inference optimizations (e.g., quantization, speculative decoding), and modern application patterns like Agentic AI systems (e.g. AlphaEvolve for code/kernel generation).
Experience and interest in code generation and/or self-improving LLMs is a plus.
PREFERRED EXPERIENCE:
This is a senior role that requires a unique blend of expertise across software engineering, GPU computing, and artificial intelligence. The ideal candidate will possess:
- Lengthy professional software development experience in performance-critical environments.
Extensive hands-on experience in GPU programming (HIP/CUDA) and optimizing deep learning kernels and operators. - A fundamental understanding of GPU architecture and memory hierarchy, used to diagnose and resolve complex performance bottlenecks.
- Expert-level proficiency in modern C++ and object-oriented design.
- Deep experience using GPU profiling and performance analysis tools (e.g., AMD ROCm Profiler, NVIDIA Nsight) to diagnose and resolve complex bottlenecks in distributed, multi-GPU systems.
- Deep knowledge of transformer architectures, attention mechanisms, and modern AI systems (Generative AI, Agentic AI).
- Hands-on experience optimizing the post-training and inference pipelines of Large Language Models (LLMs).
Strong technical ownership, communication, and problem-solving skills with a track record of delivering complex technical solutions.
Plus: Experience or deep expertise with the AMD ROCm/HIP ecosystem.
ACADEMIC CREDENTIALS:
Bachelor's degree in Computer Science, Computer Engineering, Electrical Engineering, or equivalent.
Master's degree preferred, PhD is a plus.
Relevant publications in AI/ML, GPU computing, or system optimization are highly valued.
This role is not eligible for visa sponsorship.
#LI-AG2
#LI-HYBRID
Benefits offered are described: AMD benefits at a glance.
AMD does not accept unsolicited resumes from headhunters, recruitment agencies, or fee-based recruitment services. AMD and its subsidiaries are equal opportunity, inclusive employers and will consider all applicants without regard to age, ancestry, color, marital status, medical condition, mental or physical disability, national origin, race, religion, political and/or third-party affiliation, sex, pregnancy, sexual orientation, gender identity, military or veteran status, or any other characteristic protected by law. We encourage applications from all qualified candidates and will accommodate applicants’ needs under the respective laws throughout all stages of the recruitment and selection process.
AMD may use Artificial Intelligence to help screen, assess or select applicants for this position. AMD’s “Responsible AI Policy” is available here.
This posting is for an existing vacancy.
Qualifications:Benefits offered are described: AMD benefits at a glance.
AMD does not accept unsolicited resumes from headhunters, recruitment agencies, or fee-based recruitment services. AMD and its subsidiaries are equal opportunity, inclusive employers and will consider all applicants without regard to age, ancestry, color, marital status, medical condition, mental or physical disability, national origin, race, religion, political and/or third-party affiliation, sex, pregnancy, sexual orientation, gender identity, military or veteran status, or any other characteristic protected by law. We encourage applications from all qualified candidates and will accommodate applicants’ needs under the respective laws throughout all stages of the recruitment and selection process.
AMD may use Artificial Intelligence to help screen, assess or select applicants for this position. AMD’s “Responsible AI Policy” is available here.
This posting is for an existing vacancy.
Education:UNAVAILABLEEmployment Type: FULL_TIMEAbout Advanced Micro Devices
Sourced by ZipRecruiter
Industry
Computer and electronic product manufacturing
Company size
5,001 - 10,000 Employees
Headquarters location
Sunnyvale, CA, US
Year founded
1969