1

Amd Machine Learning Jobs in California (NOW HIRING)

Backed by multi-million-dollar funding and direct sponsorship from AMD with hands-on support from ... Maintain and evolve a large-scale library of modern machine learning models, including but not ...

Backed by multi-million-dollar funding and direct sponsorship from AMD with hands-on support from ... What you'll do * Lead research in advanced machine learning areas such as LLMs, generative AI ...

LLM Training Engineer

San Francisco, CA · On-site

$155K - $220K/yr

Backed by multi-million-dollar funding and direct sponsorship from AMD with hands-on support from ... MS or PhD in Computer Science, Machine Learning, AI, Mathematics, or related field Benefits include

next page

Showing results 1-20

Amd Machine Learning information

See California salary details

$13

$22

$30

How much do amd machine learning jobs pay per hour?

As of Jun 10, 2026, the average hourly pay for amd machine learning in California is $22.52, according to ZipRecruiter salary data. Most workers in this role earn between $19.47 and $25.14 per hour, depending on experience, location, and employer.

What are some common challenges faced by machine learning engineers at AMD, and how can applicants prepare to address them?

Machine learning engineers at AMD often work on optimizing algorithms for high-performance hardware, which requires balancing model accuracy with computational efficiency. A common challenge is adapting complex models to run efficiently on specialized AMD hardware such as GPUs and accelerators. Applicants can prepare by gaining hands-on experience with parallel computing, understanding low-level hardware optimization, and becoming familiar with AMD’s development tools and frameworks. Collaboration with hardware engineers and software teams is also frequent, so strong communication and teamwork skills are essential.

What are AMD machine learning engineers?

AMD machine learning engineers are professionals who develop, optimize, and deploy machine learning models using AMD hardware and software platforms. They work on designing algorithms and frameworks that leverage AMD GPUs and accelerators for tasks such as data analysis, computer vision, and artificial intelligence. Their responsibilities often include collaborating with software developers, data scientists, and hardware engineers to ensure machine learning solutions run efficiently on AMD architecture.

What is the difference between Amd Machine Learning vs Data Scientist?

AspectAmd Machine LearningData Scientist
Required CredentialsBachelor's or higher in CS, ML, or related fields; certifications like AWS, AzureBachelor's or higher in CS, Statistics, or related fields; certifications like SAS, Python
Work EnvironmentTech companies, R&D labs, AI startupsBusiness analytics, finance, healthcare, tech firms
Industry UsageDeveloping ML models, algorithms, AI solutionsData analysis, insights, predictive modeling
Common Search/ComparisonYesYes

While both roles involve working with data and algorithms, Amd Machine Learning focuses on developing machine learning models and AI solutions, often requiring specialized technical skills. Data Scientists analyze data to generate insights and support decision-making, with a broader scope that includes statistical analysis. Both roles are vital in tech-driven industries but differ in their primary focus and skill sets.

Which 3 jobs will survive AI?

For Amd Machine Learning professionals, roles such as data scientists, machine learning engineers, and AI researchers are likely to persist as they require complex problem-solving, domain expertise, and ongoing innovation that AI tools currently cannot fully replicate. These jobs involve designing, developing, and maintaining AI systems, often requiring advanced programming skills, knowledge of algorithms, and critical thinking. Continuous learning and staying updated with new AI techniques are essential for long-term career resilience in this field.

What are the key skills and qualifications needed to thrive as an AMD Machine Learning Engineer, and why are they important?

To thrive as an AMD Machine Learning Engineer, you need strong programming skills in Python or C++, a deep understanding of machine learning algorithms, and typically a degree in computer science or a related field. Familiarity with AMD hardware, GPU acceleration libraries like ROCm, and frameworks such as TensorFlow or PyTorch is essential. Analytical thinking, problem-solving, and effective collaboration are vital soft skills for excelling in this multidisciplinary environment. These skills ensure robust ML model development and optimization that leverage AMD hardware capabilities for high-performance computing applications.
What cities in California are hiring for Amd Machine Learning jobs? Cities in California with the most Amd Machine Learning job openings:
Principal Software Quality Engineer - GPU & Machine Learning

Principal Software Quality Engineer - GPU & Machine Learning

Advanced Micro Devices, Inc

San Jose, CA

$158K - $212K/yr

Full-time

Posted 28 days ago


Advanced Micro Devices rating

8.4

Company rating: 8.4 out of 10

Based on 7 frontline employees who took The Breakroom Quiz

23rd of 139 rated electronics manufacturers


Job description


WHAT YOU DO AT AMD CHANGES EVERYTHING 

At AMD, our mission is to build great products that accelerate next-generation computing experiences—from AI and data centers, to PCs, gaming and embedded systems. Grounded in a culture of innovation and collaboration, we believe real progress comes from bold ideas, human ingenuity and a shared passion to create something extraordinary. When you join AMD, you’ll discover the real differentiator is our culture. We push the limits of innovation to solve the world’s most important challenges—striving for execution excellence, while being direct, humble, collaborative, and inclusive of diverse perspectives. Join us as we shape the future of AI and beyond.  Together, we advance your career.  



Aboutthe Role 

Weare seeking aPrincipal Software Quality Engineertoserve as the senior technical leader forROCm software validationacrosscompute workloads and server-class systems. Inthis individual-contributor leadership role, you will definehowAMD provesROCm is ready to ship— from unit andcomponenttesting, through full-stack workload validation, to multi-node system-level qualification on AMD Instinct™ GPU platforms. Youwill set the technical direction for validation strategy, build and evolve the test infrastructure thatgates everyROCm release, and personally drive the hardestdebugging, characterization, and qualification problems. Your work directly determines thequality bar experienced by hyperscalers, OEMs, sovereign-AI customers, and the open-source community runningROCm inproduction. 

What You Will Do 

  • Ownthe end-to-end validation architecturefor ROCm — unit, integration, framework, workload, performance, stress, stability, scale-out, and system-leveltest layers — across multiple GPU generations and server platforms. 

  • Definerelease-qualification gates andexit criteriaforROCm software releases (functional coverage, performance regressions, stability hours, scale targets, RAS criteria) anddrive the org to meet them. 

  • Lead system-level testing for server nodes— multi-GPU topologies, PCIe/InfinityFabric/xGMI, BMC/IPMI, thermal/power, firmware interactions, and multi-node fabric(Ethernet/InfiniBand/UALink) bring-up andvalidation. 

  • Drive compute workload validation and characterization— LLM training andinference(PyTorch, vLLM, Triton, JAX), recommender systems, scientific HPC kernels, MLPerf-class benchmarks— establishing reproducible methodology, baselines, and regression tracking. 

  • Architect thetest infrastructure— distributed test runners, GitHub Actions/ Jenkins / internal CI fleets, hardware lab orchestration, resultdatalakes, flaky-test detection, bisectionautomation, and self-servicedeveloper pre-submit pipelines. 

  • Champion modern, agile quality engineering— shift-left testing, test pyramids, contract testing betweenlayers, hermetic test environments, deterministic reproducers, and continuous validation intrunk. 

  • Setthe bar for GitHub-based quality workflows— PR gatingpolicy, requiredchecks, code-coverage standards, bug-bashandtriage cadences, and disciplined issue management acrossROCm/*repositories and partner upstream projects. 

  • Lead complex escalationdebug— partner with development, hardware, firmware, and customer-facing teams to root-cause the hardest multi-day, multi-node, multi-component failures andconvert findings into durable test coverage. 

  • Influence the roadmap— work with product management, silicon, platform, and softwarearchitecture to ensure validation readiness fornext-generation Instinct GPUs and serverplatformsbeforetape-inmilestones and silicon arrival. 

  • Mentor and elevateSenior and Staff validation engineers, SDETs, and SQA leads; raise the technical bar through designreview, code review, and written guidance. 

  • RepresentROCm validation externally— strategic customerengagements, OEM qualification programs, and open-source community quality initiatives. 

Minimum Qualifications 

  • Strongl softwareengineering experience withastrong validation, SDET, or quality-engineering focus, including5+ years in a senior IC role(Staff/Principal/PMTS or equivalent) leading validation of complex systems software. 

  • BS/MS/PhDin Computer Science, Computer Engineering, orrelated discipline (or equivalent demonstrated experience). 

  • Expert-levelPythonfortest automation and infrastructure; strongC++for debugging, and extending productioncode paths undertest. 

  • Deep, demonstrable validation experience inat least twoof the following domains: 

  • GPU compute software stacks(ROCm, CUDA, oneAPI, SYCL) 

  • Deep-learning frameworks andinference engines (PyTorch, TensorFlow, JAX, Triton, vLLM) 

  • HPC/ parallel runtimes andcommunication libraries (MPI, RCCL/NCCL, UCX, Libfabric) 

  • Linux kernel, GPU drivers, or accelerator firmware 

  • Distributed systems and large-scale cluster software 

  • System-level validation forserver-class compute nodes— multi-GPU, multi-node, fabric-attached environments — including stress/stability, soak, fault-injection, and RAS testing. 

  • Proven, hands-on experience workingefficiently in an agenticAI engineering environment— daily, productionuseofLLM-based coding agents(e.g., Cursor, Claude Code, Copilot Workspace, Codex-class agents) andorchestration frameworks forrealengineering work, withdemonstrableproductivity, quality, or coverage gains attributable to thoseworkflows. Comfort designing prompts, tool/MCP integrations, evaluation harnesses, and guardrails for autonomous and semi-autonomous agents. 

  • Hands-on experience defining and shippingrelease qualification programsfor software consumedby hyperscalers, OEMs, or otherTier-1 customers. 

  • Mastery ofGitHub atscaleforquality engineering — PR gating, GitHub Actions, self-hosted runners, requiredstatuschecks, releasetagging, and open-source contribution andtriage norms. 

  • Strong commandofmodern, agile software developmentpractices— trunk-based development, CI/CD, shift-left testing, observability, feature flags, andincremental delivery— applied specifically to validation organizations. 

  • Excellent written and verbal communication — able to author crisp test plans, qualification reports, RFCs, and post-mortems, and to influence development teams without authority. 

Preferred Qualifications 

  • Direct contributions to validation, CI, or test infrastructure forROCm,PyTorch,LLVM,Triton,vLLM, or comparable upstream open-source projects. 

  • Demonstrated leadership inagentic-AI adoption— builtor rolled out agent-based workflows across an engineering team (e.g., autonomous test generation, AI-driven log/triage pipelines, multi-agent debugsystems, MCP serverdesign, retrieval-augmented engineering knowledge bases) with measurable outcomes. 

  • Experience operating or validatinglarge GPU clusters (256+ GPUs)— fabric bring-up, cluster health monitoring, and fleet-level diagnostics. 

  • Familiarity withTraining/Inference/HPC industry-standard benchmark methodologies andsubmissions. 

  • Backgroundin performance validation: roofline analysis, profiler tooling (rocprof, Omniperf, Nsight-class), regression detection 

  • Experience withfaultinjection, RAS, telemetry, and long-haul stabilityprograms for accelerator platforms. 

  • Familiarity with hardware lab automation: BMC/IPMI/Redfish, PDU control, serial-console capture, automated re-imaging, and topology-aware test scheduling. 

  • Prior experience standing up validation forpre-silicon / emulation / first-silicon bring-upof accelerators. 

Why This Role 

ROCm powers AIand HPC workloads onAMD Instinct GPUs atthe largest scale inthe industry. The quality of every ROCm release is felt acrossmillions of GPUs in production — and the validation organization iswhatstandsbetween "code complete" and "customerready." AsPrincipal MTS for ROCm Validation, you will define thatbar, build the systems thatenforce it, and personally lead the toughest qualification problems on AMD's moststrategicplatforms. 

#LI-TC1

#Hybrid

AMD is an Equal Opportunity Employer. We celebrate diversity and are committed to creating an inclusive environment for all employees. 



Benefits offered are described:  AMD benefits at a glance.

AMD does not accept unsolicited resumes from headhunters, recruitment agencies, or fee-based recruitment services. AMD and its subsidiaries are equal opportunity, inclusive employers and will consider all applicants without regard to age, ancestry, color, marital status, medical condition, mental or physical disability, national origin, race, religion, political and/or third-party affiliation, sex, pregnancy, sexual orientation, gender identity, military or veteran status, or any other characteristic protected by law.   We encourage applications from all qualified candidates and will accommodate applicants’ needs under the respective laws throughout all stages of the recruitment and selection process.

AMD may use Artificial Intelligence to help screen, assess or select applicants for this position.  AMD’s “Responsible AI Policy” is available here.

 

This posting is for an existing vacancy.

Qualifications:

Benefits offered are described:  AMD benefits at a glance.

AMD does not accept unsolicited resumes from headhunters, recruitment agencies, or fee-based recruitment services. AMD and its subsidiaries are equal opportunity, inclusive employers and will consider all applicants without regard to age, ancestry, color, marital status, medical condition, mental or physical disability, national origin, race, religion, political and/or third-party affiliation, sex, pregnancy, sexual orientation, gender identity, military or veteran status, or any other characteristic protected by law.   We encourage applications from all qualified candidates and will accommodate applicants’ needs under the respective laws throughout all stages of the recruitment and selection process.

AMD may use Artificial Intelligence to help screen, assess or select applicants for this position.  AMD’s “Responsible AI Policy” is available here.

 

This posting is for an existing vacancy.

Education:UNAVAILABLEEmployment Type: FULL_TIME