1

Amd Machine Learning Jobs (NOW HIRING)

Manufacturing Engineer

Austin, TX · Hybrid

$72K - $93K/yr

The role is critical to advancing AMD's assembly capabilities, particularly for our expanding portfolio of Machine Learning and Artificial Intelligence products. Success in this position requires ...

Manufacturing Engineer

Austin, TX · Hybrid

$72K - $93K/yr

The role is critical to advancing AMD's assembly capabilities, particularly for our expanding portfolio of Machine Learning and Artificial Intelligence products. Success in this position requires ...

Manufacturing Engineer

Austin, TX · Hybrid

$72K - $93K/yr

The role is critical to advancing AMD's assembly capabilities, particularly for our expanding portfolio of Machine Learning and Artificial Intelligence products. Success in this position requires ...

next page

Showing results 1-20

Amd Machine Learning information

See salary details

$13

$22

$31

How much do amd machine learning jobs pay per hour?

As of Jun 25, 2026, the average hourly pay for amd machine learning in the United States is $22.82, according to ZipRecruiter salary data. Most workers in this role earn between $19.71 and $25.48 per hour, depending on experience, location, and employer.

What are some common challenges faced by machine learning engineers at AMD, and how can applicants prepare to address them?

Machine learning engineers at AMD often work on optimizing algorithms for high-performance hardware, which requires balancing model accuracy with computational efficiency. A common challenge is adapting complex models to run efficiently on specialized AMD hardware such as GPUs and accelerators. Applicants can prepare by gaining hands-on experience with parallel computing, understanding low-level hardware optimization, and becoming familiar with AMD’s development tools and frameworks. Collaboration with hardware engineers and software teams is also frequent, so strong communication and teamwork skills are essential.

What are AMD machine learning engineers?

AMD machine learning engineers are professionals who develop, optimize, and deploy machine learning models using AMD hardware and software platforms. They work on designing algorithms and frameworks that leverage AMD GPUs and accelerators for tasks such as data analysis, computer vision, and artificial intelligence. Their responsibilities often include collaborating with software developers, data scientists, and hardware engineers to ensure machine learning solutions run efficiently on AMD architecture.

What is the difference between Amd Machine Learning vs Data Scientist?

AspectAmd Machine LearningData Scientist
Required CredentialsBachelor's or higher in CS, ML, or related fields; certifications like AWS, AzureBachelor's or higher in CS, Statistics, or related fields; certifications like SAS, Python
Work EnvironmentTech companies, R&D labs, AI startupsBusiness analytics, finance, healthcare, tech firms
Industry UsageDeveloping ML models, algorithms, AI solutionsData analysis, insights, predictive modeling
Common Search/ComparisonYesYes

While both roles involve working with data and algorithms, Amd Machine Learning focuses on developing machine learning models and AI solutions, often requiring specialized technical skills. Data Scientists analyze data to generate insights and support decision-making, with a broader scope that includes statistical analysis. Both roles are vital in tech-driven industries but differ in their primary focus and skill sets.

Can AMD be used for machine learning?

AMD provides hardware such as Ryzen CPUs and Radeon GPUs that can be used for machine learning tasks. AMD's ROCm platform supports popular frameworks like TensorFlow and PyTorch, enabling developers to run machine learning workloads on AMD hardware effectively.

What engineers make $500,000?

Senior machine learning engineers and AI specialists with extensive experience, advanced skills in deep learning, and proficiency in tools like TensorFlow or PyTorch can reach salaries of $500,000 or higher, especially in high-cost-of-living areas or top tech companies. Achieving this level often requires a strong educational background, specialized certifications, and a track record of impactful projects.

Is it hard to get hired at AMD?

Getting hired for an AMD Machine Learning role can be competitive due to the company's focus on advanced technology and skilled candidates. Candidates typically need a strong background in machine learning, programming skills in languages like Python or C++, and relevant experience or certifications. The hiring process often involves technical interviews and assessments to evaluate technical proficiency and problem-solving abilities.

Which 3 jobs will survive AI?

Amd Machine Learning professionals are likely to find that roles involving complex problem-solving, creative thinking, and human interaction will persist despite AI advancements. Jobs such as data scientists, AI specialists, and machine learning engineers will continue to be in demand due to their need for specialized skills, domain knowledge, and ongoing innovation. These roles often require advanced programming, understanding of algorithms, and continuous learning to adapt to evolving technologies.

What are the key skills and qualifications needed to thrive as an AMD Machine Learning Engineer, and why are they important?

To thrive as an AMD Machine Learning Engineer, you need strong programming skills in Python or C++, a deep understanding of machine learning algorithms, and typically a degree in computer science or a related field. Familiarity with AMD hardware, GPU acceleration libraries like ROCm, and frameworks such as TensorFlow or PyTorch is essential. Analytical thinking, problem-solving, and effective collaboration are vital soft skills for excelling in this multidisciplinary environment. These skills ensure robust ML model development and optimization that leverage AMD hardware capabilities for high-performance computing applications.
More about Amd Machine Learning jobs
What cities are hiring for Amd Machine Learning jobs? Cities with the most Amd Machine Learning job openings:
What states have the most Amd Machine Learning jobs? States with the most job openings for Amd Machine Learning jobs include:
Infographic showing various Amd Machine Learning job openings in the United States as of June 2026, with employment types broken down into 100% Part Time. Highlights an 65% Physical, 13% Hybrid, and 22% Remote job distribution, with an average salary of $47,468 per year, or $22.8 per hour.
Principal Software Quality Engineer - GPU & Machine Learning

Principal Software Quality Engineer - GPU & Machine Learning

Advanced Micro Devices, Inc

San Jose, CA • On-site

$184K/yr

Full-time

Posted 13 days ago


Advanced Micro Devices rating

8.4

Company rating: 8.4 out of 10

Based on 7 frontline employees who took The Breakroom Quiz

22nd of 139 rated electronics manufacturers


Job description

WHAT YOU DO AT AMD CHANGES EVERYTHING
At AMD, our mission is to build great products that accelerate next-generation computing experiences-from AI and data centers, to PCs, gaming and embedded systems. Grounded in a culture of innovation and collaboration, we believe real progress comes from bold ideas, human ingenuity and a shared passion to create something extraordinary. When you join AMD, you'll discover the real differentiator is our culture. We push the limits of innovation to solve the world's most important challenges-striving for execution excellence, while being direct, humble, collaborative, and inclusive of diverse perspectives. Join us as we shape the future of AI and beyond. Together, we advance your career.
THE ROLE:
We are seeking a Principal Software Quality Engineer to serve as the senior technical leader for ROCm software validation across compute workloads and server-class systems. In this individual-contributor leadership role, you will define how AMD proves ROCm is ready to ship - from unit and component testing, through full-stack workload validation, to multi-node system-level qualification on AMD Instinct™ GPU platforms.
THE PERSON:
You will set the technical direction for validation strategy, build and evolve the test infrastructure that gates every ROCm release, and personally drive the hardest debugging, characterization, and qualification problems. Your work directly determines the quality bar experienced by hyperscalers, OEMs, sovereign-AI customers, and the open-source community running ROCm in production.
KEY RESPONSIBILITIES:
  • Own the end-to-end validation architecture for ROCm - unit, integration, framework, workload, performance, stress, stability, scale-out, and system-level test layers - across multiple GPU generations and server platforms.

  • Define release-qualification gates and exit criteria for ROCm software releases (functional coverage, performance regressions, stability hours, scale targets, RAS criteria) and drive the org to meet them.

  • Lead system-level testing for server nodes - multi-GPU topologies, PCIe/Infinity Fabric/xGMI, BMC/IPMI, thermal/power, firmware interactions, and multi-node fabric (Ethernet/InfiniBand/UALink) bring-up and validation.

  • Drive compute workload validation and characterization - LLM training and inference (PyTorch, vLLM, Triton, JAX), recommender systems, scientific HPC kernels, MLPerf-class benchmarks - establishing reproducible methodology, baselines, and regression tracking.

  • Architect the test infrastructure - distributed test runners, GitHub Actions / Jenkins / internal CI fleets, hardware lab orchestration, result data lakes, flaky-test detection, bisection automation, and self-service developer pre-submit pipelines.

  • Champion modern, agile quality engineering - shift-left testing, test pyramids, contract testing between layers, hermetic test environments, deterministic reproducers, and continuous validation in trunk.

  • Set the bar for GitHub-based quality workflows - PR gating policy, required checks, code-coverage standards, bug-bash and triage cadences, and disciplined issue management across ROCm/* repositories and partner upstream projects.

  • Lead complex escalation debug - partner with development, hardware, firmware, and customer-facing teams to root-cause the hardest multi-day, multi-node, multi-component failures and convert findings into durable test coverage.

  • Influence the roadmap - work with product management, silicon, platform, and software architecture to ensure validation readiness for next-generation Instinct GPUs and server platforms before tape-in milestones and silicon arrival.

  • Mentor and elevate Senior and Staff validation engineers, SDETs, and SQA leads; raise the technical bar through design review, code review, and written guidance.

  • Represent ROCm validation externally - strategic customer engagements, OEM qualification programs, and open-source community quality initiatives.

PREFERRED EXPERIENCE:
  • Strong software engineering experience with a strong validation, SDET, or quality-engineering focus, including 5+ years in a senior IC role (Staff/Principal/PMTS or equivalent) leading validation of complex systems software.

  • Expert-level Python for test automation and infrastructure; strong C++ for debugging and extending production code paths under test.

  • Deep, demonstrable validation experience in at least two of the following domains:

  • GPU compute software stacks (ROCm, CUDA, oneAPI, SYCL)

  • Deep-learning frameworks and inference engines (PyTorch, TensorFlow, JAX, Triton, vLLM)

  • HPC / parallel runtimes and communication libraries (MPI, RCCL/NCCL, UCX, Libfabric)

  • Linux kernel, GPU drivers, or accelerator firmware

  • Distributed systems and large-scale cluster software

  • System-level validation for server-class compute nodes - multi-GPU, multi-node, fabric-attached environments - including stress/stability, soak, fault-injection, and RAS testing.

  • Proven, hands-on experience working efficiently in an agentic AI engineering environment - daily, production use of LLM-based coding agents (e.g., Cursor, Claude Code, Copilot Workspace, Codex-class agents) and orchestration frameworks for real engineering work, with demonstrable productivity, quality, or coverage gains attributable to those workflows. Comfort designing prompts, tool/MCP integrations, evaluation harnesses, and guardrails for autonomous and semi-autonomous agents.

  • Hands-on experience defining and shipping release qualification programs for software consumed by hyperscalers, OEMs, or other Tier-1 customers.

  • Mastery of GitHub at scale for quality engineering - PR gating, GitHub Actions, self-hosted runners, required status checks, release tagging, and open-source contribution and triage norms.

  • Strong command of modern, agile software development practices - trunk-based development, CI/CD, shift-left testing, observability, feature flags, and incremental delivery - applied specifically to validation organizations.

  • Excellent written and verbal communication - able to author crisp test plans, qualification reports, RFCs, and post-mortems, and to influence development teams without authority.

  • Direct contributions to validation, CI, or test infrastructure for ROCm, PyTorch, LLVM, Triton, vLLM, or comparable upstream open-source projects.

  • Demonstrated leadership in agentic-AI adoption - built or rolled out agent-based workflows across an engineering team (e.g., autonomous test generation, AI-driven log/triage pipelines, multi-agent debug systems, MCP server design, retrieval-augmented engineering knowledge bases) with measurable outcomes.

  • Experience operating or validating large GPU clusters (256+ GPUs) - fabric bring-up, cluster health monitoring, and fleet-level diagnostics.

  • Familiarity with Training/Inference/HPC industry-standard benchmark methodologies and submissions.

  • Background in performance validation: roofline analysis, profiler tooling (rocprof, Omniperf, Nsight-class), regression detection

  • Experience with fault injection, RAS, telemetry, and long-haul stability programs for accelerator platforms.

  • Familiarity with hardware lab automation: BMC/IPMI/Redfish, PDU control, serial-console capture, automated re-imaging, and topology-aware test scheduling.

  • Prior experience standing up validation for pre-silicon / emulation / first-silicon bring-up of accelerators.

ACADEMIC CREDENTIALS:
  • BS/MS/PhD in Computer Science, Computer Engineering, or related discipline (or equivalent demonstrated experience).

LOCATION: San Jose, California
#LI-DR1
#LI-HYBRID
Benefits offered are described: AMD benefits at a glance.
AMD does not accept unsolicited resumes from headhunters, recruitment agencies, or fee-based recruitment services. AMD and its subsidiaries are equal opportunity, inclusive employers and will consider all applicants without regard to age, ancestry, color, marital status, medical condition, mental or physical disability, national origin, race, religion, political and/or third-party affiliation, sex, pregnancy, sexual orientation, gender identity, military or veteran status, or any other characteristic protected by law. We encourage applications from all qualified candidates and will accommodate applicants' needs under the respective laws throughout all stages of the recruitment and selection process.
AMD may use Artificial Intelligence to help screen, assess or select applicants for this position. AMD's "Responsible AI Policy" is available here.
This posting is for an existing vacancy.