Job Summary:
AMD is a leading company in the AI and computing industry, dedicated to innovation and collaboration. They are seeking a highly motivated AI Performance Software Engineer to optimize software for next-generation GPU computational accelerators, working with sophisticated clients to enhance AI applications.
Responsibilities:
• Enable software for world class datacenters and supercomputers
• Optimize the software ecosystem for next generation GPU computational accelerators
• Work with a team of Software Engineers to enable DL models, libraries, and applications for Instinct GPUs in both on-prem and Cloud environments
• Analyze and optimize the performance of AI software
• Understand hardware bottlenecks and harness performance to hit close to roofline
Qualifications:
Required:
• Minimum 4 years of experience required.
• Strong programming skills in C++ and Python
• Strong development experience is at least one major DL framework such as Pytorch or Tensorflow in inference, fine tuning and/or training
• MS with years of related experience or PhD with years of related experience in Computer Science or Computer Engineering or related equivalent.
• Experience developing software and system-level performance optimizations with a solid architecture understanding in GPUs a plus
• Experience with open-source software development including collaboration with community maintainers and submitting contributions is a plus
• Publications in reputed peer-reviewed ML conferences/journals a plus
• Excellent analytical and problem-solving skills root-causing/addressing performance issues.
• Ability to work independently and as part of a team.
• Willingness to learn skills, tools, and methods to advance the quality, consistency, and timeliness of AMD software products.
Preferred:
• Expertise in profiling tools across the AI SW Stack (Torchprofiler, RocM profiler, Vtune, Nsight)
• Experience in implementing and optimizing parallel methods on GPU accelerators (NCCL/RCCL, OpenMP, MPI)
• Performance analysis skills for both CPU and GPU
• Experience with Singularity, Docker, and/or Kubernetes.
• Experience providing clear and timely communication related to status and other key aspects of the project to leadership team.
Company:
Advanced Micro Devices is a semiconductor company that designs and develops graphics units, processors, and media solutions. Founded in 1969, the company is headquartered in Santa Clara, USA, with a team of 10001+ employees. The company is currently Late Stage.