... prefix caching, and speculative decoding. • Support development and optimization of scalable serving systems, including request scheduling and resource utilization. • Develop and use profiling ...
... prefix caching, and speculative decoding. • Support development and optimization of scalable serving systems, including request scheduling and resource utilization. • Develop and use profiling ...
Network Engineer III
Scott Air Force Base, IL · On-site
$125K - $150K/yr
Configure and validate BGP peering, route filtering (prefix-lists/route-maps), and community-based policy control * Support integration and troubleshooting with Type I encryption devices (e.g ...
Network Engineer III
Scott Air Force Base, IL · On-site
$125K - $150K/yr
Configure and validate BGP peering, route filtering (prefix-lists/route-maps), and community-based policy control * Support integration and troubleshooting with Type I encryption devices (e.g ...
Minimum requirements are a Master's degree, which includes 18 graduate hours in the teaching discipline (18 graduate hours with the SOC course prefix.) Teaching experience at the college level ...
Minimum requirements are a Master's degree, which includes 18 graduate hours in the teaching discipline (18 graduate hours with the SOC course prefix.) Teaching experience at the college level ...
Minimum requirements are a Master's degree, which includes 18 graduate hours in the teaching discipline (18 graduate hours with the SOC course prefix.) Teaching experience at the college level ...
Minimum requirements are a Master's degree, which includes 18 graduate hours in the teaching discipline (18 graduate hours with the SOC course prefix.) Teaching experience at the college level ...
... prefix caching, and speculative decoding. • Support development and optimization of scalable serving systems, including request scheduling and resource utilization. • Develop and use profiling ...
... prefix caching, and speculative decoding. • Support development and optimization of scalable serving systems, including request scheduling and resource utilization. • Develop and use profiling ...
Care Giver
Sacramento, CA · On-site
$16 - $18/hr
Benefits: * Very Available Manager * Small and Supportive Team * Flexible Time Off Request *In Advance* * Meals Included * 401(k) matching * Free food & snacks * Training & development Seeking ...
Quick apply
Care Giver
Sacramento, CA · On-site
$16 - $18/hr
Benefits: * Very Available Manager * Small and Supportive Team * Flexible Time Off Request *In Advance* * Meals Included * 401(k) matching * Free food & snacks * Training & development Seeking ...
Be Seen First
CNC Machinist III
Morgan Hill, CA · On-site
$32 - $39/hr
Our company name, Minimatics, is derived from the Greek suffix -matos, meaning "willing to perform" and the Greek prefix -mini, which means to make smaller. Thus, we have always distinguished ...
Quick apply
Be Seen First
CNC Machinist III
Morgan Hill, CA · On-site
$32 - $39/hr
Our company name, Minimatics, is derived from the Greek suffix -matos, meaning "willing to perform" and the Greek prefix -mini, which means to make smaller. Thus, we have always distinguished ...
Master's degree with a minimum of 18 graduate semester hours in ENG/ENGL prefix courses. Preferred Qualifications Teaching experience at the two-year college level is preferred. Additional Comments ...
Master's degree with a minimum of 18 graduate semester hours in ENG/ENGL prefix courses. Preferred Qualifications Teaching experience at the two-year college level is preferred. Additional Comments ...
A prefix such as "clinical", or suffix such as "of Practice", or other appropriate prefix or suffix, will be designated for these appointments, full-time or part-time. (FHB 7.1.4). A Clinical ...
A prefix such as "clinical", or suffix such as "of Practice", or other appropriate prefix or suffix, will be designated for these appointments, full-time or part-time. (FHB 7.1.4). A Clinical ...
Teach courses online and face to face with IDS prefix and others to be authorized by other academic departments as needed. * Maintain the Interdisciplinary Studies Degree Program curriculum with ...
Teach courses online and face to face with IDS prefix and others to be authorized by other academic departments as needed. * Maintain the Interdisciplinary Studies Degree Program curriculum with ...
Configure and validate BGP peering, route filtering (prefix-lists/route-maps), and community-based policy control * Support integration and troubleshooting with Type I encryption devices (e.g ...
Configure and validate BGP peering, route filtering (prefix-lists/route-maps), and community-based policy control * Support integration and troubleshooting with Type I encryption devices (e.g ...
Caregiver
Sacramento, CA · On-site
$16.50 - $18/hr
Benefits/Perks * 10 hour shift (four days a week to reach 40 hours) Job Summary Seeking someone Local to the Arden Arcade Area of Sacramento. Looking for someone who lives locally. The Job: To work ...
Quick apply
Caregiver
Sacramento, CA · On-site
$16.50 - $18/hr
Benefits/Perks * 10 hour shift (four days a week to reach 40 hours) Job Summary Seeking someone Local to the Arden Arcade Area of Sacramento. Looking for someone who lives locally. The Job: To work ...
Caregiver
Sacramento, CA · On-site
$16.50 - $18/hr
Benefits/Perks * 10 hour shift (four days a week to reach 40 hours) Job Summary Seeking someone Local to the Arden Arcade Area of Sacramento. Looking for someone who lives locally. The Job: To work ...
Quick apply
Caregiver
Sacramento, CA · On-site
$16.50 - $18/hr
Benefits/Perks * 10 hour shift (four days a week to reach 40 hours) Job Summary Seeking someone Local to the Arden Arcade Area of Sacramento. Looking for someone who lives locally. The Job: To work ...
A prefix such as "clinical", or suffix such as "of Practice", or other appropriate prefix or suffix, will be designated for these appointments, full-time or part-time (Faculty Handbook 7.1.4). A ...
A prefix such as "clinical", or suffix such as "of Practice", or other appropriate prefix or suffix, will be designated for these appointments, full-time or part-time (Faculty Handbook 7.1.4). A ...
A prefix such as "clinical", or suffix such as "of Practice", or other appropriate prefix or suffix, will be designated for these appointments, full-time or part-time. (FHB 7.1.4). A Clinical ...
A prefix such as "clinical", or suffix such as "of Practice", or other appropriate prefix or suffix, will be designated for these appointments, full-time or part-time. (FHB 7.1.4). A Clinical ...
A limited license is not considered qualifying; license must have the "MTA" prefix. NOTE : If applying under this option, candidates must provide their CLS License number and expiration date in the ...
A limited license is not considered qualifying; license must have the "MTA" prefix. NOTE : If applying under this option, candidates must provide their CLS License number and expiration date in the ...
A prefix such as "clinical", or suffix such as "of Practice", or other appropriate prefix or suffix, will be designated for these appointments, full-time or part-time (Faculty Handbook 7.1.4). A ...
A prefix such as "clinical", or suffix such as "of Practice", or other appropriate prefix or suffix, will be designated for these appointments, full-time or part-time (Faculty Handbook 7.1.4). A ...
Caregiver
Sacramento, CA · On-site
$16.50 - $18/hr
Benefits: * 401(k) matching * Free food & snacks * Opportunity for advancement * Training & development * Vision insurance Benefits/Perks * 10 hour shift (four days a week to reach 40 hours) Job ...
Quick apply
Caregiver
Sacramento, CA · On-site
$16.50 - $18/hr
Benefits: * 401(k) matching * Free food & snacks * Opportunity for advancement * Training & development * Vision insurance Benefits/Perks * 10 hour shift (four days a week to reach 40 hours) Job ...
Clinical Laboratory Scientist I/II*
San Bernardino, CA · On-site
$91K - $119K/yr
A limited license is not considered qualifying; license must have the "MTA" prefix. NOTE: If applying under this option, candidates must provide their CLS License number and expiration date in the ...
Clinical Laboratory Scientist I/II*
San Bernardino, CA · On-site
$91K - $119K/yr
A limited license is not considered qualifying; license must have the "MTA" prefix. NOTE: If applying under this option, candidates must provide their CLS License number and expiration date in the ...
Type Course Prefix & # Course Title Could You Teach This Course? IT Certification MEDT 7461 Instructional Technology, Media, and Design MEDT 7468 Instructional Multimedia Design and Development MEDT ...
Type Course Prefix & # Course Title Could You Teach This Course? IT Certification MEDT 7461 Instructional Technology, Media, and Design MEDT 7468 Instructional Multimedia Design and Development MEDT ...
Prefix information
See salary details
$8.89 - $13.70
16% of jobs
$15.17 is the 25th percentile. Wages below this are outliers.
$13.70 - $18.51
29% of jobs
The median wage is $19.71 / hr.
$18.51 - $23.32
19% of jobs
$27.58 is the 75th percentile. Wages above this are outliers.
$23.32 - $28.13
12% of jobs
$28.13 - $32.93
8% of jobs
$32.93 - $37.74
5% of jobs
$37.74 - $42.55
4% of jobs
$42.55 - $47.36
2% of jobs
$47.36 - $52.16
2% of jobs
$52.16 - $56.97
1% of jobs
$56.97 - $61.78
1% of jobs
$8
$26
$61
How much do prefix jobs pay per hour?
What are Prefix jobs?
What are some common challenges a Prefix operator may face when working in a fast-paced manufacturing environment?
Advanced Micro Devices rating
8.4
Based on 7 frontline employees who took The Breakroom Quiz
24th of 139 rated electronics manufacturers
Job description
AMD is a company focused on building innovative products that enhance computing experiences across various domains including AI and data centers. They are seeking a Principal GenAI Inference Optimization Engineer to improve the performance and efficiency of generative AI inference workloads on AMD GPU platforms, optimizing latency and throughput for large-scale models.
Responsibilities:
• Optimize performance of GenAI inference workloads on AMD GPU platforms across single-node and distributed environments.
• Improve latency, throughput, and cost efficiency for LLM and multimodal model serving in production.
• Analyze and resolve bottlenecks across compute, memory, and communication (e.g., kernel efficiency, KV-cache usage, memory bandwidth, scheduling).
• Contribute to cross-stack optimizations spanning kernels, runtimes, communication libraries, and inference/serving frameworks (e.g., vLLM, SGLang, Triton, or similar systems).
• Implement and evaluate inference optimization techniques such as batching strategies, quantization, prefix caching, and speculative decoding.
• Support development and optimization of scalable serving systems, including request scheduling and resource utilization.
• Develop and use profiling, benchmarking, and performance analysis tools for inference workloads.
• Collaborate with hardware, compiler, and framework teams to improve overall system performance.
• Contribute to internal tools and, where applicable, open-source projects for inference optimization on AMD platforms.
• Document best practices and contribute to performance guidelines for GenAI deployment.
Qualifications:
Required:
• Strong technical contributor with expertise in GenAI inference optimization, GPU performance, and large-scale serving systems.
• Solid understanding of GPU architecture, memory systems, and communication patterns.
• Ability to improve inference efficiency.
• Comfortable working across multiple layers—from kernels and runtimes to frameworks and serving systems.
• Ability to independently drive optimization efforts while collaborating with cross-functional teams.
• Optimize performance of GenAI inference workloads on AMD GPU platforms across single-node and distributed environments.
• Improve latency, throughput, and cost efficiency for LLM and multimodal model serving in production.
• Analyze and resolve bottlenecks across compute, memory, and communication (e.g., kernel efficiency, KV-cache usage, memory bandwidth, scheduling).
• Contribute to cross-stack optimizations spanning kernels, runtimes, communication libraries, and inference/serving frameworks (e.g., vLLM, SGLang, Triton, or similar systems).
• Implement and evaluate inference optimization techniques such as batching strategies, quantization, prefix caching, and speculative decoding.
• Support development and optimization of scalable serving systems, including request scheduling and resource utilization.
• Develop and use profiling, benchmarking, and performance analysis tools for inference workloads.
• Collaborate with hardware, compiler, and framework teams to improve overall system performance.
• Contribute to internal tools and, where applicable, open-source projects for inference optimization on AMD platforms.
• Document best practices and contribute to performance guidelines for GenAI deployment.
• B.S., M.S. or Ph.D. in Computer Science, Computer Engineering, or a related field preferred, or equivalent industry experience.
Preferred:
• Strong understanding of GPU architecture and performance fundamentals (compute, memory hierarchy, interconnects such as PCIe/Infinity Fabric/RDMA).
• Experience with GenAI inference optimization techniques (e.g., quantization, KV-cache optimization, batching).
• Hands-on experience with inference/serving frameworks such as vLLM, SGLang, Triton, TensorRT-LLM, or similar.
• Experience working on LLM or multimodal inference workloads.
• Familiarity with distributed systems and serving architectures.
• Experience with ML frameworks (PyTorch, JAX, or TensorFlow), especially for inference.
• Proficiency in Python and at least one systems language (C++/CUDA/HIP).
• Experience with profiling, debugging, and performance tuning tools.
• Ability to work collaboratively across teams and deliver impactful optimizations.
Company:
Advanced Micro Devices is a semiconductor company that designs and develops graphics units, processors, and media solutions. Founded in 1969, the company is headquartered in Santa Clara, USA, with a team of 10001+ employees. The company is currently Late Stage.
About Advanced Micro Devices
Sourced by ZipRecruiter
Industry
Computer and electronic product manufacturing
Company size
5,001 - 10,000 Employees
Headquarters location
Sunnyvale, CA, US
Year founded
1969