Job Summary:
NVIDIA is seeking extraordinary architects to develop processor and system architectures that accelerate machine learning, data analytics, and high-performance computing applications. The Senior Kernel Performance Architect for Deep Learning Software will craft GPU-accelerated system architectures, prototype high-performance software, and collaborate with various teams to optimize deep learning performance.
Responsibilities:
• Craft GPU-accelerated system architectures that push the boundaries of deep learning performance.
• Prototype high-performance software for deep learning and data analytics workloads.
• Analyze, visualize, and optimize software performance using analytical models, simulators, and test suites.
• Collaborate closely across NVIDIA teams such as:
• CUDA Compiler teams to identify performance issues.
• AI/ML training and inference performance teams to identify and optimize critical deep learning layers.
• hardware architecture performance teams to define expectation for emerging deep learning hardware features.
Qualifications:
Required:
• A Master's or PhD in Computer Science, Electrical Engineering or Computer Engineering, or equivalent experience.
• 5+ years of relevant industry or research experience.
• A strong foundation in machine learning and deep learning fundamentals to complement your expertise in computer architecture.
• A strong background in high performance kernel (such as CUTLASS), work experience on math library performance analysis and profiling to identify performance bottlenecks.
• Fluency in programming languages such as Python, C, C++.
• Experience and familiarity with GPU computing and parallel programming models.
• You have firsthand work experience with analytical performance modeling, profiling, and analysis.
Company:
NVIDIA is a computing platform company operating at the intersection of graphics, HPC, and AI. Founded in 1993, the company is headquartered in Santa Clara, USA, with a team of 10001+ employees. The company is currently Late Stage.