Job Summary:
NVIDIA has been transforming computer graphics and accelerated computing for more than 25 years, and they are seeking a Senior Performance Architect to build their next generation of profiling infrastructure. The role involves measuring, analyzing, and optimizing interactions between design graphs and high-throughput kernels on the GPU.
Responsibilities:
โข Architecting and maintaining custom profiling frameworks that provide a unified view of execution across CPU (multi-core/multi-socket) and GPU (multi-node/NVLink) environments.
โข Conducting deep-dive benchmarking of EDA applications to characterize memory access patterns, cache hit rates, and instruction-level parallelism.
โข Using GPU profilers to detect GPU-side inefficiencies such as warp divergence, sub-optimal occupancy, and PCIe/NVLink bottlenecks.
โข Developing tools to monitor and attribute high-watermark memory usage in multi-terabyte EDA builds, finding opportunities for data structure compression or smarter memory pooling.
โข Developing predictive models to guide hardware procurement and cloud instance selection based on built gate-count and algorithmic complexity.
Qualifications:
Required:
โข A grasp of the CUDA programming model and experience employing GPU profiling tools like NVIDIA Nsight Systems/Compute to address PCIe bottlenecks and kernel stalls.
โข Extensive knowledge of profiling tools such as perf, eBPF, VTune, or Valgrind, along with insight into their internal mechanisms.
โข A passion for meticulous benchmarking and the ability to distill sophisticated performance data into actionable engineering roadmaps.
โข Experience with distributed compute environments (Slurm, LSF, or Kubernetes).
โข A BS, MS, or PhD in Computer Science, Electrical Engineering, or a related field (or equivalent experience) with more than 8+yrs of relevant experience and at least 5 years involved in systems-level performance analysis.
Company:
NVIDIA is a computing platform company operating at the intersection of graphics, HPC, and AI. Founded in 1993, the company is headquartered in Santa Clara, USA, with a team of 10001+ employees. The company is currently Late Stage.