Job Summary:
NVIDIA has been a leader in computer graphics and accelerated computing for over 25 years and is now leveraging AI for the next computing era. They are seeking a Senior Software Engineer to join the cuEquivariance team, responsible for building and optimizing GPU-accelerated geometric ML primitives and collaborating with research teams to deliver production-quality software for scientific applications.
Responsibilities:
• Build, implement, and optimize CUDA kernels for equivariant neural network primitives — tensor products, segmented polynomials, and triangle-based operations — targeting peak performance across NVIDIA GPU generations.
• Be responsible for the end-to-end delivery of GPU-accelerated geometric ML primitives: from implementation to validated, production-quality software that external frameworks depend on.
• Build and maintain the interfaces for PyTorch and JAX that expose cuEquivariance primitives to application developers and researchers.
• Drive CI/CD infrastructure for multi-GPU kernel builds, automated correctness testing, and performance regression tracking.
• Collaborate with Applied Science and research teams to evaluate new equivariant architectures and translate prototypes into production kernels.
• Engage directly with third-party framework developers and partners to align on interfaces and ensure delivered software integrates cleanly into production pipelines.
Qualifications:
Required:
• 6+ years of software engineering experience with a strong background in CUDA and GPU programming.
• Deep proficiency in C++ and Python; experience building and shipping production libraries used by external developers.
• Good foundation in GPU computing: memory hierarchy, warp-level execution, occupancy, and performance profiling methodology.
• Experience building or chipping in to production scientific software libraries, ML frameworks, or developer-facing GPU APIs.
• Familiarity with concepts in geometric machine learning — equivariance, group representations, irreducible representations, or tensor products — sufficient to work efficiently in the domain.
• BS/MS in Computer Science, Physics, Applied Mathematics, or a related field, or equivalent experience.
Preferred:
• You have chipped in to or deeply used a major neural network framework that respects equivariance: e3nn, MACE, NequIP, SE(3)-Transformers, or similar.
• Hands-on experience with Triton kernel development or other GPU kernel authoring tools alongside CUDA.
• Experience with mixed-precision or tensor-core-aware algorithm design for scientific or ML workloads.
• PhD or equivalent experience in computational chemistry, biophysics, physics, or computer science with a focus on geometric deep learning or HPC.
• Contributions to open-source geometric ML or GPU computing projects.
Company:
NVIDIA is a computing platform company operating at the intersection of graphics, HPC, and AI. Founded in 1993, the company is headquartered in Santa Clara, USA, with a team of 10001+ employees. The company is currently Late Stage.