Job Summary:
NVIDIA is leading the way in groundbreaking developments in Artificial Intelligence, High-Performance Computing and Visualization. The role involves gathering use cases and requirements, translating those into software roadmaps, and executing those roadmaps across internal teams and external partners.
Responsibilities:
• Gathering use cases + requirements, translating those into software roadmaps, and executing those roadmaps across internal NVIDIA teams and external partners.
• Reporting project status, risks, help needed, and roadmap pivots to internal and external executives via status reports and in-person meetings.
• Brokering technical discussions between highly technical subject matter experts
• Leveraging AI tools and workflows to quickly iterate on designs, prototypes, documentation, tests, and code.
• Architecting distributed, robust, and scalable GoLang and Rust system software, deployed to monitor and manage large datacenters
Qualifications:
Required:
• BS or higher in Computer Science or equivalent experience.
• 15+ years of meaningful industry experience with a strong scalable system software development background.
• Experience with APIs and interface design.
• Experience with AI tools and development workflows.
• Outstanding written and verbal interpersonal skills.
• Business level English.
• Strong motivation and commitment to learn new skills.
• Ability to manage time in a fast, heavily multitasked environment.
• Development experience with Rust, Python, and/or GoLang.
• Development experience with distributed systems and concurrent applications, especially in a Kubernetes environment.
• Ability to quickly understand unfamiliar technical domains, identify core problems, and translate ambiguous requirements into actional engineering plans.
• Skilled at producing clear technical documentation, design docs, and status updates that keep cross-functional partners aligned.
• Track record of identifying process inefficiencies and introducing automation, tooling, or AI-power workflows that measurably improve team out.
Preferred:
• Development experience in relevant coding languages like GoLang and Rust.
• Experience with SCADA or Data Center power related software.
• Background with containers (e.g. Docker, OCI), orchestration frameworks, and logging/telemetry backends with Kubernetes monitoring stacks with tools such as Prometheus, Loki and Grafana.
• Experience with modern UI development in React and Node.js or similar frameworks.
• Experience developing Kubernetes operators or Helm charts.
• Experience with HPC job schedulers like Slurm or Run.AI.
• Familiarity with Kubernetes internals.
• Exposure to GPU programming with CUDA.
Company:
NVIDIA is a computing platform company operating at the intersection of graphics, HPC, and AI. Founded in 1993, the company is headquartered in Santa Clara, USA, with a team of 10001+ employees. The company is currently Late Stage.