1

Infrastructure Software Engineer Jobs in California

... infrastructure for HPC solutions. Responsibilities : • Build software that runs large-scale deep ... engineering background, PhD/Masters in EECS, Mathematics, Software Engineering, or Physics. • ...

Software Engineer - Infrastructure

San Francisco, CA · Remote

$203K - $241K/yr

Software Engineer - Infrastructure As a Software Engineer on the Infrastructure team, you will be an early contributor to the growing team to support a rapidly expanding fleet of satellites. Speed ...

next page

Showing results 1-20

Infrastructure Software Engineer information

See California salary details

$114K

$177.9K

$203.8K

How much do infrastructure software engineer jobs pay per year?

As of Jun 9, 2026, the average yearly pay for infrastructure software engineer in California is $177,905.00, according to ZipRecruiter salary data. Most workers in this role earn between $170,700.00 and $202,300.00 per year, depending on experience, location, and employer.

What are the key skills and qualifications needed to thrive in the Infrastructure Software Engineer position, and why are they important?

To thrive as an Infrastructure Software Engineer, you need a deep understanding of computer systems, networking, cloud infrastructure, and proficiency in programming languages such as Python, Go, or Java. Experience with infrastructure automation tools (like Terraform, Ansible, or Kubernetes), cloud platforms (AWS, Azure, GCP), and possibly certifications such as AWS Certified Solutions Architect are highly valuable. Effective collaboration, problem-solving abilities, and strong communication skills help you excel in cross-functional teams and fast-paced environments. These skills are critical to building scalable, reliable systems and ensuring seamless deployment and maintenance of company infrastructure.

What does an Infrastructure Software Engineer do?

An Infrastructure Software Engineer designs, builds, and maintains the foundational software systems that support applications, networking, and cloud environments. They focus on scalability, reliability, and performance by developing automation tools, managing CI/CD pipelines, and optimizing infrastructure. Their work ensures systems run efficiently, securely, and with minimal downtime.

What are the typical day-to-day responsibilities of an Infrastructure Software Engineer?

As an Infrastructure Software Engineer, your typical day involves designing, implementing, and maintaining automation workflows for infrastructure deployment, monitoring system performance, and responding to incidents or outages. You'll work closely with development, DevOps, and IT teams to ensure infrastructure scalability, reliability, and security. Regular tasks may include writing scripts, managing cloud resources, evaluating new tools, and participating in on-call rotations. Collaboration and proactive problem-solving are central to meeting both technical and business needs efficiently. This role offers a dynamic blend of hands-on technical work and teamwork in evolving environments.

What are the most commonly searched types of Infrastructure Software Engineer jobs in California? The most popular types of Infrastructure Software Engineer jobs in California are:
What are popular job titles related to Infrastructure Software Engineer jobs in California? For Infrastructure Software Engineer jobs in California, the most frequently searched job titles are:
What job categories do people searching Infrastructure Software Engineer jobs in California look for? The top searched job categories for Infrastructure Software Engineer jobs in California are:
What cities in California are hiring for Infrastructure Software Engineer jobs? Cities in California with the most Infrastructure Software Engineer job openings:
What are popular job titles related to Infrastructure Software Engineer jobs in CA? For Infrastructure Software Engineer jobs in CA, the most frequently searched job titles are:
Infographic showing various Infrastructure Software Engineer job openings in California as of May 2026, with employment types broken down into 92% Full Time, 6% Part Time, and 2% Contract. Highlights an 85% Physical, 5% Hybrid, and 10% Remote job distribution, with an average salary of $177,905 per year, or $85.5 per hour.
Senior DGX Cloud AI Infrastructure Software Engineer

Senior DGX Cloud AI Infrastructure Software Engineer

Nvidia

Santa Clara, CA • On-site

$127K - $173K/yr

Full-time

Posted 6 days ago


Job description

Joining NVIDIA's DGX Cloud AI Efficiency Team means contributing to the infrastructure that powers our innovative AI research. This team focuses on developing tools for optimizing efficiency and resiliency of AI workloads - pre-training, post-training, inference. Our objective is to deliver a stable, scalable environment for AI researchers, providing them with the necessary resources and scale to foster innovation. We are seeking an AI infrastructure software engineer to join our team. You'll be instrumental in designing, building, and maintaining AI infrastructure that enable large-scale AI training and inferencing. The responsibilities include implementing software and systems engineering practices to ensure high efficiency and availability of AI systems.

As a senior DGX Cloud AI Infrastructure software engineer at NVIDIA, you will have the opportunity to work on innovative technologies that power the future of AI and data science and be part of a dynamic, diverse, and supportive team that values learning and growth. The role provides the autonomy to work on meaningful projects with the support and mentorship needed to succeed, and contributes to a culture of blameless postmortems, iterative improvement, and risk-taking. If you are seeking an exciting and rewarding career that makes a difference, we invite you to apply now!

What you'll be doing:

  • Develop infrastructure software and tools for large-scale pre-training, post-training, and inference.

  • Develop and optimize tools and libraries to improve infrastructure efficiency and resiliency.

  • Co-design and implement APIs for integration with NVIDIA's resiliency stacks.

  • Enhance infrastructure and products underpinning NVIDIA's AI platforms.

  • Define meaningful and actionable reliability metrics to track and improve system and service reliability.

  • Skilled in problem-solving, root cause analysis, and optimization.

  • Root cause and analyze and triage failures from the application level to the hardware level

What we need to see:

  • Minimum of 8+ years of experience in developing software infrastructure for large scale AI systems.

  • Bachelor's degree or higher in Computer Science or a related technical field (or equivalent experience).

  • Strong debugging skills and experience in analyzing and triaging AI applications from the application level to the hardware level.

  • Experience with observability platforms for monitoring and logging (e.g., ELK, Prometheus, Loki).

  • Proven track record in building and scaling large-scale distributed systems.

  • Experience with AI training and inferencing infrastructure services.

  • Proficiency in programming languages such as Python, C/C++, script languages

  • Experience in quality software engineering practices, including test development, defensive programming, version control, and CI.

  • Excellent communication and collaboration skills, and a culture of diversity, intellectual curiosity, problem solving, and openness are essential.

Ways to stand out from the crowd:

  • Background in working with the large scale clusters

  • Experience in defining and building observability and telemetry software stack

  • Experience with RDMA software stack (NCCL, IB verbs, ucx, libfabrics)

  • Experience and root cause analysis of failures and datacenter scale

  • Good understanding on DL frameworks internal PyTorch, TensorFlow, JAX, and Ray

NVIDIA leads the way in groundbreaking developments in Artificial Intelligence, High-Performance Computing, and Visualization. The GPU, our invention, serves as the visual cortex of modern computers and is at the heart of our products and services. Our work opens up new universes to explore, enables amazing creativity and discovery, and powers what were once science fiction inventions, from artificial intelligence to autonomous cars. NVIDIA is looking for exceptional people like you to help us accelerate the next wave of artificial intelligence.

Your base salary will be determined based on your location, experience, and the pay of employees in similar positions. The base salary range is 184,000 USD - 287,500 USD for Level 4, and 224,000 USD - 356,500 USD for Level 5.

You will also be eligible for equity and benefits.

Applications for this job will be accepted at least until April 6, 2026.

This posting is for an existing vacancy.

NVIDIA uses AI tools in its recruiting processes.

NVIDIA is committed to fostering a diverse work environment and proud to be an equal opportunity employer. As we highly value diversity in our current and future employees, we do not discriminate (including in our hiring and promotion practices) on the basis of race, religion, color, national origin, gender, gender expression, sexual orientation, age, marital status, veteran status, disability status or any other characteristic protected by law.

Nvidia logo

About Nvidia

Sourced by ZipRecruiter

NVIDIA has been transforming computer graphics, PC gaming, and accelerated computing for more than 25 years. It's a unique legacy of innovation that's fueled by great technology--and amazing people. Today, we're tapping into the unlimited potential of AI to define the next era of computing. An era in which our GPU acts as the brains of computers, robots, and self-driving cars that can understand the world. Doing what's never been done before takes vision, innovation, and the world's best talent.

Industry

Computer and electronic product manufacturing

Company size

10,000+ Employees

Headquarters location

Santa Clara, CA, US

Year founded

1993