1

Senior Infrastructure Software Engineer Jobs (NOW HIRING)

Infrastructure Software Engineer

San Jose, CA ยท On-site

$150K - $250K/yr

We build this infrastructure as software - and we engineer it with the same best practices we apply to our products. We use the same rigor, design discipline, and quality standards and testing as we ...

Infrastructure Software Engineer

Manhattan, NY ยท On-site

$190.70K - $226K/yr

Areas of work include Machine Learning Engineers, Infrastructure Engineer, Product SWE Frontend and Backend, Mobile Software Engineers (iOS and Android), Engineering Manager, Data Engineer, Software ...

Infrastructure Software Engineer

Seattle, WA ยท On-site

$120K - $200K/yr

As a Software Infrastructure Engineer , you'll design, build, and maintain the internal software systems that power our AI/ML workflows, connect our robots, and streamline developer operations across ...

Infrastructure Software Engineer

$183.60K - $248.40K/yr

Role Description As an Infrastructure Engineer, your role will be crucial in shaping and ... professional software development experience * Proven track record constructing and managing ...

Infrastructure Software Engineer

Cupertino, CA ยท On-site

$147.40K - $272.10K/yr

... software developer. Deep understanding of multi-threading concepts and design of highly concurrent ... Understanding of base internet infrastructure services including DNS, DHCP , LDAP , server ...

OR ยท On-site

$108.40K - $147.40K/yr

As a senior DGX Cloud AI Infrastructure software engineer at NVIDIA, you will have the opportunity to work on innovative technologies that power the future of AI and data science and be part of a ...

Senior Infrastructure Engineer

Boston, MA ยท Remote

$170K - $220K/yr

Career Renew is recruiting for one of its clients a Senior Infrastructure Engineer - this is a ... This role brings a Software Engineering mindset to infrastructure: building reusable abstractions ...

Senior Infrastructure Engineer

Miami, FL ยท Remote

$170K - $220K/yr

Career Renew is recruiting for one of its clients a Senior Infrastructure Engineer - this is a ... This role brings a Software Engineering mindset to infrastructure: building reusable abstractions ...

Infrastructure Software Engineer

Cupertino, CA ยท On-site

$213.40K - $252.90K/yr

... for software delivered as a service to improve reuse, efficiency, and simplicity. This engineer ... Understanding of base internet infrastructure services including DNS, DHCP , LDAP , server ...

Senior Infrastructure Engineer

San Jose, CA

$127.20K - $172.90K/yr

Senior Infrastructure Engineer We are seeking a Senior Infrastructure Engineer with a strong focus ... Required Skills: * 6+ years of experience in software development, automation, or infrastructure ...

Senior Infrastructure Engineer

New York, NY ยท Remote

$170K - $220K/yr

Career Renew is recruiting for one of its clients a Senior Infrastructure Engineer - this is a ... This role brings a Software Engineering mindset to infrastructure: building reusable abstractions ...

next page

Showing results 1-20

Senior Infrastructure Software Engineer information

See salary details

$22.5K

$127K

$175.5K

How much do senior infrastructure software engineer jobs pay per year?

As of Jun 4, 2026, the average yearly pay for senior infrastructure software engineer in the United States is $126,969.00, according to ZipRecruiter salary data. Most workers in this role earn between $108,500.00 and $147,500.00 per year, depending on experience, location, and employer.

What are the key skills and qualifications needed to thrive as a Senior Infrastructure Software Engineer, and why are they important?

To thrive as a Senior Infrastructure Software Engineer, you need advanced expertise in software engineering, systems architecture, and cloud infrastructure, often backed by a degree in computer science or a related field. Familiarity with tools such as Docker, Kubernetes, Terraform, AWS or Azure, and proficiency in languages like Python or Go are typically required, along with relevant certifications such as AWS Certified Solutions Architect. Strong problem-solving abilities, effective communication, and leadership skills set top candidates apart in this role. These skills and qualities ensure reliable, scalable, and secure infrastructure solutions that support business objectives and team productivity.

How does a Senior Infrastructure Software Engineer typically collaborate with other teams within an organization?

A Senior Infrastructure Software Engineer often works closely with development, operations, and security teams to ensure that systems are robust, scalable, and secure. Collaboration may involve designing deployment pipelines with developers, troubleshooting infrastructure issues with operations, and implementing compliance requirements with security professionals. Effective communication and a proactive approach to cross-team coordination are essential, as these engineers frequently serve as technical bridges between various groups to align infrastructure solutions with business goals.

What does a Senior Infrastructure Software Engineer do?

A Senior Infrastructure Software Engineer designs, builds, and maintains the foundational systems and tools that support an organization's software applications. They focus on scalability, reliability, and performance of infrastructure components such as servers, networks, and cloud services. Their responsibilities often include automating deployment, monitoring system health, and ensuring high availability. Additionally, they collaborate closely with development and operations teams to create efficient workflows and troubleshoot complex technical issues.

What is the difference between Senior Infrastructure Software Engineer vs Infrastructure Software Engineer?

AspectSenior Infrastructure Software EngineerInfrastructure Software Engineer
Required CredentialsBachelor's or higher in CS or related field; experience with cloud platforms, scripting, and networkingBachelor's in CS or related; foundational knowledge of infrastructure tools and scripting
Work EnvironmentDesigning, developing, and maintaining complex infrastructure systems; leading projectsSupporting infrastructure components; implementing updates and troubleshooting
Employer & Industry UsageTech companies, cloud providers, large enterprisesSimilar industries, often as part of infrastructure or DevOps teams

The Senior Infrastructure Software Engineer typically has more experience, takes on leadership roles, and handles complex infrastructure projects. In contrast, the Infrastructure Software Engineer focuses on supporting and maintaining existing systems. Both roles require strong technical skills, but the senior position involves more strategic planning and oversight.

What cities are hiring for Senior Infrastructure Software Engineer jobs? Cities with the most Senior Infrastructure Software Engineer job openings:
What are the most commonly searched types of Infrastructure Software Engineer jobs? The most popular types of Infrastructure Software Engineer jobs are:
What states have the most Senior Infrastructure Software Engineer jobs? States with the most job openings for Senior Infrastructure Software Engineer jobs include:

Infrastructure Software Engineer

Etched

San Jose, CA โ€ข On-site

$150K - $250K/yr

Full-time

Medical, Dental, Vision

Posted 19 days ago


Job description

About Etched

Etched is building the worldโ€™s first AI inference system purpose-built for transformers - delivering over 10x higher performance and dramatically lower cost and latency than a B200. With Etched ASICs, you can build products that would be impossible with GPUs, like real-time video generation models and extremely deep & parallel chain-of-thought reasoning agents. Backed by hundreds of millions from top-tier investors and staffed by leading engineers, Etched is redefining the infrastructure layer for the fastest growing industry in history.

Job Summary

Building cutting-edge model-specific ASICs requires crafting custom infrastructure and toolchains to support ultra-fast, reliable, and scalable development across the stack - from simulation to silicon. We build this infrastructure as software - and we engineer it with the same best practices we apply to our products. We use the same rigor, design discipline, and quality standards and testing as we do to our ASIC, software, and platform.

You will lead the development and adoption of next-generation infrastructure tooling, enabling Etched ASIC, Software, and Platform engineers to iterate faster, build more reliably, and push the boundaries of AI performance. This includes building and scaling our hybrid high-performance compute (HPC) cluster, optimized for massively parallel CI, EDA workflows, Emulation, and hardware-aware job execution.

Youโ€™ll also architect and implement a state-of-the-art observability stack with LLM integration and a strong emphasis on streaming health and performance telemetry, log aggregation, distributed tracing, insight generation, synthetic testing, and smart alerting - across CI pipelines, simulation clusters, and service endpoints.

This role demands a strong software engineering mindset, quality instincts, and deep understanding of systems. Itโ€™s not just about writing scripts - itโ€™s about writing code that builds and manages infrastructure with precision, repeatability, and intent.

Key responsibilities

  • Design and build the orchestration layers that drive our hybrid high-performance clustersโ€”enabling simulation, synthesis, and continuous integration of AI ASICs at unprecedented scale.

  • Develop and maintain a fully programmable infrastructure control plane to ensure reproducibility, auditability, and rapid iteration across the entire stack.

  • Create tools and abstractions that empower engineers to harness massive parallelism without worrying about the underlying complexity..

  • Prototype and execute workload orchestration and migration strategies between on-premise and cloud environments, balancing performance, storage availability and replication, uptime, and cost across heterogeneous hardware and compute backends.

  • Implement real-time telemetry, tracing systems that surface insights from millions of metrics, enabling proactive debugging and system optimization.

  • Build a full observability stack that includes dashboards, alerting, automated responses, and a synthetic testing framework to proactively test infrastructure performance and reliability for various application and data flows, ensuring we remain proactive against issues impacting development and productivity workflows.

Representative projects

  • Design and deploy a fully automated, scalable hybrid HPC cluster, combining bare-metal servers and switches with cloud instances, provisioned through MaaS and orchestrated via SLURM and Kubernetes, optimized for mixed EDA workloads and parallel CI pipelines.

  • Develop a real-time observability system for ASIC toolchain jobs and distributed builds, integrating Prometheus, Grafana, and VictoriaMetrics with streaming telemetry, tracing, and alerting to detect performance regressions before they hit silicon.

  • Architect and implement a programmable infrastructure-as-code control plane, using Terraform, Ansible, and Puppet, to version, audit, and redeploy every layer of Etched's development stack with deterministic reproducibility.

  • Create a zero-downtime interactive development environment that provisions and connects Jupyter and VS Code sessions to GPUs and high-memory nodes via a secure zero-trust network, abstracting away cluster state and machine failures.

  • Prototype and evaluate dynamic workload migration strategies between on-premise and cloud environments to optimize for latency, reliability, and cost across simulation and synthesis pipelines.

  • Design a synthetic testing and fault injection framework to validate the behavior of infrastructure under high-load, degraded hardware, and intermittent network partitions - before they happen in production.

You may be a good fit if you

  • Are a systems-minded software engineer who loves building foundational platforms, working close to the metal and cloud, solving high-leverage problems at scale.

  • Are a deeply technical engineer who treats infrastructure as a software problem - prioritizing clean abstractions, version control,small change lists, easy roll backs, testing, and long-term maintainability over ad hoc configuration.

  • Have strong programming skills in languages such as Python, Go, Rust, and C++, and are comfortable building production-grade tooling.

  • Possess expert-level knowledge of Linux, virtualization, containerization, and CI/CD pipelines, with a deep understanding of how to debug, optimize, and scale complex systems.

  • Are familiar with Infrastructure as Code tools like OpenTofu, Ansible, or Puppet, and enjoy designing declarative, reproducible infrastructure systems.

  • Understand and use PromQL and other telemetry/query languages and have used LLM to extract insight from real-time metrics, and know how to architect and tune observability stacks.

  • Have a track record of debugging and resolving difficult hardware-software integration problems across bare-metal systems, networks, and distributed workloads.

  • Can lead and mentor technical teams, guiding design decisions and helping others develop sound engineering instincts.

  • Have 8+ years of experience in infrastructure engineering, systems programming, or backend software development - ideally in environments where performance, scale, or hardware interaction mattered.

  • Are driven by curiosity, take initiative, and have an innate sense of ownership โ€” you thrive in uncharted territory, design for edge cases, and love making systems more powerful, reliable, and elegant.

Strong candidates may also have experience with

  • Familiarity with Bazel build system

  • Deep understanding of ASIC development flows, especially those involving Synopsys, Cadence, and Verilator, including how EDA tools interact with infrastructure for simulation, synthesis, and verification.

  • Hands-on experience architecting systems with AWS, GCP, or Azure, including hybrid on-prem/cloud deployments, workload migration strategies, and cloud-native orchestration tooling.

  • Experience monitoring, provisioning, and debugging bare-metal servers, network hardware, and high-performance storage systems in rack-scale environments.

  • Comfortable in profiling and optimizing compute environments for single-threaded latency, memory-bound workloads, or I/O throughput, especially in the context of simulation or CI performance.

  • Proficiency building or operating telemetry systems at scale using Prometheus, Grafana, Loki, VictoriaMetrics, and tools for distributed tracing, log aggregation, and real-time alerting across heterogeneous mediums (SMS, email, push alerts, etc.)

Benefits

  • Medical, dental, and vision packages with generous premium coverage

    • $500 per month credit for waiving medical benefits

  • Housing subsidy of $2k per month for those living within walking distance of the office

  • Relocation support for those moving to San Jose (Santana Row)

  • Various wellness benefits covering fitness, mental health, and more

  • Daily lunch + dinner in our office

  • Unlimited compute budget subject to ROI justification

How weโ€™re different

Etched believes in the Bitter Lesson. We think most of the progress in the AI field has come from using more FLOPs to train and run models, and the best way to get more FLOPs is to build model-specific hardware. Larger and larger training runs encourage companies to consolidate around fewer model architectures, which creates a market for single-model ASICs.

We are a fully in-person team in San Jose (Santana Row), and greatly value engineering skills. We do not have boundaries between engineering and research, and we expect all of our technical staff to contribute to both as needed.

Compensation Range: $150K - $250K