1

Infrastructure Operations Jobs (NOW HIRING)

Infrastructure Operations Engineer

New York, NY ยท Hybrid

$117K - $154K/yr

We are seeking an experienced and proactive Infrastructure Engineer to join our Platform Operations Team. This role is pivotal in managing and evolving our enterprise platforms, ensuring robust ...

Responsibilities & Qualifications We are seeking an Infrastructure Operations Manager to join our team supporting the Transportation Team . REQUIRED QUALIFICATIONS Education * Bachelor's Degree in ...

next page

Showing results 1-20

Infrastructure Operations information

See salary details

$80.5K

$154K

$198K

How much do infrastructure operations jobs pay per year?

As of Jun 9, 2026, the average yearly pay for infrastructure operations in the United States is $154,028.00, according to ZipRecruiter salary data. Most workers in this role earn between $113,000.00 and $197,000.00 per year, depending on experience, location, and employer.

What are the typical daily responsibilities of someone working in Infrastructure Operations?

Professionals in Infrastructure Operations are usually responsible for monitoring system performance, managing server uptime, maintaining networks, and rapidly responding to incidents or outages. They often perform routine tasks such as patch management, system backups, and troubleshooting technical issues while also participating in system upgrades and infrastructure projects. The role typically involves close collaboration with IT support, cybersecurity, and development teams to ensure all systems are secure and meet organizational needs. This dynamic work environment requires staying updated on emerging technologies and being proactive in identifying and resolving potential issues before they impact end users.

What is an Infrastructure Operations job?

An Infrastructure Operations job involves managing and maintaining an organization's IT infrastructure, including servers, networks, cloud services, and data centers. Professionals in this role ensure system reliability, optimize performance, and troubleshoot issues to support business continuity. They also implement security measures, monitor system health, and collaborate with other IT teams to improve operational efficiency.

What are the key skills and qualifications needed to thrive in the Infrastructure Operations position, and why are they important?

To thrive in Infrastructure Operations, you need expertise in systems administration, network management, troubleshooting, and a solid understanding of IT infrastructure concepts, often backed by a degree in computer science or a related field. Familiarity with cloud platforms (AWS, Azure, Google Cloud), scripting languages (PowerShell, Bash), and certifications such as CompTIA Network+, Microsoft Certified: Azure Administrator, or AWS Certified SysOps Administrator are highly valuable. Strong analytical thinking, effective communication, and the ability to work well under pressure are important soft skills in this role. These abilities ensure that critical infrastructure remains reliable, secure, and efficient, which is vital for uninterrupted business operations.

More about Infrastructure Operations jobs
What cities are hiring for Infrastructure Operations jobs? Cities with the most Infrastructure Operations job openings:
What are the most commonly searched types of Infrastructure Operations jobs? The most popular types of Infrastructure Operations jobs are:
What states have the most Infrastructure Operations jobs? States with the most job openings for Infrastructure Operations jobs include:
What job categories do people searching Infrastructure Operations jobs look for? The top searched job categories for Infrastructure Operations jobs are:
Infographic showing various Infrastructure Operations job openings in the United States as of June 2026, with employment types broken down into 88% Full Time, 11% Part Time, and 1% Contract. Highlights an 85% Physical, 5% Hybrid, and 10% Remote job distribution, with an average salary of $154,028 per year, or $74.1 per hour.
Infrastructure Operations Engineer

Infrastructure Operations Engineer

Lightning AI

New York, NY โ€ข On-site, Remote

$160K - $200K/yr

Full-time

Medical, Dental, Vision, Retirement, PTO

Posted 19 days ago


Job description

Who We Are
Lightning AI is the company behind PyTorch Lightning. Founded in 2019, we build an end-to-end platform for developing, training, and deploying AI systems-designed to take ideas from research to production with less friction.
Through our merger with Voltage Park, a neocloud and AI Factory, Lightning AI combines developer-first software with cost-efficient, large-scale compute. Teams get the tools they need for experimentation, training, and production inference, with security, observability, and control built in.
We serve solo researchers, startups, and large enterprises. Lightning AI operates globally with offices in New York City, San Francisco, Seattle, and London, and is backed by Coatue, Index Ventures, Bain Capital Ventures, and Firstminute.
*]:pointer-events-auto [content-visibility:auto] supports-[content-visibility:auto]:[contain-intrinsic-size:auto_100lvh] R6Vx5W_threadScrollVars scroll-mb-[calc(var(--scroll-root-safe-area-inset-bottom,0px)+var(--thread-response-height))] scroll-mt-[calc(var(--header-height)+min(200px,max(70px,20svh)))]" data-turn-id="request-WEB:6da214d3-9e35-4d43-89b6-34d5a76ffba7-1" data-turn-id-container="request-WEB:6da214d3-9e35-4d43-89b6-34d5a76ffba7-1" data-testid="conversation-turn-4" data-scroll-anchor="false" data-turn="assistant">
What We're Looking For
Lightning AI is seeking an experienced Infrastructure Operations Engineers to help scale and operate our next-generation AI infrastructure platform. Our InfraOps team sits at the center of reliability, automation, and operational scale for GPU infrastructure. This team owns break/fix operations, incident response, customer provisioning, observability, and the automation systems that keep complex infrastructure running efficiently.
In this role, you'll work hands-on with large-scale GPU environments, Linux systems, bare metal infrastructure, provisioning workflows, and platform reliability. You'll partner closely with Infrastructure Engineering, Network Operations, and Software Platform teams to troubleshoot issues, improve operational efficiency, and build automation that reduces manual toil over time.
We're flexible on location for this team. This role can work hybrid out of one of our US-based hubs (Seattle, NYC, or SF) or fully remote within the U.S., with occasional company and team offsites. We are not able to provide visa sponsorship for this position at this time.
What You'll Do
  • At the direction of the Manager of Infrastructure Operations, design, build, and roll out new platforms and patterns to minimize incidents and enable customer facing and internal features.
  • Deploy updates and improvements to support both Voltage Park's internal and end customer use cases.
  • Collaborate with colleagues in Infrastructure Engineering, Network Operations, Customer Success and Software and Platform Development Teams.
  • Participate in the on-call rotation which is evenly distributed across all team members in a primary / secondary pattern where you are primary then move to a secondary position.

What You Will Need
Required Qualifications
  • 8+ years working with Linux as a server / hosting platform, extra points for Ubuntu experience.
  • 5+ years experience with AWS.
  • 2+ years experience with Kubernetes and strong container fundamentals.
  • 2+ years experience with Terraform and Ansible
  • 2+ years with network attached storage management (via NFS, ceph, or other protocols). Extra points for experience with VAST storage systems.
  • Experience with monitoring systems (Prometheus, ELK stack).
  • Familiarity with the gitops workflow.
  • Software development experience using Python, Go, bash, or other languages for the purposes of automation & connecting systems & APIs together.
  • Deep networking fundamentals, extra points for experience with datacenter level networks, 400Gb ethernet, and Infiniband.
  • Experience building and delivering complex systems.
  • Effective at navigating tradeoffs between design, risk, cost, and outcomes.
  • Comfortable with navigating ambiguity.
  • Strong written and oral communication.
Nice-to-Haves
  • Experience with bare metal hardware troubleshooting and provisioning, extra points for working with Dell hardware.
  • Experience with GPU servers, both in bare metal form or under virtualization.
  • Deep experience with network switches, routers, and firewalls, particularly SONiC switches, Palo Alto firewalls and Juniper Networks as vendors.
  • Experience with VAST storage systems

Compensation
We are committed to offering competitive compensation that reflects the value each team member brings to our mission. Final offers are based on factors such as experience, skills, geographic location, and role expectations. In addition to base salary, our total rewards package for eligible roles includes a discretionary bonus, a meaningful equity component, and comprehensive benefits.
The anticipated annual base salary range for this role is:
$160,000-$200,000 USD
Benefits and Perks
We offer a comprehensive and competitive benefits package designed to support our employees' health, well-being, and long-term success. Benefits may vary by location, team, and role.
Benefits include:
  • Comprehensive medical, dental and vision coverage (U.S.); Private medical and dental insurance (U.K.)
  • Retirement and financial wellness support (U.S.); Pension contribution (U.K.)
  • Generous paid time off, plus holidays
  • Paid parental leave
  • Professional development support
  • Wellness and work-from-home stipends
  • Flexible work environment

At Lightning AI, we are committed to fostering an inclusive and diverse workplace. We believe that diverse teams drive innovation and create better products. We provide equal employment opportunities to all employees and applicants without regard to race, color, religion, gender, sexual orientation, gender identity, national origin, age, disability, veteran status, or any other protected characteristic. We are dedicated to building a culture where everyone can thrive and contribute to their fullest potential.