OpenAI
OpenAI

60 Openai Software Reliability Engineer Jobs Hiring Near You

About the Team OpenAI's B2B Engineering team brings our most capable technology to the world ... Care deeply about reliability, safety, security, and performance in production environments * Have ...

Software Engineer, Habitat (Online Data)

Seattle, WA · On-site

$130.30K - $156.50K/yr

OpenAI is an AI research and deployment company dedicated to ensuring that general-purpose ... reliability, and developer experience. Responsibilities : • Design and build core abstractions ...

Monetization plays a critical role in enabling OpenAI to continue pushing the boundaries of AI ... Architect distributed systems and services with strong reliability, usability, privacy, and ...

Improve the reliability, observability and auditability of money movement at OpenAI. You might thrive in this role if you: * Have 5+ years of professional experience in software engineering, ideally ...

Monetization plays a critical role in enabling OpenAI to continue pushing the boundaries of AI ... Architect distributed systems and services with strong reliability, usability, privacy, and ...

Software Engineer, Platform Systems

San Francisco, CA · On-site

$203.80K - $241.50K/yr

About the Role As a Software Engineer, Platform Systems, you will design and build distributed ... Improve observability, reliability, and performance across OpenAI's training platform * Debug and ...

Improve the reliability, observability and auditability of money movement at OpenAI. You might thrive in this role if you: * Have 5+ years of professional experience in software engineering, ideally ...

Showing results 41-60

OpenAI Jobs Information

What are the key skills and qualifications needed to thrive as a Software Reliability Engineer, and why are they important?

To thrive as a Software Reliability Engineer, you need a strong background in software development, system architecture, and incident response, often supported by a degree in computer science or related field. Familiarity with monitoring tools (like Prometheus), cloud platforms (AWS, GCP), automation frameworks, and certifications such as AWS Certified DevOps Engineer are highly valuable. Excellent problem-solving, collaboration, and communication skills help you coordinate effectively during high-pressure situations and with cross-functional teams. These abilities are crucial for maintaining system uptime, quickly resolving outages, and ensuring the overall reliability of critical software services.

How does a Software Reliability Engineer typically interact with development and operations teams to improve system stability?

Software Reliability Engineers (SREs) work closely with both development and operations teams to ensure that systems are reliable, scalable, and maintainable. They often participate in design reviews, provide input on architectural decisions, and help define service-level objectives. SREs also collaborate with developers to automate deployment processes and create monitoring solutions, and they partner with operations staff to manage incident response and root cause analysis. This collaborative environment enables them to proactively identify potential issues and drive cross-functional improvements.

What are Software Reliability Engineers?

Software Reliability Engineers (SREs) are IT professionals who focus on ensuring that software systems are reliable, scalable, and maintain high availability. They work at the intersection of software development and IT operations, often automating processes, monitoring system performance, and responding to incidents. SREs use engineering principles to solve operational problems, aiming to reduce downtime and improve user experience. Their responsibilities can include building tools, managing infrastructure, and collaborating with development teams to implement best practices for reliability.

What is the difference between Software Reliability Engineer vs Software Test Engineer?

AspectSoftware Reliability EngineerSoftware Test Engineer
Primary FocusEnsuring software reliability, stability, and performance over timeDesigning and executing tests to identify bugs and verify functionality
Skills & CertificationsKnowledge of reliability engineering, scripting, monitoring toolsTesting methodologies, automation tools, scripting
Work EnvironmentCollaborates with development and operations teams, often in DevOpsWorks primarily in QA/testing teams, often in dedicated testing phases
Industry UsageCommon in software companies focusing on product stabilityWidely used in software development and QA departments

The main difference is that Software Reliability Engineers focus on maintaining long-term software stability and performance, while Software Test Engineers concentrate on identifying bugs through testing. Both roles require technical skills and often collaborate, but their core objectives differ: reliability versus defect detection.

What other companies are hiring for Software Reliability Engineer jobs?
Infographic showing various Software Reliability Engineer job openings at Openai in the United States as of May 2026, with employment types broken down into 100% Full Time. Highlights an 68% Physical, 25% Hybrid, and 7% Remote job distribution.
Principal Software Engineer, B2B Engineering

Principal Software Engineer, B2B Engineering

OpenAI

Remote

$385K - $490K/yr

Full-time

Posted 8 days ago


Job description

About the Team
OpenAI's B2B Engineering team brings our most capable technology to the world through our developer platform and enterprise products. We build the backend systems, APIs, and infrastructure that power how developers and organizations use OpenAI in production.
Our work spans distributed systems, data infrastructure, platform services, and enterprise-grade capabilities like security, compliance, authentication, and reliability. We partner closely with product, research, design, infrastructure, and forward-deployed teams to turn cutting-edge AI capabilities into scalable, dependable products.
About the Role
We're looking for a Principal Software Engineer to design and scale the systems that power our developer and enterprise-facing products. You'll lead the architecture of backend services and platform capabilities that bring new AI functionality into production safely, reliably, and at global scale.
This role spans a broad technical surface area, including distributed systems, APIs, databases, data pipelines, and secure enterprise infrastructure. You'll help shape both the technical foundation and the product experience of our platform, with a high bar for performance, safety, reliability, and API design.
In this role, you will
  • Design, build, and scale the backend services, APIs, and infrastructure that power OpenAI's developer and enterprise products
  • Lead the architecture of distributed systems, databases, and data pipelines that support large-scale, high-reliability production workloads
  • Own major platform capabilities end-to-end, from early technical strategy and design through implementation, launch, and long-term operation
  • Shape the design of our APIs with care and intentionality, treating API interfaces as core product surfaces and driving a high-quality developer experience
  • Build secure, reliable, and compliant systems that meet the needs of both enterprise and developer use cases
  • Partner closely with product, research, design, infrastructure, and forward-deployed engineering teams to bring new capabilities into production
  • Drive technical direction across complex problem spaces, making sound architectural tradeoffs that balance speed, quality, and long-term maintainability
  • Improve engineering velocity by building internal tooling, platform abstractions, and systems that increase leverage across the broader organization
  • Raise the bar for engineering quality, system design, operational excellence, and technical decision-making across teams
  • Help identify and solve ambiguous, high-impact technical problems that cut across multiple systems and stakeholders
You might thrive in this role if you
  • Have significant experience building, scaling, and evolving production backend systems in fast-moving environments
  • Bring deep expertise in software engineering fundamentals, distributed systems, and API design
  • Are proficient in one or more backend languages such as Python, Go, Rust, or TypeScript
  • Have a track record of leading complex technical initiatives and driving architecture across teams or critical product areas
  • Care deeply about reliability, safety, security, and performance in production environments
  • Have strong product instincts and a high bar for developer experience and interface design
  • Are comfortable working in ambiguous, fast-moving environments and can create clarity where little exists
  • Own problems end-to-end and are eager to learn whatever is needed to solve them
  • Build thoughtfully, move with urgency, and collaborate effectively across disciplines
  • Influence technical direction through strong judgment, clear communication, and consistently high-quality execution
  • Have experience as a founder or early engineer at a startup, or have built products and platforms from scratch

About OpenAI
OpenAI is an AI research and deployment company dedicated to ensuring that general-purpose artificial intelligence benefits all of humanity. We push the boundaries of the capabilities of AI systems and seek to safely deploy them to the world through our products. AI is an extremely powerful tool that must be created with safety and human needs at its core, and to achieve our mission, we must encompass and value the many different perspectives, voices, and experiences that form the full spectrum of humanity.
We are an equal opportunity employer, and we do not discriminate on the basis of race, religion, color, national origin, sex, sexual orientation, age, veteran status, disability, genetic information, or other applicable legally protected characteristic.
For additional information, please see OpenAI's Affirmative Action and Equal Employment Opportunity Policy Statement.
Background checks for applicants will be administered in accordance with applicable law, and qualified applicants with arrest or conviction records will be considered for employment consistent with those laws, including the San Francisco Fair Chance Ordinance, the Los Angeles County Fair Chance Ordinance for Employers, and the California Fair Chance Act, for US-based candidates. For unincorporated Los Angeles County workers: we reasonably believe that criminal history may have a direct, adverse and negative relationship with the following job duties, potentially resulting in the withdrawal of a conditional offer of employment: protect computer hardware entrusted to you from theft, loss or damage; return all computer hardware in your possession (including the data contained therein) upon termination of employment or end of assignment; and maintain the confidentiality of proprietary, confidential, and non-public information. In addition, job duties require access to secure and protected information technology systems and related data security obligations.
To notify OpenAI that you believe this job posting is non-compliant, please submit a report through this form. No response will be provided to inquiries unrelated to job posting compliance.
We are committed to providing reasonable accommodations to applicants with disabilities, and requests can be made via this link.
OpenAI Global Applicant Privacy Policy
At OpenAI, we believe artificial intelligence has the potential to help people solve immense global challenges, and we want the upside of AI to be widely shared. Join us in shaping the future of technology.