1

Production Stability Engineer Jobs (NOW HIRING)

Production Support Engineer III

Atlanta, GA · On-site +1

$40.50 - $52.75/hr

Implement automation strategies to improve production stability and minimize downtown. * Maintain ... Actively mentor junior support engineers, fostering technical growth within the team. Requirements

Software Engineer Sr

Pittsburgh, PA

$118K - $156K/yr

This role is focused on production stability, operational excellence, and risk management for missioncritical systems. This position is ideal for an experienced engineer who thrives in production ...

Production Support Engineer III

Atlanta, GA · On-site

$40.50 - $52.75/hr

Implement automation strategies to improve production stability and minimize downtown. * Maintain ... Actively mentor junior support engineers, fostering technical growth within the team. Requirements

Industrial Engineer

Charlotte, NC · On-site

$65K - $88K/yr

This role focuses on supplier capability development, production stability, capacity optimization ... Collaborate with procurement, engineering, quality, and operations teams in a matrix environment

Level 2 production support

Irving, TX · On-site

$40.25 - $52.50/hr

... Engineer role is a business?aligned, production support position focused on enabling stability, reliability, and observability of consumer?facing technology within a financial services environment.

Level 2 production support

Charlotte, NC · On-site

$41 - $53.50/hr

... Engineer role is a business-aligned, production support position focused on enabling stability, reliability, and observability of consumer-facing technology within a financial services environment.

... engineering, laboratory and manufacturing sectors. This job is located in South Miami Responsibilities * Prepares Stability Reports for FDA Annual Reports and Annual Product Reviews. * Prepares ...

Job#: 3029048 COTS Implementation Engineer Location: Falls Church, Virginia (Remote) Employment ... Prior experience in federal or other highly regulated environments where production stability and ...

next page

Showing results 1-20

Production Stability Engineer information

See salary details

$50.5K

$131.7K

$144K

How much do production stability engineer jobs pay per year?

As of Jun 4, 2026, the average yearly pay for production stability engineer in the United States is $131,667.00, according to ZipRecruiter salary data. Most workers in this role earn between $143,000.00 and $143,000.00 per year, depending on experience, location, and employer.

What are the key skills and qualifications needed to thrive as a Production Stability Engineer, and why are they important?

To thrive as a Production Stability Engineer, you need strong analytical skills, expertise in incident management, and a background in computer science or IT. Familiarity with monitoring tools (such as Splunk, Datadog, or Prometheus), automation frameworks, and ITIL certifications are commonly required. Effective communication, problem-solving, and the ability to remain calm under pressure help you collaborate with teams and resolve critical issues quickly. These skills ensure system reliability, minimize downtime, and support seamless business operations.

What are some typical challenges Production Stability Engineers face, and how can they proactively address them?

Production Stability Engineers often encounter challenges such as diagnosing complex, intermittent incidents and balancing quick response with long-term solutions. To address these, they collaborate closely with development and operations teams to identify root causes, implement monitoring tools, and automate repetitive recovery tasks. Regular post-incident reviews and clear communication channels are critical for continuous improvement and preventing recurring issues. Building strong cross-functional relationships also helps streamline response efforts and fosters a culture of reliability.

What are Production Stability Engineers?

Production Stability Engineers are IT professionals responsible for ensuring the reliability, availability, and overall health of software systems in production environments. They monitor system performance, troubleshoot and resolve incidents, and implement preventative measures to avoid future disruptions. Their goal is to minimize downtime and maintain seamless user experiences by collaborating with development, operations, and support teams. Production Stability Engineers also analyze root causes of issues and help improve system resilience through automation and best practices.

What is the difference between Production Stability Engineer vs Site Reliability Engineer?

AspectProduction Stability EngineerSite Reliability Engineer
Primary FocusEnsuring system stability and uptime in production environmentsBuilding and maintaining scalable, reliable systems with a focus on automation
Skills & CertificationsLinux, scripting, monitoring tools, certifications like AWS or Google CloudDevOps, cloud platforms, automation, similar certifications
Work EnvironmentOperations teams, production environments, monitoring dashboardsDevelopment and operations teams, cloud infrastructure

Both roles focus on system reliability, but Production Stability Engineers primarily concentrate on maintaining uptime and stability, while Site Reliability Engineers emphasize automation and scalable system design. They often collaborate but have distinct core responsibilities within the same industry.

Infographic showing various Production Stability Engineer job openings in the United States as of May 2026, with employment types broken down into 90% Full Time, and 10% Contract. Highlights an 90% In-person, and 10% Remote job distribution, with an average salary of $131,667 per year, or $63.3 per hour.
Support Engineer (Observability & Production Support)

Support Engineer (Observability & Production Support)

Strategic Staffing Solutions

Irving, TX • Hybrid

$40.25 - $52.50/hr

Other

Posted 29 days ago


Job description

Strategic Staffing Solutions is currently looking for a Support Engineer, for a W2 contract opportunity !

Support Engineer (Production Support and Observability)

Location: Charlotte, NC/Irving, TX

Work Schedule: Hybrid 3 days onsite / 2 days remote (8-hour onsite workday required)
Duration: 12-month contract with strong potential for extension and full-time conversion
Start Date: ASAP
Interview Process: 30-minute Microsoft Teams interview

Required Qualifications


  • Experience with Splunk (log analysis, querying, diagnostics)
  • Experience with AppDynamics or similar APM tools
  • Exposure to ThousandEyes or similar monitoring tools
  • Ability to query and analyze data within observability tools
  • Experience with ServiceNow for incident management
  • Strong understanding of runbook-driven support models
  • Ability to assess system signals and take appropriate action based on defined procedures
  • Basic hands-on technical skills (e.g., Linux commands, server restarts, troubleshooting steps)
  • Strong communication skills and ability to work across cross-functional teams

Job Description

We are hiring a Support Engineer to join the Consumer Technology Client Line Team, supporting a large-scale portfolio of 900+ consumer-facing applications (915 total apps).

This role acts as a technical bridge between platform teams and development teams, ensuring production stability, observability, and rapid incident response across a complex, high-impact environment.

You ll operate in a Mission Control-style support model, where understanding system signals, business impact, and the right corrective actions is critical to minimizing disruption for customers.


What You ll Do


  • Provide Level 2 production support across a large portfolio of consumer technology applications
  • Support and monitor 900+ applications, ensuring stability and performance
  • Use tools like Splunk, AppDynamics, and ThousandEyes to investigate issues and query system data
  • Analyze alerts, logs, and signals to determine root cause and customer/business impact
  • Act as a tech bridge between platform and development teams, ensuring proper coordination and escalation
  • Partner closely with platform teams to resolve incidents requiring deeper technical intervention
  • Support Mission Control operations, working with L1 monitoring teams and driving escalations
  • Follow and execute runbook-driven procedures to diagnose and resolve production issues
  • Understand system signals and determine:

    • What the issue means
    • What impact it has
    • What corrective action is required

  • Perform basic technical remediation steps such as:

    • Rebooting servers
    • Running Linux commands
    • Executing predefined recovery steps

  • Work with ServiceNow for incident management and Jira for tracking work and requirements
  • Help gather and document issue details for platform partners to support faster resolution
  • Assist in onboarding applications into observability platforms (alerts, dashboards, monitoring standards)
  • Communicate clearly with stakeholders, translating technical issues into business impact

Preferred Qualifications


  • Exposure to tools like Grafana and Prometheus
  • Understanding of automation concepts in support environments
  • Basic database knowledge (e.g., Oracle, SQL Server, MongoDB)
  • Familiarity with cloud and distributed systems
  • Experience supporting large-scale, high-volume application portfolios

Key Traits for Success


  • Strong ownership and accountability in a high-visibility support role
  • Ability to quickly interpret system signals and take decisive action
  • Comfortable working in a runbook-driven organization
  • Ability to connect technical behavior to business impact
  • Thrives in a fast-paced, mission-critical environment
  • Strong collaboration skills across platform, engineering, and business teams

Why This Role?




    • Work on a massive, enterprise-scale application portfolio
    • High visibility role with direct impact on customer experience
    • Opportunity to grow into full-time employment
    • Exposure to modern observability and monitoring tools.



Beware of scams. S3 never asks for money during its onboarding process