1

Chaos Engineering Jobs (NOW HIRING)

EPAM Systems is a leading global provider of digital platform engineering and development services. The Chaos Test Engineer will design and manage chaos engineering tests, enhance resilience ...

Join a high-impact engineering team building resilience frameworks across cloud-native platforms. You will design, execute, and evolve chaos experiments that safeguard platform reliability and drive ...

SRE Lead/ Architect

Atlanta, GA · On-site

$54.75 - $72.75/hr

Mandatory skills are Observability, Resiliency, Chaos engineering, strong python, and Dynatrace As an SRE Architect, you will be a pivotal technical leader responsible for designing, building, and ...

Participate in chaos engineering initiatives to validate system resiliency and fault tolerance. * Provide detailed performance analysis reports and tuning recommendations. * Ensure applications meet ...

next page

Showing results 1-20

Chaos Engineering information

See salary details

$46.5K

$146.9K

$174K

How much do chaos engineering jobs pay per year?

As of May 29, 2026, the average yearly pay for chaos engineering in the United States is $146,868.00, according to ZipRecruiter salary data. Most workers in this role earn between $116,500.00 and $173,000.00 per year, depending on experience, location, and employer.

What is a Chaos Engineering job?

A Chaos Engineering job involves proactively identifying weaknesses in complex systems by intentionally injecting failures and observing how they respond. Professionals in this role design and execute controlled experiments to improve system resilience, ensuring that services remain reliable under unexpected conditions. They work closely with development, operations, and security teams to enhance fault tolerance and incident response strategies.

What are the key skills and qualifications needed to thrive in the Chaos Engineering position, and why are they important?

To thrive in Chaos Engineering, a strong background in software engineering, distributed systems, and reliability testing is essential, often supported by a degree in computer science or a related field. Familiarity with chaos engineering tools like Gremlin or Chaos Monkey and experience with cloud platforms, container orchestration, and monitoring systems are highly valued. Excellent problem-solving abilities, communication skills, and a mindset oriented toward experimentation help engineers collaborate effectively and analyze complex failure modes. These skills are crucial for proactively identifying system weaknesses and ensuring the resilience of large-scale technology infrastructures.

What are some typical challenges a Chaos Engineer faces, and how do they overcome them?

Chaos Engineers often face the challenge of designing effective experiments that simulate real-world failures without disrupting production systems. Balancing the need to discover vulnerabilities with maintaining uptime requires careful planning, communication, and coordination with development and operations teams. They address these challenges by thoroughly testing in controlled environments, documenting procedures, and establishing clear rollback strategies. Continuous learning and cross-functional collaboration are also key to staying ahead of new complexities in evolving systems.
What cities are hiring for Chaos Engineering jobs? Cities with the most Chaos Engineering job openings:
What are the most commonly searched types of Chaos Engineering jobs? The most popular types of Chaos Engineering jobs are:
What states have the most Chaos Engineering jobs? States with the most job openings for Chaos Engineering jobs include:
Chaos Test Engineer

Full-time

Posted 23 days ago


Job description

Job Summary:
EPAM Systems is a leading global provider of digital platform engineering and development services. The Chaos Test Engineer will design and manage chaos engineering tests, enhance resilience frameworks, and integrate AI-driven capabilities into testing pipelines across cloud-native platforms.
Responsibilities:
• Design and manage chaos engineering tests using Azure Chaos Studio, analyze platform architecture to identify failure domains and strengthen system resilience
• Maintain and enhance existing LitmusChaos test suites across Kubernetes environments, ensure consistent coverage and accuracy across all platforms
• Build comprehensive testing suites by integration of LitmusSDK, Azure Management SDK, Chaos SDK and Kubernetes SDK to automate and scale chaos experiments
• Lead HA/DR testing initiatives across all environments, operate independently to validate high availability and disaster recovery readiness
• Establish and standardize chaos engineering frameworks across AKS and EKS platforms, enable scalable and repeatable resilience practices organization-wide
• Integrate AI-driven capabilities into the chaos engineering pipeline to enable touchless experiment creation, automated execution and continuous validation
Qualifications:
Required:
• Hands-on experience with Kubernetes orchestration platforms including AKS or EKS, with deep understanding of container-based infrastructure and cloud-native architecture
• Proficiency in chaos engineering tools including LitmusChaos and Azure Chaos Studio, with demonstrated experience to build and maintain structured test suites
• Experience with Istio service mesh for traffic management, observability and resilience configuration within microservices environments
• Practical experience with LitmusSDK, Azure Management SDK, Chaos SDK and Kubernetes SDK
• Proven ability to conduct HA/DR testing and work autonomously with minimal oversight across complex multi-environment cloud platforms
Company:
EPAM leverages its core engineering expertise as a leading global product development and digital platform engineering services company. Founded in 1993, the company is headquartered in Newtown, USA, with a team of 10001+ employees. The company is currently Late Stage.