1

Chaos Monkey Jobs (NOW HIRING)

... Chaos Monkey FMEA Scalability Availability High Availability JProfiler GCViewer IBM Thread dump Analyser Dynatrace AppDynamics Apache Jmeter Loadrunner - Hands-on experience in Java / J2EE one of web ...

Java Developer

Plano, TX · On-site

$48.75 - $63/hr

... chaos monkey type frameworks - Excellent knowledge on at least one tool in each of the following category 3. Shift :9:00 AM to 7:00 PM EST 4. Roles & Responsibilities :- Work with application ...

Java Developer

Plano, TX · On-site

$48.75 - $63/hr

Experience in implementing resiliency design patterns using Hystrix Service Mesh or similar frameworks and validation using chaos monkey type frameworks * Excellent knowledge on at least one tool in ...

... Chaos Monkey, Gremlin),Performance Testing -Emerging Tools (K6,Gatling),Performance Testing -Execution (Baseline, Load, Endurance, Stress, Volume, Network, DR, Failover, Spike, Saas based/COTS ...

Lead DevOps Engineer

Chicago, IL · Hybrid

$54.50 - $74.50/hr

Chaos engineering principles and tooling (e.g., Chaos Monkey, Gremlin, LitmusChaos) * PagerDuty, OpsGenie, or other incident management platforms * Microservices, distributed systems, and event ...

New

Site Reliability Engineer

Plano, TX · On-site

$54.50 - $72.50/hr

Have proven experience implementing and maintaining SLO/SLA frameworks for business-critical services, chaos engineering (Gremlin, Chaos Monkey). * Is comfortable working with both traditional ...

Senior Software Engineer - SRE

Atlanta, GA

$54.75 - $72.75/hr

Experience with chaos engineering tools (Gremlin, Chaos Monkey) * Background in product-facing services with high traffic scale * Understand how to use incident management platforms. This includes ...

Be Seen First

Lead Associate Principal DevOps

Chicago, IL · On-site

$54.25 - $74.50/hr

Familiarity with chaos engineering principles and tooling such as Chaos Monkey, Gremlin, or LitmusChaos. * Fluency with data formats and structures including JSON, Protobuf, and Avro. * Experience ...

next page

Showing results 1-20

Chaos Monkey information

What does Chaos Monkey do?

Chaos Monkey is a tool used by site reliability engineers and DevOps teams to intentionally disable or simulate failures in cloud infrastructure and services. Its purpose is to test system resilience and ensure that applications can recover quickly from unexpected outages. Familiarity with cloud platforms and scripting is often required for this role.

What is the difference between Chaos Monkey vs Site Reliability Engineer?

AspectChaos MonkeySite Reliability Engineer
Primary RoleDisrupts systems intentionally to test resilienceEnsures system reliability and performance
Skills & CertificationsKnowledge of cloud environments, scripting, testingSystems engineering, scripting, monitoring tools
Work EnvironmentDevOps, cloud platforms, testing environmentsOperations, development teams, production systems
Industry UsageTech companies, cloud providers, DevOps teamsTech, finance, e-commerce, any large-scale online service

While Chaos Monkey focuses on intentionally disrupting systems to test resilience, Site Reliability Engineers (SREs) work to maintain and improve system reliability. SREs often use tools like Chaos Monkey as part of their reliability practices, but their role encompasses broader responsibilities including monitoring, incident response, and automation.

What jobs pay 2000 a day?

Jobs related to Chaos Monkey, a tool used in IT and cybersecurity, typically do not pay $2000 a day. High-paying roles that can reach this level include specialized consulting, executive positions, or freelance cybersecurity experts with advanced skills and certifications, often working on short-term projects or high-stakes environments. Such roles usually require extensive experience, technical expertise, and sometimes a freelance or contract basis.

What jobs will be eliminated in the next 10 years?

The role of Chaos Monkey, which involves testing system resilience by intentionally causing failures, is unlikely to be eliminated as it is essential for improving cybersecurity and system robustness. However, automation and AI tools may reduce the need for manual testing roles, shifting responsibilities toward overseeing automated processes and analyzing results.

What are the key skills and qualifications needed to thrive as a Chaos Monkey engineer, and why are they important?

To thrive as a Chaos Monkey engineer (or in chaos engineering roles), you need a strong background in computer science, systems engineering, and distributed systems, often with experience in DevOps or site reliability engineering. Familiarity with chaos engineering tools like Netflix's Chaos Monkey, Gremlin, or Chaos Toolkit, as well as cloud platforms and monitoring systems, is typically required. Problem-solving, analytical thinking, and effective communication are crucial soft skills for designing and explaining controlled failure experiments. These skills and qualities are important to ensure system resilience, minimize downtime, and foster a proactive approach to reliability in complex technology environments.

Does Netflix use Chaos Monkey?

Chaos Monkey is a tool developed by Netflix to test the resilience of its cloud infrastructure by intentionally disabling systems to identify weaknesses. As part of Netflix's Simian Army, Chaos Monkey is used internally to improve system reliability and fault tolerance. It is not a job role but a technical tool employed by engineers working on cloud-based systems.

How does a Chaos Monkey engineer typically collaborate with development and operations teams to improve system resiliency?

A Chaos Monkey engineer works closely with both development and operations teams to design and execute controlled experiments that intentionally disrupt systems, helping to identify weaknesses and improve fault tolerance. They often coordinate with DevOps to schedule tests during non-peak hours and communicate findings so teams can implement necessary fixes. Close collaboration ensures that disruptions caused by chaos engineering are safe, measurable, and lead to actionable improvements, fostering a culture of reliability and shared responsibility across the organization.

What is a Chaos Monkey?

A Chaos Monkey is a tool or role in software engineering designed to test the resilience and reliability of systems by intentionally introducing failures into production environments. The term originated from Netflix's engineering team, who created the Chaos Monkey tool to randomly disable their own production instances and ensure their services could recover without user impact. The primary purpose is to identify weaknesses and encourage the development of robust, fault-tolerant systems. Chaos engineering, the broader practice, helps organizations build confidence in their infrastructure's ability to withstand unexpected disruptions.
Infographic showing various Chaos Monkey job openings in the United States as of June 2026, with employment types broken down into 1% Locum Tenens, 95% Full Time, 2% Part Time, and 2% Contract. Highlights an 98% Physical, 1% Hybrid, and 1% Remote job distribution.
R2 Architect

Contractor

Posted 16 days ago


Job description

1. Job Title :Cognizant is looking for R2 Architect role 2. Job Summary :- Hands-on experience in Java / J2EE one of web server (Apache Tomcat or IBM HTTP Server) one of the application server (Tomcat/WebSphere) and Oracle database - Knowledge w.r.t. queuing models used thread pools request servicing process etc. - Experience in Linux (RHEL) operating system performance monitoring parameters and their interpretation commands used for monitoring - Knowledge of Web Services SOA ESB (DataPower) RESTFul 3. Shift :9 am to 6 pm CST 4. Roles & Responsibilities :Tech skills: Java J2EE Architecture Web Architecture App Arch & Design Java Design Patterns UI Design Patterns Java Threading Multi-threading Webservices(SOAP and RESTful) AJAX Javascript Oracle Query Tuning Apache HTTP Server Tomcat IBM WebSphere Application Server Linux Unix Virtualization Performance Analyzer Performance Validation Resilience Chaos Monkey FMEA Scalability Availability High Availability JProfiler GCViewer IBM Thread dump Analyser Dynatrace AppDynamics Apache Jmeter Loadrunner - Hands-on experience in Java / J2EE one of web server (Apache Tomcat or IBM HTTP Server) one of the application server (Tomcat/WebSphere) and Oracle database - Knowledge of AJAX / JavaScript execution and performance analysis - Knowledge of application design patterns typical J2EE application architectures - Knowledge/Experience in Chaos Monkey Simulation - Proficiency in Java runtimes Core Java Garbage collection JVM parameters tuning - Experience in performance tuning on Application Servers (Tomcat/WAS) and JavaScript - Experience in SOLR search engine OMS - Sterling will be desired - Experience in trouble shooting Performance / Scalability / Availability issues - Thread dump heap dump generation & analysis - Knowledge on database architecture - Experience in NFR gathering / validating - Experience in Performance Test Design - Proficiency in the following tools (Must) Ø Dynatrace or Appdynamics Ø Jprofiler or JProbe 5. Demand requires Travel? :No 6. Certification(s) Required :No
Hours : 8:00am to 5:00pm
Education :
Additional Job Details :