1

Director Chaos Engineering Jobs (NOW HIRING)

... Chaos Engineering and AlOps Job Designation Remote: Employee is not required to be in or near an ... in a Director-level role managing high-scale SaaS platforms * Experience building tools and ...

... Chaos Engineering and AlOps Job Designation Remote: Employee is not required to be in or near an ... in a Director-level role managing high-scale SaaS platforms * Experience building tools and ...

next page

Showing results 1-20

Director Chaos Engineering information

See salary details

$73K

$194.7K

$254K

How much do director chaos engineering jobs pay per year?

As of Jun 11, 2026, the average yearly pay for director chaos engineering in the United States is $194,709.00, according to ZipRecruiter salary data. Most workers in this role earn between $141,500.00 and $253,000.00 per year, depending on experience, location, and employer.

What are some common challenges faced by a Director of Chaos Engineering when implementing chaos experiments at scale?

A Director of Chaos Engineering often encounters challenges such as gaining buy-in from stakeholders who may be unfamiliar or skeptical about deliberately introducing failures. Coordinating chaos experiments across multiple teams and complex distributed systems requires careful planning to ensure safety and minimize unintended disruptions. Balancing the need for rigorous experimentation with ongoing business priorities and service reliability can also be difficult. Building a culture that values resilience and learning from failure is essential for long-term success in this role.

What are the key skills and qualifications needed to thrive as a Director of Chaos Engineering, and why are they important?

To thrive as a Director of Chaos Engineering, you need deep expertise in distributed systems, site reliability engineering, and a strong background in computer science or a related field. Familiarity with chaos engineering platforms (such as Gremlin or Chaos Monkey), cloud infrastructure (AWS, Azure, GCP), and relevant certifications like AWS Certified Solutions Architect are typically required. Exceptional leadership, problem-solving, and communication skills help drive cross-team collaboration and foster a culture of resilience. These skills and qualities are essential to proactively identify system weaknesses, minimize downtime, and ensure business continuity in complex technical environments.

What is a Director of Chaos Engineering?

A Director of Chaos Engineering is a senior technology leader responsible for overseeing the development and implementation of chaos engineering practices within an organization. Their primary goal is to improve system resilience by proactively testing how systems respond to failures and unexpected disruptions. This role involves designing and leading experiments that intentionally introduce faults to identify vulnerabilities, guiding teams in building more robust systems, and fostering a culture of reliability. They often collaborate with engineering, operations, and security teams to ensure best practices are followed across the organization.

What is the difference between Director Chaos Engineering vs Site Reliability Engineer?

AspectDirector Chaos EngineeringSite Reliability Engineer
CredentialsAdvanced certifications in chaos engineering, cloud platforms, and leadershipCertifications in SRE, cloud, and DevOps tools
Work EnvironmentLeadership role overseeing chaos testing strategies across teamsOperational role managing system reliability and automation
Industry UsageUsed in organizations focusing on resilience and fault toleranceCommon in tech companies maintaining scalable, reliable systems
Search & ComparisonOften compared for strategic impact and leadership scopeCompared for technical expertise and system management

The Director Chaos Engineering focuses on leading chaos testing initiatives and shaping resilience strategies, while the Site Reliability Engineer handles the day-to-day reliability and automation of systems. Both roles are vital in ensuring system robustness but differ in scope and responsibilities.

More about Director Chaos Engineering jobs
What cities are hiring for Director Chaos Engineering jobs? Cities with the most Director Chaos Engineering job openings:
What are the most commonly searched types of Chaos Engineering jobs? The most popular types of Chaos Engineering jobs are:
What states have the most Director Chaos Engineering jobs? States with the most job openings for Director Chaos Engineering jobs include:
Infographic showing various Director Chaos Engineering job openings in the United States as of June 2026, with employment types broken down into 80% Full Time, and 20% Part Time. Highlights an 80% In-person, 7% Hybrid, and 13% Remote job distribution, with an average salary of $194,709 per year, or $93.6 per hour.
Director of Engineering Artificial Intelligence Foundry

Director of Engineering Artificial Intelligence Foundry

Insight Global

Atlanta, GA

Full-time

Posted 7 days ago


Job description

Overview

As Director of Engineering, you will lead the design, development, and engineering of enterprise-grade agentic AI solutions and frameworks for Evergreen.AI. This role requires a proven leader who can scale engineering teams, define technical strategy, and ensure operational excellence for production systems-not just PoCs and pilots. You will leverage your experience in technical architecture, global delivery leadership, and AI enablement to build secure, resilient, and compliant solutions for Fortune 500 clients.

In addition, you will serve as a highly client-facing leader, engaging directly with executive stakeholders to understand business needs, communicate technical concepts clearly, and build trusted advisory relationships. You will foster a collaborative, solution-oriented culture, demonstrating strong communication skills and a growth mindset to drive innovation and continuous improvement across teams.


Responsibilities
  • Engineering Leadership: Build and lead high-performing engineering teams across regions; establish career frameworks, mentorship programs, and succession planning.
  • Platform & Framework Ownership: Define Evergreen.AI’s agentic AI architecture, including multi-agent orchestration, LLM knowledge management, and enterprise integration patterns.
  • Delivery Excellence: Drive production readiness-runbooks, observability, SLAs, and resiliency patterns for multi-region deployments.
  • Technical Strategy: Partner with Product and Architecture to align roadmaps with business outcomes; evaluate emerging technologies for scalability and compliance.
  • Operational Governance: Implement secure SDLC, CI/CD, LLMOps/MLOps, and DevSecOps practices; ensure adherence to SOC 2, ISO 27001, HIPAA, and GDPR standards.
  • Own end-to-end ML lifecycle including data ingestion, preprocessing, model training, serving, and evaluation; ensure reproducibility, traceability, and versioning of models and experiments; implement production-grade MLOps practices (CI/CD for ML, automated validation, monitoring, rollback strategies).
  • Client Engagement: Support executive briefings, architecture reviews, and technical pre-sales; act as a trusted advisor for enterprise AI adoption.
  • Innovation & Enablement: Champion responsible AI principles; contribute to reusable accelerators, reference architectures, and delivery templates.
  • Team Collaboration & Communication: Foster a culture of teamwork and open communication, supporting and empowering colleagues across engineering, data science, product, and business functions. Build consensus, resolve conflicts constructively, and celebrate team achievements.
  • Solution Orientation: Approach challenges with creativity and resilience, focusing on outcomes and continuous improvement. Proactively identify obstacles, develop actionable plans, and drive execution to deliver measurable business value for clients and the organization.
  • Growth Mindset: Embrace learning, innovation, and personal development. Stay current with emerging technologies, encourage experimentation, and foster an environment where feedback is welcomed and used for improvement.

Qualifications
  • 12+ years in software engineering, with 5+ years leading multi-team engineering organizations delivering enterprise-grade AI solutions.
  • Proven experience in technical architecture and global delivery leadership for Fortune 1000 clients.
  • Expertise in agentic AI/ML systems, orchestration frameworks (LangChain, Semantic Kernel), and LLMOps/MLOps platforms (MLflow, Kubeflow, Azure ML).
  • Strong knowledge of data and knowledge management for LLMs, including retrieval pipelines and vector databases (Pinecone, Weaviate, Milvus).
  • Hands-on experience with cloud platforms (Azure preferred), container orchestration (Kubernetes), and event-driven architectures (Kafka/Event Hub).
  • Familiarity with observability tools (Prometheus, Grafana, ELK) and resiliency patterns (circuit breakers, chaos engineering).
  • Strong proficiency with Python and ML frameworks
  • Exceptional leadership, cross-collaboration, communication, and stakeholder management skills.
  • Advanced degree in Computer Science.
Qualifications:
  • 12+ years in software engineering, with 5+ years leading multi-team engineering organizations delivering enterprise-grade AI solutions.
  • Proven experience in technical architecture and global delivery leadership for Fortune 1000 clients.
  • Expertise in agentic AI/ML systems, orchestration frameworks (LangChain, Semantic Kernel), and LLMOps/MLOps platforms (MLflow, Kubeflow, Azure ML).
  • Strong knowledge of data and knowledge management for LLMs, including retrieval pipelines and vector databases (Pinecone, Weaviate, Milvus).
  • Hands-on experience with cloud platforms (Azure preferred), container orchestration (Kubernetes), and event-driven architectures (Kafka/Event Hub).
  • Familiarity with observability tools (Prometheus, Grafana, ELK) and resiliency patterns (circuit breakers, chaos engineering).
  • Strong proficiency with Python and ML frameworks
  • Exceptional leadership, cross-collaboration, communication, and stakeholder management skills.
  • Advanced degree in Computer Science.
Education:UNAVAILABLEEmployment Type: FULL_TIME