1

Director Observability Jobs (NOW HIRING)

Direct experience with Observability, DevOps, or cloud-native infrastructure use cases. You understand the problems SREs, platform engineers, and developers face day-to-day. * Curiosity about the ...

OR

$176K - $237K/yr

... Observability - both internal and customer-facing . This is an opportunity to join a mission ... Ability to work in a self-directed manner in a fast-paced environment. * Excellent collaboration ...

Own core observability infrastructure, including distributed logging, time series, and trace ... have a direct, adverse and negative relationship with the following job duties, potentially ...

Site Reliability Engineer, Observability

New York, NY ยท On-site

$62.25 - $82.75/hr

You will be responsible for the observability, releasability, and security foundations that keep ... Consultative mindset with the ability to influence and guide teams without direct authority ...

Site Reliability Engineer, Observability

Chicago, IL ยท On-site

$58.75 - $78/hr

You will be responsible for the observability, releasability, and security foundations that keep ... Consultative mindset with the ability to influence and guide teams without direct authority ...

next page

Showing results 1-20

Director Observability information

What is the difference between Director Observability vs Site Reliability Engineer?

AspectDirector ObservabilitySite Reliability Engineer
Primary FocusOversees observability strategies, tools, and teams to ensure system visibility and performanceBuilds and maintains reliable systems, automates deployment, and manages incident response
CredentialsTypically requires advanced knowledge of monitoring, cloud platforms, and leadership experienceOften has software engineering background, with skills in scripting, automation, and systems engineering
Work EnvironmentLeads teams in tech companies, focusing on monitoring and analytics toolsWorks closely with development and operations teams to ensure system reliability

While both roles focus on system performance and reliability, the Director Observability primarily manages observability strategies and teams, whereas the Site Reliability Engineer is hands-on, building and maintaining reliable systems. The roles complement each other in ensuring optimal system performance and uptime.

What are the key skills and qualifications needed to thrive as a Director of Observability, and why are they important?

To thrive as a Director of Observability, you need deep expertise in monitoring, logging, and distributed systems, typically backed by a degree in computer science or a related field and extensive experience in IT or DevOps leadership roles. Proficiency with observability tools such as Prometheus, Grafana, Datadog, Splunk, and APM solutions, along with knowledge of cloud platforms and relevant certifications, is essential. Strong leadership, strategic thinking, and communication skills help drive cross-functional initiatives and foster a culture of reliability. These skills and qualities are crucial for ensuring system health, rapid incident response, and alignment between technical teams and organizational objectives.

How does a Director of Observability typically collaborate with engineering and operations teams to drive organizational goals?

A Director of Observability works closely with engineering and operations teams to ensure that systems are monitored effectively and issues are identified and resolved quickly. This collaboration often involves developing unified monitoring strategies, aligning observability tools and processes, and facilitating incident response post-mortems. The Director also leads cross-functional meetings to establish best practices, set key performance indicators (KPIs), and ensure observability is integrated into the software development lifecycle. By acting as a bridge between technical teams, they help foster a culture of transparency, reliability, and continuous improvement.

What does a Director of Observability do?

A Director of Observability leads the strategy and implementation of monitoring, logging, and tracing systems to ensure the health and performance of technical infrastructure. They work with engineering and operations teams to develop best practices, select appropriate tools, and set standards for observability across the organization. Their goal is to provide visibility into system behavior, quickly identify and resolve incidents, and support continuous improvement in system reliability and performance.
More about Director Observability jobs
What cities are hiring for Director Observability jobs? Cities with the most Director Observability job openings:
What are the most commonly searched types of Observability jobs? The most popular types of Observability jobs are:
What states have the most Director Observability jobs? States with the most job openings for Director Observability jobs include:
Infographic showing various Director Observability job openings in the United States as of June 2026, with employment types broken down into 100% Full Time. Highlights an 100% In-person job distribution.

Senior Principal Solutions Architect (Observability)

DTCC

Jersey City, NJ โ€ข Hybrid

Other

Medical, Life, Retirement, PTO

Posted 8 days ago


Job description

Are you ready to make an impact at DTCC?
Do you want to work on innovative projects, collaborate with a dynamic and supportive team, and receive investment in your professional development? At DTCC, we are at the forefront of innovation in the financial markets. We are committed to helping our employees grow and succeed. We believe that you have the skills and drive to make a real impact. We foster a thriving internal community and are committed to creating a workplace that looks like the world that we serve.
The Information Technology group delivers secure, reliable technology solutions that enable DTCC to be the trusted infrastructure of the global capital markets. The team delivers high-quality information through activities that include development of essential, building infrastructure capabilities to meet client needs and implementing data standards and governance.
Pay and Benefits:
  • Competitive compensation, including base pay and annual incentive
  • Comprehensive health and life insurance and well-being benefits, based on location
  • Pension / Retirement benefits
  • Paid Time Off and Personal/Family Care, and other leaves of absence when needed to support your physical, financial, and emotional well-being.
  • DTCC offers a flexible/hybrid model of 3 days onsite and 2 days remote (onsite Tuesdays, Wednesdays and a third day unique to each team or employee).

The Impact you will have in this role:
Being a key member of the Reliability Architecture organization, the Director of Enterprise Observability Architecture provides strategic leadership for enterprise-wide observability initiatives, ensuring DTCC platforms and applications operate with regulatory-grade visibility, resilience, and operational continuity. This role defines the north-star observability vision, influences architectural direction, embeds telemetry and resilience into modernization programs, and partners with senior stakeholders across engineering, infrastructure, SRE, security, risk, and business operations. Aligned to DTCC's mission of delivering secure and reliable market infrastructure, this role ensures observability capabilities (metrics, logs, traces, events, dashboards, data health, and automated remediation) are consistently designed, adopted, and governed across the enterprise.
Your Primary Responsibilities:
  • Shape and champion DTCC's enterprise observability strategy, ensuring alignment with operational resilience, business continuity, and regulatory expectations
  • Define multi-year roadmaps for observability modernization, including OpenTelemetry adoption, enhanced signal correlation, and AIOps-enablement
  • Establish enterprise-wide architectural standards, patterns, and controls for telemetry, monitoring, alerting, visualization, and retention
  • Drive platform-engineering approaches that deliver observability as a scalable, self-service capability for application and infrastructure teams
  • Ensure all critical production services are instrumented for real-time visibility that connects technical health to business impact
  • Influence senior leadership through clear communication of observability risks, maturity, and strategic investment options
  • Integrate data observability into analytics ecosystems to support regulatory reporting, risk analytics, and client-impact transparency
  • Guide engineering teams in embedding observability throughout the SDLC, including NFR testing, architecture reviews, and operational readiness
  • Lead the design of event-correlation and alerting frameworks that reduce noise, accelerate incident triage, and enable automated remediation
  • Define enterprise dashboards that provide 360 visibility into service reliability, transaction flows, and business-processing health
  • Maintain an enterprise observability architecture covering metrics, logs, traces, events, RUM, data pipelines, and telemetry governance
  • Author policies, standards, and procedures for monitoring, alerting, logging, visualization, and retention
  • Partner with platform, cloud, and infrastructure engineering to integrate observability into modernization and cloud-adoption strategies
  • Present architecture strategies and program health to senior technology and business leaders
  • Lead enterprise assessments, failure-mode analysis, chaos engineering practices, and post-incident improvement cycles
  • Translate telemetry insights into business-level narratives that inform risk, resilience, and operational decision-making

**NOTE: The Primary Responsibilities of this role are not limited to the details above**
Qualifications
  • Minimum 10 years of related experience
  • Bachelor's degree in a technical field (preferred) or equivalent experience

Talents Needed for Success:
  • Strategic mindset with the ability to translate business outcomes into technical architectures
  • Deep knowledge of observability patterns, resiliency engineering, and automated recovery pipelines
  • Expertise in hybrid-cloud and public cloud architectural design
  • Strong understanding of financial services regulatory expectations for operational resilience
  • Proficiency in Java, Linux, SQL, and scripting for prototyping and validation
  • Exceptional communication and stakeholder management skills
  • Familiarity with resilience and continuity frameworks (e.g., ISO 22301, NIST SP 800-34) and operational risk management

The salary range is indicative for roles at the same level within DTCC across all US locations. Actual salary is determined based on the role, location, individual experience, skills, and other considerations. We are an equal opportunity employer and value diversity at our company. We do not discriminate on the basis of race, religion, color, national origin, sex, gender, gender expression, sexual orientation, age, marital status, veteran status, or disability status. We will ensure that individuals with disabilities are provided reasonable accommodation to participate in the job application or interview process, to perform essential job functions, and to receive other benefits and privileges of employment. Please contact us to request accommodation.
About Us
With over 50 years of experience, DTCC is the premier post-trade market infrastructure for the global financial services industry. From 20 locations around the world, DTCC, through its subsidiaries, automates, centralizes, and standardizes the processing of financial transactions, mitigating risk, increasing transparency, enhancing performance and driving efficiency for thousands of broker/dealers, custodian banks and asset managers. Industry owned and governed, the firm innovates purposefully, simplifying the complexities of clearing, settlement, asset servicing, transaction processing, trade reporting and data services across asset classes, bringing enhanced resilience and soundness to existing financial markets while advancing the digital asset ecosystem. In 2024, DTCC's subsidiaries processed securities transactions valued at U.S. $3.7 quadrillion and its depository subsidiary provided custody and asset servicing for securities issues from over 150 countries and territories valued at U.S. $99 trillion. DTCC's Global Trade Repository service, through locally registered, licensed, or approved trade repositories, processes more than 25 billion messages annually. To learn more, please visit us at or connect with us on LinkedIn , X , YouTube , Facebook and Instagram .
DTCC proudly supports Flexible Work Arrangements favoring openness and gives people freedom to do their jobs well, by encouraging diverse opinions and emphasizing teamwork. When you join our team, you'll have an opportunity to make meaningful contributions at a company that is recognized as a thought leader in both the financial services and technology industries. A DTCC career is more than a good way to earn a living. It's the chance to make a difference at a company that's truly one of a kind.
Learn more about Clearance and Settlement by clicking here .
About the Team
Enterprise Product & Platform Engineering transforms the way we deliver infrastructure to our business clients. A key construct of EP&PE will be the evolution of the IT Product Manager, who will partner with the Engineering organization, the Business Aligned Service Delivery organization, the DevSecOps organization as well as our operational support teams to ensure that this organization provides high quality, commercially attractive and timely solutions to support our business strategy.