With as much data under management as the hyperscalers, we're the preferred data partner for the ... In this role, you will lead the evolution of Cloudera's Observability product by designing and ...
With as much data under management as the hyperscalers, we're the preferred data partner for the ... In this role, you will lead the evolution of Cloudera's Observability product by designing and ...
With as much data under management as the hyperscalers, we're the preferred data partner for the ... In this role, you will lead the evolution of Cloudera's Observability product by designing and ...
With as much data under management as the hyperscalers, we're the preferred data partner for the ... In this role, you will lead the evolution of Cloudera's Observability product by designing and ...
Conducive Consulting About Conducive Consulting Conducive Consulting delivers enterprise observability and performance engineering solutions that empower organizations to proactively manage ...
Conducive Consulting About Conducive Consulting Conducive Consulting delivers enterprise observability and performance engineering solutions that empower organizations to proactively manage ...
... management Strong understanding of runbook-driven support models Ability to assess system signals ... observability, and rapid incident response across a complex, high-impact environment You'll operate ...
... management Strong understanding of runbook-driven support models Ability to assess system signals ... observability, and rapid incident response across a complex, high-impact environment You'll operate ...
Conducive Consulting About Conducive Consulting Conducive Consulting delivers enterprise observability and performance engineering solutions that empower organizations to proactively manage ...
Conducive Consulting About Conducive Consulting Conducive Consulting delivers enterprise observability and performance engineering solutions that empower organizations to proactively manage ...
Senior Software Engineer - Observability
Austin, TX · On-site
$121K - $160K/yr
In this role, you will design and develop the observability platforms, reliability tooling, and AI ... management, and post-incident reviewStrong software design skills - you write clean, well-tested ...
Senior Software Engineer - Observability
Austin, TX · On-site
$121K - $160K/yr
In this role, you will design and develop the observability platforms, reliability tooling, and AI ... management, and post-incident reviewStrong software design skills - you write clean, well-tested ...
This role focuses on designing and implementing automated monitoring solutions, managing observability tools (Dynatrace, ThousandEyes, Evolven), and supporting 24/7 production environments in both on ...
This role focuses on designing and implementing automated monitoring solutions, managing observability tools (Dynatrace, ThousandEyes, Evolven), and supporting 24/7 production environments in both on ...
Senior Tech Lead, Network Observability & Automation Enablement
Dallas, TX · On-site
$102K - $135K/yr
... time management skills and a consistent record of setting and meeting results-oriented goals! We are seeking a Senior Tech Lead with deep experience in network engineering, observability, and ...
Senior Tech Lead, Network Observability & Automation Enablement
Dallas, TX · On-site
$102K - $135K/yr
... time management skills and a consistent record of setting and meeting results-oriented goals! We are seeking a Senior Tech Lead with deep experience in network engineering, observability, and ...
... time management skills and a consistent record of setting and meeting results-oriented goals! We are seeking a Senior Tech Lead with deep experience in network engineering, observability, and ...
... time management skills and a consistent record of setting and meeting results-oriented goals! We are seeking a Senior Tech Lead with deep experience in network engineering, observability, and ...
Staff Observability Platform Engineer (SRE)
$51.75 - $68.75/hr
Manage error budgets effectively, collaborating with development teams to balance reliability and ... Monitoring & Observability: Design and implement comprehensive monitoring solutions to provide real ...
Staff Observability Platform Engineer (SRE)
$51.75 - $68.75/hr
Manage error budgets effectively, collaborating with development teams to balance reliability and ... Monitoring & Observability: Design and implement comprehensive monitoring solutions to provide real ...
Staff Observability Platform Engineer (SRE)
$54.75 - $72.75/hr
Manage error budgets effectively, collaborating with development teams to balance reliability and ... Monitoring & Observability: Design and implement comprehensive monitoring solutions to provide real ...
Staff Observability Platform Engineer (SRE)
$54.75 - $72.75/hr
Manage error budgets effectively, collaborating with development teams to balance reliability and ... Monitoring & Observability: Design and implement comprehensive monitoring solutions to provide real ...
Senior Tech Lead, Network Observability & Automation Enablement
Dallas, TX · On-site
$103K - $135K/yr
... time management skills and a consistent record of setting and meeting results-oriented goals! We are seeking a Senior Tech Lead with deep experience in network engineering, observability, and ...
Senior Tech Lead, Network Observability & Automation Enablement
Dallas, TX · On-site
$103K - $135K/yr
... time management skills and a consistent record of setting and meeting results-oriented goals! We are seeking a Senior Tech Lead with deep experience in network engineering, observability, and ...
... Observability Platform. This role involves architecting high-performance ingestion systems ... management, and audit logging. Looking for someone who has built governance controls into a ...
... Observability Platform. This role involves architecting high-performance ingestion systems ... management, and audit logging. Looking for someone who has built governance controls into a ...
You will be the technical anchor for our Observability stack, driving the transition to a modern ... management, and audit logging. Looking for someone who has built governance controls into a ...
You will be the technical anchor for our Observability stack, driving the transition to a modern ... management, and audit logging. Looking for someone who has built governance controls into a ...
Senior Site Reliability Engineer - Observability & Monitoring
$54.75 - $72.75/hr
Define and implement monitoring and observability coverage for the Event Management platform. * Establish standards for metrics, logs, traces, events, synthetic checks, and platform telemetry.
Senior Site Reliability Engineer - Observability & Monitoring
$54.75 - $72.75/hr
Define and implement monitoring and observability coverage for the Event Management platform. * Establish standards for metrics, logs, traces, events, synthetic checks, and platform telemetry.
Lead Observability Engineer (Grafana Cloud)
Coppell, TX · Hybrid
$95K - $125K/yr
Incident Management : Responding to and managing incidents, performing root cause analysis, and ... Participate in user training to increase awareness of observability solutions * Ensuring incident ...
Lead Observability Engineer (Grafana Cloud)
Coppell, TX · Hybrid
$95K - $125K/yr
Incident Management : Responding to and managing incidents, performing root cause analysis, and ... Participate in user training to increase awareness of observability solutions * Ensuring incident ...
SolarWinds is seeking a strategic and data-driven Senior Manager of SEO to drive organic search performance across our Monitoring and Observability Business Unit, and the SolarWinds website as a ...
SolarWinds is seeking a strategic and data-driven Senior Manager of SEO to drive organic search performance across our Monitoring and Observability Business Unit, and the SolarWinds website as a ...
Lead Observability Engineer (Grafana Cloud)
Coppell, TX · Hybrid
$95K - $125K/yr
Incident Management : Responding to and managing incidents, performing root cause analysis, and ... Participate in user training to increase awareness of observability solutions * Ensuring incident ...
Lead Observability Engineer (Grafana Cloud)
Coppell, TX · Hybrid
$95K - $125K/yr
Incident Management : Responding to and managing incidents, performing root cause analysis, and ... Participate in user training to increase awareness of observability solutions * Ensuring incident ...
Lead Observability Engineer (Grafana Cloud)
Coppell, TX · On-site
$95K - $125K/yr
Incident Management : Responding to and managing incidents, performing root cause analysis, and ... Participate in user training to increase awareness of observability solutions * Ensuring incident ...
Lead Observability Engineer (Grafana Cloud)
Coppell, TX · On-site
$95K - $125K/yr
Incident Management : Responding to and managing incidents, performing root cause analysis, and ... Participate in user training to increase awareness of observability solutions * Ensuring incident ...
Future Openings - SRE Support Engineer - Observability
Austin, TX · On-site +1
$56.50 - $75/hr
SRE Support Engineer - Observability While this position is not currently open, we are interviewing ... Manage Slack threads and tickets (roughly 50/50) * Handle a broad range of customer support: simple ...
Future Openings - SRE Support Engineer - Observability
Austin, TX · On-site +1
$56.50 - $75/hr
SRE Support Engineer - Observability While this position is not currently open, we are interviewing ... Manage Slack threads and tickets (roughly 50/50) * Handle a broad range of customer support: simple ...
Observability Manager information
What is the difference between Observability Manager vs Site Reliability Engineer?
| Aspect | Observability Manager | Site Reliability Engineer |
|---|---|---|
| Credentials | Typically requires experience in monitoring, logging, and cloud tools; certifications like AWS, Google Cloud, or Kubernetes are common | Requires strong background in systems engineering, scripting, and cloud platforms; certifications like AWS, GCP, or Linux are often preferred |
| Work Environment | Focuses on overseeing observability tools, data analysis, and team coordination in tech environments | Hands-on role involving system automation, incident response, and infrastructure reliability |
| Industry Usage | Used across tech companies to improve system visibility and performance | Common in DevOps and SRE teams to ensure system reliability and uptime |
The Observability Manager primarily oversees monitoring and logging strategies, ensuring system visibility, while the Site Reliability Engineer is more hands-on, focusing on automating infrastructure and maintaining system reliability. Both roles require technical expertise and often collaborate closely but differ in scope and daily responsibilities.
Principal Engineer - Observability Telemetry Client Infrastructure
Cloudera, Inc.Austin, TX • On-site, Remote
Full-time
PTO
Posted 16 days ago
Job description
Engineering
Seniority Level:
Director
Job Description:
At Cloudera, we empower people to transform complex data into clear and actionable insights. With as much data under management as the hyperscalers, we're the preferred data partner for the top companies in almost every industry. Powered by the relentless innovation of the open source community, Cloudera advances digital transformation for the world's largest enterprises.
Cloudera is seeking a Principal Engineer to serve as the primary architect and visionary for our Observability Telemetry client interactions framework as we build a multi-tenant, high-throughput telemetry fabric to support the world's largest data estates. In this role, you will lead the evolution of Cloudera's Observability product by designing and building a "self-service" Open Telemetry-based ecosystem that allows internal clients to integrate telemetry data seamlessly for multiple downstream consumers.
You will architect the self-service interfaces that allow thousands of distributed components to emit high-cardinality logs, metrics, and traces that are automatically correlated, context-aware, and ready for downstream analysis in ClickHouse and other massive-scale engines. You will be Cloudera's voice in the CNCF/OTel community, influencing the direction of open-source observability to meet the needs of hybrid-cloud data platforms.
As a technical leader, you will be responsible for defining the semantic conventions to enable log events, metrics, spans, and traces from diverse, multi-language clients to be correlated into a unified, actionable view for customers, Support and Cloudera product engineering. This data is the foundation for troubleshooting, forecasting, workload analysis, financial governance, and other administrative functions. You will also work closely with open source products for integration and can influence OTel integration directions in the open source ecosystem for these components
This position is a high-visibility role requiring a blend of deep systems architecture, hands-on implementation, and cross-organizational influence to ensure our telemetry infrastructure scales with the world's most complex data workloads.
This role is not eligible for immigration sponsorship.
As an Observability Telemetry Principal Engineer you will:
- Architect and drive the implementation of automated "on-ramps" for observability clients that handles the complexity of multi-cloud, hybrid environments without sacrificing performance, ensuring teams can integrate their services with minimal friction.
- Establish and enforce the semantic conventions needed to ensure telemetry data carries the appropriate context for easy correlation across the entire Cloudera stack.
- Develop and support high-performance interfaces and SDKs for clients across various languages (Java, Go, Python, etc.) to contribute high-fidelity signals.
- Build the logic to stitch together disparate signals into a unified trace, enabling deep-dive workload analysis and financial governance across massive distributed systems.
- Work alongside engineering teams to turn architectural blueprints into production reality, conducting deep-dive code reviews and resolving complex systemic bottlenecks.
- Serve as the "go-to" expert for observability, resolving technical disagreements and making high-stakes decisions on the future of our telemetry platform.
We are excited if you have: (Required Experience)
- 10+ years of experience (or equivalent advanced degree + experience) designing and maintaining large-scale distributed systems and observability platforms.
- A proven track record of designing and shipping complex, critical features that serve as foundational infrastructure for other engineering teams.
- Deep, hands-on experience with the OpenTelemetry Collector architecture, custom processors, and the challenges of high-cardinality data.
- Experience with high-volume OLAP engines (e.g., ClickHouse, StarRocks) and an understanding of how to structure telemetry data for sub-second queries at large scale.
- Excellent communication and collaboration skills and the ability to build relationships across the company to drive adoption of new standards and remove technical roadblocks.
- The ability to map business requirements to technical roadmaps, ensuring our observability tools support Cloudera's long-term strategic goals.
- Experience coaching senior and staff-level engineers, acting as a "force multiplier" for a technical organization.
- Bsc/Msc in related field or equivalent experience
You might have:
- Significant contributions to major observability or data projects (e.g., CNCF or Apache projects). Bonus points if you're already a CNCF OTel maintainer.
- Deep experience with Kubernetes-native observability and managing telemetry at scale in hybrid-cloud environments.
- Experience representing technical initiatives at industry conferences or internal company-wide summits.
- Experience using machine learning or advanced analytics to derive "AIOps" insights from raw telemetry data.
Why this role matters:
At Cloudera, our customers manage some of the largest and most complex data estates in the world. Without robust, correlated observability, managing these environments is an impossible task. This role is the linchpin of our visibility strategy.
By building a self-service, standardized telemetry framework, you are not just helping one team; you are empowering every developer at Cloudera and every one of our customers to understand their data life cycle. Your work ensures that when a performance bottleneck occurs or a system fails, the path to resolution is visible, traceable, and immediate. You are building the "nervous system" of the Cloudera Data Platform.
What you can expect from us:
- Generous PTO Policy
- Support work life balance with Unplugged Days
- Flexible WFH Policy
- Mental & Physical Wellness programs
- Phone and Internet Reimbursement program
- Access to Continued Career Development
- Comprehensive Benefits and Competitive Packages
- Paid Volunteer Time
- Employee Resource Groups
EEO/VEVRAA
#LI-CP1
#LI-HYBRID
About Cloudera
Sourced by ZipRecruiter
Industry
Software development
Company size
1,001 - 5,000 Employees
Headquarters location
Santa Clara, CA, US
Year founded
2008