1

Observability Sre Jobs (NOW HIRING)

Site Reliability Engineer

Dallas, TX ยท Remote

$35 - $40/hr

We are seeking a highly skilled Site Reliability Engineer (SRE ) with strong observability expertise, proven communication skills, and the ability to drive reliability maturity across multi-team ...

next page

Showing results 1-20

Observability Sre information

See salary details

$10

$63

$91

How much do observability sre jobs pay per hour?

As of Jun 7, 2026, the average hourly pay for observability sre in the United States is $63.74, according to ZipRecruiter salary data. Most workers in this role earn between $54.81 and $72.84 per hour, depending on experience, location, and employer.

What is the difference between Observability Sre vs Site Reliability Engineer?

AspectObservability SreSite Reliability Engineer
Primary FocusMonitoring, logging, and tracing to ensure system observabilitySystem reliability, automation, and infrastructure management
Skills & CertificationsMonitoring tools, scripting, cloud platforms, observability frameworksLinux, scripting, cloud services, automation tools
Work EnvironmentCollaborates with SRE, DevOps, and development teams on observability practicesBuilds and maintains scalable, reliable systems in production

While both roles focus on system stability, Observability Sre specializes in monitoring and diagnostics, whereas Site Reliability Engineers focus on overall system reliability and automation. They often work together to ensure robust, observable, and reliable systems.

What is an Observability SRE?

An Observability SRE (Site Reliability Engineer) is a specialist focused on ensuring that systems and applications are transparent, measurable, and reliable. Their main responsibility is to implement and maintain tools for monitoring, logging, and tracing, providing insights into system performance and health. Observability SREs help teams quickly detect, diagnose, and resolve issues by making system behavior visible and understandable. They play a critical role in uptime, incident response, and performance optimization, bridging the gap between software development and IT operations.

What are some typical challenges faced by Observability SREs when implementing monitoring solutions across diverse systems?

Observability SREs often encounter challenges when integrating monitoring tools across varied technology stacks and legacy systems. Ensuring consistent data collection, standardizing metrics, and maintaining visibility in complex, distributed environments can be difficult. Collaborating with development and operations teams to define meaningful alerts and dashboards requires strong communication and a deep understanding of both infrastructure and application behaviors. Staying up-to-date with evolving tools and best practices is also essential to address emerging observability needs.

What are the key skills and qualifications needed to thrive as an Observability SRE, and why are they important?

To thrive as an Observability SRE, you need a solid background in systems engineering, monitoring best practices, and expertise in observability concepts, often supported by a degree in computer science or related fields. Familiarity with tools like Prometheus, Grafana, ELK stack, and cloud monitoring platforms, as well as scripting languages such as Python or Bash, is typically required. Strong problem-solving, collaboration, and communication skills help SREs respond to incidents and work across teams effectively. These skills ensure system reliability, rapid issue detection, and continuous service improvement in complex technical environments.
More about Observability Sre jobs
What cities are hiring for Observability Sre jobs? Cities with the most Observability Sre job openings:
What states have the most Observability Sre jobs? States with the most job openings for Observability Sre jobs include:
Infographic showing various Observability Sre job openings in the United States as of May 2026, with employment types broken down into 96% Full Time, and 4% Contract. Highlights an 77% Physical, 7% Hybrid, and 16% Remote job distribution, with an average salary of $132,583 per year, or $63.7 per hour.
Senior Site Reliability Engineer - Observability

Senior Site Reliability Engineer - Observability

Okta

Bellevue, WA โ€ข On-site

$64 - $85/hr

Full-time

Posted 10 days ago


Job description

Job Summary:
Okta is a company focused on securing identities in the era of AI. They are seeking a highly technical Senior Observability Site Reliability Engineer to own and evolve their Splunk ecosystem and deliver a comprehensive, scalable Observability Platform.
Responsibilities:
โ€ข Design, build, and maintain scalable observability infrastructure using tools like Terraform.
โ€ข Optimize the collection, processing, and storage of log data to ensure high reliability and low latency of our Splunk services
โ€ข Participate in on-call rotations and lead post-incident reviews to drive systemic improvements and "observability-driven development."
โ€ข Eliminate "toil" by automating the deployment and scaling of observability agents and collectors.
Qualifications:
Required:
โ€ข Minimum 5+ Experience scaling and managing Splunk Cloud at scale (1000+ SVCs), including Workload Management (WLM) and HEC optimization.
โ€ข Expertise in creating intuitive, actionable Splunk dashboards that correlate data across multiple sources.
โ€ข Minimum 3+ years of experience in an SRE, DevOps, or Systems Engineering role with a focus on high-availability systems.
โ€ข Strong coding skills in SPL, Go for building internal tools and automating workflows.
โ€ข Deep understanding of Linux internals, networking (TCP/IP, DNS, Load Balancing), and container orchestration (Kubernetes/EKS).
โ€ข A data-driven approach to debugging complex, cross-service performance bottlenecks.
โ€ข This position requires the ability to access federal environments and/or have access to protected federal data.
โ€ข The successful candidate must be able to submit documentation establishing U.S. Person status (e.g. a U.S. Citizen, National, Lawful Permanent Resident, Refugee, or Asylee. 22 CFR 120.15) upon hire.
โ€ข This person must attend in person onboarding in our San Francisco office the first week of employment.
Preferred:
โ€ข Hands-on experience with OpenTelemetry (OTel), Vector, or similar frameworks for instrumenting applications.
โ€ข Experience in implementing Splunk charge-back app for usage reporting.
โ€ข Experience managing observability native tools within AWS or GCP.
Company:
Okta is a management platform that secures critical resources from cloud to ground for workforce and customers. Founded in 2009, the company is headquartered in San Francisco, USA, with a team of 5001-10000 employees. The company is currently Late Stage.