1

Observability Engineer Jobs in Seattle, WA (NOW HIRING)

Senior AI and HPC Observability Engineer

Seattle, WA ยท On-site

$139K - $183K/yr

We are looking for a strong AI & HPC Observability Engineer to build and scale next-generation Observability and Telemetry platforms. You will design and develop high-throughput, reliable telemetry ...

Our engineers maintain and automate observability tooling for our entire platform, including metrics, logs, and traces. We are looking for engineers passionate about monitoring, observing, measuring ...

Infrastructure Engineer (Observability)

Seattle, WA ยท On-site +1

$122K - $160K/yr

What We're Looking For Lightning AI is seeking an Observability Infrastructure Engineer to join our Infrastructure Engineering team. In this role, you will own and evolve observability systems across ...

Senior Engineer, Hybrid Cloud Fabric

Seattle, WA ยท Hybrid

$118K - $163K/yr

Senior Engineer - Multi-Cloud Service Mesh Position Summary: Become a key player in GEICO's tech ... Develop comprehensive monitoring and observability dashboards to provide deep insights into service ...

Staff Engineer - Multi-Cloud Service Mesh Position Summary: Become a key player in GEICO's tech ... Develop comprehensive monitoring and observability dashboards to provide deep insights into service ...

Senior Staff Engineer, Hybrid Cloud Fabric

Seattle, WA ยท Hybrid

$118K - $163K/yr

Develop comprehensive monitoring and observability dashboards to provide deep insights into service ... Guide and mentor engineers on service mesh principles and best practices, fostering knowledge ...

next page

Showing results 1-20

Observability Engineer information

What are the typical responsibilities of an Observability Engineer on a daily basis?

As an Observability Engineer, your daily responsibilities usually include designing and maintaining monitoring and alerting systems, analyzing system logs, and collaborating with development teams to improve service reliability. You'll work regularly with a suite of observability tools to ensure that infrastructure and applications are performing optimally, quickly responding to any incidents or anomalies detected. Additionally, you help establish best practices for instrumentation and metrics collection, often leading initiatives to enhance visibility into complex, distributed systems. The work is a blend of proactive system design and reactive problem-solving, requiring both technical expertise and strong teamwork skills.

What engineers make $500,000 a year?

Senior-level engineers in specialized fields such as software engineering, data engineering, or DevOps can earn $500,000 or more annually, especially with experience, advanced skills, and working at large tech companies or in high-demand industries. Compensation often includes base salary, bonuses, and stock options, and reaching this level typically requires extensive expertise and leadership responsibilities.

What is the salary of observability engineer?

The salary of an observability engineer typically ranges from $90,000 to $150,000 annually, depending on experience, location, and company size. Senior roles or those with specialized skills in monitoring tools like Prometheus, Grafana, or Elasticsearch may command higher salaries.

What does an observability engineer do?

An observability engineer designs, implements, and maintains systems that monitor and analyze software performance, reliability, and security. They work with tools like monitoring dashboards, log management, and alerting systems to ensure applications run smoothly and issues are quickly identified. Strong skills in scripting, cloud environments, and understanding of system architecture are essential for this role.

What does an Observability Engineer do?

An Observability Engineer is responsible for designing, implementing, and maintaining monitoring, logging, and tracing systems to ensure the health, performance, and reliability of applications and infrastructure. They work with tools like Prometheus, Grafana, OpenTelemetry, and ELK to collect and analyze telemetry data. Their goal is to provide visibility into system behavior, detect and diagnose issues quickly, and improve overall system observability. They collaborate with developers, SREs, and operational teams to create automated and scalable observability solutions.

What engineers make 300,000 a year?

Senior-level observability engineers, especially those with expertise in cloud infrastructure, monitoring tools, and automation, can earn $300,000 or more annually. High compensation is often associated with extensive experience, specialized skills, and working in large tech companies or in leadership roles.

What are the key skills and qualifications needed to thrive in the Observability Engineer position, and why are they important?

To thrive as an Observability Engineer, you need a solid understanding of monitoring, logging, and tracing systems, as well as expertise in programming, cloud infrastructure, and incident response. Proficiency in tools like Prometheus, Grafana, ELK stack, and familiarity with cloud platforms such as AWS, Azure, or GCP are commonly required, and certifications like AWS Certified DevOps Engineer can be advantageous. Strong analytical thinking, collaborative skills, and effective communication are essential soft skills for diagnosing issues and working across development and operations teams. These competencies are vital for proactively maintaining system reliability, ensuring performance, and resolving complications before they impact business operations.

What are the most commonly searched types of Observability Engineer jobs in Seattle, WA? The most popular types of Observability Engineer jobs in Seattle, WA are:
What are popular job titles related to Observability Engineer jobs in Seattle, WA? For Observability Engineer jobs in Seattle, WA, the most frequently searched job titles are:
What job categories do people searching Observability Engineer jobs in Seattle, WA look for? The top searched job categories for Observability Engineer jobs in Seattle, WA are:
Infographic showing various Observability Engineer job openings in Seattle, WA as of June 2026, with employment types broken down into 97% Full Time, and 3% Contract. Highlights an 78% Physical, 6% Hybrid, and 16% Remote job distribution.
Senior AI and HPC Observability Engineer

Senior AI and HPC Observability Engineer

Nvidia

Seattle, WA โ€ข On-site

$139K - $183K/yr

Full-time

Posted 13 hours ago


Job description

NVIDIA is a pioneer in accelerated computing, known for inventing the GPU and driving breakthroughs in gaming, computer graphics, high-performance computing, and artificial intelligence. Our technology powers everything from generative AI to autonomous systems, and we continue to shape the future of computing through innovation and collaboration. Within this mission, our team, Managed AI Superclusters (MARS) builds and scales the infrastructure, platforms, and tools that enable researchers and engineers to develop the next generation of AI/ML systems. By joining us, you'll help design solutions that power some of the world's most advanced computing workloads.

Observability is at the heart of this transformation. We are looking for a strong AI & HPC Observability Engineer to build and scale next-generation Observability and Telemetry platforms. You will design and develop high-throughput, reliable telemetry pipelines and modern data infrastructure. This role requires solid distributed systems fundamentals, production-grade coding, and a passion for operational excellence.

What You Will Be Doing:

  • Design and scale observability platforms handling high-volume metrics, logs, and traces across distributed environments

  • Build high-performance backend services for telemetry ingestion, processing, and routing

  • Develop and extend OpenTelemetry collectors, processors, exporters, and instrumentation libraries

  • Build and optimize metrics pipelines using large-scale time-series storage systems

  • Design and operate real-time and batch telemetry pipelines using streaming and distributed data technologies

  • Improve platform reliability, performance, and cost efficiency through tuning, capacity planning, and system optimization

  • Develop monitoring, alerting, and service reliability frameworks to ensure platform health and performance

  • Collaborate with platform engineering, infrastructure, and site reliability teams to deliver production-grade observability solutions

What We Need to see:

  • Bachelor's degree in Computer Science, Computer Engineering, or related field or equivalent experience

  • 5+ years of experience building backend or distributed systems in production environments

  • Strong programming skills in Python, Go, or Java, with experience developing production-quality software

  • Hands-on experience with modern observability architectures, including metrics, logs, and traces

  • Solid experience with PromQL and time-series data systems

  • Experience building or operating distributed data pipelines using technologies such as Kafka, Spark, or Flink

  • Experience working with Kubernetes and cloud-native infrastructure

  • Strong understanding of distributed systems, concurrency, and fault-tolerant system design. Strong debugging, performance tuning, and production operations skills

Ways To Stand Out from The Crowd:

  • Proven experience designing and scaling observability platforms for AI, GPU, or HPC environments

  • Hands-on expertise with OpenTelemetry, Prometheus, Kafka, and high-volume distributed telemetry pipelines

  • Strong background in data engineering, time-series data modeling, and real-time performance tuning

  • Experience integrating observability with AI/ML pipelines, GPU workload monitoring, or intelligent alerting

  • Demonstrated use of statistical or machine learning techniques for anomaly detection, correlation, or predictive insights

Your base salary will be determined based on your location, experience, and the pay of employees in similar positions. The base salary range is 152,000 USD - 241,500 USD for Level 3, and 184,000 USD - 287,500 USD for Level 4.

You will also be eligible for equity and benefits.

Applications for this job will be accepted at least until March 6, 2026.

This posting is for an existing vacancy.

NVIDIA uses AI tools in its recruiting processes.

NVIDIA is committed to fostering a diverse work environment and proud to be an equal opportunity employer. As we highly value diversity in our current and future employees, we do not discriminate (including in our hiring and promotion practices) on the basis of race, religion, color, national origin, gender, gender expression, sexual orientation, age, marital status, veteran status, disability status or any other characteristic protected by law.

Nvidia logo

About Nvidia

Sourced by ZipRecruiter

NVIDIA has been transforming computer graphics, PC gaming, and accelerated computing for more than 25 years. It's a unique legacy of innovation that's fueled by great technology--and amazing people. Today, we're tapping into the unlimited potential of AI to define the next era of computing. An era in which our GPU acts as the brains of computers, robots, and self-driving cars that can understand the world. Doing what's never been done before takes vision, innovation, and the world's best talent.

Industry

Computer and electronic product manufacturing

Company size

10,000+ Employees

Headquarters location

Santa Clara, CA, US

Year founded

1993