OR ยท On-site
$104K - $143K/yr
... contractual, dispatch, warranty, and compliance-sensitive workflows. * Evaluation and observability ... Communicate clearly with engineers, product managers, architects, security partners, and ...
OR ยท On-site
$104K - $143K/yr
... contractual, dispatch, warranty, and compliance-sensitive workflows. * Evaluation and observability ... Communicate clearly with engineers, product managers, architects, security partners, and ...
OR ยท On-site
$104K - $143K/yr
... contractual, dispatch, warranty, and compliance-sensitive workflows. * Evaluation and observability ... Communicate clearly with engineers, product managers, architects, security partners, and ...
Improves timeliness, accuracy, contractual conformity, and cost-to-serve while reducing KTLO work ... Reduce KTLO through automation, observability, resilience, and platform simplification. * Build and ...
Improves timeliness, accuracy, contractual conformity, and cost-to-serve while reducing KTLO work ... Reduce KTLO through automation, observability, resilience, and platform simplification. * Build and ...
Improves timeliness, accuracy, contractual conformity, and cost-to-serve while reducing KTLO work ... Reduce KTLO through automation, observability, resilience, and platform simplification. * Build and ...
Improves timeliness, accuracy, contractual conformity, and cost-to-serve while reducing KTLO work ... Reduce KTLO through automation, observability, resilience, and platform simplification. * Build and ...
$150K - $170K/yr
You will contribute to critical technical decisions around system design, observability, and ... This processing is based on legitimate interest and pre-contractual measures under applicable data ...
$150K - $170K/yr
You will contribute to critical technical decisions around system design, observability, and ... This processing is based on legitimate interest and pre-contractual measures under applicable data ...
Falls Church, VA ยท Hybrid
$164K - $218K/yr
Design tailored solutions leveraging Splunk (SIEM/Observability), Appgate (Zero Trust), Okta ... Rather, salary will be set based on experience, geographic location and possibly contractual ...
Falls Church, VA ยท Hybrid
$164K - $218K/yr
Design tailored solutions leveraging Splunk (SIEM/Observability), Appgate (Zero Trust), Okta ... Rather, salary will be set based on experience, geographic location and possibly contractual ...
Southlake, TX ยท On-site
$43.99 - $62.50/hr
... including contractual agreements to provide a variety of services including technology. This ... NET plus professional experience some (RESTful services, messaging, security, observability). * 2+ ...
Southlake, TX ยท On-site
$43.99 - $62.50/hr
... including contractual agreements to provide a variety of services including technology. This ... NET plus professional experience some (RESTful services, messaging, security, observability). * 2+ ...
Springfield, VA ยท On-site
$159K - $216K/yr
Support the development of dashboards using observability solutions using tools such as Grafana and ... Rather, salary will be set based on experience, geographic location and possibly contractual ...
Springfield, VA ยท On-site
$159K - $216K/yr
Support the development of dashboards using observability solutions using tools such as Grafana and ... Rather, salary will be set based on experience, geographic location and possibly contractual ...
$159K - $216K/yr
Support the development of dashboards using observability solutions using tools such as Grafana and ... Rather, salary will be set based on experience, geographic location and possibly contractual ...
$159K - $216K/yr
Support the development of dashboards using observability solutions using tools such as Grafana and ... Rather, salary will be set based on experience, geographic location and possibly contractual ...
Springfield, VA ยท On-site
$159K - $216K/yr
Support the development of dashboards using observability solutions using tools such as Grafana and ... Rather, salary will be set based on experience, geographic location and possibly contractual ...
Springfield, VA ยท On-site
$159K - $216K/yr
Support the development of dashboards using observability solutions using tools such as Grafana and ... Rather, salary will be set based on experience, geographic location and possibly contractual ...
Build and improve long-running transaction workflows with strong attention to observability ... This processing is based on legitimate interest and pre-contractual measures under applicable data ...
Build and improve long-running transaction workflows with strong attention to observability ... This processing is based on legitimate interest and pre-contractual measures under applicable data ...
Implement and oversee monitoring, logging, and observability solutions for AI services. * Ensure ... Rather, salary will be set based on experience, geographic location and possible contractual ...
Implement and oversee monitoring, logging, and observability solutions for AI services. * Ensure ... Rather, salary will be set based on experience, geographic location and possible contractual ...
Implement and oversee monitoring, logging, and observability solutions for AI services. * Ensure ... Rather, salary will be set based on experience, geographic location and possible contractual ...
Quick apply
Implement and oversee monitoring, logging, and observability solutions for AI services. * Ensure ... Rather, salary will be set based on experience, geographic location and possible contractual ...
Implement and oversee monitoring, logging, and observability solutions for AI services. * Ensure ... Rather, salary will be set based on experience, geographic location and possible contractual ...
Implement and oversee monitoring, logging, and observability solutions for AI services. * Ensure ... Rather, salary will be set based on experience, geographic location and possible contractual ...
Implement and oversee monitoring, logging, and observability solutions for AI services. * Ensure ... Rather, salary will be set based on experience, geographic location and possible contractual ...
Quick apply
Implement and oversee monitoring, logging, and observability solutions for AI services. * Ensure ... Rather, salary will be set based on experience, geographic location and possible contractual ...
Implement and oversee monitoring, logging, and observability solutions for AI services. * Ensure ... Rather, salary will be set based on experience, geographic location and possible contractual ...
Quick apply
Implement and oversee monitoring, logging, and observability solutions for AI services. * Ensure ... Rather, salary will be set based on experience, geographic location and possible contractual ...
Implement and oversee monitoring, logging, and observability solutions for AI services. * Ensure ... Rather, salary will be set based on experience, geographic location and possible contractual ...
Implement and oversee monitoring, logging, and observability solutions for AI services. * Ensure ... Rather, salary will be set based on experience, geographic location and possible contractual ...
Falls Church, VA ยท On-site +1
$164K - $218K/yr
Design tailored solutions leveraging Splunk (SIEM/Observability), Appgate (Zero Trust), Okta ... Rather, salary will be set based on experience, geographic location and possibly contractual ...
Falls Church, VA ยท On-site +1
$164K - $218K/yr
Design tailored solutions leveraging Splunk (SIEM/Observability), Appgate (Zero Trust), Okta ... Rather, salary will be set based on experience, geographic location and possibly contractual ...
Alpharetta, GA ยท On-site
$105K - $137K/yr
Alpharetta, GA (Hybrid / Remote Options Available) Employment Type: Full-Time / Contractual About ... Implement comprehensive observability and trace-level logging for multi-step agentic workflows ...
Alpharetta, GA ยท On-site
$105K - $137K/yr
Alpharetta, GA (Hybrid / Remote Options Available) Employment Type: Full-Time / Contractual About ... Implement comprehensive observability and trace-level logging for multi-step agentic workflows ...
Southlake, TX ยท On-site
$115K - $152K/yr
... including contractual agreements to provide a variety of services including technology. This ... NET plus professional experience some (RESTful services, messaging, security, observability). * 2+ ...
Southlake, TX ยท On-site
$115K - $152K/yr
... including contractual agreements to provide a variety of services including technology. This ... NET plus professional experience some (RESTful services, messaging, security, observability). * 2+ ...
San Jose, CA ยท On-site
$207K - $259K/yr
Develop and implement comprehensive observability strategies, including monitoring, logging, and ... ITAR, contractual, and/or regulatory requirements. Please note that this is intended to provide a ...
San Jose, CA ยท On-site
$207K - $259K/yr
Develop and implement comprehensive observability strategies, including monitoring, logging, and ... ITAR, contractual, and/or regulatory requirements. Please note that this is intended to provide a ...
$19.47 - $25.13
2% of jobs
$25.13 - $30.79
5% of jobs
$30.79 - $36.45
3% of jobs
$36.45 - $42.11
0% of jobs
$42.11 - $47.77
6% of jobs
$52.90 is the 25th percentile. Wages below this are outliers.
$47.77 - $53.43
9% of jobs
$53.43 - $59.09
15% of jobs
The median wage is $60.55 / hr.
$59.09 - $64.75
35% of jobs
$64.75 - $70.41
9% of jobs
$70.41 - $76.07
8% of jobs
$76.07 - $81.73
8% of jobs
$19
$59
$81
| Aspect | Contractual Observability Engineer | Site Reliability Engineer |
|---|---|---|
| Primary Focus | Implementing observability tools, monitoring, and logging systems | Ensuring system reliability, scalability, and performance |
| Skills & Certifications | Monitoring tools, scripting, cloud platforms | Systems engineering, automation, incident management |
| Work Environment | DevOps teams, cloud environments, monitoring platforms | Operations teams, production systems, automation tools |
| Industry Usage | Tech companies, SaaS providers, cloud services | Large-scale tech firms, internet services, cloud providers |
While both roles involve technical expertise in cloud and systems, a Contractual Observability Engineer primarily focuses on implementing and maintaining observability tools, whereas a Site Reliability Engineer emphasizes system reliability and performance. Understanding these differences helps organizations assign the right responsibilities and skills to each role.
$104K - $143K/yr
Full-time
Medical, Dental, Vision, Retirement
Posted 17 days ago
Ready to be a Titan?
ServiceTitan runs the businesses behind the trades: jobs, trucks, technicians, equipment, contracts, payments, warranties, compliance obligations, and customer history. That operational context is our advantage. We are building Agent OS to turn that context into safe, observable, production-grade agent work.
Agent OS is the shared runtime, context, memory, action, trust, and evaluation layer behind role-specific AI experiences across Atlas, office, field, voice, mobile, and future product surfaces. This is not a collection of chatbots. It is the platform that lets agents help contractors run their businesses with the right evidence, permissions, approvals, and audit trails.
You will help build the core engineering primitives behind that platform: agent runtime, typed tools, context and memory assembly, trust and approval flows, evaluation infrastructure, and production observability. You are not building one agent for one product surface. You are building the platform that product teams use to build many agents safely.
You will work on a small, senior AI platform team and partner closely with Product, Architecture, Security, Data Platform, Atlas, and domain engineering teams.
What You'll Build
Agent runtime and workflow execution: Build the runtime for role-specific agents, tool use, delegation, pause/resume, durable checkpoints, retries, and failure recovery. Agents must resume safely without losing state or duplicating side effects.
Typed tools and action contracts: Build deterministic controls around non-deterministic reasoning: governed reads, proposed writes, precondition checks, business invariants, scoped permissions, idempotency, audit trails, and rollback.
Context and memory systems: Build tenant-scoped context assembly, retrieval, freshness controls, provenance, transcripts, artifacts, tool results, and replayable evidence. ServiceTitan systems of record stay authoritative; memory provides context and coordination.
Trust and approval infrastructure: Build human-in-the-loop gates, approval thresholds, reversibility, tenant policy enforcement, and audit history for financial, contractual, dispatch, warranty, and compliance-sensitive workflows.
Evaluation and observability: Build offline and online evals, scenario libraries, simulation, trajectory review, regression detection, cost and latency telemetry, and autonomy promotion gates.
Reusable capability platform: Help product teams package prompts, tools, context requirements, policies, evals, rollout controls, ownership, and rollback into governed capabilities for owners, CSRs, dispatchers, technicians, managers, and back-office teams.
Model and inference architecture: Make practical tradeoffs across latency, cost, quality, structured outputs, caching, fallback behavior, provider choice, and model routing behind a shared platform layer.
What You'll Do
Design and implement core Agent OS platform services.
Write production code and review implementation details from other engineers.
Build reliable APIs, workflows, tools, and services for agent execution.
Inspect traces, debug failures, and improve production behavior.
Design evaluation scenarios and regression suites for agent workflows.
Work through real agent failure modes: stale context, wrong tool calls, missing permissions, unsafe actions, poor retrieval, latency spikes, and cost regressions.
Partner with domain teams to turn agent use cases into reusable platform patterns.
Help define platform contracts for tools, actions, approvals, context, memory, evidence, and evaluation.
Contribute to technical direction while staying grounded in what can ship quickly and safely.
Communicate clearly with engineers, product managers, architects, security partners, and engineering leadership.
What You'll Bring
5+ years of production software engineering experience.
Strong hands-on coding ability in Python, Java, C#, or another backend language. Python experience is strongly preferred.
Experience building AI, ML, data, platform, infrastructure, workflow, automation, or developer-platform systems in production.
Practical understanding of modern LLM application architecture: model gateways, prompt and context assembly, retrieval, tool calling, structured outputs, memory, agent workflows, and human approval patterns.
Experience with distributed systems, event-driven systems, async workflows, queues, durable execution, or message-driven architectures.
Strong production-safety instincts for non-deterministic systems: typed contracts, scoped permissions, precondition checks, idempotency, audit trails, rollback, and monitoring.
Experience designing or operating evaluation systems: behavioral evals, regression suites, scenario tests, trajectory review, simulation, online metrics, or production monitoring.
Strong data and context instincts: SQL, unstructured data, vector search, metadata, provenance, freshness, source authority, and privacy boundaries.
Experience with databases, warehouses, or search systems such as PostgreSQL, SQL Server, Snowflake, BigQuery, Elasticsearch, or vector stores.
Experience building services on public cloud infrastructure such as Azure, AWS, or GCP.
Good engineering judgment across APIs, reliability, security, observability, and multi-tenant SaaS constraints.
Bonus points
Experience building or operating agent runtimes, workflow engines, model gateways, ML platforms, evaluation platforms, developer platforms, or internal control planes.
Experience with LangGraph, LangChain, LlamaIndex, Semantic Kernel, OpenAI Agents SDK, Anthropic tooling, or similar frameworks.
Experience with MCP, A2A, tool protocols, agent interoperability, or agent-commerce patterns.
Experience with Kubernetes, Docker, serverless platforms, or cloud-native infrastructure.
Experience with compliance-sensitive workflows, approval-gated automation, audit trails, policy engines, or governed writes to systems of record.
Experience in SaaS, vertical software, fintech, ERP, CRM, marketplace, field service, or other domains where software decisions affect real business operations.
Experience with graph-based data models, knowledge graphs, entity resolution, or cross-domain operational context systems.
Why this role matters
Most AI products fail when the demo becomes a production workflow. The hard problems show up in the platform: context freshness, tool reliability, permissions, evaluation, traceability, rollback, and trust.
That is what this team is building.
At ServiceTitan, agents need to work inside real contractor operations. They need to understand the job, the customer, the technician, the equipment, the agreement, the invoice, the warranty, and the business policy. They need to explain what evidence they used. They need to know when to ask for approval. They need to recover when something fails.
The engineer in this role helps set the technical standard for every AI surface ServiceTitan builds next. This is a high-leverage engineering role for someone who wants to build the platform underneath production agents, not just another agent demo.
Remote Location (US and Canada only)- Candidates based in PST highly preferred.
Be Human With Us:
Being human isn't about checking every box on a list. It's about the experiences we have, people we meet, and the perspectives we share. So, if you have the skills but are hesitant to apply because of your background, apply anyway. We need amazing people like you to help us challenge the conventional and think differently about the problems that we're solving. We're in this together. Come be human, with us.
Use of AI Technology:
We use technology, including automated and AI-assisted tools, to support certain aspects of our recruitment process. These tools are designed to improve efficiency and enhance the candidate experience. AI tools are not used to make hiring decisions; all hiring decisions are made by our hiring teams.
What We Offer:
When you join our team, you're not just accepting a job. You're making a career move. Here's how we'll support you in doing some of the most impactful work of your career:
At ServiceTitan, we celebrate individuality and uniqueness. We believe that the convergence of fresh perspectives and experiences from all walks of life is what makes our product and culture so great. We strongly encourage people from underrepresented groups to apply. We do not discriminate against employees based on race, color, religion, sex, national origin, gender identity or expression, age, disability, pregnancy (including childbirth, breastfeeding, or related medical condition), genetic information, protected military or veteran status, sexual orientation, or any other characteristic protected by applicable federal, state or local laws.
ServiceTitan is committed to fair and equitable compensation for all of our employees. We thoughtfully consider a wide range of factors when determining individual compensation, which may change over time. We comply with all applicable minimum wage laws.For candidates in the United States, the good faith salary ranges estimate for this role is Zone 1: $179,900 USD - $269,900 USD Applicable for: CA, CT, DC, MD, MA, NJ, NY, VA, and WA Zone 2: $168,200 USD - $252,200 USD Applicable for: All other US locations. International Compensation for candidates residing outside the United States will vary by location and will be discussed during the hiring process. Actual compensation within a range is determined by factors including relevant experience, skill set, qualifications, and performance. In addition to base salary, our total compensation package includes an annual bonus, equity, and a holistic suite of benefits.