1

Principal Site Reliability Engineer Jobs in Tracy, CA

DevOps Engineer

Walnut Creek, CA · On-site

$60 - $82.25/hr

Build Release Engineer/ Site Reliability Engineer will not work Skill matrix: Explain the below skills in brief: Skills - Years of exp - Proficiency (Beginner, Intermediate or advanced) Python AWS ...

Staff Release Engineer

San Ramon, CA · On-site

$174K - $220K/yr

You'll collaborate closely with Infrastructure, SRE, and Engineering teams to architect resilient CI/CD pipelines, implement automated quality gates, and reduce friction across all deployment paths.

You'll collaborate closely with Infrastructure, SRE, and Engineering teams to architect resilient CI/CD pipelines, implement automated quality gates, and reduce friction across all deployment paths.

Senior DevOps Engineer

San Ramon, CA · On-site +1

$250.30K/yr

Proven application of SRE practices in high-stakes, always-on environments. * Strong experience with observability platforms (Prometheus, Grafana, Loki/ELK/EFK). Preferred Expertise: * Kubernetes

Contribute to incident response, SRE practices, and continuous improvement efforts to lower MTTD and MTTR * Help define and evolve the platform engineering roadmap as Five9 scales its infrastructure ...

... and SRE-aligned operating model. Define Network Engineering Strategy & Architecture * Establish and execute multi-year network and connectivity strategies aligned with Gap Inc.'s technology and ...

Support Engineer - Java

Pleasanton, CA · On-site

$140K - $155K/yr

Must have * 3+ years of experience in technical support, DevOps, SRE, QA, or an R&D-adjacent engineering role. * Strong troubleshooting skills across distributed systems, APIs, microservices, or ...

Must have * 3+ years of experience in technical support, DevOps, SRE, QA, or an R&D-adjacent engineering role. * Strong troubleshooting skills across distributed systems, APIs, microservices, or ...

... SRE, and DevOps teams to identify key metrics and create actionable observability strategies Optimize existing monitoring setups and identify gaps in visibility Integrate Datadog with various tools ...

Kubernetes Engineer

Livermore, CA · On-site

$66.75 - $88.75/hr

As a Kubernetes Engineer, you will be responsible for one of our Computing Platform's most critical ... This position requires part-time on-site presence due to the nature of the work. This position will ...

Kubernetes Engineer

Livermore, CA · On-site

$66.75 - $88.75/hr

As a Kubernetes Engineer, you will be responsible for one of our Computing Platform's most critical ... This position requires part-time on-site presence due to the nature of the work. This position will ...

next page

Showing results 1-20

Principal Site Reliability Engineer information

See Tracy, CA salary details

$11

$68

$98

How much do principal site reliability engineer jobs pay per hour?

As of May 28, 2026, the average hourly pay for principal site reliability engineer in Tracy, CA is $68.62, according to ZipRecruiter salary data. Most workers in this role earn between $58.99 and $78.41 per hour, depending on experience, location, and employer.

What are the key skills and qualifications needed to thrive as a Principal Site Reliability Engineer, and why are they important?

To thrive as a Principal Site Reliability Engineer, you need deep expertise in systems engineering, cloud infrastructure, automation, and strong programming skills, typically supported by a degree in computer science or a related field. Familiarity with tools like Kubernetes, Terraform, Prometheus, and CI/CD platforms, as well as certifications such as AWS Certified Solutions Architect or Google Professional Cloud DevOps Engineer, are often required. Exceptional problem-solving, leadership, and communication skills help you guide teams and drive reliability initiatives across organizations. These skills ensure reliable, scalable systems and foster a culture of continuous improvement and operational excellence.

How does a Principal Site Reliability Engineer typically contribute to setting technical direction and mentoring within an SRE team?

As a Principal Site Reliability Engineer, you play a critical role in shaping the technical vision of the SRE team by establishing best practices for infrastructure reliability, scalability, and incident response. You are often expected to mentor junior and mid-level engineers, guiding them through complex troubleshooting, architectural decisions, and automation strategies. Additionally, you collaborate closely with software engineering, product, and operations teams to ensure that reliability and performance goals align with business needs. This role offers significant influence over technical roadmaps and provides opportunities to lead cross-functional initiatives, making it ideal for those seeking both leadership and hands-on impact.

What are Principal Site Reliability Engineers?

Principal Site Reliability Engineers (SREs) are senior technical experts who lead the design, implementation, and maintenance of reliable, scalable, and highly available systems. They oversee complex infrastructure and work closely with engineering teams to optimize system performance, automate processes, and ensure operational excellence. Principal SREs also mentor other engineers, set technical standards, and drive improvements in incident response, monitoring, and system resilience. Their work is critical in minimizing downtime and ensuring a seamless experience for users.

What is the difference between Principal Site Reliability Engineer vs Site Reliability Engineer?

AspectPrincipal Site Reliability EngineerSite Reliability Engineer
CredentialsAdvanced certifications (e.g., AWS, Google Cloud), extensive experienceEntry to mid-level certifications, relevant experience
Work EnvironmentStrategic planning, architecture design, mentoringOperational tasks, automation, monitoring
Employer UsageLarge tech companies, cloud providers, enterprisesTech firms, startups, cloud services

The Principal Site Reliability Engineer typically holds more advanced certifications and has a strategic, leadership role in designing systems and mentoring teams. In contrast, the Site Reliability Engineer focuses on operational tasks, automation, and maintaining system reliability. Both roles are vital in ensuring system stability but differ in scope and seniority.

What are popular job titles related to Principal Site Reliability Engineer jobs in Tracy, CA? For Principal Site Reliability Engineer jobs in Tracy, CA, the most frequently searched job titles are:
What job categories do people searching Principal Site Reliability Engineer jobs in Tracy, CA look for? The top searched job categories for Principal Site Reliability Engineer jobs in Tracy, CA are:
What cities near Tracy, CA are hiring for Principal Site Reliability Engineer jobs? Cities near Tracy, CA with the most Principal Site Reliability Engineer job openings:
Infographic showing various Principal Site Reliability Engineer job openings in Tracy, CA as of May 2026, with employment types broken down into 65% Full Time, 30% Part Time, 3% Contract, and 2% Nights. Highlights an 93% Physical, 2% Hybrid, and 5% Remote job distribution, with an average salary of $142,724 per year, or $68.6 per hour.

Immediate Interviews: SRE Manager in CA

Vertex Elite LLC

San Ramon, CA • On-site

$63.75 - $84.75/hr

Contractor

Posted 20 days ago


Job description

SRE Manager
San Ramon, CA
Long Term Contract

Job Responsibilities

A Site Reliability Engineer is a professional who acts as a warrior to monitor, protect customer applications, and takes charge of operational tasks to ensure the efficient functioning of a system. They are responsible for monitoring, automating, and improving the reliability, performance, and availability of any applications.
Mandatory to have working experience as SRE Lead or Techno function role as Site Reliability Engineer (SRE) at customer work location in e-com/Retail domain.
Be a litmus7 face at customer site collaborating with Litmus7 leadership.
Must have a working knowledge of Production Application Support.
Working experience in interacting with offshore team (IND) who provide 24x7 coverage, help & guide during India night coverage.
Should know how to gather & communicate SRE requirement from Tech and non-tech aspect from customer.
Working experience on how to gather requirements on health of applications, services to monitor, setting service levels.
Must have good knowledge on Level 1, Level 2 and Level 3 support experience in eCommerce platforms Shopify, Blue Yonder or any other e-com solutions/platforms.
Hands on experience in Monitoring, Logging, Alerting, Dashboarding, and report generation in any monitoring tools such as AppDynamics/Splunk/Dynatrace/Datadog/CloudWatch/ELK/Prome/New Relic). This engagement is a customer using NewRelic, PagerDuty hence it is good to have this expertise.
Must have knowledge SRE principles such as Logs, metrics, availability metrics, uptime, ticket tracking, e-com services, ITIL framework specifically on Alerts, Incident, change management, CAB, Production deployments, Risk and mitigation plan, SLA, SLI
Should be able to lead P1 calls, brief about the P1 to customer, proactive in gathering leads/ customers into the P1 calls till RCA.
Knowledge working with postman.
Should have knowledge on building and executing SOP, runbooks, handling any ITSM platforms (JIRA/ServiceNow/BMC Remedy)
Must know how to work with the Dev team, cross functional teams across time zones.
Should be able to generate WSR/MSR by extracting the tickets from ITSM platforms, present to customers and L7 leaders.

Non-Technical Requirement

Ability to clearly communicate and understand a technical idea/concept.
Ability to work in a professional environment while interacting with peers and stakeholders, collaborating with offshore teams.
Excellent written and verbal communications skills.
Motivated, goal driven, influential, innovative, curious, and open minded, fun to work with, collaborator.
Capability to work with people in different time zones.
Ability to operate in a fast-paced, evolving environment and appropriately prioritize tasks, and keep abreast of the latest technology.
Collaborate with cloud architecture, infrastructure team, project management team, and technology services, management team.
Create and maintain detailed documentation.
A “CAN DO ATTITUDE”