Service Reliability Engineer
$55 - $73.25/hr
As a Service Reliability Engineer, you won't just be supporting systems; you'll be ensuring the services that connect artists and fans around the globe are always on. Job Functions: Key ...
$55 - $73.25/hr
As a Service Reliability Engineer, you won't just be supporting systems; you'll be ensuring the services that connect artists and fans around the globe are always on. Job Functions: Key ...
$55 - $73.25/hr
As a Service Reliability Engineer, you won't just be supporting systems; you'll be ensuring the services that connect artists and fans around the globe are always on. Job Functions: Key ...
Aliso Viejo, CA · On-site
$172.10K - $258.10K/yr
Senior Service Reliability Engineer Location: Aliso Viejo | Hybrid As a part of Sony Computer Entertainment, the Gaming, Developer and Future Technology Group (GDFT) is leading the cloud gaming ...
Aliso Viejo, CA · On-site
$172.10K - $258.10K/yr
Senior Service Reliability Engineer Location: Aliso Viejo | Hybrid As a part of Sony Computer Entertainment, the Gaming, Developer and Future Technology Group (GDFT) is leading the cloud gaming ...
Redmond, WA · On-site
$63.75 - $84.75/hr
We are seeking a Principal Service Reliability Engineer (SRE) to lead the reliability strategy for mission-critical, large-scale distributed systems. This role operates at a system and organizational ...
Redmond, WA · On-site
$63.75 - $84.75/hr
We are seeking a Principal Service Reliability Engineer (SRE) to lead the reliability strategy for mission-critical, large-scale distributed systems. This role operates at a system and organizational ...
Aliso Viejo, CA · Hybrid
$172.10K - $258.10K/yr
Senior Service Reliability Engineer Location: Aliso Viejo | Hybrid As a part of Sony Computer Entertainment, the Gaming, Developer and Future Technology Group (GDFT) is leading the cloud gaming ...
Aliso Viejo, CA · Hybrid
$172.10K - $258.10K/yr
Senior Service Reliability Engineer Location: Aliso Viejo | Hybrid As a part of Sony Computer Entertainment, the Gaming, Developer and Future Technology Group (GDFT) is leading the cloud gaming ...
Primary - Service Reliability Engineer Front line technical service reliability operators accountable for handling critical customer issues coming in via support phone line and HUB. Responsible for ...
Primary - Service Reliability Engineer Front line technical service reliability operators accountable for handling critical customer issues coming in via support phone line and HUB. Responsible for ...
$58.25 - $77.50/hr
Rootshell Enterprise Technologies Inc. is a recognized provider of professional IT Consulting services in the US. We are actively seeking SRE with FedRAMP for one of our client, Please share your ...
$58.25 - $77.50/hr
Rootshell Enterprise Technologies Inc. is a recognized provider of professional IT Consulting services in the US. We are actively seeking SRE with FedRAMP for one of our client, Please share your ...
Do you have a passion for ensuring the reliability, scalability, and performance of critical services? Are you a highly motivated and expert engineer with a strong understanding of Site Reliability ...
Do you have a passion for ensuring the reliability, scalability, and performance of critical services? Are you a highly motivated and expert engineer with a strong understanding of Site Reliability ...
Atlanta, GA · On-site
$98.60K - $124.10K/yr
... engineering solutions, improved maintenance strategies, preventative maintenance optimization, and other reliability techniques. • Provides technical service to operations and manufacturing ...
Atlanta, GA · On-site
$98.60K - $124.10K/yr
... engineering solutions, improved maintenance strategies, preventative maintenance optimization, and other reliability techniques. • Provides technical service to operations and manufacturing ...
$98.60K - $124.10K/yr
... engineering solutions, improved maintenance strategies, preventative maintenance optimization, and other reliability techniques. · Provides technical service to operations and manufacturing ...
Quick apply
$98.60K - $124.10K/yr
... engineering solutions, improved maintenance strategies, preventative maintenance optimization, and other reliability techniques. · Provides technical service to operations and manufacturing ...
$98.60K - $124.10K/yr
... engineering solutions, improved maintenance strategies, preventative maintenance optimization, and other reliability techniques. Provides technical service to operations and manufacturing personnel ...
$98.60K - $124.10K/yr
... engineering solutions, improved maintenance strategies, preventative maintenance optimization, and other reliability techniques. Provides technical service to operations and manufacturing personnel ...
$92.30K - $116.20K/yr
... engineering solutions, improved maintenance strategies, preventative maintenance optimization, and other reliability techniques. · Provides technical service to operations and manufacturing ...
$92.30K - $116.20K/yr
... engineering solutions, improved maintenance strategies, preventative maintenance optimization, and other reliability techniques. · Provides technical service to operations and manufacturing ...
Austin, TX · On-site
$56.50 - $75/hr
Do you have a passion for ensuring the reliability, scalability, and performance of critical services? Are you a highly motivated and expert engineer with a strong understanding of Site Reliability ...
Austin, TX · On-site
$56.50 - $75/hr
Do you have a passion for ensuring the reliability, scalability, and performance of critical services? Are you a highly motivated and expert engineer with a strong understanding of Site Reliability ...
Raritan, NJ · On-site
$58.25 - $77.50/hr
This is a hands-on, non-manager role focused on improving service reliability through observability, incident response, automation, and engineering excellence. This role partners closely with Product ...
Raritan, NJ · On-site
$58.25 - $77.50/hr
This is a hands-on, non-manager role focused on improving service reliability through observability, incident response, automation, and engineering excellence. This role partners closely with Product ...
$58.25 - $77.50/hr
This is a hands-on, non-manager role focused on improving service reliability through observability, incident response, automation, and engineering excellence. This role partners closely with Product ...
$58.25 - $77.50/hr
This is a hands-on, non-manager role focused on improving service reliability through observability, incident response, automation, and engineering excellence. This role partners closely with Product ...
$58.25 - $77.50/hr
The Site Reliability Engineer (SRE) / Subject Matter Expert (SME) - Computer Systems Engineer ... This role focuses on improving service availability, monitoring, incident response, automation, and ...
$58.25 - $77.50/hr
The Site Reliability Engineer (SRE) / Subject Matter Expert (SME) - Computer Systems Engineer ... This role focuses on improving service availability, monitoring, incident response, automation, and ...
$58.25 - $77.50/hr
This is a hands-on, non-manager role focused on improving service reliability through observability, incident response, automation, and engineering excellence. This role partners closely with Product ...
$58.25 - $77.50/hr
This is a hands-on, non-manager role focused on improving service reliability through observability, incident response, automation, and engineering excellence. This role partners closely with Product ...
$57 - $75.75/hr
Support adoption of SRE principles, including service level indicators (SLIs), service level objectives (SLOs), error budgets, and operational performance measurements. * Drive continual service ...
Manhattan, NY · On-site
$63 - $83.50/hr
... service reliability meets business needs
Manhattan, NY · On-site
$63 - $83.50/hr
... service reliability meets business needs
Atlanta, GA · On-site
$54.75 - $72.75/hr
Collaborate with software engineering teams to define SLAs/SLOs and improve service reliability. * Implement and maintain security best practices across environments (e.g., secrets management, IAM ...
Quick apply
Atlanta, GA · On-site
$54.75 - $72.75/hr
Collaborate with software engineering teams to define SLAs/SLOs and improve service reliability. * Implement and maintain security best practices across environments (e.g., secrets management, IAM ...
Saint Louis, MO · On-site
$55.50 - $73.50/hr
Role : SRE (Mainframe + Java + Oracle + Account Management Services) Location : St Louis, MO ( 3 days onsite/Week ) Duration: 12 Months Key Responsibilities: Site Reliability Engineering * Ensure ...
Saint Louis, MO · On-site
$55.50 - $73.50/hr
Role : SRE (Mainframe + Java + Oracle + Account Management Services) Location : St Louis, MO ( 3 days onsite/Week ) Duration: 12 Months Key Responsibilities: Site Reliability Engineering * Ensure ...
$61K - $68.3K
0% of jobs
$68.3K - $75.5K
2% of jobs
$75.5K - $82.8K
3% of jobs
$82.8K - $90.1K
8% of jobs
$90.1K - $97.4K
7% of jobs
$104.6K is the 25th percentile. Wages below this are outliers.
$97.4K - $104.6K
5% of jobs
$104.6K - $111.9K
4% of jobs
$111.9K - $119.2K
3% of jobs
$119.2K - $126.5K
2% of jobs
The median wage is $128.2K / yr.
$126.5K - $133.7K
63% of jobs
$133.7K - $141K
2% of jobs
$61K
$118K
$141K
| Aspect | Service Reliability Engineer | Site Reliability Engineer |
|---|---|---|
| Credentials | Typically requires experience in software engineering, cloud platforms, and monitoring tools | Similar credentials, often with a focus on software development and systems engineering |
| Work Environment | Works closely with development and operations teams to ensure service reliability | Works on maintaining and improving system reliability, often in cloud or data center environments |
| Industry Usage | Common in tech companies focusing on service uptime and customer experience | Widely used in tech, especially in cloud and large-scale infrastructure companies |
Both roles focus on ensuring system reliability, often requiring similar skills and certifications. The main difference lies in terminology preference and specific organizational focus, but they generally perform comparable functions in maintaining high service availability.

$55 - $73.25/hr
Other
Medical, Dental, Vision, Retirement, PTO
Posted 19 days ago
We are UMG, the Universal Music Group. We are the world's leading music company. In everything we do, we are committed to artistry, innovation and entrepreneurship. We own and operate a broad array of businesses engaged in recorded music, music publishing, merchandising, and audiovisual content in more than 60 countries. We identify and develop recording artists and songwriters, and we produce, distribute and promote the most critically acclaimed and commercially successful music to delight and entertain fans around the world.
As a key member of our Global Technical Operations team, you will be responsible for the reliability, scalability, and performance of the critical systems that power a global enterprise. By blending a software engineering mindset with operational expertise, you will engineer solutions that improve system reliability, automate complex processes, and reduce manual toil. You will be an essential partner to our development, infrastructure, and security teams, driving a culture of resilience and continuous improvement across the organization.
As a Service Reliability Engineer, you won't just be supporting systems; you'll be ensuring the services that connect artists and fans around the globe are always on.
Job Functions:Key ResponsibilitiesSystem Reliability & Performance
Design, build, and maintain the availability, scalability, and performance of critical services.
Develop and maintain robust monitoring, alerting, and observability systems (e.g., using AWS CloudWatch, Dynatrace) to ensure rapid issue detection and resolution.
Monitor infrastructure capacity and performance, providing analysis and suggestions for service delivery improvement.
Automation & Efficiency
Drive the automation of repetitive operational tasks, including infrastructure provisioning, deployments, and scaling.
Create and maintain scripts and custom code to support and enhance our operational toolset.
Support and optimize CI/CD pipelines to improve deployment speed and reliability.
Incident Management & Collaboration
Participate in an on-call rotation to troubleshoot and mitigate production incidents.
Lead post-incident reviews and root cause analyses to implement lasting solutions.
Partner with engineering and IT stakeholders to embed SRE best practices (SLOs, error budgets) into the design and development lifecycle.
Job Requirements:Required Experience & Skills:A strong background in systems administration (Linux/Windows) in a large-scale environment.
Proficiency in at least one programming language (e.g., Python, Go, Java).
Hands-on experience with a major cloud platform (AWS, GCP, or Azure), with a high preference for AWS.
Solid understanding of networking, containers (Docker, Kubernetes), and Infrastructure as Code (e.g., Terraform, Ansible).
Experience with modern monitoring and observability tools (e.g., Prometheus, Grafana, Datadog, Splunk, Dynatrace).
Proven analytical and problem-solving abilities with experience in a high-pressure environment.
Excellent communication skills and the ability to foster a collaborative team environment.
Preferred Experience & Skills:Bachelor's degree in an IT-related field.
Experience managing large-scale, distributed systems for a global organization.
Familiarity with IT governance standards like ITIL.
Direct experience with ServiceNow for IT service management.
Knowledge of chaos engineering, resilience testing, and advanced capacity planning.
Perks Playlist:
Join an entrepreneurial, global organization where authenticity, boldness, creativity, connection, drive, and insight aren't just values-they're how we work every day. Here are some of the ways we support you along the way (and just a few of the benefits we offer):
Comprehensive medical, dental, and vision coverage
Including 100% coverage for out-patient in-network mental health services
Fertility coverage for eligible medical plan participants
Wellbeing reimbursements for fitness classes, spa treatments, meal services, travel, and so much more (up to $720/year)
Student Loan Repayment Assistance and Tuition Reimbursement
401(k) with 100% immediate vesting on the first 5% of your contributions, plus an additional UMG contribution
A variety of ways to prioritize much-needed time away from work including:
Flexible Paid Time Off (PTO) for exempt employees
3-weeks PTO for non-exempt employees
2-weeks paid Winter Break
10 Company Holidays (including Juneteenth and Wellbeing Day)
Summer Fridays (between Memorial Day and Labor Day)
Generous paid parental leave for every type of parent
Check out our full overview of benefits on the Perks Playlist page of the career site.
Disclaimer: This job description only provides an overview of job responsibilities that are subject to change.
Universal Music Group is an Equal Opportunity Employer
We are an E-Verify employer in Alabama, Arizona, Georgia, Mississippi, North Carolina, South Carolina, Tennessee, and Utah.
Please note, UMG is not enrolled in E-Verify in California and New York, and cannot support employment of candidates whose employer must enroll in E-Verify, for example candidates on STEM-OPT.
For more information, please click on the following links.
E-Verify Participation Poster:English / Spanish
E-Verify Right to Work Poster:English|Spanish
Salary Range:
$122,305 - $159,835The actual base salary offered depends on a variety of factors, which may include, as applicable, the qualifications of the individual applicant for the position, years of relevant experience, specific and unique skills, level of education attained, certifications or other professional licenses held, and the location in which the applicant lives and/or from which they will be performing the job. All candidates are encouraged to apply.