Job Title: Site Reliability Engineer
Location-Type: Onsite in Las Vegas, NV
Start Date Is: ASAP
Duration: (contract, perm, etc) 6 Month Contract to Hire (will convert to FTE)
Compensation Range: 70-85$/hr W2
Benefits: Eligible for Health, Dental, Vision, 401K
Must be authorized to work in the U.S. This position is not eligible for sponsorship .
The ideal Site Reliability Engineer for this role owns the availability, performance, and operational integrity of enterprise infrastructure across physical, virtual, and cloud environments. They bring production-level experience with Nutanix, AWS, and Azure, deep expertise in storage architectures, and a strong foundation in both Linux and Windows administration. Proficiency in Ansible, Python, PowerShell, and Bash are expected. This individual is experienced in supporting large-scale, mission-critical infrastructure, comfortable in a 24/7 on-call environment, and accountable for incidents from detection through post-mortem.
What will you do?
- Operate and maintain Nutanix, AWS, and Azure environments, ensuring systems are optimized for
- performance and resource utilization.
- Administer Rubrik backup and disaster recovery platforms.
- Deploy and maintain NAS, SAN, and LUN storage infrastructure.
- Manage AVD environments using Nerdio.
- Maintain and support Genetec CCTV backend server infrastructure.
- Own the server infrastructure layer across all enterprise applications, partnering with application owners to troubleshoot and resolve issues that intersect with the underlying infrastructure.
- Identify and resolve errors and manage capacity across physical, virtual, and cloud environments.
- Evaluate and manage incoming server and storage change requests.
- Provide Level 2 and Level 3 support for operations and change management across Nutanix, NAS, SAN, and Rubrik platforms.
- Maintain accurate infrastructure documentation including diagrams, equipment lists, and change records.
- Apply firmware upgrades, software updates, and OS patching across all infrastructure systems on a scheduled and ad hoc basis.
- Create and maintain runbooks, process documentation, outage reports, and status updates.
- Perform daily monitoring, troubleshooting, and fault analysis across server and storage systems, including hands-on hardware repair.
- Generate and respond to incident tickets, monitor interfaces, and manage escalations via ServiceNow.
- Deploy and maintain infrastructure monitoring and reporting tools with SolarWinds as the primary platform.
- Collaborate with network infrastructure and cross-functional teams to ensure seamless integration of server and storage systems.
- Support studio and live sports operations, providing infrastructure guidance and assistance to teams without dedicated technical resources.
- Leverage Ansible and scripting in Python, PowerShell, and Bash to deploy, patch, and maintain server infrastructure.
- Participate in a team-based 24/7 on-call rotation for critical infrastructure support including scheduled patching.
Minimum Requirements
- 5 to 7 years of enterprise infrastructure operations experience.
- 5 or more years of experience with high performance compute, storage, and data center platforms.
- 3 or more years of vendor management experience.
- Bachelor's degree in engineering, computer science, or computer engineering preferred; equivalent experience considered.
- Production experience with Nutanix, AWS, and Azure.
- Rubrik backup and disaster recovery platform experience.
- Storage expertise across NAS, SAN, and LUN architectures.
- Strong Linux and Windows server administration.
- Experience with AVD and Nerdio for virtual desktop infrastructure.
- Proficiency in Ansible and scripting with Python, PowerShell, and Bash.
- SolarWinds proficiency - drive to learn it but not a hard requirement
- Integration experience with ServiceNow for incident management and ticketing.
- Solid networking fundamentals including routing, switching, firewalls, and load balancing.
- Previous experience supporting diverse business units including creative, production, or live events environments.
- Experience in enterprise-scale environments with an emphasis on fault tolerance and operational excellence.
Additional Requirements