Engineer Sr Lead, Site Reliability (Snowflake Focused)Location: Remote in USDuration: 6+ MonthsSalary: $Open - Please submit at lowest best rate.Job description: Position Overview The Senior Lead Site Reliability Engineer ensures the reliability, performance, and resilience of mission-critical systems with a strong focus on Snowflake-based data platforms. This role blends software engineering, cloud operations, and reliability engineering to optimize Snowflake workloads, reduce operational risk, and improve service availability across distributed environments.
Responsibilities - Lead SRE practices for systems leveraging Snowflake at enterprise scale, defining standards for reliability, performance optimization, and operational automation.
- Architect and implement highly available, fault-tolerant infrastructure supporting Snowflake pipelines, compute clusters, and data workloads.
- Develop observability frameworks (monitoring, alerting, logging) tailored to Snowflake performance, cost monitoring, query optimization, and warehouse health.
- Partner with data engineering, development, and architecture teams to embed reliability into Snowflake schema design, workload management, job orchestration, and CI/CD deployments.
- Drive incident management for Snowflake-related issues, performing root-cause analysis and implementing durable, systemic remediation.
- Reduce operational toil through engineering automation, improving stability and responsiveness of Snowflake workloads and dependent services.
- Mentor SRE engineers and guide teams in adopting reliability, automation, and Snowflake operational best practices.
Qualifications - Extensive SRE or production engineering experience supporting large-scale, cloud-based systems.
- Strong Snowflake operational experience including workload monitoring, warehouse sizing, scaling strategies, performance tuning, and cost governance.
- Expertise in automation, observability tooling, cloud engineering (Azure preferred), and distributed systems.
- Experience leading reliability initiatives and supporting complex data and application ecosystems.