Job Summary:
Google is a leading technology company, and they are seeking a Senior Site Reliability Engineer. The role focuses on building and running large-scale, distributed systems while ensuring reliability and performance in Google Cloud's services.
Responsibilities:
• Engage in and improve the whole lifecycle of services—from inception and design, through to deployment, operation, and refinement.
• Support services before they go live through activities such as system design consulting, developing software platforms and frameworks, capacity planning, and launch reviews.
• Maintain services once they are live by measuring and monitoring availability, latency, and overall system health.
• Scale systems sustainably through mechanisms like automation, and evolve systems by pushing for changes that improve reliability and velocity.
• Practice sustainable incident response and blameless postmortems.
Qualifications:
Required:
• Bachelor’s degree in Computer Science, a related field, or equivalent practical experience.
• 5 years of experience with software development in one or more programming languages.
• 3 years of experience in designing, analyzing, and troubleshooting large-scale distributed systems.
• 2 years of experience leading projects and providing technical leadership.
• 2 years of experience building and architecting production quality machine learning (ML) systems.
Preferred:
• Master's degree in Computer Science or Engineering.
• Experience with AI algorithms.
• Knowledge of containerization and container orchestration technologies such as Google Kubernetes Engine (GKE).
Company:
Google specializes in internet-related services and products, including search, advertising, and software. It is a sub-organization of Alphabet. Founded in 1998, the company is headquartered in Mountain View, USA, with a team of 10001+ employees. The company is currently Late Stage.