Job Summary:
Avaya is an enterprise software leader that helps organizations forge unbreakable connections. They are seeking a Site Reliability Engineer (SRE) to drive stability, reliability, and performance across their Azure and GCP-based platforms, focusing on operational excellence and collaboration with DevOps and Security teams.
Responsibilities:
• Serve as a key member of the 24×7 on-call rotation, responding to and managing incidents across production and pre-production environments.
• Lead incident bridges, coordinate root cause analysis (RCA), and ensure post-incident reviews drive systemic improvements.
• Maintain clear communication with cross-functional teams and leadership during major incidents.
• Build, tune, and maintain observability dashboards (Azure Monitor, GCP Operations Suite, Prometheus, Grafana, Datadog, Log Analytics).
• Perform deep-dive troubleshooting of application and service-level issues using distributed tracing and log analysis (Grafana, Datadog) to pinpoint root causes beyond infrastructure.
• Define SLOs, SLIs, and error budgets to proactively identify and mitigate reliability risks before customer impact.
• Integrate AI-Ops tools for anomaly detection, predictive alerting, and automated incident correlation.
• Continuously enhance alert quality, reduce false positives, and automate runbooks for faster recovery.
• Analyze trends to prevent recurring issues and support teams in resilience engineering.
Qualifications:
Required:
• 5+ years in Site Reliability, DevOps, Cloud Operations, or Customer support roles.
• Demonstrated experience in application-level troubleshooting by analyzing logs and traces to identify bugs, performance bottlenecks, and error conditions.
• Expertise in Azure and GCP cloud operations and distributed system reliability.
• Understanding of Terraform, Ansible, and CI/CD pipelines (Jenkins, GitHub Actions).
• Experience with observability and AI-Ops tools (Azure Monitor, GCP Operations Suite, Grafana, Prometheus, Datadog, etc.).
• Solid grasp of incident management frameworks (P1–P3 handling, RCA, PIRs, on-call rotations).
• Excellent analytical, troubleshooting, and communication skills.
Company:
Welcome to CPA Associates International Inc. a global association of independent accountancy practices. Founded in , the company is headquartered in Glen Rock, USA, with a team of 5001-10000 employees. The company is currently Late Stage.