Job Details:
Job Title: Site Reliability Engineer (SRE)
Duration: Long-Term Contract
Location: Alpharetta, GA || Hybrid
Job Description:
Position Summary
- We are seeking a skilled Site Reliability Engineer (SRE) to join our team and help design, build, and maintain scalable, reliable, and secure cloud-native infrastructure.
- You will collaborate closely with development and operations teams to ensure system reliability, performance, and efficiency.
- The ideal candidate is passionate about automation, observability, and infrastructure-as-code, and thrives in a fast-paced, collaborative environment.
Key Responsibilities:
- Design, implement, and manage cloud infrastructure on Microsoft Azure using Terraform and Terragrunt.
- Maintain and optimize Kubernetes clusters on Azure Kubernetes Service (AKS).
- Build and manage CI/CD pipelines using GitHub Actions/Workflows and ArgoCD for GitOps-based deployments.
- Implement monitoring, alerting, and observability solutions using Grafana (including PrometheLoki/Tempo where applicable) .
- Automate operational tasks to reduce manual effort and improve efficiency.
- Participate in on-call rotations, incident response, and post-incident reviews (post-mortems).
- Collaborate with development teams to improve application performance, scalability, and reliability.
- Implement and advocate SRE best practices, including SLIs, SLOs, and error budgets.
- Continuously improve system performance, cost optimization, and security posture.
.
Required Skills & Qualifications:
- 3+ years of experience in a Site Reliability Engineer, DevOps, or Cloud Infrastructure role.
- Strong hands-on experience with Microsoft Azure cloud services.
- Proficiency in Terraform and Terragrunt for Infrastructure-as-Code .
- Strong experience with Kubernetes, preferably AKS.
- Experience with CI/CD tools, especially GitHub Actions/Workflows and ArgoCD.
- Solid understanding of observability tools such as Grafana (experience with Prometheus, Loki, Tempo is a plus).
- Strong programming/scripting experience (Java preferred or relevant backend experience).