Role: Site Reliability Engineering (SRE)Location: Los Angeles, CARemote positionFulltime positionJD - Site Reliability Engineer
- Experience in Cloud platforms (AWS, Azure, Google Cloud) and hybrid environments.
- Proficiency in container technologies (Docker, Container, Podman).
- Strong knowledge of Linux administration and networking concepts.
- Experience with Infrastructure as Code (IaC) tools like Terraform, Ansible, Helm, or Pulumi.
- Monitoring and logging expertise using Prometheus, Grafana, ELK, Datadog, or Splunk.
- Hands-on experience with CI/CD pipelines and DevOps tools (Jenkins, GitHub Actions, GitLab CI, ArgoCD).
- Proficiency in scripting/programming (Python, Bash, Go) for automation.
- Strong troubleshooting and incident management skills.
- We are seeking a highly skilled - Site Reliability Engineer (SRE) to manage, optimize, and ensure the reliability of infrastructure.
- The ideal candidate will have deep expertise in ELK, Dynatrace Pagerduty.
- Powershell, container orchestration, cloud infrastructure, and automation, along with a strong focus on reliability, scalability, and performance. Good to have Logic Monitor and Python knowledge
- Reliability & Performance: Implement best practices to ensure high availability, scalability, and performance of containerized applications.
- Monitoring & Incident Response: Set up monitoring (Prometheus, Grafana, ELK, Dynatrace, Pagerduty, Powershell etc.), troubleshoot issues, and lead incident resolution.
- Automation & Infrastructure as Code (IaC): Develop and maintain Terraform, Helm charts, and Kubernetes manifests for automation.
- CI/CD & DevOps Integration: Work with DevOps teams to optimize CI/CD pipelines for Kubernetes deployments (Jenkins, ArgoCD, FluxCD, etc.).
- Security & Compliance: Implement security best practices for containerized workloads, RBAC, network policies, and vulnerability scanning.
- Capacity Planning & Optimization: Analyze resource usage and optimize infrastructure costs and performance.
- Disaster Recovery & Backup: Implement backup and disaster recovery strategies for Kubernetes workloads.
Thanks, and have a nice day
ManikanthSarian Solutions, Inc. |Ph: 732-790-2266 x 201 |Fax: 732-696-4242|manikanth.d@sariansolutions.com
www.sariansolutions.com |
Certified Minority Business Enterprise (WMBE)follow us: @sariansol | Check our current openings at https://www.sarianinc.com/work-with-us