Job Summary:
Diverse Lynx is a company focused on providing equal employment opportunities and promoting a diverse workforce. They are seeking a Site Reliability Engineer (SRE) to enhance platform reliability, performance, and observability while collaborating with various teams and automating operational tasks.
Responsibilities:
• Enhance platform reliability, performance, and observability
• Build dashboards and alerts using APM tools (Splunk, ELK, Grafana, Prometheus, GCL)
• Proactively identify performance bottlenecks and system risks
• Support incident management and root cause analysis
• Collaborate with Engineering, Security, Networking, and Infrastructure teams
• Automate operational tasks using Shell scripting and DevOps tools
• Support CI/CD pipelines and release processes
Qualifications:
Required:
• 8+ years of Software Engineering experience
• 4+ years in Site Reliability Engineering
• Strong experience with APM / monitoring tools (Splunk, ELK, Grafana, Prometheus)
• Experience with distributed systems, relational & NoSQL databases
• Knowledge of Redis, Memcache, MQ, Kafka
• Hands‐on Shell scripting, Ansible (YAML)
• Experience with CI/CD tools (Git, Jenkins, UCD or similar)
• Experience with Kubernetes / OpenShift, PCF, AWS or Azure
• Tech stack: Java/J2EE, Spring Boot, Python, Kafka, Oracle, MongoDB
Company:
Diverse Lynx is a WBENC- and NMSDC-certified partner, helping organizations turn diversity goals into measurable impact through staffing and contingent workforce solutions. Founded in 2002, the company is headquartered in Princeton, New Jersey, US, , with a team of 1001-5000 employees. The company is currently Late Stage.