Job Summary:
Scalence L.L.C. is seeking a highly experienced Senior Site Reliability Engineer (SRE) / Application Reliability Engineer with over 10 years of expertise in incident management, system reliability, and enterprise application support. This role is crucial for ensuring high availability, operational stability, and continuous improvement of critical financial and ERP systems in a 24ร7 environment.
Responsibilities:
โข Ensure high availability and reliability of enterprise applications in a 24ร7 production environment
โข Monitor applications, batch jobs, and workflows to maintain operational continuity
โข Lead and manage major incidents (P1/P2) and drive resolution to minimize business impact
โข Perform root cause analysis (RCA) and implement preventive measures
โข Design and maintain monitoring dashboards
โข Implement proactive alerting and improve system observability
โข Diagnose and resolve application and data-related issues using SQL queries and log analysis
โข Support release deployments, change validation, and post-deployment activities
โข Collaborate with infrastructure, DBA, and development teams to resolve technical issues
โข Create and maintain operational documentation, runbooks, and knowledge base articles
Qualifications:
Required:
โข Site Reliability Engineering (SRE) and Application Support
โข Incident & Problem Management
โข Root Cause Analysis (RCA)
โข SLA / SLO Compliance
โข Batch Monitoring & Scheduling
โข ITIL Framework
โข 10 years of experience in Application Support / Reliability Engineering roles
โข Strong experience in BFSI or enterprise application environments
โข Proven track record in managing production support operations and high-severity incidents
โข Applicants must be able to work directly for Artech on W2
Preferred:
โข CI/CD Tools: GitHub
โข Cloud Platforms: AWS (EC2, S3, VPC)
โข Databases: Oracle, SQL Server
โข Languages: SQL, SQR, Basic Java
โข Ticketing Tools: ServiceNow, Jira
โข Operating Systems: UNIX, Linux, Windows
Company:
In todayโs dynamic and competitive market, success hinges on mastering three key areas: Data Intelligence, Business Resilience, and Digital Experience. Founded in , the company is headquartered in Morristown, New Jersey, US, , with a team of 501-1000 employees. The company is currently Late Stage.