A DAY IN THE LIFE:
The Production Support Engineering team plays a key role at FIS Amount by ensuring production issues are managed efficiently and effectively. You will manage high-priority issues to resolution following industry best practices. You'll troubleshoot, fix, and apply workarounds to resolve technical issues across multiple platforms. ย Each day, you'll interact with every aspect of our organization to find the best solution for our partner. ย Management of ticket queues, monitoring for issues and post-release validation are also a large part of this role, all while meeting our partners' SLA requirements. ย ย
Team: This role interacts with nearly every group within the organization, including engineering, product, QA, customer success and others.ย
Salary:ย $73,000-$80,000 base salary
Similar job titles: Production Support, Production Support Analyst, Incident Manager, Incident Coordinator, IT Major Incident Manager, Application Support Engineer, Support Engineer
WHAT WE'LL TRUST YOU TO DELIVER:
- Technical ability to deep dive into issues by querying tables, analyzing data and problem-solving
- Prioritization and triage of incoming requests/issuesย
- Drive incident resolution and lead conversations with cross-functional groups. ย Ask the right questions to help determine impact/priority and the correct route for resolution.ย Oversee a technical bridge, if required.
- Management of all incidents through the incident management lifecycle
- Documentation of all relevant events, getting status reports while driving decision-making and resolutionย
- Ensure stakeholders are updated according to predefined service level agreements
- Completion and ownership of the postmortem with appropriate root cause analysis performed ย
- Improvement suggestions to capture preventative measures that will avoid recurrences of incidents
- Investigate patterns that indicate larger overall issues, even if we don't have the solution.ย
- Compilation of metrics on a weekly and monthly basis. ย Maintain dashboards for service incidents and ad hoc reporting as requested
- Play an active role during critical incidents which may occur outside of normal business hours. ย Nights, weekends, and holidays on an on-call rotation basis is a must
- Creation of runbooks or standard operating procedures (SOP) so we can all learn from each other and add to our knowledge base
WHAT YOU LIKELY BRING TO THE TABLE:ย
- Technical and/or engineering background, ideally with experience writing SQL queries
- Experience working with development teams in a fast-paced environment
- Basic knowledge or interest of any programming language such as Java, Python or Rubyย
- 2 years of experience coordinating and executing major incidents, with demonstrated capacity to lead under pressure
- Previously collaborated with a wide spectrum of internal and external stakeholders
- Worked in an organization with a complex business environment
- Leadership skills with the ability to make quick decisions
- Familiar with ITSM/ITIL concepts
- You thrive being a self-starter, who can lead others during stressful situations
- Familiar with tools such as Confluence, Jira, and on-call management software such as PagerDuty and experience with error monitoring software (Sentry, Kibana)