Must Have Technical/Functional Skill
• DB2 administration experience, including High Availability Disaster Recovery (HADR); familiarity with Oracle/Postgres and SQL.
• Experience with TWS/IWS integrations and APIs (REST/SOAP), event-based scheduling, and real-time/on-demand workload patterns.
• Experience with Tivoli Dynamic Workload Console (TDWC/TDWB) and critical path monitoring.
• Experience integrating file transfer solutions (e.g., SFTP/PGP/GPG, managed file transfer platforms) into batch workflows.
• Experience with SAP and other enterprise application integrations via TWS extended agents.
• Experience building dashboards/metrics and integrating with observability platforms (e.g., Grafana/Graphite).
• Experience defining platform standards, leading upgrades/migrations, and coordinating cross-team delivery
(e.g., change windows, cutovers, rollback planning).
• Familiarity with cloud patterns and automation (e.g., infrastructure-as-code concepts, container/VM scheduling considerations)
in support of workload modernization.
• Hands-on experience across ITSM processes (Incident, Problem, Change, Knowledge) in an enterprise environment.
• ServiceNow experience, including incident lifecycle management, documentation standards, and reporting.
• Working knowledge of ITIL concepts and IT service management best practices.
• Artificial Intelligence Navigating all the AI APP know how to communicate with it and know when not to use it when it does not meet your or
the companies expectations
• Strong analytical and problem-solving skills to investigate issues and drive resolution.
• Ability to manage multiple tasks in a high-volume, high-urgency operations environment.
• Strong written and verbal communication skills, including confident facilitation on conference bridges.
• Able to write and review technical documentation and knowledge articles.
Roles & Responsibilities
• Own the end-to-end architecture for the TWS/IWS platform (components, topology, environments, integrations), including standards, patterns,
and reference implementations.
• Provide technical oversight for additional (3rd-party) job scheduling platforms where used; establish operating standards, integration patterns,
and support processes to ensure consistent controls and reliability.
• Lead enterprise-scale installations, upgrades, and migrations; define cutover/rollback strategies, coordinate change windows,
and ensure readiness across dependent teams.
• Lead assessments of legacy scheduler instances and batch frameworks to identify candidates for retirement, consolidation, or migration;
produce target-state recommendations, sequencing/roadmaps, and risk-based migration plans.
• Define reliability engineering practices for workload automation: availability targets, capacity planning, performance tuning, monitoring/alerting,
and continuous improvement.
• Design and validate high-availability and disaster recovery solutions (including DB2 HADR where applicable); plan and execute regular DR tests
and remediate gaps.
• Establish governance for workload onboarding and job design: scheduling standards, dependency modeling, naming conventions, calendars,
critical path optimization, and SLA/SLO management.
• Architect and productionize automation for platform operations and self-service (e.g., provisioning, reporting, batch controls) using shell/Python/Perl
and enterprise tooling.
• Own security and compliance posture: access model (LDAP/SSO), least-privilege controls, audit evidence, vulnerability remediation,
and secure configuration baselines.
• Manage and develop two teams (e.g., platform engineering and operations): set priorities and operating rhythms, oversee delivery
and support outcomes, coach/mentor team members, and drive performance management in partnership with leadership.
• Be available for major outages and critical events related to job scheduling, including QEND activities up to four (4) times per fiscal year,
providing incident leadership, stakeholder communications, and post-incident follow-up.
• Participate in an on-call rotation and provide after-hours/weekend support as needed to maintain scheduling availability and meet business SLAs.
• Support a global operating model by working flexibly across EMEA and US business hours to provide required coverage and stakeholder overlap.
• Serve as escalation point for complex incidents; lead root-cause analysis and drive problem management to prevent recurrence.
• Mentor and guide engineers; lead technical design reviews, documentation/runbook standards, and knowledge sharing across the organization.
• Deep dive into other job scheduling teams like Automate, AS400 and Robot and assist in supervising these teams in IT Operations.
Salary Range $110,000-$130,000 Per year
TCS Employee Benefits Summary:
Discretionary Annual Incentive.
Comprehensive Medical Coverage: Medical & Health, Dental & Vision, Disability Planning & Insurance, Pet Insurance Plans.
Family Support: Maternal & Parental Leaves.
Insurance Options: Auto & Home Insurance, Identity Theft Protection.
Convenience & Professional Growth: Commuter Benefits & Certification & amp; Training Reimbursement.
Time Off: Vacation, Time Off, Sick Leave & Holidays.
Legal & Financial Assistance: Legal Assistance, 401K Plan, Performance Bonus, College Fund, Student Loan Refinancing .
#LI-SP1