Must Have Technical/Functional Skills
Primary skills: PySpark, Apache Kafka, Hadoop Ecosystem, Hive, Databricks Lakehouse Architecture, Delta Lake, Bronze/Silver/Gold Data Modeling, Big Data ETL Pipeline Development, SQL, Real-time Data Ingestion Frameworks, Data Governance & Cataloging, CI/CD Tools Git, Jenkins, Bitbucket, Workflow Orchestration, and Cloud & On-Prem Big Data Platforms.
Experience: Minimum 10+ years
Roles & Responsibilities
Seeking a Senior Big Data Engineer with 1013 years of experience specializing in Hadoop, PySpark, Kafka, Hive, and strong experience designing data solutions for large-scale financial systems.
In addition, the candidate must possess advanced expertise in Databricks Lakehouse architecture, particularly around Bronze/Silver/Gold layer data modeling, Delta Lake optimizations, and building reliable, scalable pipelines for regulatory, risk, trading, and analytics workloads.
This role focuses on delivering highly performant, well-governed data platforms that support the banks mission-critical global markets functions.
Key Responsibilities:
Big Data Platform Engineering
• Design, develop, and optimize PySpark-based ETL pipelines running on on-prem Hadoop clusters and cloud environments.
• Build high-volume ingestion frameworks using Kafka for real-time and near-real-time trading and market data.
• Develop, tune, and manage Hadoop ecosystem componentsHDFS, YARN, MapReduce, Tez, Oozie/Airflow.
• Build high-performance, optimized Hive data models for regulatory reporting, trade lifecycle, and market risk processing.
Databricks Lakehouse & Delta Framework
• Architect and implement Bronze/Silver/Gold layer modeling patterns within the Databricks Lakehouse.
• Apply Delta Lake best practices including:
o optimized file management
o Z-Ordering
o Delta Change Data Feed (CDF) o schema evolution & enforcement o ACID transaction handling
• Build reusable frameworks for ingestion, cleansing, transformation, and consumption of data across Lakehouse layers.
• Enable governance, lineage, and auditability using Unity Catalog or equivalent cataloging tools.
Collaboration, Leadership & Delivery
• Collaborate closely with quants, product owners, architects, risk tech, and business users.
• Participate in agile ceremonies sprint planning, refinement, design reviews.
• Mentor junior engineers and contribute to building strong engineering practices across tech teams.
Required Skills & Experience
• 1013 years of hands-on experience in Big Data engineering.
• Expert skills in:
o PySpark dataframe optimizations, partitioning, broadcast strategies, distributed computing.
o Kafka producer/consumer design, schema registry, streaming ETLs.
o Hadoop ecosystem HDFS, YARN, MapReduce/Tez, Oozie/Airflow.
o Hive advanced query tuning, TEZ optimization, partition/bucket management.
• Extensive hands-on experience with Databricks Lakehouse, including:
o Bronze/Silver/Gold layer modeling
o Delta Lake optimizations
o Data quality frameworks on Lakehouse
o Structured & unstructured data handling
• Experience in Global Markets, Risk, Treasury, Trade Surveillance, or Regulatory Reporting.
• Strong SQL knowledge with experience working on massive datasets (TB/PB scale).
Experience with CI/CD practices Git, Jenkins, Bitbucket, build pipelines.
TCS Employee Benefits Summary:
Discretionary Annual Incentive.
Comprehensive Medical Coverage: Medical & Health, Dental & Vision, Disability Planning & Insurance, Pet Insurance Plans.
Family Support: Maternal & Parental Leaves.
Insurance Options: Auto & Home Insurance, Identity Theft Protection.
Convenience & Professional Growth: Commuter Benefits & Certification & Training Reimbursement.
Time Off: Vacation, Time Off, Sick Leave & Holidays.
Legal & Financial Assistance: Legal Assistance, 401K Plan, Performance Bonus, College Fund, Student Loan Refinancing.
Salary Range: $110,000- 125,000 a year