Job Summary:
Scalence L.L.C. is seeking a highly experienced Lead Data Engineer to design, develop, and support enterprise-scale data platforms on AWS. The role involves technical leadership, hands-on development, and stakeholder management while building scalable cloud data solutions.
Responsibilities:
• Design, develop, and maintain scalable data pipelines using AWS cloud services.
• Build robust ETL/ELT workflows using Python, PySpark, AWS Glue, and SQL.
• Develop solutions for processing structured, semi-structured, and large-scale datasets.
• Implement enterprise Data Lake/Lakehouse solutions using Amazon S3.
• Build reusable data ingestion and transformation frameworks.
• Develop and optimize solutions using: Amazon S3, AWS Glue, Amazon Athena, AWS Lambda, Amazon Redshift, Amazon EMR.
• Design secure, scalable, and cost-efficient cloud data architectures.
• Optimize storage, partitioning, compression, and query performance.
• Work with Parquet, ORC, and Avro file formats.
• Design high-performance batch data pipelines.
• Optimize Spark jobs and SQL queries for large datasets.
• Improve pipeline reliability, scalability, and operational efficiency.
• Implement monitoring, logging, and alerting for data workflows.
• Serve as the technical lead and Agile anchor for the data engineering team.
• Lead sprint planning, backlog grooming, estimation, and delivery tracking.
• Collaborate with Product Owners, Scrum Masters, Architects, and business stakeholders.
• Mentor junior engineers and establish engineering best practices.
• Conduct code reviews, design reviews, and technical walkthroughs.
• Provide L2/L3 production support for enterprise data platforms.
• Troubleshoot pipeline failures and performance issues.
• Perform Root Cause Analysis (RCA) and implement preventive solutions.
• Participate in incident management and on-call support.
• Utilize CloudWatch and monitoring tools to ensure platform health.
• Implement data quality validation and reconciliation processes.
• Ensure data integrity, lineage, governance, and compliance.
• Develop monitoring frameworks for data quality and operational metrics.
• Implement CI/CD pipelines for data engineering solutions.
• Use Git, Jenkins, AWS CodePipeline, or similar deployment tools.
• Support Infrastructure as Code using Terraform or CloudFormation.
• Automate deployment, testing, and operational processes.
Qualifications:
Required:
• 8–10+ years of experience in designing, developing, and supporting enterprise-scale data platforms on AWS
• Strong expertise in AWS Data Services
• Strong expertise in Python
• Strong expertise in PySpark
• Strong expertise in SQL
• Strong expertise in ETL/ELT development
• Strong expertise in Data Lake/Lakehouse architectures
• Technical leadership experience
• Hands-on development experience
• Agile delivery ownership experience
• Production support experience
• Stakeholder management experience
• Experience in building scalable, secure, and high-performance cloud data solutions
• Experience in designing, developing, and maintaining scalable data pipelines using AWS cloud services
• Experience in building robust ETL/ELT workflows using Python, PySpark, AWS Glue, and SQL
• Experience in developing solutions for processing structured, semi-structured, and large-scale datasets
• Experience in implementing enterprise Data Lake/Lakehouse solutions using Amazon S3
• Experience in building reusable data ingestion and transformation frameworks
• Experience in developing and optimizing solutions using Amazon S3, AWS Glue, Amazon Athena, AWS Lambda, Amazon Redshift, and Amazon EMR
• Experience in designing secure, scalable, and cost-efficient cloud data architectures
• Experience in optimizing storage, partitioning, compression, and query performance
• Experience in working with Parquet, ORC, and Avro file formats
• Experience in designing high-performance batch data pipelines
• Experience in optimizing Spark jobs and SQL queries for large datasets
• Experience in improving pipeline reliability, scalability, and operational efficiency
• Experience in implementing monitoring, logging, and alerting for data workflows
• Experience in serving as the technical lead and Agile anchor for the data engineering team
• Experience in leading sprint planning, backlog grooming, estimation, and delivery tracking
• Experience in collaborating with Product Owners, Scrum Masters, Architects, and business stakeholders
• Experience in mentoring junior engineers and establishing engineering best practices
• Experience in conducting code reviews, design reviews, and technical walkthroughs
• Experience in providing L2/L3 production support for enterprise data platforms
• Experience in troubleshooting pipeline failures and performance issues
• Experience in performing Root Cause Analysis (RCA) and implementing preventive solutions
• Experience in participating in incident management and on-call support
• Experience in utilizing CloudWatch and monitoring tools to ensure platform health
• Experience in implementing data quality validation and reconciliation processes
• Experience in ensuring data integrity, lineage, governance, and compliance
• Experience in developing monitoring frameworks for data quality and operational metrics
• Experience in implementing CI/CD pipelines for data engineering solutions
• Experience in using Git, Jenkins, AWS CodePipeline, or similar deployment tools
• Experience in supporting Infrastructure as Code using Terraform or CloudFormation
• Experience in automating deployment, testing, and operational processes
Company:
In today’s dynamic and competitive market, success hinges on mastering three key areas: Data Intelligence, Business Resilience, and Digital Experience. Founded in , the company is headquartered in Morristown, New Jersey, US, , with a team of 501-1000 employees. The company is currently Late Stage.