Location:ย Rockville, MD, Tysons Corner, VA or Woodbridge, NJ or Jersey City, NJ (3 days onsite per week)
Duration:ย 6 months (long-term extensions)
ย
Notes:
ย Senior Platform Engineer โ Big Data (AWS | EMR | EKS)
- Build and modernize a largeโscaleย AWS big data platformย (EMR, S3, Athena, Trino) supporting enterprise analytics
- Help drive platform evolution towardย cloudโnative, containerized workloads on AWS EKS (Kubernetes)
- Work at the intersection ofย software engineering, big data, and platform engineeringย โ not ETLโonly
- Design and operateย Sparkโbased data workloads, optimizing performance, reliability, and cost
- Implementย CI/CD and Infrastructure as Codeย (Terraform / CloudFormation) for data platforms
- Ideal for engineers with aย strong backend or platform backgroundย whoโve grown into big data
Job Description:
Overview
We are seeking aย Senior Platform Engineer with deep Big Data experienceย to help design, operate, and modernize a largeโscale data platform on AWS. This role goes beyond traditional ETL or pipeline development โ it is focused onย building and evolving the underlying data platformย that supports analytics, reporting, and future AI/ML use cases.
The current environment is built primarily onย AWS EMR and S3, with a strong query layer usingย Athena and Trino. The team is actively modernizing the platform and evaluatingย AWS EKS (Kubernetes)ย as part of a shift toward more cloudโnative, containerized data workloads.
This role is ideal for an engineer with aย software or platform engineering background who moved into big data, rather than a pure ETL developer.
Key Responsibilities
- Design, build, and operate scalableย big data platforms on AWS, with S3 as the core data lake.
- Develop and optimizeย Sparkโbased workloads on EMR, including performance tuning and cost optimization.
- Support and enhance federated query engines such asย Athena and Trinoย for largeโscale analytics.
- Contribute to theย modernization of the data platform, including evaluation and adoption ofย Kubernetes/EKSย for data services and workloads.
- Build and operateย data services and platform componentsย using containerized deployments (Docker + EKS).
- Implement and maintainย Infrastructure as Codeย using Terraform and/or CloudFormation.
- Design and supportย CI/CD pipelinesย for data and platform workloads.
- Partner with data engineers, analytics teams, and stakeholders to ensure the platform is reliable, performant, and extensible.
- Monitor and troubleshoot platform issues across clusters, pipelines, and query engines using CloudWatch and related tooling.
- Continuously evaluate new technologies and propose improvements to the overall data architecture.
Required Qualifications
- 8+ years of experience inย Big Data, Platform Engineering, or Data Engineeringย roles.
- Strong handsโon experience withย AWS, including:
- EMR
- S3
- Athena
- AWS Glue / Glue Data Catalog
- Solid experience withย Sparkย (PySpark or Scala) and distributed data processing.
- Strongย SQLย skills, particularly with large datasets (Athena, Trino, Presto, etc.).
- Experience withย Dockerย and containerized applications.
- Working knowledge ofย Kubernetes, with exposure toย AWS EKSย strongly preferred.
- Experience implementingย CI/CD pipelinesย (Jenkins, GitHub Actions, or similar).
- Infrastructure as Code experience usingย Terraform and/or CloudFormation.
- Strong scripting and programming skills (Python preferred).
- Ability to think at aย platform and architecture level, not just task execution.
Nice to Have
- Experience runningย Spark on Kubernetes (EKS).
- Trino/Presto performance tuning experience.
- Experience preparing data platforms forย AI/ML workloads.
- Observability tooling experience (CloudWatch, Grafana, Prometheus).
- Background as aย software engineerย before moving into big data.