Role : Python Data Engineer / API DeveloperLocation: Salt Lake City Its day 1 onsite Duration: Long termKey: Python, PySpark, GCP, API development Role OverviewWe are seeking a highly skilled Python Data Engineer / API Developer with strong hands-on experience in PySpark, cloud-based data engineering on GCP, and API development. The ideal candidate should have expertise in building scalable data pipelines, working with distributed clusters, and developing secure APIs for enterprise-grade applications.
Key Responsibilities - Design, develop, and maintain scalable data pipelines for batch and/or real-time processing.
- Build and optimize PySpark applications running on distributed clusters.
- Develop secure and scalable Python-based APIs.
- Work with cloud-native GCP services including BigQuery, Composer, DAGs, and Cloud Storage Buckets.
- Implement data quality checks, validations, and monitoring frameworks within pipelines.
- Collaborate with cross-functional teams including data analysts, BI teams, and platform engineers.
- Ensure performance optimization, reliability, and security best practices across solutions.
Required Skills & Qualifications - Strong hands-on experience with PySpark and distributed cluster computing.
- Proven experience in building Python APIs with a focus on security and scalability.
- Strong knowledge of API frameworks such as FastAPI or Flask.
- Hands-on experience with Google Cloud Platform (GCP) services:
- BigQuery
- Composer
- DAG orchestration
- Cloud Storage Buckets
- Experience in building robust batch and/or real-time data pipelines.
- Strong understanding of data quality frameworks and practices.
Preferred Skills - Experience with BI and reporting tools such as:
- Familiarity with CI/CD pipelines and DevOps practices is an added advantage.
- Exposure to data governance and monitoring tools is a plus.