Job Summary:
Diverse Lynx is a company focused on providing technology solutions, and they are seeking a Pyspark/Python Data Engineer to design and maintain ETL/ELT pipelines. The role involves developing optimized data processing components and ensuring data quality and performance across pipelines.
Responsibilities:
• Design, develop, and maintain ETL/ELT pipelines using PySpark
• Write optimized and scalable PySpark transformations using DataFrames and Spark SQL
• Develop reusable and efficient Python-based data processing components
• Ensure data quality, integrity, and performance across pipelines
• Perform debugging, performance tuning, and optimization of PySpark jobs
• Collaborate with cross-functional teams (Data Analysts, Architects, DevOps)
• Contribute to CI/CD pipelines and deployment workflows for data applications
• Monitor and troubleshoot data workloads in production environments
Qualifications:
Required:
• Strong hands-on experience in PySpark (Spark SQL, DataFrame API)
• Advanced proficiency in Python (data processing, performance tuning, modular coding)
• Solid understanding of ETL design patterns and data pipeline architecture
• Good working knowledge of SQL for data transformation and analysis
• Experience with data processing in distributed environments
• 3–8 years of experience in Data Engineering / PySpark development
• Proven hands-on project experience in PySpark + Python
Preferred:
• Experience with cloud platforms (AWS preferred – S3, Glue, EMR or equivalent services)
• Familiarity with workflow orchestration tools such as Airflow or similar schedulers
• Exposure to data warehousing concepts (e.g., Snowflake or similar platforms)
• Knowledge of code versioning (Git) and CI/CD practices
Company:
Diverse Lynx is a WBENC- and NMSDC-certified partner, helping organizations turn diversity goals into measurable impact through staffing and contingent workforce solutions. Founded in 2002, the company is headquartered in Princeton, New Jersey, US, , with a team of 1001-5000 employees. The company is currently Late Stage.