Position: AWS Data Engineer
Location: Richmond, VA (Onsite)
Duration: Long term contract
Note: ExโCapitalOne candidates preferred, especially those who worked atleast 1+ yr of exp with them.
Job Responsibilities:
- Develop and optimize scalable data pipelines using Python and PySpark, with a primary focus on leveraging AWS services like Glue, EMR, and Step Functions.
- Design, implement, and maintain data processing systems utilizing the capabilities of AWS Glue for data ingestion, transformation, and orchestration.
- Utilize EMR clusters for distributed data processing and analytics, ensuring efficient resource utilization and performance.
- Implement serverless workflows using AWS Step Functions to orchestrate complex data processing tasks and workflows.
- Collaborate closely with cross-functional teams to understand data requirements and translate them into scalable and efficient solutions.
- Ensure data quality, governance, and security standards are adhered to throughout the data pipeline.
- Troubleshoot and debug data-related issues in production environments, leveraging AWS monitoring and logging tools.
- Stay abreast of the latest advancements in big data technologies, AWS services, and best practices in data engineering.
Job Requirements:
- Bachelor's degree in Computer Science, Engineering, or related field.
- Demonstrated expertise in developing data pipelines with Python and PySpark, with a strong emphasis on utilizing AWS Glue, EMR, and Step Functions.
- Proficiency in SQL for data manipulation and querying.
- Experience with distributed computing frameworks like Hadoop and Spark.
- Strong understanding of cloud computing principles and experience with AWS services.
- Excellent problem-solving and communication skills, with the ability to work collaboratively in a team environment.
- Familiarity with Agile development methodologies is advantageous.