Job Summary:
IBM Consulting Client Innovation Centers (CICs) are environments where technologists build real solutions for clients. The Associate Data Engineer role is entry-level, focusing on supporting the development and maintenance of data pipelines and platforms while collaborating with experienced practitioners.
Responsibilities:
• Support the development and maintenance of data pipelines used for analytics, reporting, and machine learning
• Assist with extracting, transforming, and loading (ETL/ELT) data from multiple sources into data platforms
• Contribute to data cleansing, validation, and transformation activities using Python and SQL
• Help prepare datasets for downstream consumption by analytics and data science teams
• Support batch and, where applicable, near-real-time data processing workflows under guidance
• Collaborate with data engineers, data scientists, and other team members in Agile delivery environments
• Build data engineering skills through training, mentorship, and hands-on delivery experience
• Work with functional and technical team members to help integrate data solutions into client business environments
Qualifications:
Required:
• Strong foundation in computer science fundamentals, including data structures and algorithms
• Strong analytical and problem-solving skills with attention to data quality and reliability
• Comfortable working onsite in a collaborative, team-based environment
• Ability to work effectively in a technology-driven consulting environment where tools, platforms, and client needs evolve over time
• Strong analytical and problem-solving skills, with the ability to approach complex tasks using structured, logical thinking
• Ability to learn new systems and technologies quickly and apply them in a delivery setting
• Proficiency in Python (preferred) or another programming language used for data processing
• Hands-on experience using data manipulation tools such as pandas, NumPy, and SQL, gained through coursework, labs, projects, or internships
• Ability to write clear, maintainable code for data transformation and processing tasks
• Understanding of ETL/ELT concepts and how data moves from source systems to consumption layers
• Familiarity with relational databases and SQL for querying and data manipulation
• Basic understanding of data modeling concepts such as schemas, normalization, or dimensional models
• Exposure to cloud-based data or analytics platforms (e.g., AWS, Azure, or Google Cloud) through coursework, labs, or projects
• Familiarity with core cloud data services such as object storage, databases, or analytics services
• Ability to translate business or functional requirements into technical solutions, with guidance from senior team members
• Comfortable working onsite in a collaborative, team-based environment
• Strong willingness to learn, accept feedback, and continuously improve
• Familiarity with generative AI concepts, including basic modeling approaches, responsible use, and ethical considerations, gained through coursework, projects, or self-study
Preferred:
• Master's Degree
• Exposure to distributed data processing tools such as Apache Spark or PySpark
• Familiarity with modern data warehouse technologies (e.g., Snowflake, Redshift, BigQuery)
• Exposure to streaming or event-based data concepts
• Familiarity with version control tools such as Git
• Basic awareness of how data engineering supports machine learning workflows
Company:
IBM provides technology and consulting, including software, infrastructure systems, and cloud-based solutions. Founded in 1911, the company is headquartered in Armonk, USA, with a team of 10001+ employees. The company is currently Late Stage.