1

Data Preprocessing Jobs in Chicago, IL (NOW HIRING)

Data Scientist II, Molecular Biology

Chicago, IL ยท On-site +1

$150K - $175K/yr

Reporting to the Senior Director, Data Science, you will execute the analytical strategy for our ... Advanced proficiency in preprocessing, analyzing, and cataloging high-throughput molecular biology ...

... Preprocessing and handling large datasets - Developing and deploying models in PyTorch, TensorFlow, or JAX - Staying current on industry trends and tools - Mentoring junior team members with coaching ...

Develop data ingestion and preprocessing pipelines using BigQuery , Dataform , and Pub/Sub . * Apply prompt engineering and parameter tuning to improve generative model accuracy. * Implement RAG ...

Data Pipeline Management: Build and manage scalable data pipelines for preprocessing, feature engineering, and data flow to train and evaluate ML models. * Model Training and Deployment: Train, test ...

Manage data acquisition, preprocessing, and feature engineering for structured and unstructured data sources Your Skills and Experience: * PhD or Master's in Engineering, Math, Statistics, Computer ...

Quantitative Researcher

Chicago, IL ยท On-site

$200K - $400K/yr

The firm has a flat, collaborative and open environment, where Quantitative Researchers carry out full-stack research from preprocessing data and feature engineering through to strategy ...

Senior AI Developer

Mettawa, IL ยท On-site

$62.50 - $82.50/hr

Data Pipeline Development: Set up scalable data pipelines for data ingestion, embedding generation, preprocessing, and continuous model training/retraining. * Technical Leadership & Collaboration:

AI Developer

Mettawa, IL ยท On-site

$80/hr

Data Pipeline Development: Set up scalable data pipelines for data ingestion, embedding generation, preprocessing, and continuous model training/retraining. * Technical Leadership & Collaboration:

Oversee data acquisition, preprocessing, and feature engineering for structured and unstructured data sources * Mentor junior researchers and contribute to a culture of research excellence and ...

Senior AI & ML Engineer

Chicago, IL ยท Hybrid

$72K - $141K/yr

Solid experience managing diverse data sources, including preprocessing, cleansing, and verifying the integrity of data to develop data marts for data science use cases and machine learning ...

Machine Learning Researcher

Chicago, IL ยท On-site

$250K - $300K/yr

Manage data acquisition, preprocessing, and feature engineering for structured and unstructured data sources Your Skills and Experience: * PhD or Master's in Engineering, Math, Statistics, Computer ...

Machine Learning Research Lead

Chicago, IL ยท On-site

$250K - $300K/yr

Oversee data acquisition, preprocessing, and feature engineering for structured and unstructured data sources * Mentor junior researchers and contribute to a culture of research excellence and ...

Senior AI & ML Engineer

Chicago, IL ยท On-site

$72K - $141K/yr

Solid experience managing diverse data sources, including preprocessing, cleansing, and verifying the integrity of data to develop data marts for data science use cases and machine learning ...

Machine Learning Research Lead

Chicago, IL ยท On-site

$250K - $300K/yr

Oversee data acquisition, preprocessing, and feature engineering for structured and unstructured data sources * Mentor junior researchers and contribute to a culture of research excellence and ...

next page

Showing results 1-20

Data Preprocessing information

See Chicago, IL salary details

$47.4K

$170K

$250.8K

How much do data preprocessing jobs pay per year?

As of Jun 29, 2026, the average yearly pay for data preprocessing in Chicago, IL is $169,993.00, according to ZipRecruiter salary data. Most workers in this role earn between $137,500.00 and $175,100.00 per year, depending on experience, location, and employer.

What is the highest paying job in data?

In data-related fields, roles such as Data Science Director, Machine Learning Engineer, and Chief Data Officer tend to have the highest salaries, often exceeding six figures annually. These positions typically require advanced skills in data analysis, programming, and leadership, along with extensive experience and relevant certifications.

What is data preprocessing?

Data preprocessing is the process of cleaning, transforming, and organizing raw data into a usable format for analysis or machine learning. It involves steps such as handling missing values, removing duplicates, normalizing or scaling data, and encoding categorical variables. Proper data preprocessing helps improve the quality and performance of predictive models by ensuring the data is accurate, consistent, and suitable for analysis.

What are the key skills and qualifications needed to thrive as a Data Preprocessing Specialist, and why are they important?

To thrive as a Data Preprocessing Specialist, you need a strong background in statistics, data cleaning, and data transformation, often supported by a degree in computer science, data science, or a related field. Proficiency with tools such as Python (pandas, NumPy), SQL, and data visualization platforms is typically essential, along with familiarity with data management systems. Attention to detail, problem-solving abilities, and effective communication are standout soft skills in this position. These skills are crucial for ensuring high-quality, reliable datasets that underpin accurate data analysis and machine learning outcomes.

Is 40 too late for data science?

Data preprocessing is a key step in data science, and individuals can enter the field at any age. Many data scientists start later in life, and acquiring skills in programming, statistics, and tools like Python or R can facilitate entry regardless of age.

What do you do in data preprocessing?

Data preprocessing involves cleaning and transforming raw data to prepare it for analysis or modeling. This includes tasks such as handling missing values, removing duplicates, normalizing data, and encoding categorical variables, often using tools like Python or R. It is a crucial step to ensure data quality and improve model performance.

What is the difference between Data Preprocessing vs Data Analysis?

AspectData PreprocessingData Analysis
Primary FocusCleaning, transforming, and preparing raw data for analysisInterpreting data to extract insights and support decision-making
Skills RequiredData cleaning, scripting, understanding of data formatsStatistical analysis, data visualization, critical thinking
Work EnvironmentData engineering teams, data science projectsBusiness intelligence, research, data science teams
Tools UsedPython, R, SQL, ETL toolsExcel, Tableau, R, Python, statistical software

While data preprocessing involves preparing raw data for analysis by cleaning and transforming it, data analysis focuses on interpreting the prepared data to uncover trends and insights. Both roles are essential in the data pipeline but serve different purposes in the data lifecycle.

Will AI replace data analysts?

AI is transforming data analysis by automating routine tasks such as data cleaning and basic reporting, but data analysts are still essential for interpreting complex insights, making strategic decisions, and applying domain knowledge. The role is evolving to include skills in machine learning tools and programming languages like Python or R, but human expertise remains critical for nuanced analysis and contextual understanding.

What are some common challenges faced in a Data Preprocessing role, and how can they be effectively managed?

Professionals in Data Preprocessing often encounter challenges such as handling incomplete or inconsistent data, managing large datasets, and ensuring data quality before analysis. Addressing these issues typically involves using specialized tools to automate data cleaning, establishing clear data validation rules, and collaborating closely with data engineers and analysts. Staying updated with best practices and leveraging scripting languages like Python or R can also streamline the preprocessing workflow, making it easier to deliver reliable and accurate datasets for downstream analysis.
What are popular job titles related to Data Preprocessing jobs in Chicago, IL? For Data Preprocessing jobs in Chicago, IL, the most frequently searched job titles are:
What job categories do people searching Data Preprocessing jobs in Chicago, IL look for? The top searched job categories for Data Preprocessing jobs in Chicago, IL are:
What cities near Chicago, IL are hiring for Data Preprocessing jobs? Cities near Chicago, IL with the most Data Preprocessing job openings:
Data Scientist II, Molecular Biology

Data Scientist II, Molecular Biology

EVOZYNE INC

Chicago, IL โ€ข On-site, Remote

$150K - $175K/yr

Full-time

Posted 12 days ago


Job description

Evozyne is an AI-native biotech company building a new way to design therapeutic proteins. Our generative AI platform was purpose-built to create entirely novel proteins that expand whatโ€™s possible beyond traditional drug discovery. We are applying this platform to develop transformative therapies for serious diseases with significant unmet need, working at the intersection of AI, biology, and protein engineering to solve complex scientific problems that conventional approaches cannot easily address.

Reporting to the Senior Director, Data Science, you will execute the analytical strategy for our drug discovery programs, encompassing experimental design, data synthesis, and featurization. You will partner closely with experimental scientists to understand assay design, wrangle multi-assay datasets, build decision-grade plots and summaries, and translate results for audiences from bench scientists to leadership. Youโ€™ll incorporate the latest advances in biological assay developments and database infrastructure to streamline program analytical processes, and your work will directly support experimental decision-making and generate high-quality datasets for model training (GenAI) for the design of novel synthetic biomolecules.

Location:ย Hybrid preferred (3 days per week in Chicago office); Open to remote (US-based)

What Youโ€™ll Do

  • Analyze, synthesize, and catalog experimental data across various data modalities to provide insights and optimization approaches.
  • Collaborate extensively with experimental scientists - asking questions, reflecting on objectives, and agreeing on success criteria before executing.
  • Own the development of reproducible pipelines to synthesize high-throughput experimental results into features amenable for training deep learning models.
  • Draw upon their experience in programming to maintain and update the companyโ€™s data processing and ingestion software infrastructure.
  • Deliver analyses and decision-grade visualizations that directly inform next-step experiments, assay optimization, or go/no-go decisions.ย 

Who You Are
You thrive in an early-stage start-up environment where you can leverage your agility and expertise to deliver high-quality results. You are galvanized by designing novel therapeutics in a rapidly evolving field with a cross-functional team of experts spanning biological disciplines. You are naturally curious and excel when working collaboratively to solve tough problems.

Required Skills + Experience

  • A PhD in a relevant scientific/or technical discipline with 0-2+ years relevant postdoctoral or industry experience, or a Master\'s degree with 4+ years of experience.
  • 2+ years of experience working in a cross-functional, collaborative scientific environment, such as in an academic lab, pharmaceutical company, and/or biotech.
  • Extensive experience working in an experimental scientific discipline, designing and executing experiments.
  • Advanced proficiency in preprocessing, analyzing, and cataloging high-throughput molecular biology or biochemistry datasets in Python.
  • Familiarity in database organization and management.
  • Expertise in ML and deep learning implementation, preferably in PyTorch or TensorFlow, is preferred.

Additional Information
Compensation: $150,000 - $175,000ย 

Individual compensation within this range is determined by a combination of factors, including, but not limited to level, years of relevant job-related experience, and internal equity. This is what we believe in good faith is the range of possible base salary for this role at the time of this posting. We may ultimately pay more or less than the posted range. This range may be modified in the future.ย 

Relocation assistance is not available for this position.