1

Biology Data Engineer Jobs (NOW HIRING)

Data Engineer

Durham, NC ยท On-site

$110K - $132K/yr

... complex biological and chemical interactions, predicts precision outcomes, and enables next ... Position Summary The Data Engineer sits within the Data & Analytics organization and supports the ...

You will design and manage our ETL pipeline of diverse biological data, with an eye for both ... You will work closely with experimental scientists, device engineers, and software and operations ...

Data Engineer

Madison, WI ยท On-site

$50K/yr

Data Engineer I Job Summary: About Us: The Institute on Aging is a research unit whose mission is ... Known as MIDUS (Midlife in the US), the study examines influences of emotion, personality, biology ...

Data Engineer

Madison, WI ยท On-site

$50K/yr

Data Engineer I Job Summary: About Us: The Institute on Aging is a research unit whose mission is ... Known as MIDUS (Midlife in the US), the study examines influences of emotion, personality, biology ...

Data Engineer

Madison, WI ยท On-site

$50K/yr

Data Engineer I Job Summary: About Us: The Institute on Aging is a research unit whose mission is ... Known as MIDUS (Midlife in the US), the study examines influences of emotion, personality, biology ...

Staff Computational Biologist

Lexington, MA ยท On-site +1

$195K - $230K/yr

... biology or data-driven discovery; minimum 1 years' experience in software engineering and data engineering teams * Proficiency working with programming languages Python, and R, modern development ...

Senior / Staff Data Engineer

New York, NY ยท On-site +1

$241K - $338K/yr

The data that trains biological frontier models comes in dozens of modalities (sequences, images ... As a senior / staff data engineer at Biohub, you'll be designing systems that ingest data from ...

Sr. Data Engineer

Milford, MA

$125K - $150K/yr

As a Sr Data Engineer, you will be responsible for implementing AI, process automation, data ... and biology. We collaborate with customers around the world to advance the release of effective ...

Sr. Data Engineer

Milford, MA ยท On-site

$125K - $150K/yr

As a Sr Data Engineer, you will be responsible for implementing AI, process automation, data ... and biology. We collaborate with customers around the world to advance the release of effective ...

Sr. Data Engineer

Milford, MA

$125K - $150K/yr

As a Sr Data Engineer, you will be responsible for implementing AI, process automation, data ... and biology. We collaborate with customers around the world to advance the release of effective ...

next page

Showing results 1-20

Biology Data Engineer information

See salary details

$44.5K

$129.7K

$177.5K

How much do biology data engineer jobs pay per year?

As of Jun 20, 2026, the average yearly pay for biology data engineer in the United States is $129,716.00, according to ZipRecruiter salary data. Most workers in this role earn between $114,500.00 and $137,500.00 per year, depending on experience, location, and employer.

How do Biology Data Engineers typically collaborate with biologists and other researchers on data projects?

Biology Data Engineers frequently work closely with biologists, bioinformaticians, and research scientists to understand the specific data requirements and biological context of projects. This often involves translating experimental needs into data pipelines, helping researchers manage large datasets, and ensuring data integrity and accessibility. Regular meetings, joint problem-solving sessions, and iterative feedback are common, enabling seamless integration of computational solutions with biological research. Strong communication skills and a willingness to learn domain-specific concepts are essential for success in this collaborative environment.

What is the difference between Biology Data Engineer vs Bioinformatics Data Scientist?

AspectBiology Data EngineerBioinformatics Data Scientist
Required CredentialsBachelor's or Master's in Biology, Data Science, or related fields; experience with data engineering toolsBachelor's or Master's in Bioinformatics, Computer Science, or related fields; strong programming skills
Work EnvironmentData pipelines, database management, cloud platforms in research or biotech companiesData analysis, algorithm development, research in healthcare or biotech sectors
Employer & Industry UsageBiotech firms, research institutions, pharmaceutical companiesResearch labs, healthcare organizations, biotech firms

The main difference between a Biology Data Engineer and a Bioinformatics Data Scientist lies in their focus areas. Biology Data Engineers primarily build and maintain data infrastructure for biological data, while Bioinformatics Data Scientists analyze and interpret biological data to derive insights. Both roles require strong technical skills and are vital in biotech and research industries, but they serve different functions within data management and analysis workflows.

What is a Biology Data Engineer?

A Biology Data Engineer is a professional who designs, builds, and maintains data systems specifically for biological and life sciences research. They work with large and complex biological datasets, ensuring that data is efficiently collected, stored, and accessible for analysis. Their responsibilities often include creating data pipelines, integrating data from various sources like genomics, proteomics, or clinical studies, and ensuring data quality and security. Biology Data Engineers collaborate closely with bioinformaticians, researchers, and software developers to support scientific discovery. Their work is essential for enabling advanced analytics, such as machine learning, in biological research.

What are the key skills and qualifications needed to thrive as a Biology Data Engineer, and why are they important?

To thrive as a Biology Data Engineer, you need a strong background in biology and computational data analysis, often supported by a degree in bioinformatics, computational biology, or computer science. Familiarity with programming languages (such as Python or R), biological databases, and data management platforms is typically required, as well as experience with cloud computing and big data tools. Strong problem-solving, collaboration, and communication skills are essential for translating complex biological data into actionable insights. These skills ensure the effective integration, analysis, and interpretation of large-scale biological datasets critical for research and innovation.
Infographic showing various Biology Data Engineer job openings in the United States as of June 2026, with employment types broken down into 89% Full Time, 1% Part Time, and 10% Contract. Highlights an 87% Physical, 5% Hybrid, and 8% Remote job distribution, with an average salary of $129,716 per year, or $62.4 per hour.

Data Engineer, Scientific Data Ingestion

Mithrl

San Francisco, CA โ€ข On-site

$150K - $200K/yr

Full-time

Medical, Dental, Vision, Retirement

Posted 25 days ago


Job description

ABOUT MITHRL
We envision a world where novel drugs and therapies reach patients in months, not years, accelerating breakthroughs that save lives.
Mithrl is building the world's first commercially available AI Co-Scientist-a discovery engine that empowers life science teams to go from messy biological data to novel insights in minutes. Scientists ask questions in natural language, and Mithrl answers with real analysis, novel targets, and patent-ready reports.
Our traction speaks for itself:
  • 12X year-over-year revenue growth
  • Trusted by leading biotechs and big pharma across three continents
  • Driving real breakthroughs from target discovery to patient outcomes.

WHAT YOU WILL DO
Build and own an AI-powered ingestion & normalization pipeline to import data from a wide variety of sources - unprocessed Excel/CSV uploads, lab and instrument exports, as well as processed data from internal pipelines.
Develop robust schema mapping, coercion, and conversion logic (think: units normalization, metadata standardization, variable-name harmonization, vendor-instrument quirks, plate-reader formats, reference-genome or annotation updates, batch-effect correction, etc.).
Use LLM-driven and classical data-engineering tools to structure "semi-structured" or messy tabular data - extracting metadata, inferring column roles/types, cleaning free-text headers, fixing inconsistencies, and preparing final clean datasets.
Ensure all transformations that should only happen once (normalization, coercion, batch-correction) execute during ingestion - so downstream analytics / the AI "Co-Scientist" always works with clean, canonical data.
Build validation, verification, and quality-control layers to catch ambiguous, inconsistent, or corrupt data before it enters the platform.
Collaborate with product teams, data science / bioinformatics colleagues, and infrastructure engineers to define and enforce data standards, and ensure pipeline outputs integrate cleanly into downstream analysis and storage systems.
WHAT YOU BRING
Must-have
  • 5+ years of experience in data engineering / data wrangling with real-world tabular or semi-structured data.
  • Strong fluency in Python, and data processing tools (Pandas, Polars, PyArrow, or similar).
  • Excellent experience dealing with messy Excel / CSV / spreadsheet-style data - inconsistent headers, multiple sheets, mixed formats, free-text fields - and normalizing it into clean structures.
  • Comfort designing and maintaining robust ETL/ELT pipelines, ideally for scientific or lab-derived data.
  • Ability to combine classical data engineering with LLM-powered data normalization / metadata extraction / cleaning.
  • Strong desire and ability to own the ingestion & normalization layer end-to-end - from raw upload โ†’ final clean dataset - with an eye for maintainability, reproducibility, and scalability.
  • Good communication skills; able to collaborate across teams (product, bioinformatics, infra) and translate real-world messy data problems into robust engineering solutions.

Nice-to-have
  • Familiarity with scientific data types and "modalities" (e.g. plate-readers, genomics metadata, time-series, batch-info, instrumentation outputs).
  • Experience with workflow orchestration tools (e.g. Nextflow, Prefect, Airflow, Dagster), or building pipeline abstractions.
  • Experience with cloud infrastructure and data storage (AWS S3, data lakes/warehouses, database schemas) to support multi-tenant ingestion.
  • Past exposure to LLM-based data transformation or cleansing agents - building or integrating tools that clean or structure messy data automatically.
  • Any background in computational biology / lab-data / bioinformatics is a bonus - though not required.

WHAT YOU WILL LOVE AT MITHRL
  • Mission-driven impact: you'll be the gatekeeper of data quality - ensuring that all scientific data entering Mithrl becomes clean, consistent, and analysis-ready. You'll have outsized influence over the reliability and trustworthiness of our entire data + AI stack.
  • High ownership & autonomy: this role is yours to shape. You decide how ingestion works, define the standards, build the pipelines. You'll work closely with our product, data science, and infrastructure teams - shaping how data is ingested, stored, and exposed to end users or AI agents.
  • Team: Join a tight-knit, talent-dense team of engineers, scientists, and builders
  • Culture: We value consistency, clarity, and hard work. We solve hard problems through focused daily execution
  • Speed: We ship fast (2x/week) and improve continuously based on real user feedback
  • Location: Beautiful SF office with a high-energy, in-person culture
  • Benefits: Comprehensive PPO health coverage through Anthem (medical, dental, and vision) + 401(k) with top-tier plans

We encourage you to apply even if you do not believe you meet every single qualification. Not all strong candidates will meet every single qualification as listed. Research shows that people who identify as being from underrepresented groups are more prone to experiencing imposter syndrome and doubting the strength of their candidacy, so we urge you not to exclude yourself prematurely and to submit an application if you're interested in this work. We think AI systems like the ones we're building have enormous social and ethical implications. We think this makes representation even more important, and we strive to include a range of diverse perspectives on our team.