Job Summary:
Montefiore Medical Center is nationally recognized for clinical excellence and is seeking a Data Engineer for Research Informatics. This position is responsible for designing, building, and maintaining scalable data pipelines and research-ready datasets to support clinical and translational research across the organization.
Responsibilities:
• Design, build, and automate robust, scalable data pipelines for ingesting, transforming, and loading both structured and unstructured data from diverse internal and external sources, including EHR systems (e.g., Epic Clarity/Caboodle), clinical research systems, registries, and third-party research datasets
• Design, develop, and support data models and schemas to enable integration of clinical and research data, supporting use cases such as cohort identification, longitudinal patient tracking, and research analytics
• Implement CI/CD pipelines and best practices for data engineering assets, ensuring reproducibility, version control, and reliable deployment of research data pipelines and datasets
• Design and implement rules-driven data quality frameworks that enhance observability, transparency, and auditability of research data, supporting compliance with regulatory and research standards (e.g., HIPAA, IRB protocols), and enabling rapid identification and remediation of data issues
• Develop and maintain research-ready datasets that support clinical trials, observational studies, population health research, and real-world evidence generation
• Integrate and harmonize data across heterogeneous sources, including structured EHR data, unstructured clinical notes, imaging metadata, genomics, and patient-reported outcomes, to enable advanced analytics and data science workflows
• Collaborate with product managers, clinical researchers, biostatisticians, data scientists, BI developers, and application teams to understand research and clinical questions, and translate them into scalable, high-quality data solutions
• Support secure data access, governance, and de-identification processes to enable compliant use of data for research purposes while protecting patient privacy
• Contribute to the development and enhancement of enterprise research data platforms and data products that accelerate scientific discovery and improve patient outcomes
Qualifications:
Required:
• 5+ years of experience in data engineering, with a focus on building and supporting scalable data pipelines and modern data architectures
• Strong proficiency in SQL and experience with Snowflake or similar cloud data platforms
• Experience with ELT/ETL tools such as dbt, Matillion, or similar frameworks; proficiency in Python for data transformation and pipeline development
• Experience with AWS (S3, compute services), Git, and DevOps workflows
• Understanding of data privacy, security, and regulatory considerations in a research environment (HIPAA, IRB, data use agreements)
• Bachelor’s or Master’s degree in Computer Science, Engineering, Biomedical Informatics, or a related field
Preferred:
• Experience working with healthcare and/or research data, including EHR systems (Epic Clarity, Caboodle preferred), clinical registries, or real-world data sources
• Familiarity with research data standards and models (e.g., OMOP, FHIR, CDISC) is a plus
Company:
Montefiore Einstein Technology began a cultural transition from a provider of IT services and support to an integral component of Montefiore Medical Center/Systems and its affiliates located in the Bronx, Westchester, and Rockland Counties, with our focus on Customer Service and meeting Montefiore’s Health Systems Mission, Vision, and Values. Founded in , the company is headquartered in Yonkers, NY, US, , with a team of 501-1000 employees. The company is currently Late Stage.