Senior Data Engineer
New York, NY
Are you an experienced Data Engineer who designs and implements innovative solutions that leverage modern tools and technologies?
Do you share our passion for enabling positive change within healthcare and helping patients with chronic conditions like diabetes?
If so, you could be a perfect fit for our team of like-minded professionals who share a common mission and passion for helping others and a desire to build a great company. Come and work with us to build our next generation healthcare platform!
Cecilia Health is a high-growth, venture-backed healthcare company based in New York City. We partner with pharmaceutical & device companies, payers and ACOs to deliver personalized, technology-enabled coaching to improve treatment, adherence and health outcomes for people living with diabetes and other chronic conditions. Cecelia Health is a high-energy, results-oriented work place that believes our success, as well as the success of our customers and patients, relies primarily on a fantastic team with the passion, drive and skills to change the face of chronic condition management.
We are hiring a Senior Data Engineer in New York City. This role will report to our Chief Technology Officer and join a strong technology team that is continuously innovating our current solutions that allow clinicians to efficiently serve an increasing volume of patients.
WHO YOU ARE
You have data in your DNA and at least 5 years of relevant work experience in data engineering requiring application of analytic skills to integrate data into business operations. You are an expert in using modern technology stacks to build intelligent ETL and data processing pipelines. You’ve worked in a variety of data wrangling roles and have strong knowledge of data manipulation using advanced SQL and other tools. You have a solid understanding of engineering best practices, as well as familiarity with scaling DevOps and engineering initiatives. You want to help develop software platforms that have real-world impact on people with chronic diseases like diabetes.
- Define and lead data lifecycle strategy across data acquisition, data ingestion, data cleansing, normalization and linkage
- Apply various techniques to produce solutions to large-scale optimization problems, including data pre-processing, indexing, blocking, field and record comparison and classification
- Improve data sharing, increase data repurposing and improve efficiency associated with data management efforts
- Build best practices to support tracking chain of custody of data so it can be easily traced back to the source for accuracy and consistency
- Responsible for the design, development, and documentation of data models and ETL processes that will feed various databases including a data warehouse/lake and various analytics/data science components
- Support and enhance existing applications and automation processes on a daily basis using a combination of SSIS, C#, .NET Core, Python, PowerShell, SQL Server, and Snowflake databases
- Ensure production service levels, performance quality, and resolution of data load failures
- Manage multiple projects independently
- Evaluate and recommend tools, technologies and processes to ensure the highest quality product platform
- Bachelor’s degree in information science, computer science, engineering or a similar area, with 5+ years of related experience, or comparable real-world development experience
- Master Data Management experience including data consolidation, linkage, federation and dissemination
- Strong knowledge of Python for Data Engineering
- Advanced SQL experience (Nested Queries, Complex Joins, Analytic Functions, Time Series)
- Experienced working in Agile/Dev Operations environment with continuous integration and continuous deployment and application lifecycle management
- Strong communication skills
- Data integration, application development and secure information management in healthcare, life sciences, or clinical research
- Experience with scalable, enterprise-level development on virtualized (AWS, Azure, GCP) infrastructure
- Experience with a cloud-based data warehouse such as Snowflake or Redshift
- Familiarity with analytics and business intelligence tools (Tableau, Power BI, Cognos).
- Experience with Real Time processing and Big Data tools (Hive, Spark, Hadoop, HDFS, Kafka, Lucene, Glue, Airflow…)