Job Summary:
Cedar is a leading healthcare technology company focused on improving the patient billing experience. They are seeking a Senior Data Engineer to design and manage scalable data pipelines, enhance data quality, and collaborate with various teams to ensure accurate and efficient data delivery.
Responsibilities:
• Design, build, and own scalable ELT/ETL pipelines that power core use cases including client billing, financial reporting, product analytics and data services for downstream teams (Finance, Data Science, Commercial Analytics, Product).
• Modernize legacy data flows by migrating SQL- and Liquibase-based transformations into dbt, with robust testing, documentation and data contracts.
• Improve reliability and observability of our data platform by implementing best practices in testing, monitoring, alerting and runbook-driven operations for pipelines orchestrated via Airflow (and/or similar tools).
• Model data for usability and performance in Snowflake and other systems, applying dimensional and domain-driven design patterns where appropriate (e.g., for analytics core models and financial engineering services).
• Partner closely with product, finance, analytics and integrations teams to understand requirements, define interfaces, and ensure data is accurate, well-documented, and delivered in the right form and cadence for consumers.
• Contribute to Cedar’s data platform vision by helping decouple data infrastructure from data services, establishing standards for governance, metadata, and access, and piloting tools like OpenMetadata and data quality frameworks.
• Provide technical mentorship to other engineers, upleveling our data engineering practices in areas like code quality, reviews, architecture, and operational excellence.
• Balance short-term delivery with long-term architecture, making pragmatic trade-offs while moving us toward a clear “North Star” data platform that supports emerging use cases like AI/ML, personalization and experimentation.
Qualifications:
Required:
• 5+ years of hands-on data engineering (or closely related software engineering) experience, including ownership of production data pipelines and systems at scale.
• Strong SQL and Python proficiency, with experience building data transformations, utilities and tooling (e.g., dbt models, Airflow DAGs, internal libraries).
• Deep experience with modern data stack tools, including several of: Snowflake (or similar cloud data warehouse), dbt, Airflow/Dagster (or similar orchestrator).
• Proven track record designing and operating reliable pipelines, including testing strategies (unit/integration/dbt tests), monitoring, alerting, and incident/root-cause analysis for data issues.
• Experience with data modeling and schema design for analytics, reporting and operational use cases (e.g., dimensional models, entity-centric designs, or medallion-style architectures).
• Familiarity with cloud platforms, ideally AWS (e.g., use of S3, IAM, containerized workloads, or related infrastructure supporting data workloads).
• Strong collaboration and communication skills, with the ability to translate ambiguous business problems into clear technical requirements and to work effectively with partners across engineering, product and business teams.
• High ownership and bias to action in complex, evolving environments—comfortable operating with partial information, making trade-offs explicit, and driving work to completion.
Preferred:
• Experience with metadata and data governance tools, such as OpenMetadata, DataHub or similar catalogs, and implementing data contracts or quality frameworks (e.g., Great Expectations, dbt tests).
• Exposure to streaming and event-driven data pipelines (e.g., Kafka, CDC tools) and integrating those into warehouse-centric architectures.
• Prior experience in healthcare, fintech, or other highly regulated domains, particularly with standards like HL7 or FHIR, or with complex billing/financial data flows.
• Familiarity with analytics and visualization tools (e.g., Looker, Hex) and enabling self-serve analytics through well-designed semantic layers and models.
• Experience helping define team-level standards, patterns, and roadmaps for data engineering or platform teams.
Company:
Cedar is a patient payment and engagement platform for hospitals, health systems, and medical groups that elevates the patient experience. Founded in 2016, the company is headquartered in New York, USA, with a team of 201-500 employees. The company is currently Growth Stage.