We are seeking a highly skilled Data Engineer with strong expertise in Python, Google Cloud Platform (GCP), and AI-enabled data solutions. The ideal candidate will build scalable data pipelines and support advanced analytics initiatives within the healthcare domain, with a focus on Medicare Part D data and patient safety outcomes. The ideal profile should reflect a higher level of technical proficiency, problem-solving ability, and domain understanding particularly in building robust data solutions, working with complex healthcare datasets, and collaborating effectively across data science, clinical, and business teams.
- Technical Skills
- Strong programming Handson experience in Python/ Pyspark(mandatory)
- Expertise in:
- SQL (advanced querying, performance tuning)
- Data modeling (star/snowflake schemas)
- Hands-on experience with GCP data services(Big Query, Dataproc)
- Experience with distributed processing frameworks (e.g., Apache Spark)
- Familiaritywith CI/CD pipelinesand DevOps practices
- AI & Machine Learning
- Experience supporting ML/AI workflows and pipelines
- Healthcare Domain Knowledge (Preferred but Strongly Desired)
- Experience working with healthcare datasets(claims, EHR, clinical data)
- Familiarity with Medicare/Medicaid data structures and reporting
- Understanding of value-based care and quality measures
- Patient Safety Knowledge (Preferred)
- Knowledge of patient safety frameworks and indicators
- Experience supporting:
- Quality reporting programs (e.g., CMS measures)
- Clinical risk and compliance analytics
- Data Engineering & Pipeline Development
- Design, develop, and maintain scalable ETL/ELT pipelinesfor structured and unstructured healthcare data.
- Build robust data ingestion frameworks from multiple sources (Medical claims, RX Claims, Membership etc.).
- Ensure data quality, integrity, and governanceacross all pipelines.
- Optimize data workflows for performance, reliability, and cost efficiency on GCP.
- AI & Advanced Analytics Enablement
- Collaborate with data scientists to operationalize AI/ML modelsin production environments.
- Develop feature pipelines and data transformations for machine learning use cases.
- Support use cases such as:
- Patient risk scoring
- Quality and safety analytics
- Cloud & GCP Engineering
- Build and manage data infrastructure using GCP services such as:
- BigQuery
- Cloud Composer Workflow
- Cloud Storage
- Dataproc / Spark
- Implement data lake and data warehouse architectureson GCP.
- Ensure compliance with HIPAA and healthcare data security standards.
- Healthcare & Medicare Data Management
- Work with Medicare datasetsincluding:
- Claims data (Part D)
- Provider and beneficiary data
- Enable analytics for quality measures, patient outcomes, and regulatory reporting.
- Patient Safety & Compliance
- Develop solutions to monitor and improve patient safety indicators (PSIs)and care quality.
- Build data models supporting:
- Adverse event detection
- Medication safety
- Clinical quality measures
- Ensure compliance with healthcare regulations and data privacy standards.