Data Pipeline Engineer Location: Remote (US / India / Global - based on project needs)
Employment Type: Contract / Full-Time
Industry: Healthcare, Life Sciences, Data & AI
Experience: 5-10+ years (flexible based on strength)
About BigRio BigRio is a trusted data, analytics, AI, and cloud consulting partner specializing in healthcare and life sciences. We help organizations modernize data platforms, build scalable pipelines, and unlock insights through cloud-native and AI-driven solutions. Our teams work on high-impact, real-world problems involving healthcare data, compliance, and advanced analytics.
Role Overview We are seeking a Data Pipeline Engineer to design, build, and maintain scalable, reliable, and secure data pipelines supporting analytics, reporting, and AI/ML initiatives. This role focuses on ingesting, transforming, and delivering high-quality data across cloud platforms, with a strong emphasis on healthcare and regulated data environments.
Key Responsibilities - Design, develop, and maintain end-to-end data pipelines (batch and streaming) for structured and unstructured data
- Build robust ETL / ELT workflows to ingest data from multiple sources including APIs, databases, files, and third-party systems
- Implement data transformations, validations, and quality checks to ensure accuracy and reliability
- Optimize pipeline performance, scalability, and cost efficiency
- Work closely with data analysts, BI engineers, data scientists, and product teams to support downstream analytics and AI use cases
- Ensure data pipelines comply with security, privacy, and HIPAA requirements where applicable
- Monitor pipelines, troubleshoot failures, and implement alerting and recovery mechanisms
- Contribute to data architecture decisions, documentation, and best practices
Required Qualifications - 5+ years of experience building and supporting data pipelines in production environments
- Strong experience with SQL and data modeling concepts
- Hands-on experience with ETL/ELT frameworks and orchestration tools
- Experience working with cloud platforms (Azure, AWS, or GCP)
- Proficiency with data processing tools such as Azure Data Factory, Databricks, Spark, Airflow, or similar
- Experience integrating data from APIs, flat files, relational databases, and cloud storage
- Strong understanding of data quality, lineage, and pipeline reliability
- Excellent problem-solving and communication skill
Nice to Have - Healthcare domain experience (provider, payer, clinical, claims, PHI data)
- Experience with streaming data (Kafka, Event Hub, Kinesis, etc.)
- Exposure to Snowflake, BigQuery, Redshift, or other cloud data warehouses
- Familiarity with Python or Scala for data processing
- Experience supporting BI tools (Power BI, Tableau, Looker)
- Knowledge of CI/CD, DevOps, and Infrastructure-as-Code