Azure Databricks Developer
Location: Louisville, KY/ Chicago, IL/ Dallas, TX / Arlington, VA
Fulltime Only
Must Have Technical/Functional Skills
• Design, develop, and maintain cloud native data engineering solutions using Azure Databricks.
• Build and manage PySpark notebooks to process large scale structured and semi structured datasets.
• Design, create, and maintain Delta Lake tables, ensuring data reliability, ACID transactions, and schema enforcement.
• Develop scalable data workflows and pipelines using Databricks notebooks and orchestration patterns.
• Optimize performance of Spark jobs, including tuning partitions, memory usage, caching strategies, and query execution.
• Work extensively with PySpark and Spark SQL, choosing the appropriate approach based on use case and performance needs.
• Support cloud data migration initiatives, migrating data pipelines from on prem or legacy platforms to Azure Databricks.
• Integrate Databricks with upstream and downstream systems (e.g., data sources, storage layers, reporting tools).
• Ensure data pipelines are robust, reusable, and maintainable, following enterprise data engineering best practices.
• Implement error handling, logging, monitoring, and recovery strategies for production grade data pipelines.
• Collaborate with data architects, analysts, and downstream consumers to understand data requirements.
• Perform debugging and root cause analysis for data quality, performance, or pipeline failures.
• Support testing, validation, and reconciliation of data during development, migration, and production phases.
• Follow security, governance, and compliance standards applicable to cloud data platforms.
• Actively participate in Agile/Scrum delivery, owning data engineering stories from development through deployment.
• Maintain documentation for notebooks, workflows, data models, and migration approaches.
Roles & Responsibilities
• Develop and maintain data engineering solutions using Azure Databricks and PySpark.
• Create, enhance, and optimize Databricks notebooks for data ingestion, transformation, and aggregation.
• Design and manage Delta Lake tables and pipelines supporting analytics and reporting use cases.
• Support cloud data migrations, including data validation and performance benchmarking.
• Optimize Spark jobs for performance, scalability, and cost efficiency.
• Collaborate with platform, DevOps, and data governance teams to ensure environment stability.
• Perform data pipeline testing and validation, ensuring correctness and completeness.
• Troubleshoot and resolve issues related to Spark jobs, Delta tables, and workflow execution.
• Participate in code reviews and enforce data engineering best practices.
• Support production deployments and post deployment stabilization.
• Provide inputs to data architecture and platform improvement initiatives.
• Mentor junior data engineers when required.