**Please strictly adhere to the following resume naming convention:
ALL CAPS, NO SPACES B/T UNDERSCORES
PTN_US_GBAMSREQID_CandidateBeelineID
i.e. PTN_US_9999999_SKIPJOHNSON0413
Bill Rate: market rate-market rate/hr
MSP Owner: Kelly Gosciminski
Location: Boston, MA - onsite
Duration: 6 months
GBaMS ReqID: 10538201
A Databricks Developer with Java expertise designs, builds, and maintains large-scale data processing solutions and AI/ML platforms, primarily using Java and Apache Spark on the Databricks platform. The role involves developing scalable data pipelines, optimizing Spark jobs, and ensuring data governance and security within cloud environments like AWS or Azure.
Core Responsibilities
• Data Pipeline Development: Design, develop, and maintain scalable ETL/ELT processes and data pipelines using Java and Spark on Databricks.
• Performance Optimization: Implement and tune Spark jobs to enhance performance, stability, and cost-efficiency in distributed systems, troubleshooting issues like data skew and memory errors.
• Cloud Integration: Work closely with cloud-native services (AWS/Azure IAM, Storage, Networking) and integrate Databricks offerings within the cloud infrastructure.
• Automation & CI/CD: Develop automation capabilities using Java APIs and Infrastructure-as-Code (IaC) tools like Terraform for platform provisioning, orchestration, and monitoring.
• Collaboration & Support: Partner with data scientists, ML engineers, and business teams to gather requirements, define compute needs, and provide support for production environments.
• Governance & Security: Enforce data governance policies, security standards (RBAC, encryption), and compliance requirements using tools like Unity Catalog and Delta Lake.
• Code Quality: Write clean, efficient, high-quality Java code following best practices, including participating in code reviews.
Required Skills & Qualifications
• Programming Languages: Strong proficiency in Java (Java 8 or higher) and experience with Spark fundamentals (DataFrames, SQL, RDDs).
• Big Data Technologies: Hands-on experience with Databricks workspace management, clusters, jobs, Delta Lake, MLflow, and Unity Catalog.
• Cloud Platforms: Deep understanding of a major cloud provider's infrastructure, such as AWS or Azure.
• Tools & Methodologies: Experience with CI/CD pipelines (GitHub Actions, Azure DevOps), Terraform, monitoring tools (Grafana, Prometheus), and Agile methodologies.
• Experience: Typically requires 5+ years of experience in software or data engineering.
• Certifications: A Databricks Certified Professional Data Engineer certification may be required or preferred.
Technical Requirements.
• Programming Languages Java, J2EE
• Cloud Technologies AWS, Azure
• Frameworks Struts, Spring, Springboot, Microservices, Kafka, spark
• Databases Oracle, MySQL, MongoDB, HBase and DB2.
• Web/App Servers WebLogic, Tomcat, WebSphere.
• Web Technologies ReactJS / Angular
• Build/ETL Tool Maven, Jenkins, Pentaho, Databricks
Role Descriptions: A Databricks Developer with Java expertise designs| builds| and maintains large-scale data processing solutions and AIML platforms| primarily using Java and Apache Spark on the Databricks platform. The role involves developing scalable data pipelines| optimizing Spark jobs| and ensuring data governance and security within cloud environments like AWS or Azure. Core ResponsibilitiesData Pipeline Development Design| develop| and maintain scalable ETLELT processes and data pipelines using Java and Spark on Databricks.Performance Optimization Implement and tune Spark jobs to enhance performance| stability| and cost-efficiency in distributed systems| troubleshooting issues like data skew and memory errors.Cloud Integration Work closely with cloud-native services (AWSAzure IAM| Storage| Networking) and integrate Databricks offerings within the cloud infrastructure.Automation CICD Develop automation capabilities using Java APIs and Infrastructure-as-Code (IaC) tools like Terraform for platform provisioning| orchestration| and monitoring.Collaboration Support Partner with data scientists| ML engineers| and business teams to gather requirements| define compute needs| and provide support for production environments.Governance Security Enforce data governance policies| security standards (RBAC| encryption)| and compliance requirements using tools like Unity Catalog and Delta Lake.Code Quality Write clean| efficient| high-quality Java code following best practices| including participating in code reviews. Required Skills QualificationsProgramming Languages Strong proficiency in Java (Java 8 or higher) and experience with Spark fundamentals (DataFrames| SQL| RDDs).Big Data Technologies Hands-on experience with Databricks workspace management| clusters| jobs| Delta Lake| MLflow| and Unity Catalog.Cloud Platforms Deep understanding of a major cloud providers infrastructure| such as AWS or Azure.Tools Methodologies Experience with CICD pipelines (GitHub Actions| Azure DevOps)| Terraform| monitoring tools (Grafana| Prometheus)| and Agile methodologies.Experience Typically requires 5 years of experience in software or data engineering.Certifications A Databricks Certified Professional Data Engineer certification may be required or preferred. Technical Requirements.Programming LanguagesJava| J2EECloud Technologies AWS| AzureFrameworksStruts| Spring| Springboot| Microservices| Kafka| sparkDatabasesOracle| MySQL| MongoDB| HBase and DB2.WebApp Servers WebLogic| Tomcat| WebSphere.Web TechnologiesReactJS AngularBuildETL ToolMaven| Jenkins| Pentaho| Databricks
Essential Skills: A Databricks Developer with Java expertise designs| builds| and maintains large-scale data processing solutions and AIML platforms| primarily using Java and Apache Spark on the Databricks platform. The role involves developing scalable data pipelines| optimizing Spark jobs| and ensuring data governance and security within cloud environments like AWS or Azure. Core ResponsibilitiesData Pipeline Development Design| develop| and maintain scalable ETLELT processes and data pipelines using Java and Spark on Databricks.Performance Optimization Implement and tune Spark jobs to enhance performance| stability| and cost-efficiency in distributed systems| troubleshooting issues like data skew and memory errors.Cloud Integration Work closely with cloud-native services (AWSAzure IAM| Storage| Networking) and integrate Databricks offerings within the cloud infrastructure.Automation CICD Develop automation capabilities using Java APIs and Infrastructure-as-Code (IaC) tools like Terraform for platform provisioning| orchestration| and monitoring.Collaboration Support Partner with data scientists| ML engineers| and business teams to gather requirements| define compute needs| and provide support for production environments.Governance Security Enforce data governance policies| security standards (RBAC| encryption)| and compliance requirements using tools l