Overview We are seeking a Data & Software Engineer works with a small team to build complex data flows for a custom application. Successful candidate will have advanced Python programming skills, familiarity with Java, an understanding of data security, privacy, governance and compliance principles and a demonstrated history of building production data pipelines and ETL workflows at scale. Candidate must have experience: What will you do? * • Building end-to-end data pipelines leveraging Python Using orchestration tools to deploy data pipelines, including configuring and updating Spark Jobs • Containerizing and deploying applications in cloud environments like AWS. • Working with MySQL and PostgreSQL including performance tuning, schema design, and query optimization for complex, analytical workloads. • Leveraging industry standard tools for code control (Git, IaaC control, etc.) • Working with data catalogs, tracking data lineage and handling a variety of data formats, including Geospatial. • Using Bash scripting for automation and data processing tasks • Integrating Al/ML services and models
* • Work with stakeholders to understand data requirements, assess feasibility, and design appropriate solutions with minimal oversight • Leverage strong problem-solving and debugging skills for data quality issues, pipeline failures, and performance bottlenecks • Leverage a background in large-scale data migration or platform modernization efforts Contribute to data engineering documentation, best practices, and design patterns. Do you have what it takes? * Active TS/SCI W/ Polygraph required. * Bachelor's degree in Computer Science, Engineering, Finance, or a related technical field, or equivalent practical experience.
* Minimum of 5 years' experience with: • Apache Spark & PySpark • Advanced Python skills (including Pandas & NumPy) • Docker, Podman • AWS S3, Lambda & Step functions • Apache Iceberg, Airflow, etc. • SQL (with Trino) • NoSQL, DynamoDB • Unity Catalog OSS, Apache Polaris • Apache Superset • Terraform or CloudFormation • OpenLineage • H3, PostGIS