Hadoop Admin w/Hbase & AWS exp (REMOTE) - 1120
(Multiple states) - Full Time
Remote but may have to travel up to 25% to client sites
Our client is looking for an experienced Hadoop Administrator with quantifiable HBase experience that can become an integral part of huge-scale Hadoop implementation projects including Hadoop cluster deployments, upgrades, DR, HA, migrations, capacity planning and on-going cluster administration and management.
- Manage and maintain production mission critical large-scale Hadoop clusters with hundreds of nodes.
- Configure, tune, manage and operate large-scale HBase deployments.
- Migrate Hadoop clusters from on-premise to cloud environments.
- Deploy Hadoop clusters in the cloud on AWS infrastructure using both EC2 and EMR.
- Plan for capacity upgrading or downsizing as and when the need arises.
- Secure and harden Hadoop clusters containing PIIs and other sensitive information
- Be proficient in Linux scripting and also in Hive, Oozie, and HCatalog
- 3+ years of hands-on experience in managing production Hadoop clusters including all infrastructure aspects such cluster provisioning & administration. Experience can be either Cloudera or Hortonworks based.
- 3+ years of experience in monitoring and supporting production 24x7 Hadoop environments.
- 3+ years’ experience as a Linux administrator including scripting, monitoring, deployments, installations, security and operations.
- 3+ years’ experience supporting and maintaining production HBase clusters.
- 3+ years of in-depth and hands-on experience working with the Hadoop ecosystem query engines such as Impala, HIVE, Tez, Presto and experience in data modeling including table partitioning models, deep understanding of Parquet file, SequenceFiles, etc.…
- 1+ years of demonstratable experience in running production Hadoop workloads using public cloud infrastructure in Amazon Web Services, Azure or GCP.
- Strong familiarity and hands-on experience with cloud computing concepts and technologies, such as EC2, EMR, S3, Glue, etc. and the related Hadoop best-practices on deploying Hadoop on such infrastructure.
- Experience in planning Hadoop for production including sizing and pre/post deployment processes.
- Experience in data replication across Hadoop clusters using DistCP or alternative solutions.
- Experience with Amazon EMR – a huge plus.