Job Overview:
Pay Range:ย $71.16hr - $82.39hr
Requirement/Must Have:
- Bachelorโs degree in Information Technology, Business, or a related field.
- 5+ years of experience in Data Centerย projects inย an enterprise environment.
- Knowledge of Cisco, Dell, HPE, Supermicro hardware.
- Deep knowledge of Ciscoย HW,ย NVIDIA GPU architectures (H100, B200, RTX 6000 Pro) and high-speed interconnects (RoCE v2, InfiniBand).
- Extensive knowledge and experience with Data Center infrastructure.
- Proficiency with asset management and automation tools (Netbox, ServiceNow, Terraform, or OpenTofu).
- Experience in Data Center lifecycle management, DCย HWย capacity planning, decommissioning,ย defragmentation,ย building complex financial showback models for shared infrastructure.
- Proven expertise in Kubernetes (NKP preferred) and NVIDIA AI Enterprise stacks (GPU Operator, DCGM, Triton, vLLM).
Responsibilities:
- Lead the architectural design and refinement of the client GPU-as-a-Service (GPUaaS) platform, ensuring a seamless experience for internal R&D, QA, and Sales teams.
- Provide technical leadership in key initiatives such as client Validated Designs (NVD) for the AI Factory, incorporating NVIDIA MGX/HGX architectures and high-density Cisco nodes (e.g., UCS 845A).
- Architect the Management Cluster control plane (NKP, Prism Central, NuDeploy) toย ensure it is decoupledย from GPU compute nodes for maximum efficiency.
- Implement policy-driven placement of workloads across on-premย and cloud-burst environments.
- Design solution for a centralized Data Center Asset Inventory system, ensuring real-time visibility into all hardware assets, including CPUs, GPUs, Virtual Machines, and networking.
- Develop a comprehensive Hardware Lifecycle Management strategy, including procurement forecasting, 'rack and stack' operationalization, and decommissioning of legacy systems (G3/G4/G5).
- Lead 'Tiger Team' initiatives to navigate supply chain constraints, ensuring critical release milestones are not delayed by hardware shortages.
- Enforce strict Security Standards for Data Center HW Provisioning.
- Implement network segmentation for all critical applications.
- Ensure all infrastructure meets SOC 2 and ISO 27001 compliance objectives while maintaining low-latency performance.
- Provide required architecture and designs during the project intake process.ย Review, guide the teams forย right architecture for all demands before they become approved projects.
- Partnerย withย security team and provide guidelines for upcoming projects.
- Involve and lead projects as an architect on special projects.
Nice to Have:
- Experience managing (as an architect) massive-scale data center environments (1,000+ nodes).
- Knowledge of client Cloud Infrastructure (NCI), AHV, and Prism Central.
- Strong background in MLOps and automated pipeline integration (Kubeflow/MLflow).
Founded in 2010 and headquartered in the Washington, DC metro area, Cynet Systems Inc. is a leading staffing and recruiting powerhouse. Proudly recognized as a nationally and locally certified diversity firm, Cynet delivers agile, scalable talent solutions across industries. With an active footprint in all 50 U.S. states and Canada, we support thousands of consultants through our expansive, high-performing recruitment engine operating across North America and Asiaโensuring speed, quality, and consistency in every hire.