Job Overview:
Pay Range: $71.16hr - $82.39hr
Requirement/Must Have:
- Bachelor’s degree in Information Technology, Business, or a related field.
- 5+ years of experience in Data Center projects in an enterprise environment.
- Knowledge of Cisco, Dell, HPE, Supermicro hardware.
- Deep knowledge of Cisco HW, NVIDIA GPU architectures (H100, B200, RTX 6000 Pro) and high-speed interconnects (RoCE v2, InfiniBand).
- Extensive knowledge and experience with Data Center infrastructure.
- Proficiency with asset management and automation tools (Netbox, ServiceNow, Terraform, or OpenTofu).
- Experience in Data Center lifecycle management, DC HW capacity planning, decommissioning, defragmentation, building complex financial showback models for shared infrastructure.
- Proven expertise in Kubernetes (NKP preferred) and NVIDIA AI Enterprise stacks (GPU Operator, DCGM, Triton, vLLM).
Responsibilities:
- Lead the architectural design and refinement of the client GPU-as-a-Service (GPUaaS) platform, ensuring a seamless experience for internal R&D, QA, and Sales teams.
- Provide technical leadership in key initiatives such as client Validated Designs (NVD) for the AI Factory, incorporating NVIDIA MGX/HGX architectures and high-density Cisco nodes (e.g., UCS 845A).
- Architect the Management Cluster control plane (NKP, Prism Central, NuDeploy) to ensure it is decoupled from GPU compute nodes for maximum efficiency.
- Implement policy-driven placement of workloads across on-prem and cloud-burst environments.
- Design solution for a centralized Data Center Asset Inventory system, ensuring real-time visibility into all hardware assets, including CPUs, GPUs, Virtual Machines, and networking.
- Develop a comprehensive Hardware Lifecycle Management strategy, including procurement forecasting, 'rack and stack' operationalization, and decommissioning of legacy systems (G3/G4/G5).
- Lead 'Tiger Team' initiatives to navigate supply chain constraints, ensuring critical release milestones are not delayed by hardware shortages.
- Enforce strict Security Standards for Data Center HW Provisioning.
- Implement network segmentation for all critical applications.
- Ensure all infrastructure meets SOC 2 and ISO 27001 compliance objectives while maintaining low-latency performance.
- Provide required architecture and designs during the project intake process. Review, guide the teams for right architecture for all demands before they become approved projects.
- Partner with security team and provide guidelines for upcoming projects.
- Involve and lead projects as an architect on special projects.
Nice to Have:
- Experience managing (as an architect) massive-scale data center environments (1,000+ nodes).
- Knowledge of client Cloud Infrastructure (NCI), AHV, and Prism Central.
- Strong background in MLOps and automated pipeline integration (Kubeflow/MLflow).
Founded in 2010 and headquartered in the Washington, DC metro area, Cynet Systems Inc. is a leading staffing and recruiting powerhouse. Proudly recognized as a nationally and locally certified diversity firm, Cynet delivers agile, scalable talent solutions across industries. With an active footprint in all 50 U.S. states and Canada, we support thousands of consultants through our expansive, high-performing recruitment engine operating across North America and Asia—ensuring speed, quality, and consistency in every hire.