Job Summary:
Deloitte is a leading professional services firm, and they are seeking a Lead Cloud Integrated Infra Engineer on their Silicon2Service team. The role involves designing and deploying integrated architectures for GPU-accelerated AI factories and high-performance computing infrastructure, collaborating closely with AI specialists and ecosystem partners to deliver effective solutions for clients.
Responsibilities:
• Leading architecture for pursuits and active opportunities, including discovery, requirements, constraints, and target-state design
• Creatively defining reference architectures for on-premises, cloud, and hybrid GPU platforms across compute, network, storage, security, software and operations
• Driving architecture trade-offs and decisions across performance, scalability, reliability, locality, total cost of ownership, time-to-value, and risk
• Owning the technical solution strategy in proposals and RFPs, including architecture narrative, assumptions, dependencies, sizing guidance, and delivery approach
• Facilitating client workshops and technical reviews and translating engineering detail into executive-ready communications
• Architecting complex, innovative technology solutions with a focus on business outcomes, cost of quality, and long-term scalability and sustainability.
• Engaging with C-Suite client leadership during sales and delivery, including leading technical pre-sales discussions, shaping proposals, and supporting the closing of new business opportunities
• Supporting go-to-market strategies, including participation in industry events, conferences, and client briefings
Qualifications:
Required:
• 10+ years of experience in infrastructure architecture or engineering for large-scale platforms including design, implementation, operations, and optimization.
• 4+ years designing or delivering GPU-accelerated platforms for AI, ML, or high-performance computing
• 3+ years Linux system administration in production environments
• 3+ years designing or operating distributed compute clusters for AI/HPC in hybrid cloud setups, including multi-GPU topologies, partitioning, scheduler integration, and scalability for edge-to-cloud workloads.
• 2+ years with high-performance networking or storage for AI/HPC
• 2+ years building containerized platforms using Kubernetes or Red Hat OpenShift, including GPU operators/drivers, CUDA container runtime, and cluster lifecycle automation
• 2+ years automating infrastructure as code(IaC) with tools like Terraform and Ansible
• At least 2 end-to-end deployments of reference architectures in the cloud or on-prem, including variants with security controls, network segmentation, operational runbooks, and validation testing
• Experience in pre-sales or sales engineering, including discovery, solution demonstrations, and proposal/RFP contributions
• Ability to travel 50%, on average, based on the work you do and the clients and industries/sectors you serve.
• Limited immigration sponsorship may be available.
Preferred:
• 2+ years implementing AI/HPC cluster scheduling (Slurm and Kubernetes), including multi-tenant queues, quotas, and GPU-aware policies
• 2+ years supporting generative AI infrastructure patterns, including multi-node distributed training
• Experience with AI agents and frameworks
• Experience with high-throughput storage for AI/HPC
• Experience executing NVIDIA co-sell motions with OEMS (Dell, HPC, Lenovo), CSPs ( AWS, Azure, Google Cloud), or independent software vendors ( Run:ai, OpenShift, Weights & Biases)
Company:
Deloitte is a business consulting company that offers audit, consulting, financial advisory, and tax services. Founded in 1845, the company is headquartered in London, GBR, with a team of 10001+ employees. The company is currently Late Stage.