1

Datacenter Operations Manager Jobs in Michigan (NOW HIRING)

AI Infrastructure Engineer

Ann Arbor, MI ยท On-site +1

$170K - $210K/yr

... operational practices that will scale with the company * Own end-to-end model serving ... Understanding of container orchestration and cluster management (Kubernetes, Docker) * Experience ...

AI Infrastructure Engineer

Ann Arbor, MI ยท On-site +1

$170K - $210K/yr

... operational practices that will scale with the company * Own end-to-end model serving ... Understanding of container orchestration and cluster management (Kubernetes, Docker) * Experience ...

next page

Showing results 1-20

Datacenter Operations Manager information

What are some common challenges faced by Datacenter Operations Managers, and how can they be addressed?

Datacenter Operations Managers often encounter challenges such as maintaining uptime during equipment failures, managing rapid scaling demands, and ensuring robust security protocols. To address these, managers typically implement detailed incident response plans, invest in staff cross-training, and coordinate closely with IT and security teams. Proactive monitoring, regular maintenance schedules, and clear communication channels are also essential for minimizing downtime and ensuring compliance with industry standards.

What is the difference between Datacenter Operations Manager vs Data Center Technician?

AspectDatacenter Operations ManagerData Center Technician
CredentialsTypically requires management experience, certifications like Cisco CCNA, CompTIA Server+Often requires technical certifications such as CompTIA A+, Network+, or vendor-specific training
Work EnvironmentOversees data center operations, manages teams, plans infrastructure upgradesPerforms hardware installation, troubleshooting, and maintenance tasks
Employer & Industry UsageUsed by data center and IT service providers for operational oversightCommonly employed in data centers, telecom, and enterprise IT facilities for technical support

The Datacenter Operations Manager focuses on overseeing overall data center operations, managing teams, and strategic planning. In contrast, the Data Center Technician handles hands-on technical tasks like hardware setup and troubleshooting. Both roles are essential but differ in scope and responsibilities within the data center environment.

What does a Datacenter Operations Manager do?

A Datacenter Operations Manager is responsible for overseeing the daily operations and maintenance of data center facilities. This includes ensuring the reliability, security, and efficiency of all hardware, software, and network systems within the data center. They manage staff, coordinate with other IT teams, handle incident response, and ensure compliance with industry standards and regulations. Their role is crucial for minimizing downtime, optimizing performance, and supporting the business's IT infrastructure needs.

What are the key skills and qualifications needed to thrive as a Datacenter Operations Manager, and why are they important?

To thrive as a Datacenter Operations Manager, you need a strong background in IT infrastructure, systems administration, and facility management, often supported by a degree in computer science or a related field. Familiarity with data center management tools, monitoring systems, and certifications such as ITIL or Data Center Certified Associate (DCCA) is commonly required. Strong leadership, problem-solving, and communication skills are crucial for managing teams, coordinating with stakeholders, and ensuring operational continuity. These competencies are vital for maintaining uptime, optimizing performance, and safeguarding critical business data and services.
What cities in Michigan are hiring for Datacenter Operations Manager jobs? Cities in Michigan with the most Datacenter Operations Manager job openings:
AI Infrastructure Engineer

AI Infrastructure Engineer

Utilidata

Ann Arbor, MI โ€ข On-site, Remote

$170K - $210K/yr

Full-time

Medical, Dental, Vision, Retirement, PTO

Posted 15 days ago


Job description

Utilidata is a fast-growing NVIDIA-backed AI company enabling AI data centers to dynamically orchestrate power and unlock more compute capacity from existing energy infrastructure. For over a decade, we have applied AI to the electric grid โ€” bringing real-time visibility and power-flow control to complex energy infrastructure. Our Karman platform, built on a custom NVIDIA module, brings that same capability to AI data centers, giving operators a way to better use the power already available to them.
The AI Infrastructure Engineer is responsible for designing, building, and owning the end-to-end infrastructure that serves Utilidata\'s AI and ML models across edge deployments, cloud environments, and data center integrations. They are also responsible for designing, building, and owning the integration of power data with AI inference software. ย This is Utilidata\'s first dedicated role of this kind, and will serve as the foundational function for how the company deploys and operates AI capabilities in production. The role requires deep technical expertise in ML model serving, distributed systems, and GPU infrastructure, with a strong emphasis on reliability, performance, and scalability. This position works cross-functionally with product, engineering, and data science teams and is open to fully remote candidates, with periodic travel expected for company retreats and key on-site engagements.
Responsibilities
  • Lead the design and build of Utilidata\'s AI inference platform โ€” establishing architecture patterns, deployment standards, and operational practices that will scale with the company
  • Own end-to-end model serving infrastructure for Utilidata\'s AI infrastructure (on-prem and datacenter)ย 
  • Build and maintain fault-tolerant, high-performance systems for serving AI models at scale, with a focus on low latency, reliability, and cost efficiency
  • Collaborate closely with algorithms engineers to integrate AI inference data and configuration with power optimization algorithmsย 
  • Optimize GPU utilization and inference performance across our hardware fleet, including NVIDIA accelerators central to Utilidata\'s edge AI platform
  • Establish MLOps best practices including CI/CD pipelines for model deployment, monitoring, and rollback across environments
  • Contribute to infrastructure roadmap decisions, including build vs. buy tradeoffs, tooling selection, and platform evolution as the team grows

Minimum Qualificationsย 
  • 5+ years of software engineering experience with a strong focus on AI infrastructure, backend systems, or distributed systems
  • Hands-on experience with AI model serving frameworks (e.g., vLLM, SGLang, Triton, TensorRT, TorchServe, or similar)
  • Understanding of container orchestration and cluster management (Kubernetes, Docker)
  • Experience deploying and operating infrastructure across both datacenter and on-prem environments
  • Strong knowledge of GPU workloads and the tradeoffs that come with them โ€” you understand how inference differs from training, and why it matters
  • Proficiency in Python; C++, CUDA, Go, Rust a plus
  • Excellent communication skills and comfort working cross-functionally in a lean, fast-moving environment
  • Willingness to travel up to 10% of timeย 

Enhanced Qualifications (Nice to Have)ย 
  • Dynamo experience a plus
  • Experience with edge AI deployments or constrained compute environments
  • Familiarity with infrastructure as code (Terraform, Helm)
  • Experience with observability platforms (Datadog, Prometheus, Grafana)
  • Background in energy, utilities, or industrial IoT
  • Contributions to open-source ML infrastructure projects

Salary Range: $170,000 to $210,000 base compensation depending on experience plus stock options. Salary will be commensurate with an individual\'s skills, training, years of experience, and in line with internal compensation bands.
Location: This position can be performed remotely from anywhere in the United States.ย 
Our Commitments:
Utilidata values the diversity of our team. We provide equal employment opportunities without regard to race, color, religion, creed, sex, gender, sexual orientation, gender identity or expression, national origin, age, physical disability, mental disability, medical condition, pregnancy or childbirth, sexual orientation, genetics, genetic information, marital status, or status as a covered veteran or any other basis protected by applicable federal, state and local laws.
We are committed to:
  • Creating a diverse and inclusive workplace that is welcoming, supportive, affirming and respectful
  • Empowering employees to solve problems and work together to make a difference
  • Providing mentorship and growth opportunities as part of a collaborative team
  • A flexible work environment with flexible paid time off
  • Competitive compensation and benefits, including health, dental, vision, and employer-match 401k