Manage support for demand-side users (developers and ML teams using GPU compute), enterprise ... Fully remote team - Remote-first with hubs in Phoenix, Boulder, Miami and periodic team offsites
Manage support for demand-side users (developers and ML teams using GPU compute), enterprise ... Fully remote team - Remote-first with hubs in Phoenix, Boulder, Miami and periodic team offsites
Machine Learning & Operations Engineer
Miami, FL · Remote
$66K - $89K/yr
This is a fully remote position, working cross-functionally with research and engineering teams ... Optimize GPU/compute utilization across cloud and on-prem environments. * Deploy, monitor, and ...
Machine Learning & Operations Engineer
Miami, FL · Remote
$66K - $89K/yr
This is a fully remote position, working cross-functionally with research and engineering teams ... Optimize GPU/compute utilization across cloud and on-prem environments. * Deploy, monitor, and ...
Machine Learning & Operations Engineer
Miami, FL · Remote
$66K - $89K/yr
This is a fully remote position, working cross-functionally with research and engineering teams ... Optimize GPU/compute utilization across cloud and on-prem environments. * Deploy, monitor, and ...
Machine Learning & Operations Engineer
Miami, FL · Remote
$66K - $89K/yr
This is a fully remote position, working cross-functionally with research and engineering teams ... Optimize GPU/compute utilization across cloud and on-prem environments. * Deploy, monitor, and ...
AI/ML Engineer
Tampa, FL · Remote
$140K - $220K/yr
Exposure to GPU-based or edge inference environments. * Bachelor's or Master's degree in Computer ... Remote Employment Type: FULL_TIME
AI/ML Engineer
Tampa, FL · Remote
$140K - $220K/yr
Exposure to GPU-based or edge inference environments. * Bachelor's or Master's degree in Computer ... Remote Employment Type: FULL_TIME
Develop logic for Remote Device Management (RDM), including silent background updates, real-time ... Experience with profiling tools (Android Profiler, Systrace) to minimize CPU/GPU overhead and ...
Develop logic for Remote Device Management (RDM), including silent background updates, real-time ... Experience with profiling tools (Android Profiler, Systrace) to minimize CPU/GPU overhead and ...
Develop logic for Remote Device Management (RDM), including silent background updates, real-time ... Experience with profiling tools (Android Profiler, Systrace) to minimize CPU/GPU overhead and ...
Develop logic for Remote Device Management (RDM), including silent background updates, real-time ... Experience with profiling tools (Android Profiler, Systrace) to minimize CPU/GPU overhead and ...
Head of Product & Business Development - Agentic Payments
Miami, FL · On-site +1
$225K - $275K/yr
Source and close infrastructure and payment processing deals that drive GPU/compute utilization ... Work closely with Product and Engineering to shape the crypto IaaS and payments roadmap (RPC ...
Quick apply
Head of Product & Business Development - Agentic Payments
Miami, FL · On-site +1
$225K - $275K/yr
Source and close infrastructure and payment processing deals that drive GPU/compute utilization ... Work closely with Product and Engineering to shape the crypto IaaS and payments roadmap (RPC ...
Head of Product & Business Development - Agentic Payments
Sarasota, FL · On-site +1
$225K - $275K/yr
Source and close infrastructure and payment processing deals that drive GPU/compute utilization ... Work closely with Product and Engineering to shape the crypto IaaS and payments roadmap (RPC ...
Quick apply
Head of Product & Business Development - Agentic Payments
Sarasota, FL · On-site +1
$225K - $275K/yr
Source and close infrastructure and payment processing deals that drive GPU/compute utilization ... Work closely with Product and Engineering to shape the crypto IaaS and payments roadmap (RPC ...
Remote Gpu Programming information
What are some common challenges faced by professionals in remote GPU programming roles, and how can they be addressed?
What is remote GPU programming?
What are the key skills and qualifications needed to thrive as a Remote GPU Programmer, and why are they important?
Other
Medical, Dental, Vision
Posted 23 days ago
Job description
Hydra Host is a Founders Fund-backed NVIDIA cloud partner building the infrastructure platform that powers AI at scale. We connect AI Factories - high-performance GPU data centers - with the teams that depend on them: research labs training foundation models, enterprises running production inference, and developer platforms demanding scalable compute capacity.
This isn't a traditional support role. We're solving hard problems at the intersection of physical infrastructure and digital platforms. When a GPU cluster goes down, when AI team is experiencing network degradation, when an SLA on enterprise contract is breached, you're the person who ensures accountability, drives resolution, and documents what went wrong so it never happens again.
We've scaled fast, and our support systems need to catch up. You'll build them almost from scratch - establishing processes, defining SLAs, implementing tooling, and creating the operational discipline that ensures our infrastructure runs with the same precision as the silicon we deploy.
The AI infrastructure layer is being built right now. Companies are scrambling to secure GPU capacity, deploy clusters, and monetize excess compute. We're at the center of that transformation.
You'll work with world-class customers deploying next-gen AI hardware. You'll help solve real infrastructure challenges - not hypothetical SaaS edge cases. And you'll do it alongside a team that values craftsmanship, and moving fast without breaking things (especially GPUs).
What you'll do
- Handle multi-tier support operations. Manage support for demand-side users (developers and ML teams using GPU compute), enterprise clients (companies deploying private infrastructure), and supply-side operators (data centers running Brokkr). Each has different SLAs, escalation paths, and support needs
- Work with engineering to solve hard problems. Partner with infrastructure engineers and platform developers to troubleshoot complex issues. You'll figure out what's broken, pull in the right people, get to resolution, and make sure the same problem doesn't happen again
- Own vendor accountability. When hardware fails, firmware has issues, or deliveries are delayed, you're the quarterback. Track vendor performance, escalate issues, negotiate remediation, and ensure our customers aren't left holding the bag for supplier problems
- Manage SLA compliance and breaches. Define, track, and enforce SLAs across customer tiers and service types. When breaches occur, coordinate incident response, manage customer communication, and drive postmortem processes
- Build support infrastructure. Establish ticketing systems, escalation matrices, on-call rotations, playbooks, and knowledge bases. Create the scaffolding that lets support scale from hundreds of customers to tens of thousands without breaking
- Scale the function. As we grow, hire and mentor a support team. Build the culture and operating principles for support at Hydra Host
- 4+ years in customer support, success, or operations roles at technical B2B companies
- Experience supporting infrastructure, critical, or physical products where uptime and reliability matter
- Track record of building support processes and systems and scaling them through rapid growth
- Comfortable with technical concepts: APIs, server infrastructure, networking basics, cloud platforms
- Stellar communication skills - you can explain complex technical issues clearly to non-technical stakeholders
- Bias toward action and ownership. You see a problem, you fix it
- Experience in AI/ML infrastructure, GPU compute, or data center operations
- Experience implementing systems automation and AI Agents, support bots
- Familiarity with modern support & CRM platforms (Zendesk, Intercom, Pylon, Linear, Hubspot
- Prior experience managing or mentoring support teams
- Customer obsession - You genuinely care about solving customers problems, communicate proactively, and never leave customers in the dark
- Principled thinking - You make decisions based on clear principles, not politics or convenience. When there's ambiguity, you fall back on what's right for the customer and the business long-term
- Technical curiosity - You love learning how things work, even when it's outside your immediate domain
- Systems thinking - You don't just solve one-off issues; you identify root causes and build solutions that prevent them from recurring
- Equity ownership - Meaningful stake in what we're building together
- Competitive salary - We pay fairly and transparently
- Healthcare coverage - Medical, dental, vision for you and your dependents
- Fully remote team - Remote-first with hubs in Phoenix, Boulder, Miami and periodic team offsites
- Direct impact - Your work will shape how thousands of GPU clusters get deployed and operated. Early team means your fingerprints are on everything
About Hydra Host
Sourced by ZipRecruiter
Industry
Software development
Company size
11 - 50 Employees
Headquarters location
Miami, FL, US
Year founded
2021