Job Summary:
TensorWave is dedicated to delivering seamless and reliable AI compute at scale through their cloud platform. They are seeking a DevOps Software Engineer to design and build integrations between internal platform services and infrastructure systems, focusing on automation and orchestration across platforms.
Responsibilities:
• Design and build integrations between internal platform services (e.g., messaging/pub-sub systems), infrastructure systems (compute, storage, networking), third-party vendor platforms
• Develop services and tools that enable automation workflows, system coordination and orchestration, event-driven infrastructure operations
• Write production-quality code in Go, Python, Rust (where applicable)
• Build APIs, services, and background workers that interact with infrastructure platforms, CI/CD systems, automation frameworks
• Ensure code is reliable, observable, maintainable
• Integrate software with automation systems such as Ansible, Terraform, CI/CD pipelines (GitHub Actions, ArgoCD)
• Enable infrastructure workflows through APIs, event-driven systems, automation hooks
• Work closely with DevOps engineers (infrastructure and automation), Development teams (application requirements)
• Translate infrastructure capabilities into usable APIs and services
• Help teams integrate their systems into platform workflows
• Build logging, metrics, and tracing into services
• Debug and resolve issues across distributed systems
• Ensure integrations are resilient and handle failure scenarios gracefully
• Identify gaps in platform integration and automation
• Build tooling that reduces manual work and improves system cohesion
• Contribute to standards for internal platform development
Qualifications:
Required:
• 5+ years of experience in software engineering, DevOps development, or platform engineering
• Strong programming experience in: Go and/or Python
• Rust is a strong plus
• Experience building: APIs, services, system integrations
• Strong understanding of: Distributed systems concepts, Event-driven architectures
• Experience working with: Linux systems, Infrastructure platforms
Preferred:
• Experience integrating with: Infrastructure platforms (compute, storage, networking), Kubernetes environments
• Familiarity with: Message queues or pub/sub systems, CI/CD systems (GitHub Actions, ArgoCD)
• Experience with automation frameworks such as: Ansible
• Experience working with third-party APIs and vendor platforms
• Exposure to infrastructure at scale or CSP environments
Company:
TensorWave is an AMD-exclusive cloud platform that leverages AMD Instinct GPUs and ROCm for high-performance AI workloads. Founded in 2023, the company is headquartered in Las Vegas, USA, with a team of 51-200 employees. The company is currently Growth Stage.