Job Summary:
Bandwidth Inc. is a global software company that helps enterprises deliver exceptional experiences through voice, messaging, and emergency services. They are seeking an Applied AI Engineer to identify and implement AI solutions across internal systems, enhancing operations and infrastructure. This role involves developing AI platforms, automating workflows, and establishing best practices for AI integration within the Corporate IT Engineering team.
Responsibilities:
• Own and extend existing AI platforms and tooling, improving reliability, expanding capabilities, and integrating them more deeply with internal systems.
• Architect and build internal API layers and shared services that allow AI workflows and internal applications to publish, version, and retrieve outputs across the engineering ecosystem.
• Identify and build AI-powered tooling that creates leverage across the Corporate IT Engineering stack, including infrastructure, identity, monitoring, and automation platforms.
• Develop and iterate on proof-of-concepts that demonstrate how AI can augment or automate internal workflows; from anomaly detection in infrastructure logs to AI-assisted documentation and IT troubleshooting.
• Containerize and orchestrate AI workloads using Docker and Kubernetes, ensuring reliable and reproducible deployments across environments.
• Automate infrastructure provisioning and configuration using Terraform and Ansible, following infrastructure-as-code best practices.
• Establish AI development patterns and best practices across the Corporate IT Engineering organization, helping teams adopt AI capabilities effectively and responsibly.
• Stay current with the evolving AI and MLOps landscape and bring relevant advancements back to the team.
Qualifications:
Required:
• AI & Application Development
• Hands-on experience owning or extending LLM-powered platforms, including RAG pipeline development, prompt engineering, and integrating LLM APIs into production internal systems.
• Expert-level knowledge of AI infrastructure, including model serving, inference optimization, GPU/CPU resource management, and MLOps pipelines.
• Experience designing and building internal API layers or shared platform services that multiple teams and systems publish to and consume from.
• Proficiency in Python and/or TypeScript for building integrations, scripts, and lightweight internal services.
• Experience working with REST APIs and building integrations across a diverse internal tooling ecosystem.
• Strong AWS experience: required proficiency in core services (EC2, ECS/EKS, S3, RDS, Lambda, IAM, VPC) and experience architecting and operating production workloads on AWS.
• Deep Docker and Kubernetes expertise: required hands-on experience containerizing applications, writing Dockerfiles, managing multi-container deployments, and orchestrating workloads with Kubernetes (EKS or self-managed).
• Deep Terraform and Ansible expertise: required experience writing and maintaining Terraform modules for cloud infrastructure, and using Ansible for configuration management and automation.
• Experience with GitHub for version control, pull request workflows, branching strategies, and CI/CD integration.
• Experience with Artifactory for artifact management, including publishing and consuming build artifacts, Docker images, and package registries.
• An experimental mindset: comfortable inheriting imperfect systems, iterating quickly, and improving as you go.
• Ability to evaluate AI capabilities through a business lens, understanding not just what’s possible but what creates real value for internal teams and the organization.
• Strong communication skills and the ability to explain AI concepts and tradeoffs to non-technical stakeholders.
• A collaborative, team-first approach with a genuine curiosity about where AI is headed.
• A Bachelor’s degree in Computer Science, Engineering, or equivalent hands-on experience.
Preferred:
• Experience with agentic frameworks such as LangChain, LangGraph, CrewAI, AutoGen, or similar.
• Familiarity with corporate IT or infrastructure engineering environments, understanding how enterprise platforms around identity, monitoring, and automation operate.
• Background building MCP (Model Context Protocol) servers or tools that extend AI agent capabilities.
• Experience with vector databases (e.g., Pinecone, Weaviate, pgvector) and semantic search.
• Experience building or maintaining internal developer platforms, artifact registries, or shared API services.
Company:
Bandwidth is the universal communications platform that simplifies how businesses deliver integrated global experiences. Founded in 1999, the company is headquartered in Raleigh, USA, with a team of 1001-5000 employees. The company is currently Late Stage.