MTS DevOps Engineer
Location: Austin, TX
Job Type: Full-Time
Department: Engineering / DevOps
About Avtal, Inc.
We are a VC-backed company that grew revenue 35x in the past year. We help third-party debt collection agencies deliver a digital, end-to-end self-service experience for their consumers.
About the RoleWe are looking for a skilled and motivated MTS DevOps Engineer with strong experience in AWS, Linux, infrastructure automation, and CI/CD, along with practical experience supporting AI-enabled systems in production. In this role, you will be instrumental in building, maintaining, and scaling our cloud-native infrastructure, improving deployment workflows, and ensuring the reliability, security, performance, and auditability of our systems in a highly regulated environment. You will also help support the infrastructure and operational foundations needed for AI-powered applications, including secure runtime environments, observability, scalable service orchestration, and cost-conscious operations.
Responsibilities- Build and maintain infrastructure automation tools using Ansible, Terraform, Python, Go, and shell scripting
- Develop and operate secure, scalable infrastructure on AWS (e.g., EC2, S3, RDS, IAM, CloudWatch)
- Maintain and optimize Linux-based systems across development and production environments
- Implement and manage CI/CD pipelines and automated deployment workflows
- Support infrastructure for AI-powered services, including runtime reliability, operational visibility, and secure service configuration
- Help enable LLM API integrations, AI service orchestration, secrets management, and secure runtime environments for AI-enabled applications
- Monitor system health, performance, reliability, security, and AI service observability using modern tooling
- Troubleshoot production issues, perform root cause analysis, and implement durable improvements
- Collaborate with engineering teams to improve infrastructure reliability, scalability, developer productivity, and operational resilience
- Document infrastructure processes, runbooks, and best practices to support knowledge sharing and onboarding
Requirements- 4+ years of experience in DevOps, SRE, or Infrastructure Engineering
- Proficiency in infrastructure automation and tooling using Ansible, Terraform, Python, Go, and shell scripting
- Deep understanding of Linux system administration, shell scripting, and process management
- Proven experience with AWS services such as EC2, S3, RDS, IAM, CloudWatch, etc.
- Hands-on experience with CI/CD systems and version control (Git)
- Familiarity with infrastructure needs for AI-enabled systems, such as model API integrations, service orchestration, observability, cost monitoring, or secure data handling
- Strong debugging, troubleshooting, and problem-solving skills
- Ability to build and operate systems with attention to reliability, security, and auditability in a highly regulated environment
Nice to Have- Experience supporting production systems that include LLM-based or other AI-powered capabilities
- Familiarity with AI observability, evaluation support tooling, guardrails, and cost/performance monitoring
- Experience with vector databases, embeddings pipelines, or retrieval infrastructure
- Hands-on experience with infrastructure as code, including Terraform or CloudFormation
- Background in Site Reliability Engineering (SRE) practices
- Familiarity with monitoring and observability tools such as Prometheus, Grafana, and Kibana
- Understanding of secure infrastructure design and cloud compliance best practices
- Experience supporting high-availability production systems in regulated or security-conscious environments