Job Summary:
Blitzy is a Cambridge, MA based AI software development platform on a mission to revolutionize the software development life cycle. As a DevOps Engineer, you will be responsible for architecting and maintaining scalable systems that enable the autonomous delivery of production-ready software for Fortune 500 companies.
Responsibilities:
• Build, manage, and scale Kubernetes clusters supporting AI agent workloads and production application deployments.
• Design and implement robust CI/CD pipelines for both application services and AI-driven workflows.
• Automate infrastructure provisioning, scaling, and operations using Python and Terraform.
• Deploy and maintain applications via Helm charts, ensuring consistency across environments.
• Own the observability stack: alerting, distributed tracing, and monitoring for all production services and APIs.
• Build and maintain infrastructure for AI agent orchestration, enabling reliable and high-throughput agent execution.
• Partner closely with engineering teams to improve developer experience, deployment strategies, and operational tooling.
• Maintain and continuously improve the security, reliability, and cost-efficiency of our cloud environments.
Qualifications:
Required:
• 5–8 years of DevOps or infrastructure engineering experience in production environments.
• Deep expertise in Kubernetes — including deployment, scaling, networking, and troubleshooting.
• Strong Python proficiency for automation, scripting, and tooling.
• Hands-on experience with Helm for application package management.
• Proven track record designing and maintaining CI/CD pipelines.
• Experience with major cloud platforms (AWS, Azure, or GCP).
• Proficiency with Terraform for Infrastructure as Code.
• Strong Linux administration skills and containerization expertise (Docker).
Preferred:
• CKA (Certified Kubernetes Administrator) certification.
• Experience with MLOps tooling such as MLflow, Kubeflow, or similar platforms.
• Background in microservices architecture and service mesh technologies.
• Familiarity with API gateway management and advanced service mesh configurations.
• A bias for automation — if you've done something manually twice, you've already started scripting it.
• Passion for AI infrastructure and excitement about building systems at the frontier of what's technically possible.
Company:
Blitzy is a Generative AI software platform that aims to automate custom software development. Founded in 2023, the company is headquartered in Boston, USA, with a team of 51-200 employees. The company is currently Growth Stage.