Shield AI
Shield AI

61 Shield Ai Devops Engineer Jobs Hiring Near You

AI DevOps Engineer

Manhattan, NY ยท On-site

$58 - $79.50/hr

As the DevOps Engineer for the AI Platform, you build and operate the infrastructure layer that makes AI tooling reliable at scale and that provides the safe, instrumented execution environment AI ...

DevOps Engineer

Miami, FL ยท On-site

$50.25 - $69/hr

Millennium is expanding its Core DevOps Team and is seeking an experienced AI DevOps Engineer to enhance the AI-centric developer experience. The role involves delivering key SDLC capabilities ...

Senior DevOps Engineer

San Jose, CA ยท On-site

$152K - $195K/yr

TENEX is an AI-native, automation-first, built-for-scale Managed Detection and Response (MDR ... Shield Capital , DTCP (formerly Deutsche Telekom Capital Partners) , Deepwork Capital , and the ...

Senior DevOps Engineer

Sarasota, FL ยท On-site

$125K - $160K/yr

TENEX is an AI-native, automation-first, built-for-scale Managed Detection and Response (MDR ... Shield Capital , DTCP (formerly Deutsche Telekom Capital Partners) , Deepwork Capital , and the ...

Senior DevOps Engineer

Overland Park, KS ยท On-site

$128K - $165K/yr

TENEX is an AI-native, automation-first, built-for-scale Managed Detection and Response (MDR ... Shield Capital , DTCP (formerly Deutsche Telekom Capital Partners) , Deepwork Capital , and the ...

Showing results 21-40

AI DevOps Engineer

AI DevOps Engineer

Summit Tech Partners

Manhattan, NY โ€ข On-site

$58 - $79.50/hr

Other

Posted 4 days ago


Job description

The AI Software Development team is charged with transforming how our engineering organization designs, builds, and ships software. The team drives the adoption of AI-assisted tooling across the full software development lifecycle, deploys autonomous agents that expand engineering capacity, and ensures that every new generation of AI capability is evaluated, integrated, and operationalized with rigor.
As the DevOps Engineer for the AI Platform, you build and operate the infrastructure layer that makes AI tooling reliable at scale and that provides the safe, instrumented execution environment AI agents require to operate in production. You collaborate closely with the InfoSec Engineer - working through the security review and approval process for every new capability introduced to the platform - and you ensure the infrastructure is observable, resilient, and ready to evolve as the platform grows.
KEY RESPONSIBILITIES
  • Design and operate the infrastructure layer for the AI Software Development platform - compute environments, API gateway configurations, token usage governance, and service reliability.
  • Build and maintain sandboxed execution environments for AI agents - isolating agent workloads and enabling safe rollback on task failure.
  • Own CI/CD integration for AI tooling - ensuring IDE assistants, automated review tools, and agent triggers are wired into existing engineering pipelines with minimal friction.
  • Instrument the platform for full observability: latency, token consumption, task throughput, error rates, cost-per-task, and agent utilization metrics.
  • Manage model API integrations - OpenAI, Anthropic, GitHub, and others - including rate limit governance, failover logic, and cost attribution by team and use case.
  • Collaborate with the InfoSec Engineer throughout the security review and approval process for all new platform capabilities, ensuring each integration meets policy requirements before deployment.
  • Evaluate and adopt emerging infrastructure patterns for AI workloads as the platform evolves.

REQUIRED QUALIFICATIONS
  • 5+ years of DevOps, platform engineering, or SRE experience with ownership of production infrastructure.
  • Proficiency with container orchestration (Kubernetes) and infrastructure-as-code (Terraform, Pulumi, or CDK).
  • Experience integrating third-party SaaS APIs into enterprise engineering pipelines - auth, rate limiting, cost governance.
  • Strong observability fundamentals - experience with Datadog, Grafana, OpenTelemetry, or equivalent tooling.
  • Demonstrated experience designing sandboxed or isolated execution environments for automated workloads.
  • Solid security hygiene: secret management, least-privilege IAM, network egress control, and audit logging.
  • Direct experience with prompt engineering and LLM-based developer tools, and practical familiarity with how they are deployed and operated.
  • Familiarity with AI capability benchmarks - including SWE-bench, METR research, and similar frameworks - sufficient to inform infrastructure planning decisions.

NICE TO HAVE
  • Prior experience building infrastructure for LLM-based applications or AI agent workloads.
  • Familiarity with vector databases (Pinecone, Weaviate, pgvector) and embedding pipeline operations.
  • Experience with GPU-backed compute provisioning for on-premises or hybrid inference workloads.
  • Cost attribution and FinOps experience in multi-team API consumption environments.
  • Background in developer experience tooling infrastructure - telemetry pipelines, IDE plugin distribution, or SCM integrations.