We scale the control plane itself - apiserver, etcd, controllers - so it stays responsive as object counts and node counts grow by orders of magnitude. And we build the core cluster services every ...
We scale the control plane itself - apiserver, etcd, controllers - so it stays responsive as object counts and node counts grow by orders of magnitude. And we build the core cluster services every ...
Senior Staff+ Software Engineer, Kubernetes Platform
San Francisco, CA · On-site
$405K - $485K/yr
We scale the control plane itself - apiserver, etcd, controllers - so it stays responsive as object counts and node counts grow by orders of magnitude. And we build the core cluster services every ...
Senior Staff+ Software Engineer, Kubernetes Platform
San Francisco, CA · On-site
$405K - $485K/yr
We scale the control plane itself - apiserver, etcd, controllers - so it stays responsive as object counts and node counts grow by orders of magnitude. And we build the core cluster services every ...
You will contribute to the control plane architecture for EKS Ultraclusters, defining how the API server, etcd, and associated components scale to support 100,000-node clusters running generative AI ...
You will contribute to the control plane architecture for EKS Ultraclusters, defining how the API server, etcd, and associated components scale to support 100,000-node clusters running generative AI ...
Senior XMPP DevOps Engineer
Tempe, AZ · On-site
$126K - $162K/yr
Experience with Nginx, ETCD in production deployment and troubleshooting * Familiar with AWS technology including Elastic Search, Elastic cache, DynamoDB, SQS and S3 * Understand gitops and familiar ...
Senior XMPP DevOps Engineer
Tempe, AZ · On-site
$126K - $162K/yr
Experience with Nginx, ETCD in production deployment and troubleshooting * Familiar with AWS technology including Elastic Search, Elastic cache, DynamoDB, SQS and S3 * Understand gitops and familiar ...
Senior XMPP DevOps Engineer
Tempe, AZ · On-site
$126K - $162K/yr
Experience with Nginx, ETCD in production deployment and troubleshooting * Familiar with AWS technology including Elastic Search, Elastic cache, DynamoDB, SQS and S3 * Understand gitops and familiar ...
Senior XMPP DevOps Engineer
Tempe, AZ · On-site
$126K - $162K/yr
Experience with Nginx, ETCD in production deployment and troubleshooting * Familiar with AWS technology including Elastic Search, Elastic cache, DynamoDB, SQS and S3 * Understand gitops and familiar ...
Stay current with the Go ecosystem and cloud-native tooling (gRPC, buf, etcd, DynamoDB, Prometheus) and apply relevant advances to improve the platform. Requirements: Basic Qualifications * Bachelor ...
Stay current with the Go ecosystem and cloud-native tooling (gRPC, buf, etcd, DynamoDB, Prometheus) and apply relevant advances to improve the platform. Requirements: Basic Qualifications * Bachelor ...
Rancher and Kubernetes SME
Princeton, NJ · On-site
$51 - $56/hr
Deep understanding of Kubernetes architecture, control plane, etcd, networking, and storage. * Experience designing and managing high availability Kubernetes environments. * Strong Linux ...
Quick apply
Rancher and Kubernetes SME
Princeton, NJ · On-site
$51 - $56/hr
Deep understanding of Kubernetes architecture, control plane, etcd, networking, and storage. * Experience designing and managing high availability Kubernetes environments. * Strong Linux ...
Stay current with the Go ecosystem and cloud-native tooling (gRPC, buf, etcd, DynamoDB, Prometheus) and apply relevant advances to improve the platform. Requirements Basic Qualifications * Bachelor ...
Stay current with the Go ecosystem and cloud-native tooling (gRPC, buf, etcd, DynamoDB, Prometheus) and apply relevant advances to improve the platform. Requirements Basic Qualifications * Bachelor ...
Senior Web DevOps Engineer
Tempe, AZ · On-site
$126K - $162K/yr
Experience with Nginx, ETCD in production deployment and troubleshooting * Experience working with AWS services, such as Dynamodb, RDS, S3, Route53, etc. * Experience with Source Code Management ...
Senior Web DevOps Engineer
Tempe, AZ · On-site
$126K - $162K/yr
Experience with Nginx, ETCD in production deployment and troubleshooting * Experience working with AWS services, such as Dynamodb, RDS, S3, Route53, etc. * Experience with Source Code Management ...
Senior Web DevOps Engineer
Tempe, AZ · On-site
$126K - $162K/yr
Experience with Nginx, ETCD in production deployment and troubleshooting * Experience working with AWS services, such as Dynamodb, RDS, S3, Route53, etc. * Experience with Source Code Management ...
Senior Web DevOps Engineer
Tempe, AZ · On-site
$126K - $162K/yr
Experience with Nginx, ETCD in production deployment and troubleshooting * Experience working with AWS services, such as Dynamodb, RDS, S3, Route53, etc. * Experience with Source Code Management ...
Software Development Engineer (Elastic Kubernetes Service), EKS Scalability & Performance
Seattle, WA · On-site
You will contribute to the control plane architecture for EKS Ultraclusters, defining how the API server, etcd, and associated components scale to support 100,000-node clusters running generative AI ...
Software Development Engineer (Elastic Kubernetes Service), EKS Scalability & Performance
Seattle, WA · On-site
You will contribute to the control plane architecture for EKS Ultraclusters, defining how the API server, etcd, and associated components scale to support 100,000-node clusters running generative AI ...
Senior Kubernetes Platform Engineer - AI/ML Infrastructure
Durham, NC · On-site
$98K - $133K/yr
Architect, build, and operate large-scale on-prem Kubernetes platforms (OpenShift/Anthos), including control plane and etcd lifecycle management * Define and evolve scalable, multi-tenant platform ...
Senior Kubernetes Platform Engineer - AI/ML Infrastructure
Durham, NC · On-site
$98K - $133K/yr
Architect, build, and operate large-scale on-prem Kubernetes platforms (OpenShift/Anthos), including control plane and etcd lifecycle management * Define and evolve scalable, multi-tenant platform ...
Design and build core systems including the scheduler, controller manager, API server, and etcd-like state store. * Develop a custom container runtime interface or integrate with existing runtimes (e ...
Design and build core systems including the scheduler, controller manager, API server, and etcd-like state store. * Develop a custom container runtime interface or integrate with existing runtimes (e ...
Work with distributed data stores and messaging systems (MongoDB, Etcd, Elasticsearch, RabbitMQ) * Collaborate with cross-functional teams in an Agile development environment * Support system ...
Quick apply
Work with distributed data stores and messaging systems (MongoDB, Etcd, Elasticsearch, RabbitMQ) * Collaborate with cross-functional teams in an Agile development environment * Support system ...
Java Architect
$69 - $93/hr
... Etcd, Consul, Zookeeper, Curator, Eureka etc preferred. => Experience in working with Docker container, Kubernetes preferred. => Experience utilizing IaaS and PaaS from Amazon AWS or Google Cloud ...
Java Architect
$69 - $93/hr
... Etcd, Consul, Zookeeper, Curator, Eureka etc preferred. => Experience in working with Docker container, Kubernetes preferred. => Experience utilizing IaaS and PaaS from Amazon AWS or Google Cloud ...
Design and build core systems including the scheduler, controller manager, API server, and etcd-like state store. * Develop a custom container runtime interface or integrate with existing runtimes (e ...
Design and build core systems including the scheduler, controller manager, API server, and etcd-like state store. * Develop a custom container runtime interface or integrate with existing runtimes (e ...
Collaborate on node IAM, pod service accounts, CNI security, and cloud provider integrations • Secure the Kubernetes control plane including API server, etcd, and CNI plugin configurations • ...
Collaborate on node IAM, pod service accounts, CNI security, and cloud provider integrations • Secure the Kubernetes control plane including API server, etcd, and CNI plugin configurations • ...
Collaborate on node IAM, pod service accounts, CNI security, and cloud provider integrations • Secure the Kubernetes control plane including API server, etcd, and CNI plugin configurations • ...
Collaborate on node IAM, pod service accounts, CNI security, and cloud provider integrations • Secure the Kubernetes control plane including API server, etcd, and CNI plugin configurations • ...
Design and build core systems including the scheduler, controller manager, API server, and etcd-like state store. * Develop a custom container runtime interface or integrate with existing runtimes (e ...
Design and build core systems including the scheduler, controller manager, API server, and etcd-like state store. * Develop a custom container runtime interface or integrate with existing runtimes (e ...
Deep expertise in etcd management (backup, restore, recovery, upgrades) * Strong proficiency in Go with experience building Kubernetes controllers, operators, CRDs, and webhooks * Deep understanding ...
Deep expertise in etcd management (backup, restore, recovery, upgrades) * Strong proficiency in Go with experience building Kubernetes controllers, operators, CRDs, and webhooks * Deep understanding ...
Etcd information
See salary details
$18.03 - $25.87
5% of jobs
$25.87 - $33.72
5% of jobs
$33.72 - $41.56
5% of jobs
$45.05 is the 25th percentile. Wages below this are outliers.
$41.56 - $49.41
20% of jobs
The median wage is $54.05 / hr.
$49.41 - $57.26
24% of jobs
$57.26 - $65.10
5% of jobs
$70.14 is the 75th percentile. Wages above this are outliers.
$65.10 - $72.95
15% of jobs
$72.95 - $80.79
2% of jobs
$80.79 - $88.64
12% of jobs
$88.64 - $96.48
2% of jobs
$96.48 - $104.33
3% of jobs
$18
$60
$104
How much do etcd jobs pay per hour?
What are the key skills and qualifications needed to thrive as an Etcd Administrator, and why are they important?
What are some common challenges faced by professionals working with etcd in a production environment?
What is an Etcd administrator and what do they do?
What is the difference between Etcd vs Kubernetes Administrator?
| Aspect | Etcd | Kubernetes Administrator |
|---|---|---|
| Primary Role | Distributed key-value store for configuration data and service discovery | Managing, deploying, and maintaining Kubernetes clusters |
| Required Skills | Knowledge of distributed systems, etcd architecture, security, and troubleshooting | Kubernetes architecture, cluster management, networking, and security |
| Work Environment | DevOps, cloud infrastructure, containerized environments | DevOps, cloud platforms, container orchestration |
| Certifications | None specific, but related to cloud and DevOps certifications | Kubernetes certifications (CKA, CKAD) |
While both roles are essential in cloud-native environments, Etcd focuses on maintaining a reliable distributed key-value store, whereas a Kubernetes Administrator manages entire Kubernetes clusters. Understanding Etcd is crucial for Kubernetes Administrators, but their responsibilities extend beyond Etcd to include cluster deployment, scaling, and security.

$144K - $190K/yr
Other
Posted 28 days ago
Job description
Anthropic runs some of the largest Kubernetes clusters in the industry. We have fleets of hundreds of thousands of nodes across multiple cloud providers and datacenters to train, research, and serve frontier AI models. The Kubernetes Platform team owns the Kubernetes control plane that makes those clusters work.
We are operating at a scale where the defaults stop working. We own the scheduler and extend it to place topology-sensitive ML workloads across thousands of accelerators at once. We scale the control plane itself - apiserver, etcd, controllers - so it stays responsive as object counts and node counts grow by orders of magnitude. And we build the core cluster services every workload depends on, like service discovery, so they hold up under the same pressure.
We make sure the control plane is fast, correct, and always available. Your work will directly determine whether Anthropic can keep reliably and safely training frontier models as our compute footprint continues to grow.
Key responsibilities- Own, operate, and extend the Kubernetes scheduler for Anthropic's accelerator fleets, including custom scheduling plugins and policies for gang scheduling, topology awareness, and preemption
- Scale the Kubernetes control plane (apiserver, etcd, controller-manager) to support clusters far beyond typical limits, and find the next bottleneck before it finds us
- Design, build, and operate core cluster services such as service discovery that every workload in the fleet depends on
- Build and maintain custom controllers, operators, and CRDs
- Partner with research, training, and inference to understand workload shapes and turn their requirements into platform capabilities
- Collaborate with cloud providers on required features and escalations
- Participate in on-call, lead incident response, and design processes (postmortems, runbooks, SLOs) that help the team avoid repeating failures
- Significant software engineering experience building and operating production distributed systems
- Proficiency in at least one systems-appropriate language (e.g., Go, Python, Rust, or C++)
- Deep, hands-on Kubernetes experience (well beyond "user of") into scheduler, controllers, apiserver, or operating large multi-tenant clusters
- Demonstrated ability to debug complex issues across the stack, from API behavior down to node and network-level root causes
- A track record of designing for reliability, correctness, and clear failure semantics in systems other engineers depend on
- Strong written and verbal communication; comfort building consensus with internal stakeholders
- Experience with Kubernetes internals or contributions: kube-scheduler / scheduling framework, apiserver, etcd, client-go, controller-runtime, or similar
- Experience building or operating cluster schedulers or batch systems (e.g., Kueue, Volcano, Slurm, or in-house equivalents)
- Background scaling control planes or coordination systems (etcd, ZooKeeper, Consul, or large DNS/service-mesh deployments)
- Familiarity with ML infrastructure: GPUs, TPUs, or Trainium; gang scheduling; topology-aware placement; collective networking such as NCCL
- Experience with GCP and/or AWS, including GKE/EKS internals and Infrastructure as Code
- Low-level systems experience such as Linux kernel tuning, cgroups, or eBPF
- 12+ years of relevant industry experience, including time leading large, ambiguous infrastructure projects
About Anthropic
Sourced by ZipRecruiter
Company size
11 - 50 Employees
Headquarters location
Daly City, CA, US
Year founded
2021