Join to apply for the Site Reliability Engineer - Inference role at Jobright.ai
2 days ago Be among the first 25 applicants
Join to apply for the Site Reliability Engineer - Inference role at Jobright.ai
Get AI-powered advice on this job and more exclusive features.
Jobright is an AI-powered career platform that helps job seekers discover the top opportunities in the US. We are NOT a staffing agency. Jobright does not hire directly for these positions. We connect you with verified openings from employers you can trust.
Job Summary:
Lambda is the #1 GPU Cloud for ML/AI teams, providing tools for building, testing, and deploying AI products at scale. The Site Reliability Engineer - Inference will work on developing a large-scale platform for running AI models and building a high-throughput, low-latency API for distributed systems.
Responsibilities:
• Work on our Inference service, helping us to develop our large-scale platform for running new, cutting-edge models across tens of thousands of GPUs
• Help build a high-throughput, low-latency API and routing system running at geographically-distributed scale
• Shape a highly reliable distributed system with a focus on reducing operational overhead and deep observability and capacity management.
• Work with the team and our internal ML researchers to adopt and improve new inference engines, models and architectures across a variety of different mediums (such as text, image, video and audio)
• Tackle global networking challenges to deliver the lowest possible latency to our users across all of Lambda’s available capacity
• Help push Lambda forward into the state of the art, and be part of a team that is operating right at the edge of new developments in the industry.
Qualifications:
Required:
• 8 or more years of experience as a software reliability engineer or software engineer working on large-scale, internet-facing production services
• Highly skilled at writing Go and Python
• Experience with bare-metal system installation and administration
• Experience deploying applications and operators on Kubernetes
• Product-focused, balancing operational needs and keeping overheads down with the need to ship features at a rapid pace
• Proven track record of working in an environment with rapid deployment and the ability to stay on top of shifting priorities as the industry rapidly develops
• Willingness to take ownership of projects and help drive them forwards through design, implementation, launch, and maintenance.
Preferred:
• Experience working with machine learning models
• Experience operating large-scale, geographically distributed systems
• Experience developing Kubernetes operators and components
Company:
Lambda provides infrastructure, cloud services, and software for the training and inferencing of AI models. Founded in 2012, headquartered in San Jose, California, USA, team size 201-500 employees, currently Late Stage. Lambda has a track record of offering H1B sponsorships.
Seniority level
- Seniority levelMid-Senior level
Employment type
Job function
- IndustriesSoftware Development
Referrals increase your chances of interviewing at Jobright.ai by 2x
Inferred from the description for this job
Medical insurance
Vision insurance
401(k)
Get notified when a new job is posted.
Sign in to set job alerts for “Site Reliability Engineer” roles.
San Francisco, CA $160,000.00-$180,000.00 4 days ago
Software Engineer, Infrastructure, Early Career
San Francisco, CA $126,000.00-$170,000.00 11 hours ago
San Francisco, CA $180,000.00-$280,000.00 3 days ago
San Francisco, CA $130,000.00-$238,000.00 1 day ago
San Francisco, CA $150,000.00-$250,000.00 1 day ago
San Francisco, CA $150,000.00-$230,000.00 4 months ago
San Francisco, CA $99,500.00-$200,000.00 2 weeks ago
Full-Stack Software Engineer (Jr/Mid level)
San Francisco, CA $120,000.00-$180,000.00 1 day ago
San Francisco, CA $56.25-$137,000.00 5 days ago
Software Development Engineer I - Frontend & Mobile
San Francisco, CA $99,500.00-$200,000.00 3 weeks ago
San Francisco, CA $160,000.00-$200,000.00 2 months ago
San Francisco, CA $150,000.00-$176,000.00 3 months ago
San Francisco, CA $120,000.00-$190,000.00 9 months ago
San Francisco, CA $130,000.00-$140,000.00 2 weeks ago
Software Engineer, AI Intern (Summer 2026)
San Francisco, CA $125,000.00-$175,000.00 2 months ago
Software Engineer, AI Intern (Winter 2026)
San Francisco, CA $130,000.00-$240,000.00 2 weeks ago
San Francisco, CA $163,200.00-$223,200.00 3 days ago
Software Engineer, Frontend (All Levels)
San Francisco, CA $150,000.00-$220,000.00 2 weeks ago
San Francisco, CA $150,000.00-$283,000.00 4 days ago
San Francisco, CA $155,000.00-$339,500.00 2 weeks ago
San Francisco, CA $140,000.00-$280,000.00 8 months ago
San Francisco, CA $165,000.00-$165,000.00 2 years ago
San Francisco, CA $120,000.00-$200,000.00 2 years ago
We’re unlocking community knowledge in a new way. Experts add insights directly into each article, started with the help of AI.
#J-18808-Ljbffr