Staff Software Engineer - Backend & AI Infra Remote Full-time Location: Based in US to GMT timezones Compensation: Competitive Compensation Package Our client is a high-growth technology firm. They ...
Staff Software Engineer - Backend & AI Infra Remote Full-time Location: Based in US to GMT timezones Compensation: Competitive Compensation Package Our client is a high-growth technology firm. They ...
Research and identify innovative technologies and potential partners within the AI application/infra ecosystem * Contribute to SK AI DC design and infrastructure strategy development based on ...
Research and identify innovative technologies and potential partners within the AI application/infra ecosystem * Contribute to SK AI DC design and infrastructure strategy development based on ...
Research and identify innovative technologies and potential partners within the AI application/infra ecosystem * Contribute to SK AI DC design and infrastructure strategy development based on ...
Research and identify innovative technologies and potential partners within the AI application/infra ecosystem * Contribute to SK AI DC design and infrastructure strategy development based on ...
Platform Engineering Lead (AI Infra)
$120K - $159K/yr
Build AI infrastructure and internal tooling to enable agentic systems: * Quality Scientist Agents - monitor operations end-to-end, surface anomalies, and intervene/escalate when quality or ...
Platform Engineering Lead (AI Infra)
$120K - $159K/yr
Build AI infrastructure and internal tooling to enable agentic systems: * Quality Scientist Agents - monitor operations end-to-end, surface anomalies, and intervene/escalate when quality or ...
Member of Technical Staff (Software Engineer, Storage Platform)
New York, NY · On-site
$220K - $405K/yr
Partner with AI, infra, and product teams to design data models and access patterns that meet low‑latency, high‑throughput, and cost‑efficiency requirements. * Lead capacity planning ...
Member of Technical Staff (Software Engineer, Storage Platform)
New York, NY · On-site
$220K - $405K/yr
Partner with AI, infra, and product teams to design data models and access patterns that meet low‑latency, high‑throughput, and cost‑efficiency requirements. * Lead capacity planning ...
XPENG is a leading smart technology company at the forefront of innovation, integrating advanced AI and autonomous driving technologies into its vehicles, including electric vehicles (EVs), electric ...
XPENG is a leading smart technology company at the forefront of innovation, integrating advanced AI and autonomous driving technologies into its vehicles, including electric vehicles (EVs), electric ...
XPENG is a leading smart technology company at the forefront of innovation, integrating advanced AI and autonomous driving technologies into its vehicles, including electric vehicles (EVs), electric ...
XPENG is a leading smart technology company at the forefront of innovation, integrating advanced AI and autonomous driving technologies into its vehicles, including electric vehicles (EVs), electric ...
... AI Infra, and security. • Strong industry knowledge and operational empathy • 5+ yrs of pre-sales engineering or related experience. • Excellent written and verbal communication skills. • ...
... AI Infra, and security. • Strong industry knowledge and operational empathy • 5+ yrs of pre-sales engineering or related experience. • Excellent written and verbal communication skills. • ...
Experience with AI infra and MLOps tooling such as K8s, CI/CD, model registry, experiment tracking, observability. ~Communication * Strong written and verbal communication; ability to drive design ...
Experience with AI infra and MLOps tooling such as K8s, CI/CD, model registry, experiment tracking, observability. ~Communication * Strong written and verbal communication; ability to drive design ...
Platform Engineering Lead (AI Infra)
San Francisco, CA · On-site
$205K - $235K/yr
Build AI infrastructure and internal tooling to enable agentic systems: * Quality Scientist Agents - monitor operations end-to-end, surface anomalies, and intervene/escalate when quality or ...
Platform Engineering Lead (AI Infra)
San Francisco, CA · On-site
$205K - $235K/yr
Build AI infrastructure and internal tooling to enable agentic systems: * Quality Scientist Agents - monitor operations end-to-end, surface anomalies, and intervene/escalate when quality or ...
Experience with AI infra and MLOps tooling such as K8s, CI/CD, model registry, experiment tracking, observability. ~Communication * Strong written and verbal communication; ability to drive design ...
Experience with AI infra and MLOps tooling such as K8s, CI/CD, model registry, experiment tracking, observability. ~Communication * Strong written and verbal communication; ability to drive design ...
Experience with machine learning infra (e.g., GPU, Cloud TPU, etc.). About the job Like Google ... The AI and Infrastructure team is redefining what's possible. We empower Google customers with ...
Experience with machine learning infra (e.g., GPU, Cloud TPU, etc.). About the job Like Google ... The AI and Infrastructure team is redefining what's possible. We empower Google customers with ...
Supercomputing / AI Infra at Krea We build and operate the infrastructure for Krea's research and inference. Distributed training, 1000+ K8s GPU clusters, petabyte scale data pipelines, etc. We build ...
Supercomputing / AI Infra at Krea We build and operate the infrastructure for Krea's research and inference. Distributed training, 1000+ K8s GPU clusters, petabyte scale data pipelines, etc. We build ...
Account Executive
San Francisco, CA · On-site
Background in developer tools, APIs, infra platforms, or AI/ML systems. * Familiarity with presales or SE-style motions (demos, POCs, technical discovery). Why Join Parasail * Join at the ground ...
Account Executive
San Francisco, CA · On-site
Background in developer tools, APIs, infra platforms, or AI/ML systems. * Familiarity with presales or SE-style motions (demos, POCs, technical discovery). Why Join Parasail * Join at the ground ...
Fellow, Software Engineering- Infrastructure
Mountain View, CA · Hybrid
$204K - $241K/yr
You will be one of the most senior technical voices in an organization spanning all layers of the Infra stack (Physical Infra, Core Infra, Data Infra, Online Infra, AI Infra, and Reliability Infra ...
Fellow, Software Engineering- Infrastructure
Mountain View, CA · Hybrid
$204K - $241K/yr
You will be one of the most senior technical voices in an organization spanning all layers of the Infra stack (Physical Infra, Core Infra, Data Infra, Online Infra, AI Infra, and Reliability Infra ...
Fellow, Software Engineering- Infrastructure
Mountain View, CA · On-site
$204K - $241K/yr
You will be one of the most senior technical voices in an organization spanning all layers of the Infra stack (Physical Infra, Core Infra, Data Infra, Online Infra, AI Infra, and Reliability Infra ...
Fellow, Software Engineering- Infrastructure
Mountain View, CA · On-site
$204K - $241K/yr
You will be one of the most senior technical voices in an organization spanning all layers of the Infra stack (Physical Infra, Core Infra, Data Infra, Online Infra, AI Infra, and Reliability Infra ...
AI Engineer - LLM Infra
San Francisco, CA · On-site
$126K - $166K/yr
Yutori is reimagining how people interact with the web by building AI agents that can reliably do ... Scale infra for post-training of multimodal LLMs (CPT, SFT, RL, search, reward models) * Scale ...
AI Engineer - LLM Infra
San Francisco, CA · On-site
$126K - $166K/yr
Yutori is reimagining how people interact with the web by building AI agents that can reliably do ... Scale infra for post-training of multimodal LLMs (CPT, SFT, RL, search, reward models) * Scale ...
AI System Engineer
San Francisco, CA · On-site +1
$175K - $250K/yr
You'll partner closely with AI, infra, and product teams to integrate large language, speech, and rendering models into a unified, low-latency system that can power live human-like interaction at ...
AI System Engineer
San Francisco, CA · On-site +1
$175K - $250K/yr
You'll partner closely with AI, infra, and product teams to integrate large language, speech, and rendering models into a unified, low-latency system that can power live human-like interaction at ...
Senior Software Developer - AI Infra Compute
$54 - $71.25/hr
OCI (Oracle Cloud Infrastructure) AI Infrastructure is at the forefront of building a cutting-edge, ultra-high-performance GPU platform designed to support AI/ML/HPC workloads. This is your chance to ...
Senior Software Developer - AI Infra Compute
$54 - $71.25/hr
OCI (Oracle Cloud Infrastructure) AI Infrastructure is at the forefront of building a cutting-edge, ultra-high-performance GPU platform designed to support AI/ML/HPC workloads. This is your chance to ...
AI Infra Resident (1-Year Program)
$116K - $148K/yr
About The Role RadixArk is launching a full-time, paid, 1-year residency program for aspiring AI infrastructure engineers. You'll rotate across inference, training, kernels, compilers, and cluster ...
AI Infra Resident (1-Year Program)
$116K - $148K/yr
About The Role RadixArk is launching a full-time, paid, 1-year residency program for aspiring AI infrastructure engineers. You'll rotate across inference, training, kernels, compilers, and cluster ...
Ai Infra information
See salary details
$43K - $51.8K
1% of jobs
$51.8K - $60.5K
4% of jobs
$60.5K - $69.3K
15% of jobs
$71.5K is the 25th percentile. Wages below this are outliers.
$69.3K - $78.1K
19% of jobs
The median wage is $82.7K / yr.
$78.1K - $86.9K
20% of jobs
$86.9K - $95.6K
15% of jobs
$96.2K is the 75th percentile. Wages above this are outliers.
$95.6K - $104.4K
9% of jobs
$104.4K - $113.2K
4% of jobs
$113.2K - $122K
6% of jobs
$122K - $130.7K
3% of jobs
$130.7K - $139.5K
3% of jobs
$43K
$88.7K
$139.5K
How much do ai infra jobs pay per year?
Full-time
Posted 28 days ago
Job description
Remote Full-time
Location: Based in US to GMT timezones
Compensation: Competitive Compensation Package
Our client is a high-growth technology firm. They are seeking a Staff Software Engineer to spearhead two critical domains: the core agent runtime and backend infrastructure powering a high-frequency trading fleet, and the comprehensive migration of model hosting and agent deployment to in-house, proprietary infrastructure.
This is a foundational, high-impact building role. The successful candidate will design and implement the backend services, runtime engines, and deployment systems that enable a fleet of autonomous agents to operate with superior speed, reliability, and intelligence. By moving away from third-party LLM providers and hosted platforms, this role will establish the sovereign infrastructure necessary for the next generation of autonomous financial software.
Key Responsibilities
Agent Runtime & Backend Development
- Plugin Runtime Ownership: Lead the evolution of the per-agent process, migrating from a distributed Go/Python hybrid to a centralized, high-performance Go service utilizing Postgres state and real-time websocket price feeds.
- Rules Engine Engineering: Build a YAML-configurable "Scanner Gateway" to bridge signal production and execution, allowing for complex scoring and filtering without direct code manipulation.
- Advanced Execution Systems: Develop and maintain the RatchetStop Backend, a centralized profit-trailing service capable of sub-second evaluation and websocket-based order execution to protect capital even when agents are offline.
- Data & Connectivity: Manage the Model Context Protocol (MCP) server bridging agents to platform tools, and oversee a high-throughput data pipeline (Redis, Postgres, ClickHouse) for real-time market intelligence ingestion.
Model & Agent Hosting Migration
- Infrastructure Sovereignty: Lead the technical execution of migrating agents from third-party platforms to a custom-built, Senpi-hosted environment featuring isolated workspaces and state persistence.
- Model Serving: Evaluate and implement the transition from external LLM APIs (Anthropic, Google) to self-hosted inference, optimizing for telemetry capture and performance.
- Telemetry & Feedback Loops: Architect systems to capture every agent decision and score, creating a self-reinforcing loop where the fleet learns and improves from collective performance data.
- Deployment Pipelines: Build robust CI/CD pipelines for zero-downtime rollouts, ensuring that updates to scanner logic or runtime patches do not interrupt active market positions.
Infrastructure & Operations
- System Reliability: Design monitoring and alerting frameworks to detect agent failures, state corruption, or authentication expirations before they impact financial performance.
- Cloud Orchestration: Manage AWS/EKS environments using Infrastructure-as-Code (IaC).
- Incident Response: Own the operational health of the fleet, acting as the primary responder for high-stakes trading system incidents.
IInterview Process
- Founder / CEO Interview: Introduction to the vision and strategic goals.
- Take-Home Test: A practical assessment of technical design and coding capabilities.
- Technical Interview: A deep dive into systems architecture and engineering expertise.
- Final Interview: Cultural alignment and final technical synthesis.
Requirements
- Technical Essentials
- Expert Backend Engineering: Proficiency in writing production-grade code in Go, Python, and Node.js/TypeScript (Go is strongly preferred for runtime services).
- Startup Experience: A proven track record of building complex backend services (APIs, job scheduling, distributed systems) from scratch in a fast-paced environment.
- Real-Time Systems: Deep understanding of low-latency environments, websocket management, and sub-second condition evaluation.
- Database Mastery: Production experience with Postgres, Redis, and at least one analytical database (e.g., ClickHouse, TimescaleDB, or BigQuery).
- Orchestration: Hands-on experience deploying, scaling, and debugging production workloads on Kubernetes (AWS EKS).
- End-to-End Ownership: Demonstrated ability to design, build, deploy, and maintain systems throughout their entire lifecycle.
- Preferred Qualifications
- LLM Infrastructure: Experience with model serving and optimizing inference (e.g., vLLM, TGI, or TensorRT-LLM).
- FinTech/Trading: Background in exchange APIs, wallet operations, or on-chain infrastructure where uptime has direct financial consequences.
- Agentic Frameworks: Familiarity with Model Context Protocol (MCP) or orchestrating multi-agent platforms.
Benefits
- Competitive compensation and equity packages.
- The opportunity to build foundational infrastructure in a new category of autonomous software.
- High-autonomy environment with a focus on engineering excellence.
- Collaborative culture working alongside industry-leading founders and engineers.
Due to the high volume of applications we anticipate, we regret that we are unable to provide individual feedback to all candidates. If you do not hear back from us within 4 weeks of your application, please assume that you have not been successful on this occasion. We genuinely appreciate your interest and wish you the best in your job search.
Commitment to Equality and Accessibility:
At MLabs, we are committed to offer equal opportunities to all candidates. We ensure no discrimination, accessible job adverts, and providing information in accessible formats. Our goal is to foster a diverse, inclusive workplace with equal opportunities for all. If you need any reasonable adjustments during any part of the hiring process or you would like to see the job-advert in an accessible format please let us know at the earliest opportunity by emailing human-resources@mlabs.city.
MLabs Ltd collects and processes the personal information you provide such as your contact details, work history, resume, and other relevant data for recruitment purposes only. This information is managed securely in accordance with MLabs Ltd's Privacy Policy and Information Security Policy, and in compliance with applicable data protection laws. Your data may be shared only with clients and trusted partners where necessary for recruitment purposes. You may request the deletion of your data or withdraw your consent at any time by contacting legal@mlabs.city.