Deepspeed, Huggingface TGI, FSDP) * Experience in projects involving LLMs
Deepspeed, Huggingface TGI, FSDP) * Experience in projects involving LLMs
... vLLM, DeepSpeed, Megatron-LM • Experience building agentic AI systems (LangChain, LangGraph, CrewAI, etc.) • Knowledge of enterprise AI deployment patterns • Experience with technical ...
... vLLM, DeepSpeed, Megatron-LM • Experience building agentic AI systems (LangChain, LangGraph, CrewAI, etc.) • Knowledge of enterprise AI deployment patterns • Experience with technical ...
... DeepSpeed). • Proficiency in software development for deployable ML systems. • A track record of relevant publications in top international conferences (RSS, NeuRIPS, ICML, ICLR, CoRL, ICRA, IROS ...
... DeepSpeed). • Proficiency in software development for deployable ML systems. • A track record of relevant publications in top international conferences (RSS, NeuRIPS, ICML, ICLR, CoRL, ICRA, IROS ...
Research Intern, Agent RL Training
Mountain View, CA · On-site
$35 - $50/hr
Experience with multi-node distributed training (FSDP, DeepSpeed, Megatron-LM) * Proficiency in writing custom GPU kernels with Triton or CUDA * Experience building synthetic data pipelines for agent ...
Research Intern, Agent RL Training
Mountain View, CA · On-site
$35 - $50/hr
Experience with multi-node distributed training (FSDP, DeepSpeed, Megatron-LM) * Proficiency in writing custom GPU kernels with Triton or CUDA * Experience building synthetic data pipelines for agent ...
Machine Learning Engineer - Speech Model Training
San Francisco, CA · On-site
$250K - $300K/yr
Run and optimise distributed training at scale via PyTorch or JAX, FSDP, DeepSpeed, etc * Drive real-time inference performance with vLLM, TensorRT-LLM, or SGLang * Apply RL alignment techniques to ...
Machine Learning Engineer - Speech Model Training
San Francisco, CA · On-site
$250K - $300K/yr
Run and optimise distributed training at scale via PyTorch or JAX, FSDP, DeepSpeed, etc * Drive real-time inference performance with vLLM, TensorRT-LLM, or SGLang * Apply RL alignment techniques to ...
Senior Staff AI Data Infrastructure Engineer
Santa Clara, CA · On-site
$124.40K - $169.10K/yr
... DeepSpeed, or Megatron. • Practical experience with Vector Databases, automated labeling toolchains, or data-centric AI workflows. • Knowledge of storage formats optimized for AI (e.g., Parquet ...
Senior Staff AI Data Infrastructure Engineer
Santa Clara, CA · On-site
$124.40K - $169.10K/yr
... DeepSpeed, or Megatron. • Practical experience with Vector Databases, automated labeling toolchains, or data-centric AI workflows. • Knowledge of storage formats optimized for AI (e.g., Parquet ...
... DeepSpeed, or similar large-scale training systems. • Familiarity with large-scale model parallelism strategies (data, tensor, pipeline, or expert parallelism). • Experience optimizing training ...
... DeepSpeed, or similar large-scale training systems. • Familiarity with large-scale model parallelism strategies (data, tensor, pipeline, or expert parallelism). • Experience optimizing training ...
... DeepSpeed, or similar large-scale training systems. • Familiarity with large-scale model parallelism strategies (data, tensor, pipeline, or expert parallelism). • Experience optimizing training ...
... DeepSpeed, or similar large-scale training systems. • Familiarity with large-scale model parallelism strategies (data, tensor, pipeline, or expert parallelism). • Experience optimizing training ...
AI Research Engineer - Scaling
San Carlos, CA · On-site
$180K - $300K/yr
Experience with distributed training frameworks (e.g., TorchTitan, DeepSpeed, FSDP/ZeRO), multi-node debugging, and experiment management * Proven skills in optimizing inference performance using ...
AI Research Engineer - Scaling
San Carlos, CA · On-site
$180K - $300K/yr
Experience with distributed training frameworks (e.g., TorchTitan, DeepSpeed, FSDP/ZeRO), multi-node debugging, and experiment management * Proven skills in optimizing inference performance using ...
Senior AI Data Infrastructure Engineer
Santa Clara, CA · On-site
$127.40K - $173.20K/yr
... DeepSpeed, or Megatron. • Practical experience with Vector Databases, automated labeling toolchains, or data-centric AI workflows. • Knowledge of storage formats optimized for AI (e.g., Parquet ...
Senior AI Data Infrastructure Engineer
Santa Clara, CA · On-site
$127.40K - $173.20K/yr
... DeepSpeed, or Megatron. • Practical experience with Vector Databases, automated labeling toolchains, or data-centric AI workflows. • Knowledge of storage formats optimized for AI (e.g., Parquet ...
LLM Inference Deployment Engineer
$180K - $240K/yr
Strong expertise in LLM inference frameworks (PyTorch, ONNX Runtime, vLLM, TensorRT-LLM, DeepSpeed). * In-depth knowledge of the Python programming language for model integration and performance ...
LLM Inference Deployment Engineer
$180K - $240K/yr
Strong expertise in LLM inference frameworks (PyTorch, ONNX Runtime, vLLM, TensorRT-LLM, DeepSpeed). * In-depth knowledge of the Python programming language for model integration and performance ...
Software Engineer Graduate (Inference Infrastructure) - 2026 Start (PhD)
San Jose, CA · On-site
$202.70K - $240.20K/yr
... DeepSpeed, PyTorch) and distributed training/inference platforms. • Excellent communication skills and ability to collaborate across global, cross-functional teams. • Passion for system ...
Software Engineer Graduate (Inference Infrastructure) - 2026 Start (PhD)
San Jose, CA · On-site
$202.70K - $240.20K/yr
... DeepSpeed, PyTorch) and distributed training/inference platforms. • Excellent communication skills and ability to collaborate across global, cross-functional teams. • Passion for system ...
ML Engineer - Austin, TX
Austin, TX · On-site +1
Familiarity with distributed training (multi-GPU, NCCL, DeepSpeed, or Accelerate). * Prior work in MLOps or packaging ML pipelines for deployment. * Contributions to open-source ML libraries. Why ...
Quick apply
ML Engineer - Austin, TX
Austin, TX · On-site +1
Familiarity with distributed training (multi-GPU, NCCL, DeepSpeed, or Accelerate). * Prior work in MLOps or packaging ML pipelines for deployment. * Contributions to open-source ML libraries. Why ...
Deepspeed, Huggingface TGI) * Experience in turning applied research results into product components
Deepspeed, Huggingface TGI) * Experience in turning applied research results into product components
Senior AI Data Infrastructure Engineer
Santa Clara, CA · On-site
$127.40K - $173.20K/yr
... DeepSpeed, or Megatron. • Practical experience with Vector Databases, automated labeling toolchains, or data-centric AI workflows. • Knowledge of storage formats optimized for AI (e.g., Parquet ...
Senior AI Data Infrastructure Engineer
Santa Clara, CA · On-site
$127.40K - $173.20K/yr
... DeepSpeed, or Megatron. • Practical experience with Vector Databases, automated labeling toolchains, or data-centric AI workflows. • Knowledge of storage formats optimized for AI (e.g., Parquet ...
Staff/Sr. Machine Learning Engineer, Foundation Models - AI, Search & Knowledge Platforms
San Francisco, CA · On-site
Familiarity with Nvidia TensorRT-LLM, vLLM, DeepSpeed, Nvidia Triton Server etc. Experience writing custom CUDA kernels using CUDA or OpenAI Triton. MS in Computer Science, Artificial Intelligence ...
Staff/Sr. Machine Learning Engineer, Foundation Models - AI, Search & Knowledge Platforms
San Francisco, CA · On-site
Familiarity with Nvidia TensorRT-LLM, vLLM, DeepSpeed, Nvidia Triton Server etc. Experience writing custom CUDA kernels using CUDA or OpenAI Triton. MS in Computer Science, Artificial Intelligence ...
Senior Machine Learning Engineer - SIML, ISE
Cupertino, CA · On-site
$128.90K - $177K/yr
Experience with parallel training libraries such as PyTorch Distributed (torch.distributed), DeepSpeed, or FairScale.Experience building ML models for on-device inference.Publication record at ML ...
Senior Machine Learning Engineer - SIML, ISE
Cupertino, CA · On-site
$128.90K - $177K/yr
Experience with parallel training libraries such as PyTorch Distributed (torch.distributed), DeepSpeed, or FairScale.Experience building ML models for on-device inference.Publication record at ML ...
DeepSpeed * Megatron-LM 2. Representation Learning & Method Innovation * Design and improve self-supervised and multimodal learning methods for real-world autonomous driving systems * Conduct ...
DeepSpeed * Megatron-LM 2. Representation Learning & Method Innovation * Design and improve self-supervised and multimodal learning methods for real-world autonomous driving systems * Conduct ...
Preferred : • Experience with distributed training techniques such as DeepSpeed, FSDP, etc. • Experience with the NVIDIA software and hardware stack (CUDA, NCCL). • Experience with PyTorch. • ...
Preferred : • Experience with distributed training techniques such as DeepSpeed, FSDP, etc. • Experience with the NVIDIA software and hardware stack (CUDA, NCCL). • Experience with PyTorch. • ...
Agent RL Infra Engineer
Santa Clara, CA · On-site
... DeepSpeed, FSDP, HF Accelerate) and ML ops skills covering pipeline automation, job orchestration, and GPU cluster management are important here • Proficiency in Python, Go, Rust, or similar • ...
Agent RL Infra Engineer
Santa Clara, CA · On-site
... DeepSpeed, FSDP, HF Accelerate) and ML ops skills covering pipeline automation, job orchestration, and GPU cluster management are important here • Proficiency in Python, Go, Rust, or similar • ...
Deepspeed information
What are the key skills and qualifications needed to thrive as a DeepSpeed Engineer, and why are they important?
What are some common challenges faced by engineers working with DeepSpeed and how can they be addressed?
What is Deepspeed?
What is the difference between Deepspeed vs Data Scientist?
| Aspect | Deepspeed | Data Scientist |
|---|---|---|
| Required credentials | Knowledge of machine learning frameworks, programming skills in Python, experience with AI model training | Degree in Data Science, Statistics, Computer Science, or related fields; strong analytical skills |
| Work environment | AI research labs, tech companies, cloud computing environments | Business, tech companies, research institutions |
| Industry usage | AI model training, deep learning optimization | Data analysis, predictive modeling, business insights |
Deepspeed focuses on optimizing large-scale AI model training and deep learning performance, while Data Scientists analyze data to generate insights and build predictive models. Both roles require technical skills but serve different purposes within the AI and data ecosystem.
- Remote Audio Machine Learning
- Contractual Computer Vision Deep Learning Engineer
- Machine Learning Platform Engineer
- Artificial Neural Network
- Contractual Artificial Intelligence Machine Learning
- Assistant Llm Developer
- Ai Machine Learning Engineer
- Systemc Modeling Engineer
- Artificial Intelligence Ai Engineer
- Kubeflow

Full-time
Posted 22 days ago
Job description
About Nexusflow.ai
Modern enterprise copilots & agents call for last-mile quality, enterprise-grade robustness and scalable operation costs, beyond simplified programming interfaces for generative AI. Nexusflow tackles this challenge, enabling enterprises to own their workflow copilots & agents stacked on top of powerful yet cost-effective, compact LLMs. We train large language models and build last-mile quality dev tooling for copilots & agents on your enterprise workflows. Our team has built the open-source LLM, NexusRaven-V2, rivaling GPT-4 in function calling with a 100X smaller model size. Our team members are also behind the scenes of Starling, the #1 ranked compact 7B chat model based on human evaluation in Chatbot Arena.
Position: Backend Engineer
Nexusflow is currently adding Backend Engineers to our team. Our Backend Engineers package up our technology in models and last-mile quality tooling. Our Backend Engineers will be the driving force to build our products and solutions, in extensive collaboration with our ML Engineers and Front-end Engineers.
ResponsibilitiesAPI system development for copilot & agent quality tooling
API system development for copilot serving and integration with a focus on enterprise-grade requirements in the following areas
Integration with on-prem & cloud compute vendors
Integration with software tools required in customer oriented solutions
Distributed system and optionally GPU performance optimization
Wear many hats and collaborate with the whole team for product development, deployment and customer success
Experience in ML model or ML data pipeline deployment (on-prem or on cloud)
Experience in building backend for application or platform API systems
- Working experience in fast-pace team environment
Experience in using or contributing to modern compute frameworks for LLMs (e.g. Deepspeed, Huggingface TGI, FSDP)
Experience in projects involving LLMs
About Nexusflow.ai
Sourced by ZipRecruiter
Industry
Software development
Company size
11 - 50 Employees
Headquarters location
Daly City, CA, US
Year founded
2022