1

Deepspeed Jobs (NOW HIRING)

Senior AI Data Infrastructure Engineer

Santa Clara, CA · On-site

$127.40K - $173.20K/yr

... DeepSpeed, or Megatron. • Practical experience with Vector Databases, automated labeling toolchains, or data-centric AI workflows. • Knowledge of storage formats optimized for AI (e.g., Parquet ...

Strong expertise in LLM inference frameworks (PyTorch, ONNX Runtime, vLLM, TensorRT-LLM, DeepSpeed). * In-depth knowledge of the Python programming language for model integration and performance ...

Familiarity with distributed training (multi-GPU, NCCL, DeepSpeed, or Accelerate). * Prior work in MLOps or packaging ML pipelines for deployment. * Contributions to open-source ML libraries. Why ...

Senior AI Data Infrastructure Engineer

Santa Clara, CA · On-site

$127.40K - $173.20K/yr

... DeepSpeed, or Megatron. • Practical experience with Vector Databases, automated labeling toolchains, or data-centric AI workflows. • Knowledge of storage formats optimized for AI (e.g., Parquet ...

... DeepSpeed, FSDP, HF Accelerate) and ML ops skills covering pipeline automation, job orchestration, and GPU cluster management are important here • Proficiency in Python, Go, Rust, or similar • ...

next page

Showing results 1-20

Deepspeed information

What are the key skills and qualifications needed to thrive as a DeepSpeed Engineer, and why are they important?

To thrive as a DeepSpeed Engineer, you need a solid background in machine learning, deep learning frameworks (such as PyTorch), and distributed systems, often supported by a degree in computer science or a related field. Proficiency with DeepSpeed, parallel computing libraries, and cloud platforms, along with familiarity with tools like CUDA and NCCL, is typically expected. Strong problem-solving abilities, collaboration, and adaptability are crucial soft skills for optimizing large-scale AI models and working with cross-functional teams. Mastering these skills ensures efficient development and deployment of high-performance, scalable AI solutions in demanding environments.

What are some common challenges faced by engineers working with DeepSpeed and how can they be addressed?

Engineers working with DeepSpeed often encounter challenges related to optimizing large-scale model training, such as managing memory efficiency and tuning distributed training parameters. Troubleshooting issues like gradient accumulation, parallelism strategies, and ensuring compatibility with different hardware setups can be complex. Collaborating closely with data scientists, DevOps, and research teams is essential for addressing these challenges, as is staying updated with the latest DeepSpeed releases and documentation. Regular participation in code reviews and knowledge-sharing sessions can also help engineers overcome technical hurdles and continuously improve model performance.

What is Deepspeed?

Deepspeed is an open-source deep learning optimization library developed by Microsoft, designed to enable distributed training of large-scale models efficiently. It helps researchers and engineers train models that are too large to fit in the memory of a single GPU by offering features like ZeRO optimization, mixed-precision training, and advanced parallelism techniques. Deepspeed is widely used in the machine learning community for its scalability and performance improvements, making it easier to train state-of-the-art models on vast datasets. The library integrates seamlessly with PyTorch and supports training on multiple GPUs and even across multiple machines.

What is the difference between Deepspeed vs Data Scientist?

AspectDeepspeedData Scientist
Required credentialsKnowledge of machine learning frameworks, programming skills in Python, experience with AI model trainingDegree in Data Science, Statistics, Computer Science, or related fields; strong analytical skills
Work environmentAI research labs, tech companies, cloud computing environmentsBusiness, tech companies, research institutions
Industry usageAI model training, deep learning optimizationData analysis, predictive modeling, business insights

Deepspeed focuses on optimizing large-scale AI model training and deep learning performance, while Data Scientists analyze data to generate insights and build predictive models. Both roles require technical skills but serve different purposes within the AI and data ecosystem.

More about Deepspeed jobs
What cities are hiring for Deepspeed jobs? Cities with the most Deepspeed job openings:
What states have the most Deepspeed jobs? States with the most job openings for Deepspeed jobs include:
Infographic showing various Deepspeed job openings in the United States as of May 2026, with employment types broken down into 80% Full Time, and 20% Contract. Highlights an 80% In-person, and 20% Remote job distribution.

Full-time

Posted 22 days ago


Job description

About Nexusflow.ai

Modern enterprise copilots & agents call for last-mile quality, enterprise-grade robustness and scalable operation costs, beyond simplified programming interfaces for generative AI. Nexusflow tackles this challenge, enabling enterprises to own their workflow copilots & agents stacked on top of powerful yet cost-effective, compact LLMs. We train large language models and build last-mile quality dev tooling for copilots & agents on your enterprise workflows. Our team has built the open-source LLM, NexusRaven-V2, rivaling GPT-4 in function calling with a 100X smaller model size. Our team members are also behind the scenes of Starling, the #1 ranked compact 7B chat model based on human evaluation in Chatbot Arena.

Position: Backend Engineer

Nexusflow is currently adding Backend Engineers to our team. Our Backend Engineers package up our technology in models and last-mile quality tooling. Our Backend Engineers will be the driving force to build our products and solutions, in extensive collaboration with our ML Engineers and Front-end Engineers.

Responsibilities
  • API system development for copilot & agent quality tooling

  • API system development for copilot serving and integration with a focus on enterprise-grade requirements in the following areas

    • Integration with on-prem & cloud compute vendors

    • Integration with software tools required in customer oriented solutions

  • Distributed system and optionally GPU performance optimization

  • Wear many hats and collaborate with the whole team for product development, deployment and customer success

Qualification Required
  • Experience in ML model or ML data pipeline deployment (on-prem or on cloud)

  • Experience in building backend for application or platform API systems

Preferred
  • Working experience in fast-pace team environment  
  • Experience in using or contributing to modern compute frameworks for LLMs (e.g. Deepspeed, Huggingface TGI, FSDP)

  • Experience in projects involving LLMs