Log In

1

Diffusion Model Jobs (NOW HIRING)

Multimodal LLM Researcher (MLLM)

Palo Alto, CA · On-site +1

Work on diffusion model distillation and/or develop diffusion-based world models for multimodal applications * Train and finetune autoregressive and diffusion models in LLM, VLM, or Audio LM contexts ...

Multimodal LLM Researcher (MLLM)

Palo Alto, CA · On-site +1

Work on diffusion model distillation and/or develop diffusion-based world models for multimodal applications * Train and finetune autoregressive and diffusion models in LLM, VLM, or Audio LM contexts ...

Multimodal LLM Researcher (MLLM)

Palo Alto, CA · On-site

$185K - $400K/yr

Work on diffusion model distillation and/or develop diffusion-based world models for multimodal applications * Train and finetune autoregressive and diffusion models in LLM, VLM, or Audio LM contexts ...

Multimodal LLM Researcher (MLLM)

Palo Alto, CA · On-site

$185K - $400K/yr

Work on diffusion model distillation and/or develop diffusion-based world models for multimodal applications * Train and finetune autoregressive and diffusion models in LLM, VLM, or Audio LM contexts ...

Oscar Technology

Machine Learning Inference Engineer

San Francisco, CA

$134K - $162K/yr

KV cache optimization model pruning quantization distillation batching strategies memory optimization latent-space conditioning Deploy and scale multimodal architectures including: diffusion models ...

Oscar Technology

Machine Learning Inference Engineer

San Francisco, CA

$134K - $162K/yr

KV cache optimization model pruning quantization distillation batching strategies memory optimization latent-space conditioning Deploy and scale multimodal architectures including: diffusion models ...

Research Scientist (Generative Modeling)

San Francisco, CA

Contribute hands-on to all stages of model development including data curation, experimentation, evaluation, and deployment. * Continuously explore and integrate cutting-edge research in diffusion ...

Research Scientist (Generative Modeling)

San Francisco, CA

Contribute hands-on to all stages of model development including data curation, experimentation, evaluation, and deployment. * Continuously explore and integrate cutting-edge research in diffusion ...

Research Scientist (Generative Modeling)

San Francisco, CA · On-site

Contribute hands-on to all stages of model development including data curation, experimentation, evaluation, and deployment. * Continuously explore and integrate cutting-edge research in diffusion ...

Research Scientist (Generative Modeling)

San Francisco, CA · On-site

Contribute hands-on to all stages of model development including data curation, experimentation, evaluation, and deployment. * Continuously explore and integrate cutting-edge research in diffusion ...

Machine Learning Engineer Intern (Computer Vision/Multimodal/Generative AI)

San Francisco, CA · On-site

Work on diffusion model optimization, controllability, and step efficiency. * Design experiments and evaluation frameworks for visual realism and consistency. * Translate research prototypes into ...

Machine Learning Engineer Intern (Computer Vision/Multimodal/Generative AI)

San Francisco, CA · On-site

Work on diffusion model optimization, controllability, and step efficiency. * Design experiments and evaluation frameworks for visual realism and consistency. * Translate research prototypes into ...

Software Engineer 5 - Model Runtime, AI Platform

OR · On-site +1

$466K - $750K/yr

Enable next-generation GenAI workloads - Create infrastructure for multimodal and diffusion models, including distributed training, disaggregated serving, real-time, near-real-time and batch ...

Software Engineer 5 - Model Runtime, AI Platform

OR · On-site +1

$466K - $750K/yr

Enable next-generation GenAI workloads - Create infrastructure for multimodal and diffusion models, including distributed training, disaggregated serving, real-time, near-real-time and batch ...

Software Engineer 5 - Model Runtime, AI Platform

$466K - $750K/yr

Enable next-generation GenAI workloads - Create infrastructure for multimodal and diffusion models, including distributed training, disaggregated serving, real-time, near-real-time and batch ...

Software Engineer 5 - Model Runtime, AI Platform

$466K - $750K/yr

Enable next-generation GenAI workloads - Create infrastructure for multimodal and diffusion models, including distributed training, disaggregated serving, real-time, near-real-time and batch ...

Software Engineer 5 - Model Runtime, AI Platform

OR · On-site +1

$466K - $750K/yr

Enable next-generation GenAI workloads - Create infrastructure for multimodal and diffusion models, including distributed training, disaggregated serving, real-time, near-real-time and batch ...

Software Engineer 5 - Model Runtime, AI Platform

OR · On-site +1

$466K - $750K/yr

Enable next-generation GenAI workloads - Create infrastructure for multimodal and diffusion models, including distributed training, disaggregated serving, real-time, near-real-time and batch ...

Software Engineer 5 - Model Runtime, AI Platform

$466K - $750K/yr

Enable next-generation GenAI workloads - Create infrastructure for multimodal and diffusion models, including distributed training, disaggregated serving, real-time, near-real-time and batch ...

Software Engineer 5 - Model Runtime, AI Platform

$466K - $750K/yr

Enable next-generation GenAI workloads - Create infrastructure for multimodal and diffusion models, including distributed training, disaggregated serving, real-time, near-real-time and batch ...

Staff Research Scientist (Diffusion)

Manhattan, NY · On-site +1

This role sits right at the centre of their core research: pre-training diffusion-based transformer models for high-fidelity audio generation. What you'll work on Designing and training large-scale ...

Staff Research Scientist (Diffusion)

Manhattan, NY · On-site +1

This role sits right at the centre of their core research: pre-training diffusion-based transformer models for high-fidelity audio generation. What you'll work on Designing and training large-scale ...

Research Scientist - Privacy-Preserving Large-Scale Model Training & Architecture Optimization

San Jose, CA · On-site

$156K - $316K/yr

Diffusion & Unified Model Optimization - Optimize Diffusion Transformer training pipelines, including noise schedules, timestep strategies, and memory-efficient attention mechanisms. - Support ...

Research Scientist - Privacy-Preserving Large-Scale Model Training & Architecture Optimization

San Jose, CA · On-site

$156K - $316K/yr

Diffusion & Unified Model Optimization - Optimize Diffusion Transformer training pipelines, including noise schedules, timestep strategies, and memory-efficient attention mechanisms. - Support ...

Foundational diffusion models and world models for high-quality video generation * Real-time AI pipelines that turn ideas into consistent, dynamic video scenes * Multi-agent systems and orchestration ...

Foundational diffusion models and world models for high-quality video generation * Real-time AI pipelines that turn ideas into consistent, dynamic video scenes * Multi-agent systems and orchestration ...

AI Researcher - Video Generation

San Francisco, CA · On-site

$300K/yr

Foundational diffusion models and world models for high-quality video generation * Real-time AI pipelines that turn ideas into consistent, dynamic video scenes * Multi-agent systems and orchestration ...

AI Researcher - Video Generation

San Francisco, CA · On-site

$300K/yr

Foundational diffusion models and world models for high-quality video generation * Real-time AI pipelines that turn ideas into consistent, dynamic video scenes * Multi-agent systems and orchestration ...

Black Forest Labs

Senior Solutions Architect

San Francisco, CA · On-site

$180K - $300K/yr

Have prior experience finetuning diffusion models and working with customization tools like ComfyUI * Bring a proven track record in solutions engineering, particularly on large and complex ...

Black Forest Labs

Senior Solutions Architect

San Francisco, CA · On-site

$180K - $300K/yr

Have prior experience finetuning diffusion models and working with customization tools like ComfyUI * Bring a proven track record in solutions engineering, particularly on large and complex ...

Black Forest Labs

Senior Solutions Architect

San Francisco, CA · On-site

$180K - $300K/yr

Have prior experience finetuning diffusion models and working with customization tools like ComfyUI * Bring a proven track record in solutions engineering, particularly on large and complex ...

Black Forest Labs

Senior Solutions Architect

San Francisco, CA · On-site

$180K - $300K/yr

Have prior experience finetuning diffusion models and working with customization tools like ComfyUI * Bring a proven track record in solutions engineering, particularly on large and complex ...

Black Forest Labs

Member of Technical Staff - Image / Video Generation

Wildorado, TX · On-site

Why This Role You'll train large-scale diffusion models for image and video generation, exploring new approaches while maintaining the rigor that helps us distinguish meaningful progress from ...

Black Forest Labs

Member of Technical Staff - Image / Video Generation

Wildorado, TX · On-site

Why This Role You'll train large-scale diffusion models for image and video generation, exploring new approaches while maintaining the rigor that helps us distinguish meaningful progress from ...

Black Forest Labs

Senior Solutions Architect

San Francisco, CA · On-site

$180K - $300K/yr

Have prior experience finetuning diffusion models and working with customization tools like ComfyUI * Bring a proven track record in solutions engineering, particularly on large and complex ...

Black Forest Labs

Senior Solutions Architect

San Francisco, CA · On-site

$180K - $300K/yr

Have prior experience finetuning diffusion models and working with customization tools like ComfyUI * Bring a proven track record in solutions engineering, particularly on large and complex ...

World Model / Action Policy Researcher

Manhattan, NY · On-site

... diffusion models, VAEs, and transformers applied to video. • Experience with world models and predictive control - you understand how to train models that simulate dynamics and plan actions in ...

World Model / Action Policy Researcher

Manhattan, NY · On-site

... diffusion models, VAEs, and transformers applied to video. • Experience with world models and predictive control - you understand how to train models that simulate dynamics and plan actions in ...

Data Scientist + ML Engineer (Gen AI)

Cupertino, CA · On-site

In this role, you will be responsible for developing, fine-tuning, and applying advanced generative AI models -- including diffusion models, large language models (LLMs), and other state-of-the-art ...

Data Scientist + ML Engineer (Gen AI)

Cupertino, CA · On-site

In this role, you will be responsible for developing, fine-tuning, and applying advanced generative AI models -- including diffusion models, large language models (LLMs), and other state-of-the-art ...

1

2

3

Showing results 1-20

People also search for

Ai Mod

Next

See

Diffusion Model Jobs

Diffusion Model information

See salary details

$30

$52

$96

How much do diffusion model jobs pay per hour?

As of Jun 5, 2026, the average hourly pay for diffusion model in the United States is $52.18, according to ZipRecruiter salary data. Most workers in this role earn between $38.46 and $96.15 per hour, depending on experience, location, and employer.

What are the key skills and qualifications needed to thrive as a Diffusion Model Engineer, and why are they important?

To thrive as a Diffusion Model Engineer, you need a strong background in machine learning, deep learning, mathematics, and programming, usually supported by a degree in computer science or a related field. Familiarity with frameworks like PyTorch or TensorFlow, experience with large-scale data processing, and knowledge of diffusion model architectures are typically required. Creativity, problem-solving, and effective communication are crucial soft skills for collaborating with multidisciplinary teams and advancing research. These skills enable the development and implementation of cutting-edge generative models that drive innovation in AI applications.

What are some common challenges faced by professionals working with diffusion models, and how can these be addressed?

Professionals working with diffusion models often encounter challenges related to computational resource demands, model stability, and data quality. Training large diffusion models can require significant GPU resources and careful tuning to prevent issues like mode collapse or slow convergence. Collaborating closely with data engineers and domain experts helps ensure high-quality, diverse datasets, which are critical for realistic outputs. Staying up-to-date with the latest research and best practices can also help address these challenges and advance your skills in this rapidly evolving field.

What are diffusion models in machine learning?

Diffusion models are a type of generative model in machine learning that create data, such as images, by simulating a process where noise is gradually removed from a random signal. These models learn to reverse a diffusion process, transforming noisy data into structured outputs that resemble real examples from the training set. They have gained popularity for producing high-quality, realistic images and other media. Diffusion models are used in various applications, including image synthesis, inpainting, and audio generation.

What is the difference between Diffusion Model vs Data Scientist?

Aspect	Diffusion Model	Data Scientist
Required Credentials	Typically a background in machine learning, statistics, or computer science	Degree in data science, statistics, computer science, or related fields
Work Environment	Research labs, AI development teams, tech companies	Business, tech firms, consulting, research institutions
Industry Usage	Used in AI image generation, generative modeling	Analyzing data, building predictive models, data visualization

While both roles involve data and algorithms, a Diffusion Model focuses on developing generative AI models, whereas a Data Scientist analyzes data to inform business decisions. Understanding these differences helps in choosing the right career path or job focus.

Infographic showing various Diffusion Model job openings in the United States as of May 2026, with employment types broken down into 83% Full Time, 13% Part Time, 1% Temporary, and 3% Contract. Highlights an 79% Physical, 1% Hybrid, and 20% Remote job distribution, with an average salary of $108,534 per year, or $52.2 per hour.

Multimodal LLM Researcher (MLLM)

Palo Alto, CA • On-site, Remote

Apply

Other

Medical, Retirement

This job post has expired today. Applications are no longer accepted.

Job description

Multimodal LLM Researcher (MLLM)
About the Role
At Pika, we are pioneering next-generation creative infrastructure built around real-time, multimodal generation and intelligent, agentic platforms. We are seeking accomplished Multimodal LLM Researchers (LLM, VLM, and Audio LM) to drive forward our mission to make agentic real-time generative technology accessible, dynamic, and transformative for millions of creators.
As a core member of our research team, you will be integral to designing and building foundational technologies, developing novel approaches for large multimodal language models (LLMs/VLMs/Audio LMs), and orchestrating intelligent agentic systems that power scalable, interactive multimedia experiences. You will collaborate closely with engineering and product teams, shaping the future of real-time creative platforms.
What You'll Do

Lead and contribute to research efforts focused on real-time, multimodal generation-including text, image, video, and audio synthesis-as well as orchestration of agentic platform infrastructure
Design and prototype novel algorithms and architectures for high-fidelity, real-time multimodal synthesis and interactive experiences
Focus on real-time aspects of model inference and synthesis across modalities
Work on diffusion model distillation and/or develop diffusion-based world models for multimodal applications
Train and finetune autoregressive and diffusion models in LLM, VLM, or Audio LM contexts with a focus on real-time performance
Curate specific datasets, especially for video, audio, cross-modal, and sensory-rich data
Collaborate with cross-functional teams to bring research advancements into production-ready technologies
Publish work in top-tier conferences and journals; communicate research results internally and externally
Stay at the cutting edge of real-time multimodal generative AI and agentic orchestration

What We're Looking For

5+ years of relevant experience, including research during graduate studies, in large language models, vision-language models, audio language models, deep learning, or related fields
Demonstrated impact as first author on major publications in top conferences or journals (e.g., NeurIPS, CVPR, ICML, ICCV, SIGGRAPH, Interspeech, etc.)
Deep expertise in at least one area: language modeling (LLM), vision-language modeling (VLM), or audio language modeling (Audio LM)
Strong experience with generative models, including autoregressive and diffusion models, and their real-time deployment
Hands-on experience curating, constructing, or augmenting large, high-quality multimodal datasets
Experience developing and deploying real-time systems and/or agentic orchestration infrastructure
Strong programming and prototyping skills (Python, PyTorch, TensorFlow, etc.)
Passion for building creative tools and platforms that empower users
Excellent communication and collaboration skills

What We Offer

Competitive salary and substantial equity in a high-growth startup
Full health benefits + 401k matching and more
Collaborative, mission-driven team environment with major growth opportunities
Flexible on-site/remote hybrid (HQ in Palo Alto, CA)

About Pika
Pika empowers creators by building state-of-the-art agentic and multimedia platforms. Our vision is to break down technical barriers to creativity, making real-time generative and intelligent orchestration accessible to all. Join us and shape the next evolution of creative technology!
If you are a leading researcher excited by real-time multimodal AI and agentic platforms, we want to hear from you.

Apply