Online Rlhf Jobs (NOW HIRING)

Applied Scientist II, Alexa International Team

Build novel online & offline evaluation metrics and methodologies for multimodal personal digital assistants. * Fine-tune/post-train LLMs using techniques like SFT, DPO, RLHF, and RLAIF. * Set up ...

Amazon

Applied Scientist II, Alexa International Team

Bellevue, WA · On-site

BeyondTrust

Design experiments, define success metrics, and run rigorous offline and online evaluations (A/B ... Familiarity with LLM fine-tuning techniques (LoRA, RLHF, instruction tuning) and serving ...

BeyondTrust

University of Virginia

Research Scientist

Charlottesville, VA · On-site

Experience with fine-tuning techniques (supervised fine-tuning, instruction tuning, RLHF, domain ... This position will not sponsor applicants requiring a visa How to Apply Please apply online through ...

University of Virginia

Research Scientist

Charlottesville, VA · On-site

BMC Software, Inc

Principal AI Engineer

Santa Clara, CA · On-site

... CI/CD, online evaluators on production traffic, calibrated LLM-as-a-judge graders, and A/B ... Experience with model customization (SFT, RLHF, DPO/GRPO), eval/observability platforms and ...

BMC Software, Inc

Principal AI Engineer

Santa Clara, CA · On-site

Airbnb

Machine Learning Engineer, Community Support Engineering

San Francisco, CA

Hands-on expertise in LLM, including pretraining, fine-tuning (SFT, RLHF, GRPO), prompt engineering ... online application.

Airbnb

Machine Learning Engineer, Community Support Engineering

San Francisco, CA

Hands-on expertise in LLM, including pretraining, fine-tuning (SFT, RLHF, GRPO), prompt engineering ... online application.

BMC Software, Inc.

Principal AI Engineer

Santa Clara, CA · On-site

BMC Software, Inc.

Principal AI Engineer

Santa Clara, CA · On-site

Wizard

AI Applied Scientist

$225K - $280K/yr

... online * Drive measurable improvements to LLM judge quality (calibration, fine-tuning where ... Direct experience with LLM-based systems: judge models, RAG, prompt engineering, fine-tuning, RLHF ...

Wizard

AI Applied Scientist

$225K - $280K/yr

TikTok

Senior Software Engineer - Global E-Commerce Search Infrastructure (TikTok Shop)

Seattle, WA · On-site

$202K - $368K/yr

... latency online services. - Proficiency in C++, Go, or Java (C++ preferred); strong systems ... SFT, RL (RLHF / PPO / GRPO), distillation, or reward model design. - Research publications at ...

TikTok

Senior Software Engineer - Global E-Commerce Search Infrastructure (TikTok Shop)

Seattle, WA · On-site

$202K - $368K/yr

Netflix

Research Scientist 4 - Machine Learning and Inference Research, LLM Post-Training

Los Gatos, CA

D. in Computer Science or a related field with a specialization in post-training LLMs for downstream tasks, especially using RL (e.g., RLVR, RLHF, offline or online, policy- or value-based), and ...

Netflix

Research Scientist 4 - Machine Learning and Inference Research, LLM Post-Training

Los Gatos, CA

Lam Research Corporation

Software Engineer Sys 5

Fremont, CA · On-site

$189K - $224K/yr

Understanding of reinforcement learning, including RLHF, RLAIF, offline and online RL, safety-aware reward design, and feedback loops. * Expertise in AI evaluation platforms, synthetic data ...

Lam Research Corporation

Software Engineer Sys 5

Fremont, CA · On-site

$189K - $224K/yr

Amazon

Senior Applied Scientist

Seattle, WA · On-site

At Amazon Selection and Catalog Systems (ASCS), our mission is to power the online buying ... tuning, RLHF, or agentic architectures Amazon is an equal opportunity employer and does not ...

Amazon

Senior Applied Scientist

Seattle, WA · On-site

At Amazon Selection and Catalog Systems (ASCS), our mission is to power the online buying ... tuning, RLHF, or agentic architectures Amazon is an equal opportunity employer and does not ...

Netflix

Research Scientist 4 - Machine Learning and Inference Research, LLM Post-Training

Los Angeles, CA · On-site

D in Computer Science or a related field with a specialization in post-training LLMs for downstream tasks, especially using RL (e.g., RLVR, RLHF, offline or online, policy- or value-based), and ...

Netflix

Research Scientist 4 - Machine Learning and Inference Research, LLM Post-Training

Los Angeles, CA · On-site

Apple

AIML - Sr Machine Learning Engineer, Data and ML Innovation

Cupertino, CA · On-site +1

$150K - $277K/yr

... online metrics, covering reasoning, tool use, and task success. Design and maintain verifiers ... RLHF, DPO, PPO). Strong software engineering fundamentals: debugging, testing, code reviews, and ...

Apple

AIML - Sr Machine Learning Engineer, Data and ML Innovation

Cupertino, CA · On-site +1

$150K - $277K/yr

Netflix

Research Scientist 4 - Machine Learning and Inference Research, LLM Post-Training

Los Angeles, CA

Netflix

Research Scientist 4 - Machine Learning and Inference Research, LLM Post-Training

Los Angeles, CA

Apple

AIML - Sr Machine Learning Engineer, Data and ML Innovation

Cupertino, CA · On-site +1

$150K - $277K/yr

Apple

AIML - Sr Machine Learning Engineer, Data and ML Innovation

Cupertino, CA · On-site +1

$150K - $277K/yr

Netflix

Research Scientist 4 - Machine Learning and Inference Research, LLM Post-Training

New York, NY · On-site

Netflix

Research Scientist 4 - Machine Learning and Inference Research, LLM Post-Training

New York, NY · On-site

Netflix

Research Scientist 4 - Machine Learning and Inference Research, LLM Post-Training

New York, NY · On-site

Netflix

Research Scientist 4 - Machine Learning and Inference Research, LLM Post-Training

New York, NY · On-site

Netflix

Research Scientist 4 - Machine Learning and Inference Research, LLM Post-Training

Los Gatos, CA · On-site

Netflix

Research Scientist 4 - Machine Learning and Inference Research, LLM Post-Training

Los Gatos, CA · On-site

Apple

Machine Learning Engineer, Proactive

Cupertino, CA · On-site

$216K - $324K/yr

... techniques (e.g RLHF, Reward model, DPO, PPO, GRPO etc.), Parameter efficient fine-tuning ... online learning and recommendation systems Experience working with machine learning or LLM model ...

Apple

Machine Learning Engineer, Proactive

Cupertino, CA · On-site

$216K - $324K/yr

Hyundai Capital

Sr. AI Agent / Prompt Engineer

Irvine, CA

... online evaluations for agent behavior (task success, tool success, error recovery). Diagnose ... Experience with multiagent coordination, RLHF/RLAIF, or feedback loops preferred. Knowledge of ...

Hyundai Capital

Sr. AI Agent / Prompt Engineer

Irvine, CA

Showing results 1-20

Online Rlhf Jobs

Online Rlhf information

See salary details

$17.5K

$40.6K

$86K

How much do online rlhf jobs pay per year?

As of Jul 11, 2026, the average yearly pay for online rlhf in the United States is $40,596.00, according to ZipRecruiter salary data. Most workers in this role earn between $25,000.00 and $43,500.00 per year, depending on experience, location, and employer.

What are some common challenges faced by Online RLHF (Reinforcement Learning from Human Feedback) specialists when collaborating with cross-functional teams?

Online RLHF specialists often work closely with machine learning engineers, data annotators, and product managers. A common challenge is ensuring that feedback from human annotators is accurately integrated into model training, which requires clear communication and well-defined annotation guidelines. Additionally, balancing the pace of model updates with the need for high-quality human feedback can be demanding. Effective collaboration and regular syncs are essential to maintain alignment and achieve project goals.

What is the difference between Online Rlhf vs Online Rlhf?

Aspect	Online Rlhf	Online Rlhf
Credentials	Typically requires certification in online health coaching or related fields	Typically requires certification in online health coaching or related fields
Work Environment	Remote, online platform-based	Remote, online platform-based
Industry Usage	Common in health and wellness sectors	Common in health and wellness sectors
Job Focus	Providing health guidance and support online	Providing health guidance and support online

Online Rlhf and Online Rlhf are the same role, often used interchangeably. Both involve providing health and wellness support remotely, requiring similar certifications and working within the online health industry. The key difference is often in terminology rather than job function.

What are Online RLHF jobs?

Online RLHF (Reinforcement Learning from Human Feedback) jobs typically involve helping to train AI models by providing human feedback on their outputs. Workers in these roles might review model responses, rate the quality of generated text, or suggest improvements to help the AI learn to produce better results. These jobs are often remote and can be done part-time or as contract work. They play a crucial role in improving the safety, usefulness, and accuracy of AI systems by aligning them more closely with human preferences.

What are the key skills and qualifications needed to thrive as an Online RLHF (Reinforcement Learning from Human Feedback) Specialist, and why are they important?

To thrive as an Online RLHF Specialist, you need a strong background in machine learning, reinforcement learning, and data analysis, typically supported by a degree in computer science or a related field. Familiarity with technical tools like Python, PyTorch or TensorFlow, and experience with human feedback systems or annotation platforms are highly valuable. Strong problem-solving, attention to detail, and the ability to communicate complex concepts clearly are crucial soft skills. These qualifications ensure the effective training and evaluation of AI models, leading to more accurate and reliable machine learning systems.

More about Online Rlhf jobs

The 10 Top Types Of Online Rlhf Jobs

What cities are hiring for Online Rlhf jobs? Cities with the most Online Rlhf job openings:

What are the most commonly searched types of Rlhf jobs? The most popular types of Rlhf jobs are:

What states have the most Online Rlhf jobs? States with the most job openings for Online Rlhf jobs include:

What job categories do people searching Online Rlhf jobs look for? The top searched job categories for Online Rlhf jobs are:

Online Rlhf jobs near you

Infographic showing various Online Rlhf job openings in the United States as of July 2026, with employment types broken down into 1% Locum Tenens, 1% As Needed, 61% Full Time, 35% Part Time, 1% Temporary, and 1% Contract. Highlights an 81% Physical, 1% Hybrid, and 18% Remote job distribution, with an average salary of $40,596 per year, or $19.5 per hour.

Applied Scientist II, Alexa International Team

Amazon

Bellevue, WA • On-site

Apply

Full-time

Medical, Dental, Vision, Life, Retirement, PTO

Re-posted 26 days ago

Amazon rating

7.4

Based on 6,956 frontline employees who took The Breakroom Quiz

6th of 39 rated national retailers

Job description

Alexa International is looking for a passionate, talented, and inventive Applied Scientist to help build industry-leading technology with Large Language Models (LLMs) and multimodal systems, requiring strong deep learning and generative models knowledge. You will contribute to developing novel solutions and deliver high-quality results that impact Alexa's international products and services.
Key job responsibilities
As an Applied Scientist with the Alexa International team, you will work with talented peers to develop novel algorithms and modeling techniques to advance the state of the art with LLMs. Your work will directly impact our international customers in the form of products and services that make use of digital assistant technology. You will leverage Amazon's heterogeneous data sources, unique and diverse international customer nuances and large-scale computing resources to accelerate advances in text, voice, and vision domains in a multimodal setup. The ideal candidate possesses a solid understanding of machine learning, natural language understanding, modern LLM architectures, LLM evaluation & tooling, and a passion for pushing boundaries in this vast and quickly evolving field. They thrive in fast-paced environments to tackle complex challenges, excel at swiftly delivering impactful solutions while iterating based on user feedback, and collaborate effectively with cross-functional teams.
A day in the life
* Analyze, understand, and model customer behavior and the customer experience based on large-scale data.
* Build novel online & offline evaluation metrics and methodologies for multimodal personal digital assistants.
* Fine-tune/post-train LLMs using techniques like SFT, DPO, RLHF, and RLAIF.
* Set up experimentation frameworks for agile model analysis and A/B testing.
* Collaborate with partner teams on LLM evaluation frameworks and post-training methodologies.
* Contribute to end-to-end delivery of solutions from research to production, including reusable science components.
* Communicate solutions clearly to partners and stakeholders.
* Contribute to the scientific community through publications and community engagement.
BASIC QUALIFICATIONS
- PhD, or Master's degree and 4+ years of CS, CE, ML or related field experience
- Experience in patents or publications at top-tier peer-reviewed conferences or journals
- Experience programming in Java, C++, Python or related language
- Experience in any of the following areas: algorithms and data structures, parsing, numerical optimization, data mining, parallel and distributed computing, high-performance computing
PREFERRED QUALIFICATIONS
- Experience in professional software development
- PhD
Amazon is an equal opportunity employer and does not discriminate on the basis of protected veteran status, disability, or other legally protected status.
Our inclusive culture empowers Amazonians to deliver the best results for our customers. If you have a disability and need a workplace accommodation or adjustment during the application and hiring process, including support for the interview or onboarding process, please visit https://amazon.jobs/content/en/how-we-hire/accommodations for more information. If the country/region you're applying in isn't listed, please contact your Recruiting Partner.
The base salary range for this position is listed below. Your Amazon package will include sign-on payments and restricted stock units (RSUs). Final compensation will be determined based on factors including experience, qualifications, and location. Amazon also offers comprehensive benefits including health insurance (medical, dental, vision, prescription, Basic Life & AD&D insurance and option for Supplemental life plans, EAP, Mental Health Support, Medical Advice Line, Flexible Spending Accounts, Adoption and Surrogacy Reimbursement coverage), 401(k) matching, paid time off, and parental leave. Learn more about our benefits at https://amazon.jobs/en/benefits.
USA, WA, Bellevue - 142,800.00 - 193,200.00 USD annually

What Amazon employees say

Pay

Benefits

Hours and flexibility

Workplace

Get the full story on Breakroom

About Amazon

Sourced by ZipRecruiter

Amazon.com, Inc., commonly known as Amazon, is an American multinational technology company. It was founded by Jeff Bezos in 1994 and initially started as an online marketplace for books. Since then, Amazon has expanded its operations and become one of the largest e-commerce companies in the world. Amazon's primary business is its online retail platform, where customers can purchase a vast array of products, including electronics, clothing, books, home goods, and much more. The company offers a convenient and user-friendly shopping experience, with features such as fast shipping, customer reviews, and personalized recommendations. In addition to its e-commerce platform, Amazon has diversified its business into various other areas. One of its notable ventures is Amazon Web Services (AWS), a comprehensive cloud computing platform that provides services such as storage, compute power, and database management to individuals and businesses. AWS has become a leader in the cloud computing industry, powering many websites and applications worldwide. Amazon has also developed its own consumer electronics, including the popular Amazon Kindle e-reader, Fire tablets, Fire TV streaming devices, and the Alexa-powered Echo smart speakers. The Alexa voice assistant, integrated into these devices, allows users to interact with their devices using voice commands, perform tasks, and access information. Furthermore, Amazon has expanded into media and entertainment. It operates Prime Video, a streaming service that offers a wide range of movies, TV shows, and original content. Amazon Music provides a platform for streaming and purchasing digital music, while Audible offers audiobooks and other audio content. The company's commitment to customer satisfaction and convenience is demonstrated by its membership program, Amazon Prime. Prime members receive various benefits, including free two-day shipping, access to streaming services, exclusive deals, and more.

Industry

It services, book publishers, retail, real estate and computer and electronic product manufacturing

Company size

10,000+ Employees

Headquarters location

Seattle, WA, US

Website

amazon.com

Social media

View All Amazon Jobs

Apply

Online Rlhf Jobs (NOW HIRING)

Applied Scientist II, Alexa International Team

Applied Scientist II, Alexa International Team

Staff AI Data Scientist

Staff AI Data Scientist

Research Scientist

Research Scientist

Principal AI Engineer

Principal AI Engineer

Machine Learning Engineer, Community Support Engineering

Machine Learning Engineer, Community Support Engineering

Principal AI Engineer

Principal AI Engineer

AI Applied Scientist

AI Applied Scientist

Senior Software Engineer - Global E-Commerce Search Infrastructure (TikTok Shop)

Senior Software Engineer - Global E-Commerce Search Infrastructure (TikTok Shop)

Research Scientist 4 - Machine Learning and Inference Research, LLM Post-Training

Research Scientist 4 - Machine Learning and Inference Research, LLM Post-Training

Software Engineer Sys 5

Software Engineer Sys 5

Senior Applied Scientist

Senior Applied Scientist

Research Scientist 4 - Machine Learning and Inference Research, LLM Post-Training

Research Scientist 4 - Machine Learning and Inference Research, LLM Post-Training

AIML - Sr Machine Learning Engineer, Data and ML Innovation

AIML - Sr Machine Learning Engineer, Data and ML Innovation

Research Scientist 4 - Machine Learning and Inference Research, LLM Post-Training

Research Scientist 4 - Machine Learning and Inference Research, LLM Post-Training

AIML - Sr Machine Learning Engineer, Data and ML Innovation

AIML - Sr Machine Learning Engineer, Data and ML Innovation

Research Scientist 4 - Machine Learning and Inference Research, LLM Post-Training

Research Scientist 4 - Machine Learning and Inference Research, LLM Post-Training

Research Scientist 4 - Machine Learning and Inference Research, LLM Post-Training

Research Scientist 4 - Machine Learning and Inference Research, LLM Post-Training

Research Scientist 4 - Machine Learning and Inference Research, LLM Post-Training

Research Scientist 4 - Machine Learning and Inference Research, LLM Post-Training

Machine Learning Engineer, Proactive

Machine Learning Engineer, Proactive

Sr. AI Agent / Prompt Engineer

Sr. AI Agent / Prompt Engineer

Online Rlhf information

See salary details

How much do online rlhf jobs pay per year?

What are some common challenges faced by Online RLHF (Reinforcement Learning from Human Feedback) specialists when collaborating with cross-functional teams?

What is the difference between Online Rlhf vs Online Rlhf?

What are Online RLHF jobs?

What are the key skills and qualifications needed to thrive as an Online RLHF (Reinforcement Learning from Human Feedback) Specialist, and why are they important?

Applied Scientist II, Alexa International Team

Share this job

Amazon rating

Get the real story on frontline employers

Job description

What Amazon employees say

Get the real story on frontline employers

Pay

Most people get paid breaks

Most people don’t get paid when they’re sick

The job rarely spills into unpaid time

Benefits

Sick days use up paid time off

Only some part-timers can get health insurance

Most part-timers get paid time off

Hours and flexibility

Less than 4 weeks notice of work schedule

Some people worry about their hours

Only some people can choose their shifts

Workplace

Most people feel treated with respect

Most people get breaks without interruption

Some people are stressed out

About Amazon

Industry

Company size

Headquarters location

Website

Social media

Share this job