Reinforcement Learning With Human Feedback Jobs (NOW HIRING)

2026 Fall Applied Science Internship - Natural Language Processing and Speech Technologies - United

$17 - $22.75/hr

You'll dive into production-scale data, exploring innovative approaches to natural language understanding, large language models, reinforcement learning with human feedback, conversational AI, and ...

Amazon

2026 Fall Applied Science Internship - Natural Language Processing and Speech Technologies - United

Seattle, WA · On-site

$17 - $22.75/hr

Remotasks

Threat Intel - AI / LLM Trainer - Make Your Own Hours

The successful candidate will be involved in evaluating AI-generated prompts using reinforcement learning with human feedback to grade and improve AI quality. This role is an exceptional opportunity ...

Remotasks

Threat Intel - AI / LLM Trainer - Make Your Own Hours

Cypress HCM

LLM Research Engineer

Mountain View, CA

$90 - $121.86/hr

Knowledge of reinforcement learning and RLHF (Reinforcement Learning with Human Feedback). Compensation: $90 - $121.86 per hour ID#: 36408719

Quick apply

Cypress HCM

LLM Research Engineer

Mountain View, CA

$90 - $121.86/hr

Knowledge of reinforcement learning and RLHF (Reinforcement Learning with Human Feedback). Compensation: $90 - $121.86 per hour ID#: 36408719

Genmo

Research Scientist (post-training)

San Francisco, CA · On-site

Design and implement supervised fine-tuning and reinforcement learning from human feedback (RLHF ... Collaborate with cross-functional teams to integrate alignment improvements into our production ...

Quick apply

Genmo

Research Scientist (post-training)

San Francisco, CA · On-site

Character.AI

Research Engineer, AI Safety & Alignment

Redwood City, CA · On-site

... reinforcement learning from human feedback (RLHF) and fine-tuning. • Collaborate with engineering and product teams to translate safety research into practical, scalable solutions and best ...

Character.AI

Research Engineer, AI Safety & Alignment

Redwood City, CA · On-site

Genmo

Research Scientist (post-training)

San Francisco, CA · On-site

Genmo

Research Scientist (post-training)

San Francisco, CA · On-site

JPMorgan Chase & Co.

AI/LLM Product Director - Executive Director

Palo Alto, CA · On-site

$180K - $285K/yr

Leverage a wide range of cutting-edge technologies such as Semantic Search, Personalized Search, Supervised Fine-Tuning (SFT), Reinforcement Learning with Human Feedback (RLHF), Retrieval-Augmented ...

JPMorgan Chase & Co.

AI/LLM Product Director - Executive Director

Palo Alto, CA · On-site

$180K - $285K/yr

JPMorgan Chase & Co

AI/LLM Product Director - Executive Director

Palo Alto, CA · On-site

$267K - $280K/yr

JPMorgan Chase & Co

AI/LLM Product Director - Executive Director

Palo Alto, CA · On-site

$267K - $280K/yr

JP Morgan Chase

AI/LLM Product Director - Executive Director

Palo Alto, CA · On-site

$267K - $280K/yr

JP Morgan Chase

AI/LLM Product Director - Executive Director

Palo Alto, CA · On-site

$267K - $280K/yr

J.P. Morgan

AI/LLM Product Director - Executive Director

New York, NY

$254K - $266K/yr

J.P. Morgan

AI/LLM Product Director - Executive Director

New York, NY

$254K - $266K/yr

Centific

Research Intern - Applied Reinforcement Learning

$35 - $45/hr

... human feedback and evaluate alignment improvements - Create an evaluation harness measuring ... with legal requirements.

Centific

Research Intern - Applied Reinforcement Learning

$35 - $45/hr

... human feedback and evaluate alignment improvements - Create an evaluation harness measuring ... with legal requirements.

Institute of Foundation Models

Machine Learning Engineer

Sunnyvale, CA

$150K - $450K/yr

Hands-on experience with LLM algorithms, such as Supervised Fine-Tuning (SFT) and Reinforcement Learning with Human Feedback (RLHF). * Excellent data analysis skills. Visa Sponsorship This position ...

Quick apply

Institute of Foundation Models

Machine Learning Engineer

Sunnyvale, CA

$150K - $450K/yr

Figure

Reinforcement Learning Engineer - Whole Body Control

San Jose, CA · Hybrid

$200K - $350K/yr

The goal of the company is to ship humanoid robots with human level intelligence. Its robots are ... We are looking for a Reinforcement Learning Engineer to develop, train, deploy, and evaluate ...

Figure

Reinforcement Learning Engineer - Whole Body Control

San Jose, CA · Hybrid

$200K - $350K/yr

Hirekeyz Inc

Applied Research Engineer

San Francisco, CA · On-site

You'll design and implement advanced methods to align human feedback with the training of cutting-edge AI models, including techniques like Reinforcement Learning from Human Feedback (RLHF) , Direct ...

Quick apply

Hirekeyz Inc

Applied Research Engineer

San Francisco, CA · On-site

Figure

Reinforcement Learning Engineer - Whole Body Control

San Jose, CA · On-site

$200K - $350K/yr

Figure

Reinforcement Learning Engineer - Whole Body Control

San Jose, CA · On-site

$200K - $350K/yr

Centific

Applied Reinforcement Learning Engineer

Reward model training, preference learning, human feedback integration • Direct optimization: DPO ... Software engineering beyond research with scalable pipelines and training infrastructure • ...

Centific

Applied Reinforcement Learning Engineer

Persona AI

Reinforcement Learning Engineer, Grasping

Houston, TX · On-site

They are seeking a Reinforcement Learning Engineer to join their Manipulation team, focusing on ... feedback into learned grasp policies. • Experience with contact-rich manipulation and force ...

Persona AI

Reinforcement Learning Engineer, Grasping

Houston, TX · On-site

Character.ai

Research Engineer, AI Safety & Alignment

Redwood City, CA · On-site

$225K - $400K/yr

... like reinforcement learning from human feedback (RLHF) and fine-tuning. * Collaborate with engineering and product teams to translate safety research into practical, scalable solutions and best ...

Character.ai

Research Engineer, AI Safety & Alignment

Redwood City, CA · On-site

$225K - $400K/yr

Dexian

DE · On-site

$122K - $161K/yr

Familiarity with Prompt Engineering for agents/assistants, Supervised Fine-Tuning (SFT), Reinforcement Learning with Human Feedback (RLHF), RAG, and HITL in Agentic Ecosystems. * Knowledge of AI/LLM ...

Dexian

DE · On-site

$122K - $161K/yr

Institute of Foundation Models

Machine Learning Engineer

Sunnyvale, CA · On-site

$150K - $450K/yr

Institute of Foundation Models

Machine Learning Engineer

Sunnyvale, CA · On-site

$150K - $450K/yr

Showing results 1-20

Reinforcement Learning With Human Feedback Jobs

Reinforcement Learning With Human Feedback information

See salary details

$26

$40

$69

How much do reinforcement learning with human feedback jobs pay per hour?

As of Jul 15, 2026, the average hourly pay for reinforcement learning with human feedback in the United States is $40.70, according to ZipRecruiter salary data. Most workers in this role earn between $29.57 and $52.88 per hour, depending on experience, location, and employer.

What are the key skills and qualifications needed to thrive as a Reinforcement Learning with Human Feedback (RLHF) Engineer, and why are they important?

To excel as a Reinforcement Learning with Human Feedback (RLHF) Engineer, you need a strong background in machine learning, reinforcement learning theory, statistics, and typically an advanced degree in computer science or a related field. Familiarity with deep learning frameworks (such as TensorFlow or PyTorch), RL libraries (like Ray RLlib), and experience with data collection and annotation systems are essential. Excellent problem-solving abilities, communication skills, and teamwork help you collaborate with researchers, data annotators, and other engineers. These skills enable you to design and implement RLHF systems that are robust, scalable, and aligned with human values.

What is the difference between Reinforcement Learning With Human Feedback vs Reinforcement Learning Engineer?

Aspect	Reinforcement Learning With Human Feedback	Reinforcement Learning Engineer
Credentials	Typically requires knowledge of machine learning, AI, and data analysis	Requires similar credentials in machine learning, programming, and AI
Work Environment	Research labs, AI development teams, tech companies	Development teams, research labs, tech firms
Industry Usage	Used in AI training, human-in-the-loop systems, and model refinement	Designing, implementing, and optimizing reinforcement learning algorithms

Reinforcement Learning With Human Feedback focuses on improving AI models through human input, while Reinforcement Learning Engineers develop and deploy these algorithms. Both roles require strong machine learning skills and often work in similar environments, but their core responsibilities differ in application and focus.

What is Reinforcement Learning with Human Feedback?

Reinforcement Learning with Human Feedback (RLHF) is a machine learning technique where AI agents are trained not only through automated reward signals but also by incorporating feedback from humans. This approach helps align the agent’s behavior with human preferences, values, or safety requirements by allowing humans to guide or correct the learning process. RLHF is commonly used in developing advanced AI systems, such as language models, to ensure their outputs are helpful, safe, and aligned with user expectations. The process often involves human evaluators ranking or scoring the AI's responses, which are then used to fine-tune the model’s behavior.

What are the typical collaborations involved for a Reinforcement Learning with Human Feedback (RLHF) specialist within a machine learning team?

As an RLHF specialist, you often work closely with data scientists, machine learning engineers, and domain experts to design effective feedback mechanisms and reward models. Collaboration with annotation teams or subject matter experts is common, as high-quality human feedback is crucial for training robust RLHF models. You may also partner with product managers and UX researchers to ensure that the models align with user needs and ethical considerations. Regular cross-functional meetings and code reviews help maintain alignment and foster innovation across teams.

More about Reinforcement Learning With Human Feedback jobs

The 10 Top Types Of Reinforcement Learning With Human Feedback Jobs

What cities are hiring for Reinforcement Learning With Human Feedback jobs? Cities with the most Reinforcement Learning With Human Feedback job openings:

What states have the most Reinforcement Learning With Human Feedback jobs? States with the most job openings for Reinforcement Learning With Human Feedback jobs include:

What job categories do people searching Reinforcement Learning With Human Feedback jobs look for? The top searched job categories for Reinforcement Learning With Human Feedback jobs are:

Reinforcement Learning With Human Feedback jobs near you

Infographic showing various Reinforcement Learning With Human Feedback job openings in the United States as of July 2026, with employment types broken down into 82% Full Time, 17% Part Time, and 1% Contract. Highlights an 90% Physical, 1% Hybrid, and 9% Remote job distribution, with an average salary of $84,648 per year, or $40.7 per hour.

2026 Fall Applied Science Internship - Natural Language Processing and Speech Technologies - United

Amazon

Seattle, WA • On-site

Apply

$17 - $22.75/hr

Full-time

Medical, Retirement

Re-posted 23 hours ago

Amazon rating

7.4

Based on 6,968 frontline employees who took The Breakroom Quiz

6th of 39 rated national retailers

Job description

Shape the Future of Human-Machine Interaction
Are you a master of natural language processing, eager to push the boundaries of conversational AI? Amazon is seeking exceptional graduate students to join our cutting-edge research team, where they will have the opportunity to explore and push the boundaries of natural language processing (NLP), natural language understanding (NLU), and speech recognition technologies.
Imagine waking up each morning, fueled by the excitement of tackling complex research problems that have the potential to reshape the world. You'll dive into production-scale data, exploring innovative approaches to natural language understanding, large language models, reinforcement learning with human feedback, conversational AI, and multimodal learning. Your days will be filled with brainstorming sessions, coding sprints, and lively discussions with brilliant minds from diverse backgrounds.
Throughout your journey, you'll have access to unparalleled resources, including state-of-the-art computing infrastructure, cutting-edge research papers, and mentorship from industry luminaries. This immersive experience will not only sharpen your technical skills but also cultivate your ability to think critically, communicate effectively, and thrive in a fast-paced, innovative environment where bold ideas are celebrated..
Join us at the forefront of applied science, where your contributions will shape the future of AI and propel humanity forward. Seize this extraordinary opportunity to learn, grow, and leave an indelible mark on the world of technology.
Amazon has positions available for Natural Language Processing & Speech Applied Science Internships in, but not limited to, Bellevue, WA; Boston, MA; Cambridge, MA; New York, NY; Santa Clara, CA; Seattle, WA; Sunnyvale, CA.
Key job responsibilities
We are particularly interested in candidates with expertise in: NLP/NLU, LLMs, Reinforcement Learning, Human Feedback/HITL, Deep Learning, Speech Recognition, Conversational AI, Natural Language Modeling, Multimodal Learning.
In this role, you will work alongside global experts to develop and implement novel, scalable algorithms and modeling techniques that advance the state-of-the-art in areas at the intersection of Natural Language Processing and Speech Technologies. You will tackle challenging, groundbreaking research problems on production-scale data, with a focus on natural language processing, speech recognition, text-to-speech (TTS), text recognition, question answering, NLP models (e.g., LSTM, transformer-based models), signal processing, information extraction, conversational modeling, audio processing, speaker detection, large language models, multilingual modeling, and more.
The ideal candidate should possess the ability to work collaboratively with diverse groups and cross-functional teams to solve complex business problems. A successful candidate will be a self-starter, comfortable with ambiguity, with strong attention to detail and the ability to thrive in a fast-paced, ever-changing environment.
A day in the life
- Develop novel, scalable algorithms and modeling techniques that advance the state-of-the-art in natural language processing, speech recognition, text-to-speech, question answering, and conversational modeling.
- Tackle groundbreaking research problems on production-scale data, leveraging techniques such as LSTM, transformer-based models, signal processing, information extraction, audio processing, speaker detection, large language models, and multilingual modeling.
- Collaborate with cross-functional teams to solve complex business problems, leveraging your expertise in NLP/NLU, LLMs, reinforcement learning, human feedback/HITL, deep learning, speech recognition, conversational AI, natural language modeling, and multimodal learning.
- Thrive in a fast-paced, ever-changing environment, embracing ambiguity and demonstrating strong attention to detail.
BASIC QUALIFICATIONS
- Are enrolled in a PhD
- Can relocate to where the internship is based
- Experience programming in Java, C++, Python or related language
- Experience with one or more of the following: Natural Language Processing/Understanding, Large Language Models, Reinforcement Learning, Human Feedback/HITL, Deep Learning, Speech Recognition, Conversational AI, Natural Language Modeling, Multimodal Learning
- Must be available for full-time (40 hours per week) internship for the whole duration of the internship
PREFERRED QUALIFICATIONS
- Have publications at top-tier peer-reviewed conferences or journals
- Experience in designing experiments and statistical analysis of results
- Experience in building speech recognition, machine translation and natural language processing systems (e.g., commercial speech products or government speech projects)
Amazon is an equal opportunity employer and does not discriminate on the basis of protected veteran status, disability, or other legally protected status.
Our inclusive culture empowers Amazonians to deliver the best results for our customers. If you have a disability and need a workplace accommodation or adjustment during the application and hiring process, including support for the interview or onboarding process, please visit https://amazon.jobs/content/en/how-we-hire/accommodations for more information. If the country/region you're applying in isn't listed, please contact your Recruiting Partner.
The starting pay for this position is listed below. Final starting pay will be based on factors including experience, qualifications, and location. Starting Day 1 of employment, Amazon offers EAP, Mental Health Support, Medical Advice Line, 401(k) matching. Learn more about our benefits at https://hiring.amazon.com/why-amazon/benefits.
USA, WA, Seattle - 142,800.00 - 193,200.00 USD annually

What Amazon employees say

Pay

Benefits

Hours and flexibility

Workplace

Get the full story on Breakroom

About Amazon

Sourced by ZipRecruiter

Amazon.com, Inc., commonly known as Amazon, is an American multinational technology company. It was founded by Jeff Bezos in 1994 and initially started as an online marketplace for books. Since then, Amazon has expanded its operations and become one of the largest e-commerce companies in the world. Amazon's primary business is its online retail platform, where customers can purchase a vast array of products, including electronics, clothing, books, home goods, and much more. The company offers a convenient and user-friendly shopping experience, with features such as fast shipping, customer reviews, and personalized recommendations. In addition to its e-commerce platform, Amazon has diversified its business into various other areas. One of its notable ventures is Amazon Web Services (AWS), a comprehensive cloud computing platform that provides services such as storage, compute power, and database management to individuals and businesses. AWS has become a leader in the cloud computing industry, powering many websites and applications worldwide. Amazon has also developed its own consumer electronics, including the popular Amazon Kindle e-reader, Fire tablets, Fire TV streaming devices, and the Alexa-powered Echo smart speakers. The Alexa voice assistant, integrated into these devices, allows users to interact with their devices using voice commands, perform tasks, and access information. Furthermore, Amazon has expanded into media and entertainment. It operates Prime Video, a streaming service that offers a wide range of movies, TV shows, and original content. Amazon Music provides a platform for streaming and purchasing digital music, while Audible offers audiobooks and other audio content. The company's commitment to customer satisfaction and convenience is demonstrated by its membership program, Amazon Prime. Prime members receive various benefits, including free two-day shipping, access to streaming services, exclusive deals, and more.

Industry

It services, book publishers, retail, real estate and computer and electronic product manufacturing

Company size

10,000+ Employees

Headquarters location

Seattle, WA, US

Website

amazon.com

Social media

View All Amazon Jobs

Apply

Reinforcement Learning With Human Feedback Jobs (NOW HIRING)

2026 Fall Applied Science Internship - Natural Language Processing and Speech Technologies - United

2026 Fall Applied Science Internship - Natural Language Processing and Speech Technologies - United

Threat Intel - AI / LLM Trainer - Make Your Own Hours

Threat Intel - AI / LLM Trainer - Make Your Own Hours

LLM Research Engineer

LLM Research Engineer

Research Scientist (post-training)

Research Scientist (post-training)

Research Engineer, AI Safety & Alignment

Research Engineer, AI Safety & Alignment

Research Scientist (post-training)

Research Scientist (post-training)

AI/LLM Product Director - Executive Director

AI/LLM Product Director - Executive Director

AI/LLM Product Director - Executive Director

AI/LLM Product Director - Executive Director

AI/LLM Product Director - Executive Director

AI/LLM Product Director - Executive Director

AI/LLM Product Director - Executive Director

AI/LLM Product Director - Executive Director

Research Intern - Applied Reinforcement Learning

Research Intern - Applied Reinforcement Learning

Machine Learning Engineer

Machine Learning Engineer

Reinforcement Learning Engineer - Whole Body Control

Reinforcement Learning Engineer - Whole Body Control

Applied Research Engineer

Applied Research Engineer

Reinforcement Learning Engineer - Whole Body Control

Reinforcement Learning Engineer - Whole Body Control

Applied Reinforcement Learning Engineer

Applied Reinforcement Learning Engineer

Reinforcement Learning Engineer, Grasping

Reinforcement Learning Engineer, Grasping

Research Engineer, AI Safety & Alignment

Research Engineer, AI Safety & Alignment

Senior Product Associate

Senior Product Associate

Machine Learning Engineer

Machine Learning Engineer

Reinforcement Learning With Human Feedback information

See salary details

How much do reinforcement learning with human feedback jobs pay per hour?

What are the key skills and qualifications needed to thrive as a Reinforcement Learning with Human Feedback (RLHF) Engineer, and why are they important?

What is the difference between Reinforcement Learning With Human Feedback vs Reinforcement Learning Engineer?

What is Reinforcement Learning with Human Feedback?

What are the typical collaborations involved for a Reinforcement Learning with Human Feedback (RLHF) specialist within a machine learning team?

2026 Fall Applied Science Internship - Natural Language Processing and Speech Technologies - United

Share this job

Amazon rating

Get the real story on frontline employers

Job description

What Amazon employees say

Get the real story on frontline employers

Pay

Most people get paid breaks

Most people don’t get paid when they’re sick

The job rarely spills into unpaid time

Benefits

Sick days use up paid time off

Only some part-timers can get health insurance

Most part-timers get paid time off

Hours and flexibility

Less than 4 weeks notice of work schedule

Some people worry about their hours

Only some people can choose their shifts

Workplace

Most people feel treated with respect

Most people get breaks without interruption

Some people are stressed out

About Amazon

Industry

Company size

Headquarters location

Website

Social media

Share this job