Manager Rlhf Jobs in New York (NOW HIRING) Jul 26

Tech Lead Manager- MLRE, ML Systems

They are seeking a Tech Lead Manager for their ML Systems team to build and optimize a training and ... RLHF/RLVR and related algorithms like PPO/GRPO etc. • Strong software engineering skills ...

Scale AI

Tech Lead Manager- MLRE, ML Systems

Manhattan, NY · On-site

They are seeking a Tech Lead Manager for their ML Systems team to build and optimize a training and ... RLHF/RLVR and related algorithms like PPO/GRPO etc. • Strong software engineering skills ...

Two Sigma Investments, LP

Post-Training Research Scientist

New York, NY · On-site

$165K - $300K/yr

RLHF, DPO, RLAIF, or related methods * Deep understanding of distributed training infrastructure: multi-node GPU clusters, training stability, checkpointing * Track record managing large-scale ...

Two Sigma Investments, LP

Post-Training Research Scientist

New York, NY · On-site

$165K - $300K/yr

RLHF, DPO, RLAIF, or related methods * Deep understanding of distributed training infrastructure: multi-node GPU clusters, training stability, checkpointing * Track record managing large-scale ...

Merican

Developer

New York, NY · On-site

Deep understanding of Data preprocessing, Prompt management, Caching, Validation, Advanced RAG, RLHF, and success measurement. * Thorough understanding of LLMOps, data pipelines and other common ...

Merican

Developer

New York, NY · On-site

Deep understanding of Data preprocessing, Prompt management, Caching, Validation, Advanced RAG, RLHF, and success measurement. * Thorough understanding of LLMOps, data pipelines and other common ...

J.P. Morgan

AI/LLM Product Director - Executive Director

New York, NY

$254K - $266K/yr

Oversees the product roadmap, vision, development, execution, risk management, and business growth ... Feedback (RLHF), Retrieval-Augmented Generation (RAG), and Agents to enhance user experiences.

J.P. Morgan

AI/LLM Product Director - Executive Director

New York, NY

$254K - $266K/yr

Oversees the product roadmap, vision, development, execution, risk management, and business growth ... Feedback (RLHF), Retrieval-Augmented Generation (RAG), and Agents to enhance user experiences.

HumanSignal

Delivery Lead

New York, NY · Remote

$110K - $140K/yr

... RLHF, annotation, model evaluation) * STEM background or strong technical fluency * Python & REACT working knowledge * Experience managing distributed contributor workforces at scale * Background in ...

Quick apply

HumanSignal

Delivery Lead

New York, NY · Remote

$110K - $140K/yr

... RLHF, annotation, model evaluation) * STEM background or strong technical fluency * Python & REACT working knowledge * Experience managing distributed contributor workforces at scale * Background in ...

Citigroup, Inc.

Generative AI - Group Manager - Senior Vice President

Jersey City, NJ · On-site

Manage the project lifecycle from ideation and scoping to deployment and post-launch support ... RLHF, multi-task learning). * Model Optimization: Expertise in model compression and quantization ...

Citigroup, Inc.

Generative AI - Group Manager - Senior Vice President

Jersey City, NJ · On-site

Manage the project lifecycle from ideation and scoping to deployment and post-launch support ... RLHF, multi-task learning). * Model Optimization: Expertise in model compression and quantization ...

Apple

ML Researcher, Apple Foundation Models

New York, NY · On-site

... to manage their own context in long-horizon tasks. This is applied research with direct product ... RLHF, GRPO, PPO, RLVR, reward modeling, RL scaling laws Code generation and coding agents ...

Apple

ML Researcher, Apple Foundation Models

New York, NY · On-site

... to manage their own context in long-horizon tasks. This is applied research with direct product ... RLHF, GRPO, PPO, RLVR, reward modeling, RL scaling laws Code generation and coding agents ...

Reflection

Forward Deployed Engineer - LLM Post-training

Manhattan, NY · On-site

Familiarity with SFT, DPO, RLHF, or similar techniques. • Understanding of evaluation methodology ... GPUs, compute management, debugging common training failures. You don't need to be an infra ...

Reflection

Forward Deployed Engineer - LLM Post-training

Manhattan, NY · On-site

Familiarity with SFT, DPO, RLHF, or similar techniques. • Understanding of evaluation methodology ... GPUs, compute management, debugging common training failures. You don't need to be an infra ...