About the Role
The Nuance Research Fellowship is a 3-month engagement for early-career researchers who want to work at the frontier of Multimodal LLMs, generative modeling, and real-time audiovisual AI. The program is open to current PhD students (on internship, leave, or in their final stretch) and recent graduates from BS, MS, or PhD programs.
As a fellow, you'll own a real research problem inside one of our core workstreams: pretraining, post-training, RL, evaluation, data, multimodal modeling, generative modeling, or inference. Depending on your strengths, this could mean training omni models from scratch, improving real-time audio-video-language reasoning, building evals for full-duplex interaction, or exploring model families such as flow matching and diffusion for controllable, high-fidelity generation.
This is designed as a mutual trial for a long-term role at Nuance, not a short standalone internship. At the end of three months, we'll decide together whether to convert to a full-time Member of Technical Staff role. Fellows who convert step into MTS-level scope and ownership from day one.
What You'll Own
- Own a concrete research problem from framing through experiments, analysis, and integration into the Nuance stack
- Work on frontier Multimodal LLM systems spanning audio, video, language, and real-time interaction
- Explore and adapt modern generative modeling techniques, including flow matching, diffusion, autoregressive modeling, and hybrid approaches where they fit
- Read papers, reproduce key results, and turn promising ideas into production-grade experiments
- Design, instrument, debug, and interpret training and evaluation runs with scientific rigor
- Build evaluation harnesses, benchmarks, and analysis tooling for real-time conversational agents
- Take research-grade prototypes and turn them into systems that ship
- Work closely with senior researchers and engineers across the team; ramp on the stack fast
What We're Looking For
Hard requirements:
- Strong working knowledge of PyTorch and deep learning - you can train a model, debug a training run, and reason about what's happening at the loss level
- At least one first-author paper at a tier 1 venue (main conference proceedings) - NeurIPS, ICML, ICLR, CVPR, ICCV, ECCV, ACL, EMNLP, NAACL, ICASSP, Interspeech, MLSys, SIGGRAPH, or equivalent - or equivalent evidence of unusually strong research taste and execution
- Genuine interest in joining Nuance full-time after the fellowship. We are looking for long-term partners on this journey
Beyond the hard bar:
- Currently enrolled in or recently completed a BS, MS, or PhD in CS, ML, math, physics, EE, or a related field
- Strong programming ability and software engineering instincts
- High agency - when you see something broken or slow, you fix it; when you see an opportunity, you take it before being asked
- A bias toward shipping over polishing, with the judgment to know when each matters
- The appetite to pick up anything and optimize the hell out of it
Bonus Points
- Hands-on experience with Multimodal LLMs, omni models, audio-language models, video-language models, speech generation, or real-time interactive agents
- Research or implementation experience with flow matching, diffusion models, rectified flows, autoregressive generation, neural codecs, or related generative modeling methods
- Multiple tier 1 publications, or a paper that received significant attention (best paper award, broad adoption, high citation impact for its age)
- Olympiad medals or finalist-level results in IMO, IPhO, IOI, IChO, IBO, IMC, or equivalent
- Codeforces grandmaster, ICPC world finals, Putnam fellow, Kaggle grandmaster, or similar
- Open-source contributions to major ML frameworks or research codebases
- A track record of independent projects that made something noticeably faster, smaller, or better
Compensation
$200,000 - $250,000 annualized base salary during the 3-month fellowship (paid as a prorated stipend). Fellows who convert to a full-time Member of Technical Staff role step into a base salary of $250,000 - $350,000 plus meaningful equity.