26WD98297
26WD98297, Research Lead / Principal Scientist & Manager Post-Training Alignment Reinforcement Learning Autodesk AI Lab: Toronto Remote
French translation to follow!/Traduction francaise a suivre!
About Autodesk AI Lab
Autodesk AI Lab advances state-of-the-art research across generative AI, multimodal foundation models, reasoning systems, and human-AI collaboration. Our work has direct impact across the industries that shape the physical world. We are an active contributor to the global research community and collaborate closely with leading academic and industry labs.
At Autodesk, we are building a diverse workplace and an inclusive culture to give more people the chance to imagine, design, and make a better world. Autodesk is proud to be an equal opportunity employer and considers all qualified applicants for employment without regard to race, color, religion, age, sex, sexual orientation, gender identity, national origin, disability, veteran status, or any other legally protected characteristic.
Position Overview
Foundation models are reshaping how engineers, architects, and designers work - but training foundation models that are reliable, domain-capable systems is still an open research problem.
Autodesk touches more of the physical world than almost any other software company. The products we build are used to design skyscrapers, manufacture aircraft, and produce films. AI is now central to how those workflows are evolving - and post-training is the layer that makes the difference between a capable model and one that is dependable and robust in our customers' high-precision domains.
As Research Lead for Post-Training & Alignment, you will own Autodesk's research strategy for transforming foundation models into systems that are reliable, aligned, and genuinely useful in complex, domain-specific workflows. This is a deeply technical leadership role - you will shape research direction, drive key architectural decisions, and remain close to the work.
You will lead a growing team of AI scientists while continuing to contribute directly to research: running experiments, developing novel algorithms, and publishing at top-tier venues.
Autodesk's domains - architecture, engineering, construction, manufacturing, media & entertainment - provide a distinctive research environment: rich structured data, long-horizon reasoning tasks, and real-world evaluation grounded in professional workflows. Uniquely, decades of investment in physics simulation engines, CAD kernels, and computational design tools give us something most labs don't have: high-fidelity, domain-grounded verifiers that can serve as reward signals for post-training. Rather than relying solely on human preference data, we can ground reinforcement learning in the laws of physics and the constraints of real engineering. These are exactly the kinds of challenges - and assets - that make post-training and alignment research here genuinely distinctive.
We publish at NeurIPS, ICML, ICLR, CVPR, and SIGGRAPH. We collaborate with leading academic and industry labs. And we have a direct line from research advances to product impact at scale. This is not a role where research sits behind a wall from engineering - you will see your work matter.
This role reports to the Senior Director of AI Research within Autodesk AI Lab.
Post-Training Alignment Reinforcement Learning
Autodesk AI Lab: London San Francisco Toronto Remote (US/CA/EU)
Responsibilities
Research & Technical Leadership
Own post-training strategy for model development - from RLHF and preference optimization to agentic systems and long-horizon reasoning
Develop novel algorithms that improve model reliability, controllability, and alignment
Make principled architectural decisions about when to address challenges at the pre-training, post-training, or system level
Design and run experiments that shape model behavior, robustness, and reasoning quality
Partner with infrastructure teams to build scalable, reproducible post-training workflows
Contribute to publications, patents, and Autodesk's external research visibility
Evaluation & Model Quality
Design evaluation frameworks for long-horizon reasoning, tool use, agentic behavior, safety, and real-world workflow completion
Lead rigorous model analysis and interpretability efforts
Drive human-in-the-loop evaluation with high annotation quality and sound scientific methodology
Establish model readiness criteria and provide go/no-go recommendations for releases
Communicate technical risks, limitations, and trade-offs clearly to leadership
Team & Organizational Leadership
Manage, mentor, and grow a team of AI scientists
Set technical direction and research priorities across post-training and alignment initiatives
Foster a research culture grounded in scientific rigor, reproducibility, and fast iteration
Help recruit world-class talent across ML, RL, alignment, and foundation models
Partner closely with pre-training teams, infrastructure, product organizations, and other stakeholders
Translate research trade-offs into clear, decision-ready guidance for leadership
Minimum Qualifications
We care about research judgment and outcomes, not credential checklists. Strong candidates will typically have:
Deep hands-on expertise in reinforcement learning for foundation models, and fluency with post-training methods (RLHF, RLAIF, DPO, PPO, or adjacent approaches)
Proven experience leading or mentoring technical research teams - whether in an academic lab, AI research organization, or industry setting
Strong intuition for model behavior, alignment challenges, and post-training trade-offs
Experience designing evaluation systems and thinking rigorously about what it means for a model to be ready
Ability to communicate complex technical trade-offs clearly to both technical and non-technical audiences
A PhD or equivalent depth of industry research experience in ML, RL, AI, or a related field
Preferred Qualifications
Experience at a frontier model lab or advanced applied AI organization
A strong publication record at leading ML or AI venues
Background in alignment research, preference learning, or agentic AI
Experience deploying or supporting production AI systems
Familiarity with large-scale training infrastructure and compute trade-offs
The Ideal Candidate
In the first year, success means:
Post-trained models show measurable improvements in reliability, alignment, reasoning quality, and domain usefulness
Evaluation metrics and release criteria are trusted and adopted across teams
The team delivers high-quality research with practical impact - and team members are growing into stronger, more independent researchers
Leadership relies on your judgment for model readiness, technical direction, and risk assessment
Autodesk AI Lab advances its reputation as a serious contributor to frontier AI research
______________________________________________________________________________________________________________
26WD98297, Responsable de recherche / Chercheur principal et responsable de la formation post-apprentissage Alignement Apprentissage par renforcement Autodesk AI Lab : Toronto Teletravail
A propos d'Autodesk AI Lab
Autodesk AI Lab mene des recherches de pointe dans les domaines de l'IA generative, des modeles de base multimodaux, des systemes de raisonnement et de la collaboration entre l'humain et l'IA. Nos travaux ont un impact direct sur les secteurs qui faconnent le monde physique. Nous contribuons activement a la communaute mondiale de la recherche et collaborons etroitement avec des laboratoires universitaires et industriels de premier plan.
Chez Autodesk, nous construisons un lieu de travail diversifie et une culture inclusive afin de donner a davantage de personnes la chance d'imaginer, de concevoir et de creer un monde meilleur. Autodesk est fier d'etre un employeur garantissant l'egalite des chances et prend en consideration toutes les candidatures qualifiees sans distinction de race, de couleur, de religion, d'age, de sexe, d'orientation sexuelle, d'identite de genre, d'origine nationale, de handicap, de statut d'ancien combattant ou de toute autre caracteristique protegee par la loi.
Presentation du poste
Les modeles de base sont en train de transformer la facon dont les ingenieurs, les architectes et les concepteurs travaillent - mais la formation de modeles de base fiables et adaptes a un domaine specifique reste un probleme de recherche non resolu.
Autodesk touche davantage le monde physique que presque toute autre entreprise de logiciels. Les produits que nous developpons sont utilises pour concevoir des gratte-ciel, fabriquer des avions et produire des films. L'IA est desormais au cur de l'evolution de ces flux de travail - et le post-entrainement est la couche qui fait la difference entre un modele performant et un modele fiable et robuste dans les domaines de haute precision de nos clients.
En tant que responsable de la recherche en post-entrainement et alignement, vous serez en charge de la strategie de recherche d'Autodesk visant a transformer les modeles de base en systemes fiables, alignes et veritablement utiles dans des flux de travail complexes et specifiques a un domaine. Il s'agit d'un poste de direction hautement technique : vous definirez l'orientation de la recherche, piloterez les decisions architecturales cles et resterez au cur de l'action.
Vous dirigerez une equipe grandissante de chercheurs en IA tout en continuant a contribuer directement a la recherche : en menant des experiences, en developpant des algorithmes novateurs et en publiant dans des revues de premier plan.
Les domaines d'Autodesk - architecture, ingenierie, construction, fabrication, medias et divertissement - offrent un environnement de recherche unique : des donnees structurees riches, des taches de raisonnement a long terme et une evaluation en conditions reelles ancree dans des flux de travail professionnels. De maniere unique, des decennies d'investissement dans les moteurs de simulation physique, les noyaux de CAO et les outils de conception computationnelle nous conferent un atout que la plupart des laboratoires n'ont pas : des verificateurs haute fidelite, ancres dans le domaine, pouvant servir de signaux de recompense pour le post-entrainement. Plutot que de nous fier uniquement aux donnees de preferences humaines, nous pouvons ancrer l'apprentissage par renforcement dans les lois de la physique et les contraintes de l'ingenierie reelle. Ce sont precisement ces types de defis - et d'atouts - qui rendent la recherche sur le post-entrainement et l'alignement ici veritablement unique.
Nous publions dans NeurIPS, ICML, ICLR, CVPR et SIGGRAPH. Nous collaborons avec des laboratoires universitaires et industriels de premier plan. Et nous disposons d'un lien direct entre les avancees de la recherche et l'impact des produits a grande echelle. Il ne s'agit pas d'un poste ou la recherche est isolee de l'ingenierie : vous verrez que votre travail a un impact reel.
Ce poste est rattache au directeur principal de la recherche en IA au sein de l'Autodesk AI Lab.
Post-entrainement Alignement Apprentissage par renforcement
Autodesk AI Lab : Londres San Francisco Toronto A distance (Etats-Unis/Canada/UE)
Responsabilites
Leadership en matiere de recherche et de technologie
Definir la strategie de post-entrainement pour le developpement de modeles - du RLHF et de l'optimisation des preferences aux systemes agentiels et au raisonnement a long terme
Developper des algorithmes novateurs qui ameliorent la fiabilite, la controlabilite et l'alignement des modeles
Prendre des decisions architecturales fondees sur des principes pour determiner quand relever les defis au niveau de la pre-formation, de la post-formation ou du systeme
Concevoir et mener des experiences qui faconnent le comportement, la robustesse et la qualite du raisonnement des modeles
Collaborer avec les equipes d'infrastructure pour mettre en place des workflows de post-formation evolutifs et reproductibles
Contribuer aux publications, aux brevets et a la visibilite de la recherche externe d'Autodesk
Evaluation et qualite des modeles
Concevoir des cadres d'evaluation pour le raisonnement a long terme, l'utilisation des outils, le comportement agentique, la securite et l'execution des flux de travail en conditions reelles
Diriger des efforts rigoureux d'analyse et d'interpretabilite des modeles
Mener des evaluations human-in-the-loop avec une annotation de haute qualite et une methodologie scientifique solide
Etablir des criteres de maturite des modeles et fournir des recommandations de lancement ou de suspension pour les versions
Communiquer clairement les risques techniques, les limites et les compromis a la direction
Direction d'equipe et organisationnelle
Gerer, encadrer et developper une equipe de chercheurs en IA
Definir l'orientation technique et les priorites de recherche pour les initiatives de post-entrainement et d'alignement
Favoriser une culture de recherche fondee sur la rigueur scientifique, la reproductibilite et l'iteration rapide
Contribuer au recrutement de talents de classe mondiale dans les domaines du ML, du RL, de l'alignement et des modeles de base
Travailler en etroite collaboration avec les equipes de pre-entrainement, les equipes ...
Autodesk is changing how the world is designed and made. Our technology spans architecture, engineering, construction, product design, manufacturing, media, and entertainment, empowering innovators everywhere to solve challenges big and small. From greener buildings to smarter products to more mesmerizing blockbusters, Autodesk software helps our customers to design and make a better world for all. For more information visit autodesk.com or follow @autodesk.