Job Summary:
XPENG is a leading smart technology company at the forefront of innovation, integrating advanced AI and autonomous driving technologies into its vehicles. They are seeking exceptional Research Engineers / Scientists to design learning systems that allow agents to plan over long horizons and improve through experience.
Responsibilities:
• Reinforcement learning methods for LLM-driven agents and decision systems.
• Policy optimization for long-horizon reasoning and planning.
• Learning from human or AI feedback (RLHF / RLAIF).
• Agent training pipelines built on top of our agent infrastructure platform.
• Evaluation and benchmarking systems for agent capabilities.
• Learning loops that integrate real-world and simulation data.
• Contribute to AI systems that continuously improve after deployment.
Qualifications:
Required:
• MS or PhD in Computer Science, AI, Machine Learning, Robotics, or a related field.
• Strong background in reinforcement learning or machine learning.
• Experience implementing RL algorithms such as PPO, Actor-Critic, or policy gradient methods.
• Strong programming skills in Python with PyTorch or JAX.
• Experience building ML training systems or infrastructure.
Preferred:
• Experience with RLHF or preference learning.
• Experience with LLM agents or tool-using AI systems.
• Multi-agent systems or long-horizon planning.
• Simulation environments for RL.
• Publications in NeurIPS, ICML, ICLR, ACL, or related venues.
Company:
XPENG is a leading Chinese Smart EV company that designs, develops, manufactures, and markets Smart EVs that appeal to the large and growing base of technology-savvy middle-class consumers. Founded in 2014, the company is headquartered in Guangzhou, CHN, with a team of 10001+ employees. The company is currently Late Stage.