Lab Summary:
Samsung AI Research Center (SAIC) located in Mountain View, California, is currently recruiting outstanding research interns for the Vision Intelligence lab. Our goal is to perform long-term research with potential for direct impact on future Samsung products reaching hundreds of millions of users worldwide. We are focused on pushing the state-of-the-art in multimodal understanding and generation.
Position Summary:
In this position, the student will collaborate with leading researchers on a project related to the lab's interests in multimodal (image, video, 3D, text) retrieval, question answering, reasoning, generation and editing.
Position Responsibilities:
- Conduct research on multi-modal models in one the above-mentioned area
- Write maintainable code to implement research ideas
- Work closely with the team to discuss findings and improve the method
- Executing experiments, documenting their progress and presenting insights
- Prepare findings for submission to a major conference or journal
Required Skills:
- Currently pursuing a MS or PhD degree
- Experience with programming in Python, Pytorch, Transformers etc.
- Previous publications in top-tier conference and journals in CV/ML/AI such as CVPR, ECCV, ICCV, ICML, NeurIPS, AAAI
- Experience in one or more of the following areas:
- Experience in VLMs including model architecture design, pretraining or post training (SFT, RLHF, GRPO) techniques
- Experience in diffusion models and other generative AI approaches for image/video/3D domains
- Experience in design and development of efficient multi-modal models using techniques like knowledge distillation, mixture of experts etc.
- Experience in Vision Language Action Models (VLA) for Agentic systems in robotics, GUI based task automation etc.
- Strong oral and written communication skill