1

Multimodal Learning Jobs in Georgia (NOW HIRING)

... multimodal transport to renewable energy power to climate-positive buildings. Together, we are ... Learning and development supported by evolving tools and technologies, including AI * Best-in-class ...

With a wealth of learning and career development opportunities, a world-class training facility ... multimodal solutions, ensuring seamless integration, quality, scalability, and security within ...

Senior Transportation/Traffic Engineer

Atlanta, GA · Hybrid

$74K - $98K/yr

... and multimodal design goals. * Mentor junior staff in traffic operations, modeling, and safety ... Learning and development supported by evolving tools and technologies, including AI * Best-in-class ...

Experience training, fine-tuning, and evaluating LLMs and multimodal foundation models using advanced techniques such as self-supervised and transfer learning. * Experience designing value ...

Senior Transportation/Traffic Engineer

Atlanta, GA · On-site

$74K - $98K/yr

... and multimodal design goals. * Mentor junior staff in traffic operations, modeling, and safety ... Learning and development supported by evolving tools and technologies, including AI * Best-in-class ...

Senior Transportation/Traffic Engineer

Atlanta, GA · Hybrid

$74K - $98K/yr

... and multimodal design goals. * Mentor junior staff in traffic operations, modeling, and safety ... Learning and development supported by evolving tools and technologies, including AI * Best-in-class ...

With a wealth of learning and career development opportunities, a world-class training facility ... multimodal solutions, ensuring seamless integration, quality, scalability, and security within ...

Bachelor's degree in Computer Science, Data Science, Machine Learning, Applied Mathematics, or ... Familiarity with document intelligence, and multimodal AI capabilities. Soft Skills: * Strong ...

Bachelor's degree in Computer Science, Data Science, Machine Learning, Applied Mathematics, or ... Familiarity with document intelligence, and multimodal AI capabilities. Soft Skills: * Strong ...

City Bus Operator

Athens, GA · On-site

$19 - $23.77/hr

Transit Multimodal Transportation Center - 775 East Broad Street, Athens, GA Job Type: Full Time ... Opportunities and Career Development via the Government Wide Learning Management System Time

next page

Showing results 1-20

Multimodal Learning information

What is multimodal learning?

Multimodal learning is an area of machine learning that involves integrating and processing information from multiple types of data, such as text, images, audio, and video. The goal is to create models that can understand and make predictions based on more than one data modality, similar to how humans use various senses. This approach is used in applications like speech recognition with visual cues, image captioning, and video analysis. By combining different data types, multimodal learning systems can achieve better accuracy and more robust understanding.

What is the difference between Multimodal Learning vs Data Scientist?

AspectMultimodal LearningData Scientist
Required CredentialsAdvanced degrees in AI, Machine Learning, or Computer ScienceBachelor's or Master's in Data Science, Statistics, or related fields
Work EnvironmentResearch labs, AI development teams, academiaBusiness, tech companies, analytics teams
Industry UsageAI research, multimedia applications, roboticsData analysis, predictive modeling, business insights

Multimodal Learning focuses on developing AI models that process and integrate multiple data types like images, text, and audio. Data Scientists analyze data to extract insights, build models, and support decision-making. While both roles involve data and algorithms, Multimodal Learning is specialized in AI model development for complex data integration, whereas Data Scientists work broadly across data analysis and interpretation.

What are the key skills and qualifications needed to thrive as a Multimodal Learning Specialist, and why are they important?

To excel as a Multimodal Learning Specialist, you need a solid background in machine learning, data science, and computer vision, often supported by an advanced degree in a related field. Familiarity with deep learning frameworks like TensorFlow or PyTorch, experience integrating data from diverse sources (e.g., text, audio, images), and knowledge of relevant algorithms are crucial. Strong problem-solving abilities, creativity, and effective collaboration are standout soft skills for this role. These competencies are vital for developing innovative models that can process and interpret complex, multi-source data to drive impactful AI solutions.

What are some common challenges faced by professionals working in multimodal learning roles, and how can they be addressed?

Professionals in multimodal learning frequently encounter challenges related to integrating and aligning data from multiple sources, such as text, images, audio, or video. Ensuring data quality and consistency across modalities can be complex, and developing models that effectively combine heterogeneous information often requires advanced technical skills and innovative thinking. Collaboration with domain experts and other data scientists is key to overcoming these obstacles, as is staying up to date with the latest research and tools in machine learning. Regular team meetings and cross-disciplinary workshops can help foster a collaborative environment and promote knowledge sharing.
What cities in Georgia are hiring for Multimodal Learning jobs? Cities in Georgia with the most Multimodal Learning job openings:
PhD Research Intern - Foundational AI (Fall 2026, Atlanta)

PhD Research Intern - Foundational AI (Fall 2026, Atlanta)

Dolby Laboratories, Inc.

Atlanta, GA • On-site

$53/hr

Other

Posted 10 days ago


Job description

Join the world leader in innovation and building unique entertainment experiences at Dolby. The Advanced Technology Group (ATG) at Dolby works at imagining, creating, and integrating cutting-edge technologies at Dolby and is central to its innovation. As a Research Intern at ATG, you will imagine new visual, audio, and multimodal experiences that enable content creators to deliver their stories with maximum impact while allowing consumers to enjoy these experiences with unprecedented quality and immersion. 

The Advanced Technology Group (ATG) at Dolby is at the forefront of research and development in audio-visual technologies. Our team explores novel approaches in spatial audio processing, high dynamic range (HDR) imaging, computer vision, machine learning, and perceptual modeling. We collaborate across disciplines to push the boundaries of what is possible in entertainment technology, translating cutting-edge research into innovations that shape the future of media experiences worldwide. 

While traditional focus on foundation models has centered around standard audio and visual technologies, Dolby is pioneering the application of these powerful AI systems to enhance media experiences. We are seeking a PhD research intern to join our Foundational AI lab to explore how large-scale models can be leveraged for spatial audio, HDR imaging/video, and high pixel depth imaging/video generation and representation learning. 

As a Research Intern, you will: 

  • Design and implement novel neural architectures for processing enhanced media formats. 

  • Develop training methodologies for foundation models that can understand and generate high-fidelity audio-visual content. 

  • Research techniques for efficient fine-tuning of large models for Dolby-specific applications. 

  • Collaborate with cross-functional teams to integrate research findings into Dolby's product ecosystem. 

  • Present research findings to internal stakeholders and potentially at academic conferences. 

The role will be based out of our research facility in Atlanta, GA, and offers the opportunity to work with state-of-the-art computing resources and proprietary datasets. 

Requirements 

  • Currently enrolled in a PhD program in Computer Science, Electrical Engineering, Machine Learning, Computational Media, or related fields. 

  • Strong background in imaging/video/audio modeling and understanding. 

  • Demonstrable proficiency in training and fine-tuning large models (diffusion models, transformers, autoregressive models, etc.). 

  • Solid understanding of deep learning fundamentals and experience with frameworks such as PyTorch. 

  • Excellent programming skills in Python. 

  • Ability to work independently and as part of a collaborative research team. 

Desirable Experience 

  • First-authored publication in relevant domains at top conferences such as CVPR, ICCV, NeurIPS, ICLR, ICML, ICASSP, and similar venues. 

  • Experience with scaling up model training across multiple GPUs across hybrid infrastructures. 

  • Familiarity with audio processing, computer vision, or multimedia systems. 

  • Knowledge of perceptual quality metrics for audio and visual media. 

  • Prior work with HDR imaging or spatial audio technologies. 

Application Process 

We will review applications on a rolling basis. Qualified candidates should submit a CV, research statement, and relevant publications or project examples. For the best chance to have your resume reviewed and considered, we recommend submitting your application by June 26, 2026.  

Join us in shaping the future of entertainment technology through the power of AI and foundation models. 

Eligibility 

Currently enrolled in Doctoral program. Must be available to work full-time Monday - Friday for 12 weeks between September 2026 - December 2026.  

The start date for this internship is as follows (please note these dates are not flexible):  

  • September 21, 2026  

The Atlanta base hourly range for this internship position is $53/hr and can vary if outside of this location. Our hourly ranges are determined by role, level, and location. Within the range, individual pay is determined by work location and additional factors, including job-related skills, experience, and relevant education or training. Your recruiter can share more about the specific hourly range and perks and benefits for your location during the hiring process.

Dolby will consider qualified applicants with criminal histories in a manner consistent with the requirements of San Francisco Police Code, Article 49, and Administrative Code, Article 12

Equal Employment Opportunity:
Dolby is proud to be an equal opportunity employer. Our success depends on the combined skills and talents of all our employees. We are committed to making employment decisions without regard to race, religious creed, color, age, sex, sexual orientation, gender identity, national origin, religion, marital status, family status, medical condition, disability, military service, pregnancy, childbirth and related medical conditions or any other classification protected by federal, state, and local laws and ordinances.