1

Multimodal Learning Jobs in Georgia (NOW HIRING)

Required : โ€ข 5+ years of hands-on experience training and fine-tuning deep learning models in NLP (or a closely related domain like speech, IR, or multimodal). โ€ข 5+ years of experience with ...

Traffic Engineer

Duluth, GA ยท On-site

$80K - $109K/yr

Conduct multimodal analyses for vehicles, pedestrian, bicyclists and transit. * Prepare technical ... Learning & Development: We provide clear career paths, learning resources and development programs ...

Discovery-serving as a company-wide authority in advanced Data Science, Machine Learning, and ... NLP, generative AI & multimodal ML systems * Computer vision & video intelligence pipelines * Apply ...

next page

Showing results 1-20

Multimodal Learning information

What is multimodal learning?

Multimodal learning is an area of machine learning that involves integrating and processing information from multiple types of data, such as text, images, audio, and video. The goal is to create models that can understand and make predictions based on more than one data modality, similar to how humans use various senses. This approach is used in applications like speech recognition with visual cues, image captioning, and video analysis. By combining different data types, multimodal learning systems can achieve better accuracy and more robust understanding.

What is the difference between Multimodal Learning vs Data Scientist?

AspectMultimodal LearningData Scientist
Required CredentialsAdvanced degrees in AI, Machine Learning, or Computer ScienceBachelor's or Master's in Data Science, Statistics, or related fields
Work EnvironmentResearch labs, AI development teams, academiaBusiness, tech companies, analytics teams
Industry UsageAI research, multimedia applications, roboticsData analysis, predictive modeling, business insights

Multimodal Learning focuses on developing AI models that process and integrate multiple data types like images, text, and audio. Data Scientists analyze data to extract insights, build models, and support decision-making. While both roles involve data and algorithms, Multimodal Learning is specialized in AI model development for complex data integration, whereas Data Scientists work broadly across data analysis and interpretation.

What are the key skills and qualifications needed to thrive as a Multimodal Learning Specialist, and why are they important?

To excel as a Multimodal Learning Specialist, you need a solid background in machine learning, data science, and computer vision, often supported by an advanced degree in a related field. Familiarity with deep learning frameworks like TensorFlow or PyTorch, experience integrating data from diverse sources (e.g., text, audio, images), and knowledge of relevant algorithms are crucial. Strong problem-solving abilities, creativity, and effective collaboration are standout soft skills for this role. These competencies are vital for developing innovative models that can process and interpret complex, multi-source data to drive impactful AI solutions.

What are some common challenges faced by professionals working in multimodal learning roles, and how can they be addressed?

Professionals in multimodal learning frequently encounter challenges related to integrating and aligning data from multiple sources, such as text, images, audio, or video. Ensuring data quality and consistency across modalities can be complex, and developing models that effectively combine heterogeneous information often requires advanced technical skills and innovative thinking. Collaboration with domain experts and other data scientists is key to overcoming these obstacles, as is staying up to date with the latest research and tools in machine learning. Regular team meetings and cross-disciplinary workshops can help foster a collaborative environment and promote knowledge sharing.
What cities in Georgia are hiring for Multimodal Learning jobs? Cities in Georgia with the most Multimodal Learning job openings:

PhD Research Intern - Data Management & Visualization (Fall 2026, Atlanta)

Dolby

Atlanta, GA โ€ข On-site

Full-time, Internship

Posted 17 days ago


Job description

Join the leader in entertainment innovation and help us design the future. The Advanced Technology Group (ATG) is the research division of the company. ATG's mission is to look ahead, deliver insights, and innovate technological solutions that will fuel Dolby's continued growth. As a valued member of the Dolby team, you'll see and hear the results of your work everywhere, from movie theaters to smartphones. We continuously push the boundaries of audio, imaging, and cloud technology to create spectacular entertainment experiences.
As a diverse and dynamic group, our ATG researchers work on cutting-edge projects related to computer science and electrical engineering for audio, video, and cloud technologies, exploring exciting domains such as AI/ML, algorithms, digital signal processing, audio processing, image processing, computer vision, AR/VR, data science & analytics, distributed systems, cloud, edge & mobile computing, computer networking, and IoT.
About the Role
The Data Platform & AI Services research team within Dolby's Advanced Technology Group focuses on advancing our AI and data platforms to enable AI-based innovation and deliver cloud and network-delivered media experiences to power the world's most influential media service providers.
We are looking for a PhD Research Intern in ML Data Platform & Visualization to extend our existing data platform with scalable tooling that helps ML researchers understand, navigate, and extract insight from large-scale multimodal datasets. You will build on a production-grade platform while drawing on and contributing to emerging research in visualization for machine learning, data-centric AI, and foundation model interpretability.
As a Research Intern, you will:
  • Extend our ML data platform to improve dataset management, discoverability, and quality assessment for large-scale, multimodal media datasets (video, image, audio, sensor data)

  • Build scalable visualization tooling that enables ML researchers to explore embedding spaces, surface semantic representations from foundation models, and understand dataset structure at scale

  • Design and implement interactive data exploration interfaces to support ML research workflows and data management, including ingestion, indexing, retrieval, annotation and representation

  • Investigate and apply emerging research in visualization for ML, data-centric AI, and foundation model representations to inform platform design decisions

  • Collaborate directly with AI researchersto translate research workflows into platform requirements, bridging the gap between model development needs and data infrastructure capabilities

  • Present your work to internal stakeholders, with the possibility of contributing to academic publications or conference presentations

The role will be based out of our research facility in Atlanta, GA, and offers the opportunity to work with state-of-the-art computing resources and proprietary datasets.
Requirements
Candidates should meet one or more of the following:
  • Currently enrolled in a PhD program in Computer Science, Human-Computer Interaction, Computational Media, Data Science, Electrical Engineering, or a related field, with interest in data management, data visualization, ML infrastructure, or media data systems

  • Strong background in data management and visualization, including data modeling, indexing, retrieval, annotation and visualization for large-scale or unstructured media data

  • Familiarity withML workflows and researcher tooling - understanding how ML researchers interact with datasets during training, evaluation, and debugging

  • Solid understanding of deep learning fundamentals and experience with frameworks such as PyTorch

  • Proficiency in Python and experience with visualization libraries

  • Ability to work independently and as part of a collaborative, cross-disciplinary research team

Highly Desired Experience
  • First-authored publication or project work in relevant domains at top venues such as IEEE VIS, CHI, VLDB, ACM SIGMOD, SIGKDD, or IEEE Big Data

  • Expertise in visualization research for ML, including dataset cartography, latent space visualization, data-centric AI, or interactive ML tools

  • Hands-on experience building data visualization tools or interactive ML exploration interfaces - embedding viewers, dataset dashboards, annotation UIs, or similar

  • Experience with scalable data processing and model training

We will review applications on a rolling basis. For the best chance to have your resume reviewed and considered, we recommend submitting your application by June 26, 2026.
Eligibility
Currently enrolled in Doctoral program. Recent grads who are within 6 months of graduation are also eligible to apply. Must be available to work full-time Monday - Friday for 12 weeks between September 2026 - December 2026.
The start date for this internship is as follows (please note these dates are not flexible):
  • September 21, 2026

The Atlanta area base hourly range for this internship position is $53/hr and can vary if outside of this location. Our hourly ranges are determined by role, level, and location. Within the range, individual pay is determined by work location and additional factors, including job-related skills, experience, and relevant education or training. Your recruiter can share more about the specific hourly range and perks and benefits for your location during the hiring process.
Dolby will consider qualified applicants with criminal histories in a manner consistent with the requirements of San Francisco Police Code, Article 49, and Administrative Code, Article 12
Equal Employment Opportunity:
Dolby is proud to be an equal opportunity employer. Our success depends on the combined skills and talents of all our employees. We are committed to making employment decisions without regard to race, religious creed, color, age, sex, sexual orientation, gender identity, national origin, religion, marital status, family status, medical condition, disability, military service, pregnancy, childbirth and related medical conditions or any other classification protected by federal, state, and local laws and ordinances.