1

Multimodal Learning Jobs in Seattle, WA (NOW HIRING)

The ideal Research Scientist candidate will leverage their practical expertise in robot foundation models, tactile sensing, multimodal learning, and robotics systems to innovate on full-stack systems.

The ideal Research Scientist candidate will leverage their practical expertise in robot foundation models, tactile sensing, multimodal learning, and robotics systems to innovate on full-stack systems.

Senior Agentic AI Research Scientist

Seattle, WA · On-site +1

$112K - $142K/yr

You will advance the state-of-the-art in machine learning and multimodal technology and apply your research findings to create new responsible AI capabilities for Axon products. You will collaborate ...

Senior Agentic AI Research Scientist

Seattle, WA · On-site

$112K - $142K/yr

You will advance the state-of-the-art in machine learning and multimodal technology and apply your research findings to create new responsible AI capabilities for Axon products. You will collaborate ...

Formulate novel research problems at the intersection of GenAI, multimodal learning, and large-scale information retrieval-translating ambiguous business challenges into tractable scientific ...

Formulate novel research problems at the intersection of GenAI, multimodal learning, and large-scale information retrieval-translating ambiguous business challenges into tractable scientific ...

Formulate novel research problems at the intersection of GenAI, multimodal learning, and large-scale information retrieval-translating ambiguous business challenges into tractable scientific ...

next page

Showing results 1-20

Multimodal Learning information

See Seattle, WA salary details

$23.9K

$70.2K

$130.3K

How much do multimodal learning jobs pay per year?

As of Jun 28, 2026, the average yearly pay for multimodal learning in Seattle, WA is $70,207.00, according to ZipRecruiter salary data. Most workers in this role earn between $46,700.00 and $81,900.00 per year, depending on experience, location, and employer.

What is multimodal learning?

Multimodal learning is an area of machine learning that involves integrating and processing information from multiple types of data, such as text, images, audio, and video. The goal is to create models that can understand and make predictions based on more than one data modality, similar to how humans use various senses. This approach is used in applications like speech recognition with visual cues, image captioning, and video analysis. By combining different data types, multimodal learning systems can achieve better accuracy and more robust understanding.

What is the difference between Multimodal Learning vs Data Scientist?

AspectMultimodal LearningData Scientist
Required CredentialsAdvanced degrees in AI, Machine Learning, or Computer ScienceBachelor's or Master's in Data Science, Statistics, or related fields
Work EnvironmentResearch labs, AI development teams, academiaBusiness, tech companies, analytics teams
Industry UsageAI research, multimedia applications, roboticsData analysis, predictive modeling, business insights

Multimodal Learning focuses on developing AI models that process and integrate multiple data types like images, text, and audio. Data Scientists analyze data to extract insights, build models, and support decision-making. While both roles involve data and algorithms, Multimodal Learning is specialized in AI model development for complex data integration, whereas Data Scientists work broadly across data analysis and interpretation.

What are the key skills and qualifications needed to thrive as a Multimodal Learning Specialist, and why are they important?

To excel as a Multimodal Learning Specialist, you need a solid background in machine learning, data science, and computer vision, often supported by an advanced degree in a related field. Familiarity with deep learning frameworks like TensorFlow or PyTorch, experience integrating data from diverse sources (e.g., text, audio, images), and knowledge of relevant algorithms are crucial. Strong problem-solving abilities, creativity, and effective collaboration are standout soft skills for this role. These competencies are vital for developing innovative models that can process and interpret complex, multi-source data to drive impactful AI solutions.

What are some common challenges faced by professionals working in multimodal learning roles, and how can they be addressed?

Professionals in multimodal learning frequently encounter challenges related to integrating and aligning data from multiple sources, such as text, images, audio, or video. Ensuring data quality and consistency across modalities can be complex, and developing models that effectively combine heterogeneous information often requires advanced technical skills and innovative thinking. Collaboration with domain experts and other data scientists is key to overcoming these obstacles, as is staying up to date with the latest research and tools in machine learning. Regular team meetings and cross-disciplinary workshops can help foster a collaborative environment and promote knowledge sharing.
What are popular job titles related to Multimodal Learning jobs in Seattle, WA? For Multimodal Learning jobs in Seattle, WA, the most frequently searched job titles are:
What job categories do people searching Multimodal Learning jobs in Seattle, WA look for? The top searched job categories for Multimodal Learning jobs in Seattle, WA are:
Infographic showing various Multimodal Learning job openings in Seattle, WA as of June 2026, with employment types broken down into 33% Internship, and 67% Full Time. Highlights an 100% In-person job distribution, with an average salary of $70,207 per year, or $33.8 per hour.
2026 Fall Applied Science Internship - Natural Language Processing and Speech Technologies - United

2026 Fall Applied Science Internship - Natural Language Processing and Speech Technologies - United

Amazon

Seattle, WA • On-site

$17 - $22.75/hr

Full-time

Medical, Retirement

Posted 13 days ago


Amazon rating

7.4

Company rating: 7.4 out of 10

Based on 6,908 frontline employees who took The Breakroom Quiz

6th of 39 rated national retailers


Job description

Shape the Future of Human-Machine Interaction
Are you a master of natural language processing, eager to push the boundaries of conversational AI? Amazon is seeking exceptional graduate students to join our cutting-edge research team, where they will have the opportunity to explore and push the boundaries of natural language processing (NLP), natural language understanding (NLU), and speech recognition technologies.
Imagine waking up each morning, fueled by the excitement of tackling complex research problems that have the potential to reshape the world. You'll dive into production-scale data, exploring innovative approaches to natural language understanding, large language models, reinforcement learning with human feedback, conversational AI, and multimodal learning. Your days will be filled with brainstorming sessions, coding sprints, and lively discussions with brilliant minds from diverse backgrounds.
Throughout your journey, you'll have access to unparalleled resources, including state-of-the-art computing infrastructure, cutting-edge research papers, and mentorship from industry luminaries. This immersive experience will not only sharpen your technical skills but also cultivate your ability to think critically, communicate effectively, and thrive in a fast-paced, innovative environment where bold ideas are celebrated..
Join us at the forefront of applied science, where your contributions will shape the future of AI and propel humanity forward. Seize this extraordinary opportunity to learn, grow, and leave an indelible mark on the world of technology.
Amazon has positions available for Natural Language Processing & Speech Applied Science Internships in, but not limited to, Bellevue, WA; Boston, MA; Cambridge, MA; New York, NY; Santa Clara, CA; Seattle, WA; Sunnyvale, CA.
Key job responsibilities
We are particularly interested in candidates with expertise in: NLP/NLU, LLMs, Reinforcement Learning, Human Feedback/HITL, Deep Learning, Speech Recognition, Conversational AI, Natural Language Modeling, Multimodal Learning.
In this role, you will work alongside global experts to develop and implement novel, scalable algorithms and modeling techniques that advance the state-of-the-art in areas at the intersection of Natural Language Processing and Speech Technologies. You will tackle challenging, groundbreaking research problems on production-scale data, with a focus on natural language processing, speech recognition, text-to-speech (TTS), text recognition, question answering, NLP models (e.g., LSTM, transformer-based models), signal processing, information extraction, conversational modeling, audio processing, speaker detection, large language models, multilingual modeling, and more.
The ideal candidate should possess the ability to work collaboratively with diverse groups and cross-functional teams to solve complex business problems. A successful candidate will be a self-starter, comfortable with ambiguity, with strong attention to detail and the ability to thrive in a fast-paced, ever-changing environment.
A day in the life
- Develop novel, scalable algorithms and modeling techniques that advance the state-of-the-art in natural language processing, speech recognition, text-to-speech, question answering, and conversational modeling.
- Tackle groundbreaking research problems on production-scale data, leveraging techniques such as LSTM, transformer-based models, signal processing, information extraction, audio processing, speaker detection, large language models, and multilingual modeling.
- Collaborate with cross-functional teams to solve complex business problems, leveraging your expertise in NLP/NLU, LLMs, reinforcement learning, human feedback/HITL, deep learning, speech recognition, conversational AI, natural language modeling, and multimodal learning.
- Thrive in a fast-paced, ever-changing environment, embracing ambiguity and demonstrating strong attention to detail.
BASIC QUALIFICATIONS
- Are enrolled in a PhD
- Can relocate to where the internship is based
- Experience programming in Java, C++, Python or related language
- Experience with one or more of the following: Natural Language Processing/Understanding, Large Language Models, Reinforcement Learning, Human Feedback/HITL, Deep Learning, Speech Recognition, Conversational AI, Natural Language Modeling, Multimodal Learning
- Must be available for full-time (40 hours per week) internship for the whole duration of the internship
PREFERRED QUALIFICATIONS
- Have publications at top-tier peer-reviewed conferences or journals
- Experience in designing experiments and statistical analysis of results
- Experience in building speech recognition, machine translation and natural language processing systems (e.g., commercial speech products or government speech projects)
Amazon is an equal opportunity employer and does not discriminate on the basis of protected veteran status, disability, or other legally protected status.
Our inclusive culture empowers Amazonians to deliver the best results for our customers. If you have a disability and need a workplace accommodation or adjustment during the application and hiring process, including support for the interview or onboarding process, please visit https://amazon.jobs/content/en/how-we-hire/accommodations for more information. If the country/region you're applying in isn't listed, please contact your Recruiting Partner.
The starting pay for this position is listed below. Final starting pay will be based on factors including experience, qualifications, and location. Starting Day 1 of employment, Amazon offers EAP, Mental Health Support, Medical Advice Line, 401(k) matching. Learn more about our benefits at https://hiring.amazon.com/why-amazon/benefits.
USA, WA, Seattle - 142,800.00 - 193,200.00 USD annually

What Amazon employees say

Pay

Benefits

Hours and flexibility

Workplace

Get the full story on Breakroom


Amazon logo

About Amazon

Sourced by ZipRecruiter

Amazon.com, Inc., commonly known as Amazon, is an American multinational technology company. It was founded by Jeff Bezos in 1994 and initially started as an online marketplace for books. Since then, Amazon has expanded its operations and become one of the largest e-commerce companies in the world. Amazon's primary business is its online retail platform, where customers can purchase a vast array of products, including electronics, clothing, books, home goods, and much more. The company offers a convenient and user-friendly shopping experience, with features such as fast shipping, customer reviews, and personalized recommendations. In addition to its e-commerce platform, Amazon has diversified its business into various other areas. One of its notable ventures is Amazon Web Services (AWS), a comprehensive cloud computing platform that provides services such as storage, compute power, and database management to individuals and businesses. AWS has become a leader in the cloud computing industry, powering many websites and applications worldwide. Amazon has also developed its own consumer electronics, including the popular Amazon Kindle e-reader, Fire tablets, Fire TV streaming devices, and the Alexa-powered Echo smart speakers. The Alexa voice assistant, integrated into these devices, allows users to interact with their devices using voice commands, perform tasks, and access information. Furthermore, Amazon has expanded into media and entertainment. It operates Prime Video, a streaming service that offers a wide range of movies, TV shows, and original content. Amazon Music provides a platform for streaming and purchasing digital music, while Audible offers audiobooks and other audio content. The company's commitment to customer satisfaction and convenience is demonstrated by its membership program, Amazon Prime. Prime members receive various benefits, including free two-day shipping, access to streaming services, exclusive deals, and more.

Industry

It services, book publishers, retail, real estate and computer and electronic product manufacturing

Company size

10,000+ Employees

Headquarters location

Seattle, WA, US