1

Multimodal Learning Jobs (NOW HIRING)

Machine Learning Engineer, Data Mining

Boston, MA ยท On-site +1

$144K - $192K/yr

Omnitag, our ML-powered multimodal data mining framework, is the engine that powers this discovery. As a Machine Learning Engineer on the Data Mining team, your mission is to help build the "Brain ...

Machine Learning Engineer, Data Mining

Boston, MA ยท On-site +1

$144K - $192K/yr

Omnitag, our ML-powered multimodal data mining framework, is the engine that powers this discovery. As a Machine Learning Engineer on the Data Mining team, your mission is to help build the "Brain ...

Omnitag, our ML-powered multimodal data mining framework, is the engine that powers this discovery. As a Senior Machine Learning Engineer on the Data Mining team, your mission is to build the "Brain ...

next page

Showing results 1-20

Multimodal Learning information

See salary details

$21K

$61.7K

$114.5K

How much do multimodal learning jobs pay per year?

As of Jun 28, 2026, the average yearly pay for multimodal learning in the United States is $61,692.00, according to ZipRecruiter salary data. Most workers in this role earn between $41,000.00 and $72,000.00 per year, depending on experience, location, and employer.

What is multimodal learning?

Multimodal learning is an area of machine learning that involves integrating and processing information from multiple types of data, such as text, images, audio, and video. The goal is to create models that can understand and make predictions based on more than one data modality, similar to how humans use various senses. This approach is used in applications like speech recognition with visual cues, image captioning, and video analysis. By combining different data types, multimodal learning systems can achieve better accuracy and more robust understanding.

What is the difference between Multimodal Learning vs Data Scientist?

AspectMultimodal LearningData Scientist
Required CredentialsAdvanced degrees in AI, Machine Learning, or Computer ScienceBachelor's or Master's in Data Science, Statistics, or related fields
Work EnvironmentResearch labs, AI development teams, academiaBusiness, tech companies, analytics teams
Industry UsageAI research, multimedia applications, roboticsData analysis, predictive modeling, business insights

Multimodal Learning focuses on developing AI models that process and integrate multiple data types like images, text, and audio. Data Scientists analyze data to extract insights, build models, and support decision-making. While both roles involve data and algorithms, Multimodal Learning is specialized in AI model development for complex data integration, whereas Data Scientists work broadly across data analysis and interpretation.

What are the key skills and qualifications needed to thrive as a Multimodal Learning Specialist, and why are they important?

To excel as a Multimodal Learning Specialist, you need a solid background in machine learning, data science, and computer vision, often supported by an advanced degree in a related field. Familiarity with deep learning frameworks like TensorFlow or PyTorch, experience integrating data from diverse sources (e.g., text, audio, images), and knowledge of relevant algorithms are crucial. Strong problem-solving abilities, creativity, and effective collaboration are standout soft skills for this role. These competencies are vital for developing innovative models that can process and interpret complex, multi-source data to drive impactful AI solutions.

What are some common challenges faced by professionals working in multimodal learning roles, and how can they be addressed?

Professionals in multimodal learning frequently encounter challenges related to integrating and aligning data from multiple sources, such as text, images, audio, or video. Ensuring data quality and consistency across modalities can be complex, and developing models that effectively combine heterogeneous information often requires advanced technical skills and innovative thinking. Collaboration with domain experts and other data scientists is key to overcoming these obstacles, as is staying up to date with the latest research and tools in machine learning. Regular team meetings and cross-disciplinary workshops can help foster a collaborative environment and promote knowledge sharing.
More about Multimodal Learning jobs
What cities are hiring for Multimodal Learning jobs? Cities with the most Multimodal Learning job openings:
What states have the most Multimodal Learning jobs? States with the most job openings for Multimodal Learning jobs include:
Infographic showing various Multimodal Learning job openings in the United States as of June 2026, with employment types broken down into 33% Internship, and 67% Full Time. Highlights an 100% In-person job distribution, with an average salary of $61,692 per year, or $29.7 per hour.
Senior Machine Learning Engineer, Data Mining

Senior Machine Learning Engineer, Data Mining

Motional

Las Vegas, NV โ€ข On-site, Remote

$117K - $154K/yr

Full-time

Medical, Dental, Vision, Life, Retirement

Posted 17 days ago


Job description

Mission Summary:

At Motional, we're transforming how autonomous vehicles discover critical intelligence hidden within petabytes of multimodal sensor data. Our next-generation autonomous driving stack depends on finding the rare edge cases, long-tail scenarios, and model errors that matter most. Omnitag, our ML-powered multimodal data mining framework, is the engine that powers this discovery.

As a Senior Machine Learning Engineer on the Data Mining team, your mission is to build the "Brain" of this engine: designing massive multimodal Teacher models that understand the world, and distilling them into hyper-efficient Student models that can scour exabytes of data in near real-time. You will work at the intersection of large-scale representation learning, retrieval optimization, and reasoning systems. Your work will directly influence how we compress knowledge into efficient encoders for fast search, and how we apply reinforcement learning to optimize data discovery workflows and intelligent querying. By building smarter mining tools, you will accelerate the entire model improvement lifecycle for teams working on post-training analysis, error diagnosis, and dataset curation.

What You'll Do:

  • Architect and Train Distilled Models: Design and implement teacher-student model frameworks for multimodal sensor data. Develop training pipelines for knowledge distillation. Ensure student models maintain high accuracy while drastically reducing inference latency and memory footprint.
  • Reinforcement Learning for Data Discover: Build RL-based policy learning and reasoning systems for autonomous driving applications. Implement and scale RL training workflows (e.g., PPO, DQN, actor-critic methods) for simulation and real-world interaction. Explore reward shaping, environment modeling, and multi-agent RL where applicable.
  • Optimize Model Deployment for Real-Time Inference: Collaborate with backend engineers to deploy distilled and RL models into production. Optimize for latency, throughput, and hardware efficiency across GPU/CPU clusters. Implement model versioning, A/B testing, and monitoring for performance regressions.
  • Research and Integrate Agentic Systems: Explore and prototype agentic workflows for autonomous reasoning, chain-of-thought prompting, and goal-directed behavior. Integrate such systems into our broader autonomy stack as experimental or production components.
  • Drive Production Reliability: Establish patterns for graceful degradation, fault tolerance, and cost optimization. Operate Omnitag as a mission-critical data platform serving the entire ML organization, with a focus on reliability, debuggability, and operational excellence.
  • Mentor and Collaborate: Work closely with ML scientists, data engineers, and autonomy teams to translate research advances into scalable engineering solutions. Guide junior engineers in best practices for model training, evaluation, and deployment.

What We're Looking For:

  • BS in Computer Science, Machine Learning, or related field, or equivalent professional experience.
  • 6+ years of hands-on experience in machine learning engineering, with a focus on model post training, optimization, and deployment.
  • Strong experience with model distillation or teacher-student training - practical knowledge of loss functions, training strategies, and evaluation of compressed models.
  • Proven experience with reinforcement learning in production or research settings: policy optimization, reward design, simulation environments, and RL-based reasoning.
  • Expert-level proficiency in Python and ML frameworks (PyTorch, TensorFlow, or JAX).
  • Strong software engineering fundamentals: testing, CI/CD, containerization, and system design.
  • Experience deploying ML models in cloud environments (AWS, GCP, or Azure) and optimizing for inference.
  • Demonstrated ability to ship production-grade ML systems and mentor team members.
  • Demonstrated track record of shipping robust, well-tested, production-grade systems and mentoring junior engineers

Bonus Points (Nice-to-Haves):

  • MS/PhD in Computer Science, Machine Learning, or related field.
  • Experience with agentic systems, autonomous reasoning, chain-of-thought models, or LLM-based planning.
  • Background in autonomous driving, robotics, or real-time decision-making systems.
  • Familiarity with multimodal learning, sensor fusion, or embodied AI.
  • Experience building active learning loops, using the model to find the data that breaks the model.
  • Experience with ML-based data mining, active learning, or contrastive learning.
  • Knowledge of model serving tools (TF Serving, Triton, TorchServe) and MLOps platforms.
  • Publications or open-source contributions in RL, distillation, or efficient ML.

We encourage a hybrid schedule with in-office time at one of our locations in Boston, Pittsburgh, or Las Vegas to support collaboration, or this role can be fully remote.

The salary range for this role is an estimate based on a wide range of compensation factors including but not limited to specific skills, experience and expertise, role location, certifications, licenses, and business needs. The estimated compensation range listed in this job posting reflects base salary only. This role may include additional forms of compensation such as a bonus or company equity. The recruiter assigned to this role can share more information about the specific compensation and benefit details associated with this role during the hiring process.

Candidates for certain positions are eligible to participate in Motional's benefits program. Motional's benefits include but are not limited to medical, dental, vision, 401k with a company match, health saving accounts, life insurance, pet insurance, and more.

Salary Range
$172,000โ€”$229,000 USD

Motional is a driverless technology company making autonomous vehicles a safe, reliable, and accessible reality. We're driven by something more.

Our journey is always people first.

We aren't just developing driverless cars; we're creating safer roadways, more equitable transportation options, and making our communities better places to live, work, and connect. Our team is made up of engineers, researchers, innovators, dreamers and doers, who are creating a technology with the potential to transform the way we move.

Higher purpose, greater impact.

We're creating first-of-its-kind technology that will transform transportation. To do so successfully, we must design for everyone in our cities and on our roads. We believe in building a great place to work through a progressive, global culture that is diverse, inclusive, and ensures people feel valued at every level of the organization. Diversity helps us to see the world differently; it's not only good for our business, it's the right thing to do.

Scale up, not starting up.

Our team is behind some of the industry's largest leaps forward, including the first fully-autonomous cross-country drive in the U.S, the launch of the world's first robotaxi pilot, and operation of the world's longest-standing public robotaxi fleet. We're driven to scale; we're moving towards commercialization of our technology, and we need team members who are ready to embrace change and challenges.

Formed as a joint venture between Hyundai Motor Group and Aptiv, Motional is fundamentally changing how people move through their lives. Headquartered in Boston, Motional has operations in the U.S and Asia. For more information, visit www.Motional.com and follow us on Twitter, LinkedIn, Instagram and YouTube.

Motional AD Inc. is an EOE. We celebrate diversity and are committed to creating an inclusive environment for all employees. To comply with Federal Law, we participate in E-Verify. All newly-hired employees are queried through this electronic system established by the DHS and the SSA to verify their identity and employment eligibility.