Audio Speech Machine Learning Jobs (NOW HIRING)

Audio Inference Engineer, Model Efficiency

... machine learning inference systems. • Proficiency with programming languages such as C++ and Python. • Hands-on experience with deep learning models for audio, speech, or language applications ...

Cohere

Audio Inference Engineer, Model Efficiency

Zyphra Technologies Inc

Research Engineer - Audio & Speech Models

San Francisco, CA · On-site

As a Research Engineer - Audio & Speech Models , you will be a core contributor on Zyphra's Audio ... Previously published machine learning research in well-respected venues * Postgraduate degree in a ...

Zyphra Technologies Inc

Research Engineer - Audio & Speech Models

San Francisco, CA · On-site

Zyphra

Research Engineer - Audio & Speech Models

San Francisco, CA · On-site

Quick apply

Zyphra

Research Engineer - Audio & Speech Models

San Francisco, CA · On-site

Cohere

Audio Inference Engineer, Model Efficiency

New York, NY · On-site

... audio or machine learning inference systems. * Proficiency with programming languages such as C++ and Python. * Hands-on experience with deep learning models for audio, speech, or language ...

Cohere

Audio Inference Engineer, Model Efficiency

New York, NY · On-site

RST Recruitment

Research Scientist-Voice and Audio Ai

Redwood City, CA

$225K - $400K/yr

As a Founding Machine Learning Research Engineer at Retell, you'll focus on advancing model ... PyTorch, LLMs, Audio/Speech Models, Text-to-Speech (TTS), Automatic Speech Recognition (ASR ...

Quick apply

RST Recruitment

Research Scientist-Voice and Audio Ai

Redwood City, CA

$225K - $400K/yr

Bland

Machine Learning Researcher, Audio

San Francisco, CA · On-site

$140K - $250K/yr

Machine Learning Researcher, Audio Location: San Francisco, CA or Remote About Bland At Bland.com ... Design and train large scale text-to-speech models capable of expressive, controllable, human ...

Bland

Machine Learning Researcher, Audio

San Francisco, CA · On-site

$140K - $250K/yr

Meta

Research Scientist, Machine Learning - Audio

Redmond, WA

$122K/yr

You will operate at the intersection of acoustics, speech, and signal processing algorithms with hardware and software co-design. Research Scientist, Machine Learning - Audio Responsibilities:

New

Meta

Research Scientist, Machine Learning - Audio

Redmond, WA

$122K/yr

New

Meta

Research Scientist, Machine Learning - Audio

Redmond, WA · On-site

$122K - $181K/yr

... Machine Learning Research Scientist, with a specialty in audio and speech processing to join our ... You will operate at the intersection of acoustics, speech, and signal processing algorithms with ...

New

Meta

Research Scientist, Machine Learning - Audio

Redmond, WA · On-site

$122K - $181K/yr

New

Apple

Machine Learning Architect - Conversational Speech

Cupertino, CA · On-site

We are seeking a Machine Learning Architect to serve as a senior technical leader spanning the full ... audio, and text modalities. Direct experience building speech-to-speech conversational systems ...

Apple

Machine Learning Architect - Conversational Speech

Cupertino, CA · On-site

HARMAN International

Audio ML Engineer (Research)

The Audio ML Engineer (Research) will develop machine learning models to enhance Intelligent Audio ... Preferred : • Experience with audio ML domains (speech enhancement, denoising, source separation ...

HARMAN International

Audio ML Engineer (Research)

Apple

Machine Learning Architect - Conversational Speech

Cupertino, CA

$262K - $394K/yr

... Machine Learning Architect to serve as a senior technical leader spanning the full Speech ... speaker modeling, audio understanding-and an ability to reason about their interactions and ...

Apple

Machine Learning Architect - Conversational Speech

Cupertino, CA

$262K - $394K/yr

Paramount

Principal Machine Learning Engineer, Content Engineering

Manhattan, NY · On-site

$233K - $350K/yr

Principal Machine Learning Engineer, Content Engineering (45445) Overview: We are seeking a Senior ... Design and deploy models that synthesize signals across video (pixels), audio (speech/music), and ...

Paramount

Principal Machine Learning Engineer, Content Engineering

Manhattan, NY · On-site

$233K - $350K/yr

Paramount

Principal Machine Learning Engineer, Content Engineering

Manhattan, NY

$233K - $350K/yr

Paramount

Principal Machine Learning Engineer, Content Engineering

Manhattan, NY

$233K - $350K/yr

ForeFlight

Staff Machine Learning Engineer

Austin, TX · On-site +1

$208K - $255K/yr

Jeppesen ForeFlight is seeking a Senior Machine Learning Engineer to help build and scale domain-specialized automatic speech recognition (ASR) systems for aviation and operational audio workflows.

ForeFlight

Staff Machine Learning Engineer

Austin, TX · On-site +1

$208K - $255K/yr

Abaka AI

Machine Learning Engineer

Mountain View, CA · On-site

... speech-audio modeling) and dataset optimization for model training. • Solid understanding of ML system design, including feature pipelines, data loaders, model serving, and evaluation frameworks ...

Abaka AI

Machine Learning Engineer

Mountain View, CA · On-site

Abaka AI

Machine Learning Engineer

Mountain View, CA · On-site

Abaka AI

Machine Learning Engineer

Mountain View, CA · On-site

Xforia, Inc.

Machine Learning Scientist

Laguna Hills, CA · On-site

Coursework in machine learning, computer vision, control systems, and time series modeling. Strong ... Image Processing/Computer Vision, ADAS, Anomaly Detection, Audio/Speech Processing, Automatic ...

Xforia, Inc.

Machine Learning Scientist

Laguna Hills, CA · On-site

Apple

Machine Learning Engineer, Siri Speech

Cupertino, CA · On-site

Good knowledge in machine learning technologies related to speech and audio processing; experience with image processing is a plus. Strong problem-solving skills and ability to work independently as ...

Apple

Machine Learning Engineer, Siri Speech

Cupertino, CA · On-site

David AI

Senior Machine Learning Research Engineer

San Francisco, CA · On-site

$230K - $290K/yr

Speech is versatile, accessible, and human-it fits naturally into everyday life. As audio AI ... About our Machine Learning team Our Machine Learning team sits at the intersection of cutting-edge ...

David AI

Senior Machine Learning Research Engineer

San Francisco, CA · On-site

$230K - $290K/yr

NICE

Senior Machine Learning Engineer

Sandy, UT · Hybrid

$99K - $136K/yr

As Senior Machine Learning Engineer, you will own the evaluation and optimization of speech ... LoRA/PEFT for speech models, inference optimization (quantization, SGLang/vLLM serving for audio ...

New

NICE

Senior Machine Learning Engineer

Sandy, UT · Hybrid

$99K - $136K/yr

New

Showing results 1-20

Audio Speech Machine Learning Jobs

Audio Speech Machine Learning information

What are some common challenges faced when developing machine learning models for audio speech applications?

A key challenge in audio speech machine learning roles is dealing with diverse and noisy audio data, which can significantly affect model accuracy. Additionally, models must be robust to different accents, languages, and speaking styles, requiring large and varied datasets for training and validation. Collaboration with data engineers, linguists, and software developers is often necessary to ensure high-quality data pipelines and model integration into production systems. Staying updated with the latest research and optimizing models for real-time performance are also ongoing aspects of the role.

What is an Audio Speech Machine Learning Engineer?

An Audio Speech Machine Learning Engineer is a specialized professional who designs, develops, and implements machine learning models that process and analyze audio and speech data. Their work involves tasks like speech recognition, speaker identification, and audio event detection by leveraging algorithms and large datasets. These engineers collaborate with data scientists, software developers, and linguists to create applications such as voice assistants, transcription tools, and automated customer service systems. Expertise in signal processing, deep learning frameworks, and programming languages like Python is crucial for this role.

What is the difference between Audio Speech Machine Learning vs Speech Data Analyst?

Aspect	Audio Speech Machine Learning	Speech Data Analyst
Required Credentials	Degree in Computer Science, Data Science, or related fields; knowledge of ML frameworks	Degree in Data Analysis, Statistics, or related fields; experience with data tools
Work Environment	Research labs, tech companies, AI startups	Data analysis teams, research institutions, tech firms
Industry Usage	Developing speech recognition, voice assistants, NLP applications	Analyzing speech datasets, improving speech models, reporting insights

Audio Speech Machine Learning focuses on developing algorithms for speech recognition and processing, often involving model training and AI development. Speech Data Analysts interpret speech data, generate insights, and support model improvements. Both roles require strong analytical skills, but their core tasks differ: one builds models, the other analyzes data.

What are the key skills and qualifications needed to thrive as an Audio Speech Machine Learning Engineer, and why are they important?

To thrive as an Audio Speech Machine Learning Engineer, you need a solid background in machine learning, signal processing, and programming (typically Python), along with a relevant degree in computer science or a related field. Familiarity with tools like TensorFlow or PyTorch, audio processing libraries (such as Librosa), and experience with speech datasets and ASR systems are commonly required. Critical soft skills include problem-solving, innovation, and effective communication for collaborating with cross-functional teams. These skills are essential to develop accurate, scalable speech recognition systems that advance voice-driven technology.

More about Audio Speech Machine Learning jobs

The 10 Top Types Of Audio Speech Machine Learning Jobs

What cities are hiring for Audio Speech Machine Learning jobs? Cities with the most Audio Speech Machine Learning job openings:

What states have the most Audio Speech Machine Learning jobs? States with the most job openings for Audio Speech Machine Learning jobs include:

What job categories do people searching Audio Speech Machine Learning jobs look for? The top searched job categories for Audio Speech Machine Learning jobs are:

Audio Speech Machine Learning jobs near you

Infographic showing various Audio Speech Machine Learning job openings in the United States as of July 2026, with employment types broken down into 77% Full Time, 20% Part Time, 1% Temporary, and 2% Contract. Highlights an 89% Physical, 1% Hybrid, and 10% Remote job distribution.

Audio Inference Engineer, Model Efficiency

Cohere

Remote

Apply

Full-time

Re-posted 21 days ago

Job description

Job Summary:
Cohere is a company focused on scaling intelligence to serve humanity through advanced AI systems. The Audio Inference Engineer will work on optimizing audio inference serving efficiency and advancing core audio model serving metrics while collaborating with training and serving infrastructure teams.
Responsibilities:
• Build reliable machine learning systems and optimize audio inference serving efficiency using innovative techniques.
• Advance core audio model serving metrics, including latency, throughput, and quality by diving deep into systems, identifying bottlenecks, and delivering creative solutions for audio processing and streaming workloads.
• Collaborate closely with both the training and serving infrastructure teams to ensure seamless integration between model development and deployment, with a special focus on real-time and streaming audio inference.
Qualifications:
Required:
• Significant experience developing high-performance audio or machine learning inference systems.
• Proficiency with programming languages such as C++ and Python.
• Hands-on experience with deep learning models for audio, speech, or language applications.
• A bias for action and a strong results-oriented mindset.
Preferred:
• GPU programming, low-level system optimization, model parallelization techniques over multiple GPUs
• Experience with duplex real-time streaming architectures.
• Internals of machine learning frameworks for audio (such as PyTorch, TensorFlow, or specialized audio libraries).
• Experience with inference framework like vLLM, SGLang, Tensort-LLM, or custom distributed inference systems
• Sequence modeling (e.g., transformers for audio/speech) and end-to-end audio pipeline optimization
Company:
Cohere develops enterprise artificial intelligence software and provides language models, retrieval tools, and workplace platforms. Founded in 2019, the company is headquartered in Toronto, CAN, with a team of 201-500 employees. The company is currently Growth Stage.

Apply

Audio Speech Machine Learning Jobs (NOW HIRING)

Audio Inference Engineer, Model Efficiency

Audio Inference Engineer, Model Efficiency

Research Engineer - Audio & Speech Models

Research Engineer - Audio & Speech Models

Research Engineer - Audio & Speech Models

Research Engineer - Audio & Speech Models

Audio Inference Engineer, Model Efficiency

Audio Inference Engineer, Model Efficiency

Research Scientist-Voice and Audio Ai

Research Scientist-Voice and Audio Ai

Machine Learning Researcher, Audio

Machine Learning Researcher, Audio

Research Scientist, Machine Learning - Audio

Research Scientist, Machine Learning - Audio

Research Scientist, Machine Learning - Audio

Research Scientist, Machine Learning - Audio

Machine Learning Architect - Conversational Speech

Machine Learning Architect - Conversational Speech

Audio ML Engineer (Research)

Audio ML Engineer (Research)

Machine Learning Architect - Conversational Speech

Machine Learning Architect - Conversational Speech

Principal Machine Learning Engineer, Content Engineering

Principal Machine Learning Engineer, Content Engineering

Principal Machine Learning Engineer, Content Engineering

Principal Machine Learning Engineer, Content Engineering

Staff Machine Learning Engineer

Staff Machine Learning Engineer

Machine Learning Engineer

Machine Learning Engineer

Machine Learning Engineer

Machine Learning Engineer

Machine Learning Scientist

Machine Learning Scientist

Machine Learning Engineer, Siri Speech

Machine Learning Engineer, Siri Speech

Senior Machine Learning Research Engineer

Senior Machine Learning Research Engineer

Senior Machine Learning Engineer

Senior Machine Learning Engineer

Audio Speech Machine Learning information

What are some common challenges faced when developing machine learning models for audio speech applications?

What is an Audio Speech Machine Learning Engineer?

What is the difference between Audio Speech Machine Learning vs Speech Data Analyst?

What are the key skills and qualifications needed to thrive as an Audio Speech Machine Learning Engineer, and why are they important?

Audio Inference Engineer, Model Efficiency

Share this job

Job description

Share this job