1

Text To Speech Jobs (NOW HIRING)

Integrate speech recognition and text-to-speech technologies using Nuance. Monitor and maintain IVR systems to ensure high availability and performance. Troubleshoot and resolve issues related to IVR ...

Lead AI/ML Engineer

Mountain View, CA

$120K - $159K/yr

You will lead the design and delivery of end-to-end voice AI solutions, combining large language models with speech technologies such as speech-to-text, text-to-speech, and real-time streaming audio ...

Deepgram is the leading platform underpinning the emerging trillion-dollar Voice AI economy, providing real-time APIs for speech-to-text (STT) and text-to-speech (TTS). As a QA Engineering Manager ...

next page

Showing results 1-20

Text To Speech information

What is a Text To Speech (TTS) job?

A Text To Speech (TTS) job typically involves converting written text into spoken audio using specialized software or AI technology. Professionals in this field may work on developing, fine-tuning, or implementing TTS systems for various applications, such as virtual assistants, accessibility tools, or audiobooks. The role can also include tasks like voice data collection, script editing, and quality assurance of generated speech. TTS jobs are important for making digital content more accessible to people with visual impairments or reading difficulties. The field combines elements of linguistics, software engineering, and artificial intelligence.

What are some common challenges faced by professionals working in Text to Speech (TTS) development roles?

Professionals in Text to Speech development often encounter challenges such as fine-tuning synthetic voices to sound natural and expressive, handling diverse accents or languages, and optimizing algorithms for various platforms. Collaboration with linguists, UX designers, and software engineers is frequent, as ensuring accessibility and seamless integration across applications is a top priority. Staying updated on advances in AI and deep learning is essential, as the field evolves rapidly and demands continuous improvement in both technical and creative aspects.

What is the difference between Text To Speech vs Voice Actor?

AspectText To SpeechVoice Actor
Required CredentialsNone or basic audio editing skillsVoice training, acting skills, often professional demos
Work EnvironmentSoftware, digital platforms, remoteRecording studios, on-location, remote
Industry UsageAutomation, AI, tech companiesMedia, entertainment, advertising
Search & Comparison IntentAutomated voice solutions, TTS technologyVoice acting, narration, character voices

Text To Speech involves using software to convert written text into spoken words, primarily for automation and digital applications. Voice Actors, on the other hand, provide human voice recordings for media, entertainment, and advertising. While TTS is tech-driven and often used in AI and accessibility tools, Voice Actors bring emotional nuance and personality to their performances. Both roles are essential in their respective industries, but they differ significantly in skills, environment, and purpose.

What are the key skills and qualifications needed to thrive as a Text to Speech Engineer, and why are they important?

To thrive as a Text to Speech Engineer, you need a strong background in computer science, linguistics, and digital signal processing, often supported by a relevant degree. Experience with machine learning frameworks, speech synthesis toolkits (like Tacotron or WaveNet), and programming languages such as Python or C++ is typically required. Creativity, analytical thinking, and cross-functional communication skills help you collaborate with diverse teams and innovate in voice technology. These skills ensure the development of accurate, natural-sounding speech systems that meet user and client needs.
More about Text To Speech jobs
What cities are hiring for Text To Speech jobs? Cities with the most Text To Speech job openings:
What states have the most Text To Speech jobs? States with the most job openings for Text To Speech jobs include:
Infographic showing various Text To Speech job openings in the United States as of June 2026, with employment types broken down into 5% Full Time, 72% Part Time, 2% Temporary, 20% Contract, and 1% Nights. Highlights an 95% Physical, and 5% Remote job distribution.
Research Engineer, Machine Learning Systems

Research Engineer, Machine Learning Systems

Deepgram

Remote

Full-time

Posted 12 days ago


Job description

Job Summary:
Deepgram is the leading platform underpinning the emerging trillion-dollar Voice AI economy, providing real-time APIs for speech-to-text and text-to-speech. The Research Engineer will partner with research scientists to prototype and validate novel modeling ideas, focusing on scalable model training and tooling for speech technologies.
Responsibilities:
• Architect and manage horizontally scalable systems that dramatically accelerate the end-to-end training lifecycle for Speech-to-Text (STT) and Text-to-Speech (TTS) models.
• Design and implement internal UIs and tools that make ML systems and workflows accessible to non-technical stakeholders across the company.
• Oversee and manage training tooling, job orchestration, experiment tracking, and data storage.
Qualifications:
Required:
• Strong experience with the machine learning research pipeline, particularly in STT or related speech domains. This includes experimenting with and evaluating new architectures and modeling approaches, and implementing large-scale training systems.
• Proficiency with orchestration and infrastructure tools like Kubernetes, Docker, and Prefect.
• Familiarity with ML lifecycle tools such as MLflow.
• Experience building internal tools or dashboards for non-technical users.
• Hands-on experience with data engineering practices for unstructured audio and text data.
• Comfortable working in cross-functional teams that include researchers, engineers, and product stakeholders.
Company:
Deepgram provides a voice artificial intelligence platform for speech-to-text, text-to-speech, and voice applications. Founded in 2015, the company is headquartered in San Francisco, USA, with a team of 51-200 employees. The company is currently Growth Stage.