1

Voice Model Jobs (NOW HIRING)

Research Engineer, Voice

Palo Alto, CA · On-site

$225K - $325K/yr

Pi is a personal AI agent powered by Inflection AI's foundation model, proving that AI can be ... Research, develop, and optimize neural models for voice and audio-including text-to-speech ...

Research Engineer, Voice

Palo Alto, CA · On-site

$225K - $325K/yr

Research, develop, and optimize neural models for voice and audio-including text-to-speech, automatic speech recognition, audio generation, and spoken dialogue systems. * Build and maintain ...

A telecommunicationsnetwork engineer must perform maintain data, voice and/or video network ... Serves as a role model to less experienced personnel. Experience with installation and ...

Apply Early

$120K - $140K/yr

As part of our growing AI/ML division, you will work closely with product managers, data scientists, and engineers to build next-generation voice interactions powered by Large Language Models (LLMs ...

Voice Bot Engineer

$120K - $140K/yr

As part of our growing AI/ML division, you will work closely with product managers, data scientists, and engineers to build next-generation voice interactions powered by Large Language Models (LLMs ...

Have clear understanding of Cisco and Shoretel Voice network deployment models and should understand functioning of voice network devices. * Have worked on Cisco Unified communication products like ...

Analysis of Design and Architecture documentation Experience with the development models and ... Voice Portal (GVP), Nuance speech recognition, grammar development, Text to speech, Nuance ...

Salary: Seeking a Superstar Voice Teacher Do you love working with kids? Do you want to make a ... We believe in a family first business model where we are raising tomorrows leaders through the ...

Apply Early

Network Voice Engineer

Waltham, MA · On-site

$60K - $135K/yr

Network Voice Engineer City: Waltham State/Province: Massachusetts Posting Start Date: 6/10/26 ... Experience operating in an ITIL-based support model (ServiceNow) ͏ Desirable Skills * Experience ...

next page

Showing results 1-20

Voice Model information

See salary details

$5

$48

$76

How much do voice model jobs pay per hour?

As of Jun 30, 2026, the average hourly pay for voice model in the United States is $48.17, according to ZipRecruiter salary data. Most workers in this role earn between $39.18 and $60.10 per hour, depending on experience, location, and employer.

Is 24 too old to start voice acting?

Voice modeling as a career does not have an age limit, and many successful voice actors start at various ages, including in their twenties. Developing skills such as voice control, acting ability, and recording proficiency can be more important than age when entering the field.

What is a Voice Model job?

A Voice Model job involves providing high-quality voice recordings that are used to create or enhance AI-powered speech systems. These recordings help train text-to-speech (TTS) models, virtual assistants, and other voice-enabled applications. Voice Models may work on projects requiring natural speech, emotional expression, or specific accents and tones. The role can involve reading scripts, responding to prompts, or improvising speech patterns to capture a variety of vocal nuances.

How much do TV narrators get paid?

TV narrators typically earn between $200 and $1,000 per finished hour, depending on experience, the project's scope, and the network or production company. Experienced narrators or those working on high-profile projects can earn higher rates, and some may receive royalties or residuals for ongoing broadcasts.

Can I do voice acting with no experience?

Voice modeling jobs often do not require prior experience, as many employers provide training or accept beginners. Developing skills in voice control, clarity, and audio editing can improve chances of success, but entry-level roles are available for those new to the field.

What are the typical work arrangements and environments for a Voice Model?

Voice Models often work on a freelance basis or as part of talent agencies, providing voice recordings for commercials, animations, video games, and audiobooks. Many professionals operate from home studios using specialized equipment, while some projects require sessions at recording studios with directors and sound engineers present. Flexibility is important, as schedules can include last-minute bookings and varied project durations. Collaboration with creative teams, such as producers and scriptwriters, is common to ensure that the final product matches the intended vision. This dynamic environment offers both autonomy and opportunities for skill development across different media industries.

How to get hired for voice work?

To get hired as a voice model, develop a strong demo reel showcasing your vocal range and clarity, and build a professional portfolio. Audition for voice acting roles through online platforms, casting agencies, or direct contacts, and continuously improve your skills with training or coaching. Having good audio equipment and understanding industry standards can also increase your chances of securing voice work.

What are the key skills and qualifications needed to thrive in the Voice Model position, and why are they important?

To thrive as a Voice Model, you need an excellent command of vocal techniques, clear diction, and the ability to adjust your voice for various styles or characters, often supported by formal vocal or acting training. Familiarity with professional recording equipment, audio editing software, and sometimes home studio setups is essential. Adaptability, reliability, and taking direction well are important soft skills for succeeding in client-driven environments. These skills and qualities ensure Voice Models can deliver high-quality vocal performances that meet diverse client needs across advertising, entertainment, and media industries.

More about Voice Model jobs
What states have the most Voice Model jobs? States with the most job openings for Voice Model jobs include:
Infographic showing various Voice Model job openings in the United States as of June 2026, with employment types broken down into 3% As Needed, 61% Full Time, 34% Part Time, 1% Temporary, and 1% Contract. Highlights an 90% Physical, 2% Hybrid, and 8% Remote job distribution, with an average salary of $100,198 per year, or $48.2 per hour.

Research Engineer, Voice

Inflection AI

Palo Alto, CA • On-site

$225K - $325K/yr

Full-time

Medical, Dental, Vision, Retirement, PTO

Posted 20 days ago


Key responsibilities

  • Research, develop, and optimize neural models for voice and audio, including text-to-speech, automatic speech recognition, audio generation, and spoken dialogue systems.

  • Build and maintain production-grade training and inference pipelines for voice models with attention to latency, naturalness, and scalability.

  • Run experiments end-to-end, including data curation, model architecture design, training, evaluation, and ablation studies.


Job description

About Inflection AI
Inflection AI is a Public Benefit Corporation empowering people with human-centered, emotionally intelligent AI. We're shaping the future of AI by combining emotional intelligence (EQ) and raw intelligence (IQ) to elevate people's potential.
Inflection AI created Pi, the world's first emotionally intelligent AI, to help people work through decisions, emotions, and challenges. Pi is a personal AI agent powered by Inflection AI's foundation model, proving that AI can be personal, empathetic, and contextually aware.
About the Role
We're looking for a Member of Technical Staff (MTS), Research Engineer focused on voice and audio to help advance the spoken intelligence behind Pi. In this role, you'll work at the intersection of research and production-developing, training, and shipping neural models across the full spectrum of voice: speech synthesis, recognition, audio generation, and real-time spoken dialogue. You'll collaborate closely with ML engineers, product teams, and infrastructure to turn cutting-edge ideas in areas like neural audio codecs, diffusion-based TTS, and multimodal foundation models into the natural, expressive voice experiences that millions of Pi users interact with every day.
What You'll Do
  • Research, develop, and optimize neural models for voice and audio-including text-to-speech, automatic speech recognition, audio generation, and spoken dialogue systems.
  • Build and maintain production-grade training and inference pipelines for voice models, with close attention to latency, naturalness, and scalability.
  • Run experiments end-to-end: data curation, model architecture design, training, evaluation, and ablation studies.
  • Collaborate with ML engineers, product teams, and infrastructure to integrate voice models into Pi's real-time conversational stack.
  • Explore and apply advances in neural audio codecs, diffusion-based synthesis, streaming architectures, and multimodal foundation models to improve Pi's voice experience.
  • Develop robust evaluation frameworks combining perceptual metrics, automated benchmarks, and user-facing quality signals.
  • Contribute to Inflection's research culture through publications, internal reviews, and knowledge sharing.

What We're Looking For
  • 2-5 years of research or engineering experience (including graduate work) in audio, speech, or multimodal ML.
  • Strong proficiency in PyTorch and hands-on experience training and debugging large-scale neural models on GPU/accelerator clusters.
  • Solid understanding of audio and speech fundamentals spectrograms, mel features, vocoders, codec-based representations, and signal processing.
  • Demonstrated ability to take a research idea from prototype to production: equally comfortable reading papers and writing efficient, CUDA-aware training loops.
  • Familiarity with modern generative architectures for audio (e.g., diffusion models, autoregressive codecs, flow-matching) and their trade-offs.
  • Clear, collaborative communication able to distill complex research into actionable insights for cross-functional partners.
  • Have a bachelor's degree or equivalent in Computer Science, Electrical Engineering, Linguistics, or a related field; MS or PhD strongly preferred.

Employee Pay Disclosures
At Inflection AI, we aim to attract and retain the best employees and compensate them in a way that appropriately and fairly values their individual contributions to the company. For this role, Inflection AI estimates a starting annual base salary to fall within the range of $225,000 to $325,000, depending on a candidate's qualifications and level of experience. This role also includes a meaningful equity component, allowing employees to share in the long-term success of the company.
Benefits
Inflection AI values and supports our team's mental and physical health. We are focused on building a positive, safe, inclusive and inspiring place to work. Our benefits include:
  • Diverse medical, dental and vision options
  • 401k matching program
  • Unlimited paid time off
  • Parental leave and flexibility for all parents and caregivers
  • Support of country-specific visa needs for international employees living in the Bay Area