1

Ai Voice Collection Jobs (NOW HIRING)

ABOUT xAI xAI's mission is to create AI systems that can accurately understand the universe and aid ... Design and execute large-scale speech data curation and processing pipelines, including collection ...

ABOUT xAI xAI's mission is to create AI systems that can accurately understand the universe and aid ... Design and execute large-scale speech data curation and processing pipelines, including collection ...

Company description Service Measure (SM) is a field data collection company founded in 2013 in New ... Your role will help improve voice-command functionality and user experience for future in-car ...

Company description Service Measure (SM) is a field data collection company founded in 2013 in New ... Your role will help improve voice-command functionality and user experience for future in-car ...

Each month our AI voice technology analyzes 28+ billion calls, protecting over 550 million users ... collection, control validation, and security tooling-primarily during key periods aligned with ...

Senior GRC Engineer

New York, NY · On-site

$125K - $171K/yr

Aircall's AI Voice Agent automates routine calls, AI Assist streamlines post-call work, and AI ... collection and continuous control monitoring. > * Engineer "compliance-as-code" workflows: codify ...

Dentist

West Bloomfield, MI · On-site

$850/day

Digital X-rays, Primescan, Overjet AI, Voice-enabled systems * Payer Mix: Medicaid, PPO, FFS * Average Production: $5,000 per doctor per day * Collection Rate: 94% Compensation * Daily Guaranteed ...

The position We're looking for an Applied AI Engineer to take our growing collection of foundation ... Ship real-time streaming pipelines (voice agents) alongside batch and request-response workloads

The position We're looking for an Applied AI Engineer to take our growing collection of foundation ... Ship real-time streaming pipelines (voice agents) alongside batch and request-response workloads

The position We're looking for an Applied AI Engineer to take our growing collection of foundation ... Ship real-time streaming pipelines (voice agents) alongside batch and request-response workloads

... voice of the customer and experience design, focused on improving experiences for healthcare ... Coordinate, support and operationalize data collection to enhance AI performance * Build and ...

New

... voice of the customer and experience design, focused on improving experiences for healthcare ... Coordinate, support and operationalize data collection to enhance AI performance * Build and ...

New

next page

Showing results 1-20

Ai Voice Collection information

See salary details

$11

$19

$27

How much do ai voice collection jobs pay per hour?

As of Jun 8, 2026, the average hourly pay for ai voice collection in the United States is $19.12, according to ZipRecruiter salary data. Most workers in this role earn between $16.35 and $21.15 per hour, depending on experience, location, and employer.

What is AI voice collection?

AI voice collection is the process of gathering and recording human voices to create datasets that train artificial intelligence models, such as speech recognition or voice synthesis systems. These datasets often include samples from diverse speakers, languages, and contexts to improve the accuracy and versatility of AI applications. AI voice collection can be used for virtual assistants, transcription services, and accessibility tools, among other uses. The collected data is anonymized and processed to protect privacy before being used in machine learning.

What are the typical daily responsibilities of someone working in AI voice collection?

Professionals in AI voice collection typically spend their day recording scripted prompts, ensuring clear enunciation, and following specific technical guidelines for audio quality. They may also annotate or label audio files, manage data privacy requirements, and collaborate with technical teams to troubleshoot recording issues. Attention to detail is crucial, as clean and diverse audio samples are essential for training accurate speech recognition models. Coordination with project managers and linguists is common to meet language, accent, and demographic targets.

What are the key skills and qualifications needed to thrive in AI Voice Collection, and why are they important?

To thrive in AI Voice Collection, strong verbal communication, attention to detail, and the ability to follow precise instructions are essential, often with a background in linguistics or audio recording being advantageous. Familiarity with recording software, high-quality microphones, and data management platforms is typically required. Reliability, adaptability, and professionalism help participants deliver consistent, high-quality voice samples. These skills ensure the accurate and effective collection of diverse voice data, which is critical for training robust AI speech models.

What is the difference between Ai Voice Collection vs Speech Data Annotator?

AspectAi Voice CollectionSpeech Data Annotator
Required CredentialsBasic technical skills, familiarity with voice dataAttention to detail, understanding of annotation tools
Work EnvironmentRemote or office-based, data collection sitesRemote, using annotation software
Industry UsageVoice AI development, speech recognitionTraining datasets for speech models
Common Search IntentCollecting voice data for AI trainingAnnotating speech data for AI models

Ai Voice Collection involves gathering voice recordings for AI training, often requiring data collection skills. Speech Data Annotator focuses on labeling and annotating speech data to improve AI models. Both roles are essential in voice AI development but differ in tasks and skills needed.

Infographic showing various Ai Voice Collection job openings in the United States as of May 2026, with employment types broken down into 79% Full Time, 15% Part Time, and 6% Contract. Highlights an 94% Physical, 2% Hybrid, and 4% Remote job distribution, with an average salary of $39,779 per year, or $19.1 per hour.

Member of Technical Staff - Voice Model

xAI

Palo Alto, CA

$150K - $450K/yr

Full-time

Medical, Dental, Vision, Life, Retirement

Posted 23 days ago


Job description

ABOUT xAI

xAI's mission is to create AI systems that can accurately understand the universe and aid humanity in its pursuit of knowledge. Our team is small, highly motivated, and focused on engineering excellence. This organization is for individuals who appreciate challenging themselves and thrive on curiosity. We operate with a flat organizational structure. All employees are expected to be hands-on and to contribute directly to the company's mission. Leadership is given to those who show initiative and consistently deliver excellence. Work ethic and strong prioritization skills are important. All employees are expected to have strong communication skills. They should be able to concisely and accurately share knowledge with their teammates.

ABOUT THE ROLE:

You will join the Grok Voice Model team to help build the world's best voice AI. We deliver smooth, natural, low-latency spoken interactions — expressive, multilingual, and reliable across devices and real-time scenarios. We own the full training pipeline: massive data curation, premium audio processing, frontier speech-language pre-training, and intensive post-training to push quality, speed, and stability to the limit.

Our goal: make talking to AI feel like conversing with the most charming, kind, and knowledgeable person imaginable. We're seeking exceptionally smart, execution-oriented engineers to help us get there.

RESPONSIBILITIES:
  • Design and execute large-scale speech data curation and processing pipelines, including collection of diverse real-world audio, synthetic data generation, and automated annotation workflows to enable high-quality model training and evaluation.
  • Work on pre-training and post-training of speech-language models, with targeted enhancements through supervised fine-tuning, reinforcement learning, and other techniques to ensure Grok Voice responses are accurate, factually grounded, natural and idiomatic in spoken style, conversational in tone, and fluent across multiple languages.
  • Build and iterate a comprehensive evaluation framework covering objective metrics (accuracy, quality, latency, expressiveness), human preference studies, content factuality assessments, real-time interaction quality, and experimentation infrastructure to measure and improve performance.
  • Work closely with product teams to integrate voice models into applications and real-time environments, define spoken interaction specifications, and handle the full lifecycle from prototype to global-scale deployment for stable, low-latency, delightful voice experiences.
BASIC QUALIFICATIONS:
  • Python expert with deep proficiency in writing clean, efficient code for AI/ML systems.
  • Hands-on experience processing large-scale datasets using tools like Spark and Ray for cleaning, augmentation, and feature extraction.
  • Proficiency in pre-training and post-training speech-language models using JAX/PyTorch, including supervised fine-tuning, reinforcement learning, and optimizations for accuracy, factuality, natural spoken style, detail, and multilingual fluency.
  • Ability to set up and run rigorous evaluation pipelines: objective metrics, human preference studies, content factuality checks, and iterative A/B testing to drive model improvements.
  • Experience building or working with large-scale distributed training and inference systems on Kubernetes.
  • Proactive, self-driven attitude — ready to grind in a fast-paced, high-caliber team to deliver outstanding voice AI experiences.
COMPENSATION AND BENEFITS:

$150,000 - $450,000 USD

Base salary is just one part of our total rewards package at xAI, which also includes equity, comprehensive medical, vision, and dental coverage, access to a 401(k) retirement plan, short & long-term disability insurance, life insurance, and various other discounts and perks.

xAI is an equal opportunity employer. For details on data processing, view our Recruitment Privacy Notice.