... audio + text). • Advance multimodal capabilities including spatial-temporal compression, cross ... algorithm co-design, and scaling paradigms for state-of-the-art performance. • Build research ...
... audio + text). • Advance multimodal capabilities including spatial-temporal compression, cross ... algorithm co-design, and scaling paradigms for state-of-the-art performance. • Build research ...
... as audio, motion, and sensor data to shape next-generation products impacting millions daily ... advanced algorithms for multimodal sensor fusion. The ideal candidate brings proven expertise in ...
... as audio, motion, and sensor data to shape next-generation products impacting millions daily ... advanced algorithms for multimodal sensor fusion. The ideal candidate brings proven expertise in ...
Member of Technical Staff - Multimodal Understanding
Palo Alto, CA · On-site
$180K - $440K/yr
... audio + text). * Advance multimodal capabilities including spatial-temporal compression, cross ... Innovate on algorithms, modeling approaches, hardware/software/algorithm co-design, and scaling ...
Member of Technical Staff - Multimodal Understanding
Palo Alto, CA · On-site
$180K - $440K/yr
... audio + text). * Advance multimodal capabilities including spatial-temporal compression, cross ... Innovate on algorithms, modeling approaches, hardware/software/algorithm co-design, and scaling ...
... and algorithms. * Experience developing accessible technologies. * Experience with ChromeOS ... Powered by realistic state-of-the-art 3D imaging and spatial audio and integrated with today ...
... and algorithms. * Experience developing accessible technologies. * Experience with ChromeOS ... Powered by realistic state-of-the-art 3D imaging and spatial audio and integrated with today ...
Senior Software Engineer, Google Beam
Mountain View, CA · On-site
$144.50K - $190.50K/yr
... algorithms. * 1 year of experience in a technical leadership role. * Experience developing ... Powered by realistic 3D imaging and spatial audio and integrated with today's leading remote video ...
Senior Software Engineer, Google Beam
Mountain View, CA · On-site
$144.50K - $190.50K/yr
... algorithms. * 1 year of experience in a technical leadership role. * Experience developing ... Powered by realistic 3D imaging and spatial audio and integrated with today's leading remote video ...
... algorithms. * 3 years of experience in a technical leadership role leading project teams and ... Powered by realistic Three-Dimensional imaging and spatial audio and integrated with today ...
... algorithms. * 3 years of experience in a technical leadership role leading project teams and ... Powered by realistic Three-Dimensional imaging and spatial audio and integrated with today ...
You will use spatial, temporal, and cross-modal data augmentation to multiply the value of every ... Develop sophisticated post-processing algorithms to analyze force interactions and infer ...
You will use spatial, temporal, and cross-modal data augmentation to multiply the value of every ... Develop sophisticated post-processing algorithms to analyze force interactions and infer ...
Senior Experience Designer
Atlanta, GA · On-site
$98.10K - $104.80K/yr
... ML, algorithms, digital signal processing, audio engineering, image processing, computer vision ... Experience with immersive or spatial media, such as immersive audio, immersive video, or ...
Senior Experience Designer
Atlanta, GA · On-site
$98.10K - $104.80K/yr
... ML, algorithms, digital signal processing, audio engineering, image processing, computer vision ... Experience with immersive or spatial media, such as immersive audio, immersive video, or ...
Member of Technical Staff - Multimodal Understanding
Palo Alto, CA · On-site
$180K - $440K/yr
... audio + text). * Advance multimodal capabilities including spatial-temporal compression, cross ... Innovate on algorithms, modeling approaches, hardware/software/algorithm co-design, and scaling ...
Quick apply
Member of Technical Staff - Multimodal Understanding
Palo Alto, CA · On-site
$180K - $440K/yr
... audio + text). * Advance multimodal capabilities including spatial-temporal compression, cross ... Innovate on algorithms, modeling approaches, hardware/software/algorithm co-design, and scaling ...
Senior Experience Designer
$98.10K - $104.80K/yr
... ML, algorithms, digital signal processing, audio engineering, image processing, computer vision ... Experience with immersive or spatial media, such as immersive audio, immersive video, or ...
Senior Experience Designer
$98.10K - $104.80K/yr
... ML, algorithms, digital signal processing, audio engineering, image processing, computer vision ... Experience with immersive or spatial media, such as immersive audio, immersive video, or ...
Member of Technical Staff - Multimodal Understanding
Palo Alto, CA · On-site
$180K - $440K/yr
... audio + text). * Advance multimodal capabilities including spatial-temporal compression, cross ... Innovate on algorithms, modeling approaches, hardware/software/algorithm co-design, and scaling ...
Member of Technical Staff - Multimodal Understanding
Palo Alto, CA · On-site
$180K - $440K/yr
... audio + text). * Advance multimodal capabilities including spatial-temporal compression, cross ... Innovate on algorithms, modeling approaches, hardware/software/algorithm co-design, and scaling ...
Machine Learning Engineer: Multimodal Sensor Fusion
$147.40K - $272.10K/yr
... audio, motion, and sensor data to shape next-generation products impacting millions daily. Ready to ... advanced algorithms for multimodal sensor fusion. The ideal candidate brings proven expertise in ...
Machine Learning Engineer: Multimodal Sensor Fusion
$147.40K - $272.10K/yr
... audio, motion, and sensor data to shape next-generation products impacting millions daily. Ready to ... advanced algorithms for multimodal sensor fusion. The ideal candidate brings proven expertise in ...
3D Machine Learning Engineer
Irvine, CA · On-site
Design and implement scalable machine learning pipelines for large-scale 3D spatial data processing ... Analyze diverse sensor inputs, including RGBD imagery, LiDAR point clouds, 360 photos, audio, and ...
3D Machine Learning Engineer
Irvine, CA · On-site
Design and implement scalable machine learning pipelines for large-scale 3D spatial data processing ... Analyze diverse sensor inputs, including RGBD imagery, LiDAR point clouds, 360 photos, audio, and ...
Design and implement scalable machine learning pipelines for large-scale 3D spatial data processing ... Analyze diverse sensor inputs, including RGBD imagery, LiDAR point clouds, 360 photos, audio, and ...
Design and implement scalable machine learning pipelines for large-scale 3D spatial data processing ... Analyze diverse sensor inputs, including RGBD imagery, LiDAR point clouds, 360 photos, audio, and ...
Design and implement scalable machine learning pipelines for large-scale 3D spatial data processing ... Analyze diverse sensor inputs, including RGBD imagery, LiDAR point clouds, 360 photos, audio, and ...
Quick apply
Design and implement scalable machine learning pipelines for large-scale 3D spatial data processing ... Analyze diverse sensor inputs, including RGBD imagery, LiDAR point clouds, 360 photos, audio, and ...
... ML, algorithms, digital signal processing, audio engineering, image processing, computer vision ... spatial capture). * Drive projects that enhance image delivery, analysis, rendering, and content ...
... ML, algorithms, digital signal processing, audio engineering, image processing, computer vision ... spatial capture). * Drive projects that enhance image delivery, analysis, rendering, and content ...
... ML, algorithms, digital signal processing, audio engineering, image processing, computer vision ... spatial capture). * Drive projects that enhance image delivery, analysis, rendering, and content ...
... ML, algorithms, digital signal processing, audio engineering, image processing, computer vision ... spatial capture). * Drive projects that enhance image delivery, analysis, rendering, and content ...
Principal ML Engineer (Robotics)
Boston, MA · On-site
$200K - $275K/yr
... audio, tactile, spatial and temporal understanding powered by physical AI. You will develop ... Design and implement simulation environments and evaluation frameworks for algorithm validation.
Principal ML Engineer (Robotics)
Boston, MA · On-site
$200K - $275K/yr
... audio, tactile, spatial and temporal understanding powered by physical AI. You will develop ... Design and implement simulation environments and evaluation frameworks for algorithm validation.
Principal ML Engineer (Robotics)
Boston, MA · On-site
$200K - $275K/yr
... audio, tactile, spatial and temporal understanding powered by physical AI. You will develop ... Design and implement simulation environments and evaluation frameworks for algorithm validation.
Principal ML Engineer (Robotics)
Boston, MA · On-site
$200K - $275K/yr
... audio, tactile, spatial and temporal understanding powered by physical AI. You will develop ... Design and implement simulation environments and evaluation frameworks for algorithm validation.
... Audio/DSP, etc.), or equivalent practical experience. * 8 years of experience in camera or ISP ... Research and evaluate emerging camera algorithms, architectures, and technologies to foster ...
... Audio/DSP, etc.), or equivalent practical experience. * 8 years of experience in camera or ISP ... Research and evaluate emerging camera algorithms, architectures, and technologies to foster ...
Apprentice Spatial Audio Algorithms information
See salary details
$12.02 - $14.20
2% of jobs
$14.20 - $16.39
9% of jobs
$18.50 is the 25th percentile. Wages below this are outliers.
$16.39 - $18.58
15% of jobs
$18.58 - $20.76
17% of jobs
The median wage is $21.66 / hr.
$20.76 - $22.95
18% of jobs
$24.58 is the 75th percentile. Wages above this are outliers.
$22.95 - $25.13
19% of jobs
$25.13 - $27.32
10% of jobs
$27.32 - $29.50
4% of jobs
$29.50 - $31.69
2% of jobs
$31.69 - $33.87
1% of jobs
$33.87 - $36.06
3% of jobs
$12
$22
$36
How much do apprentice spatial audio algorithms jobs pay per hour?
Full-time
Posted 14 days ago
Job description
xAI is dedicated to creating AI systems that enhance human understanding of the universe. The role involves collaborating with the multimodal team to develop advanced capabilities in multimodal reasoning and real-time interactions across various data types, including image, video, audio, and text.
Responsibilities:
• Design, build, and optimize large-scale distributed systems for multimodal pre-training, post-training, inference, data processing, and tokenization at web/petabyte scale.
• Develop high-throughput pipelines for data acquisition, preprocessing, filtering, generation, decoding, loading, crawling, visualization, and management (images, videos, audio + text).
• Advance multimodal capabilities including spatial-temporal compression, cross-modal alignment, world modeling, reasoning, emergent abilities, audio/image/video understanding & generation, real-time video processing, and noisy data handling.
• Drive data quality and studies: curation (human/synthetic), filtering techniques, analysis, and scalable pipelines to support trillion-parameter models.
• Create evaluation frameworks, internal benchmarks, reward models, and metrics that capture real-world usage, failure modes, interactive dynamics, and human-AI synergy.
• Innovate on algorithms, modeling approaches, hardware/software/algorithm co-design, and scaling paradigms for state-of-the-art performance.
• Build research tooling, user-friendly interfaces, prototypes/demos, full-stack applications, and enable rapid iteration based on feedback.
• Work across the stack (pre-training → SFT/RL/post-training) to enable reasoning, tool calling, agentic behaviors, orchestration, and seamless real-time interactions.
Qualifications:
Required:
• Hands-on experience with multimodal pre-training, post-training, or fine-tuning (vision, audio, video, or cross-modal).
• Expert-level proficiency in Python (core language), with strong experience in at least one of: JAX / PyTorch / XLA.
• Proven track record building or optimizing large-scale distributed ML systems (training/inference optimization, GPU utilization, multi-GPU/TPU setups, hardware co-design).
• Deep experience designing and running data pipelines at scale: curation, filtering, generation, quality studies, especially for noisy/real-world multimodal data.
• Strong fundamentals in evaluation design, benchmarks, reward modeling, or RL techniques (particularly for interactive/agentic behaviors).
• Proactive self-starter who thrives in high-intensity environments and is passionate about pushing multimodal AI frontiers.
• Willingness to own end-to-end initiatives and do whatever it takes to deliver breakthrough user experiences.
Preferred:
• Experience leading major improvements in model capabilities through better data, modeling, algorithms, or scaling.
• Familiarity with state-of-the-art in multimodal LLMs, scaling laws, tokenizers, compression techniques, reasoning, or agentic systems.
• Proficiency in Rust and/or C++ for performance-critical components.
• Hands-on work with large-scale orchestration tools such as Spark, Ray, or Kubernetes.
• Background building full-stack tooling: performant interfaces, real-time research demos/apps, or end-to-end product ownership.
• Passion for end-to-end user experience in interactive, real-time multimodal AI systems.
Company:
XAI is an artificial intelligence startup that develops AI solutions and tools to enhance reasoning and search capabilities. It is a sub-organization of SpaceX. Founded in 2023, the company is headquartered in Palo Alto, USA, with a team of 1001-5000 employees. The company is currently Late Stage.