Unconventional Ai

2 jobs near Columbus, OH

AI Systems, Training

Palo Alto, CA · On-site

$123K - $168K/yr

Unconventional AI is a company focused on redefining computing to solve the energy limitations of AI. They are seeking a key contributor to build a next-generation ML model training platform and co ...

AI Systems, Training

Unconventional AI

Palo Alto, CA • On-site

$123K - $168K/yr

Full-time

Posted 6 days ago


Job description

Job Summary:
Unconventional AI is a company focused on redefining computing to solve the energy limitations of AI. They are seeking a key contributor to build a next-generation ML model training platform and co-design training systems alongside novel AI models and hardware.
Responsibilities:
• Build and maintain highly optimized, model-specific training stacks specifically tuned for state-of-the-art generative vision, language, and world models.
• Design and scale multi-node distributed training systems, implementing elastic sharding and robust data streaming pipelines for fast, large-scale iteration. Implement and robust model checkpointing and recovery mechanisms.
• Develop and optimize kernels using low-level programming models like CUDA and Triton. Design rigorous benchmarking suites to track Model Flops Utilization (MFU), memory bandwidth, and convergence stability.
• Act as a translator, discussing algorithmic trade-offs with theorists and converting model requirements into concrete specifications for infrastructure and hardware engineering teams.
Qualifications:
Required:
• Education: An MS/PhD or equivalent research/project experience in a quantitative field such as AI/Machine Learning, Computer Science, Physics, Electrical Engineering, or Applied Math.
• Experience: Veteran of the modern ML software stack. Demonstrated ability to map state-of-the-art AI model architectures (e.g., transformers, Mixture of Experts, diffusion models) to system performance implication. Deep expertise in how models are partitioned across a cluster, with a mastery of communication primitives, and parallelism strategies.
• Software Development: Proven track record of implementing, debugging, and maintaining production-grade training frameworks—such as Megatron-LM, DeepSpeed, Ray, PyTorch Lightning—turning raw compute into a reliable model-building factory.
Preferred:
• Unconventional Co-Design: A forward-looking perspective on co-designing algorithms for unconventional computing paradigms that map closely to the physics of underlying systems.
Company:
Unconventional AI rethinks computer foundations to optimize energy efficiency for AI. Founded in 2025, the company is headquartered in San Francisco, USA, with a team of 11-50 employees. The company is currently Early Stage.