2

Full Time Machine Learning Architect Jobs (NOW HIRING)

They are seeking a Staff Machine Learning Architect to lead the adaptation and optimization of machine learning models for their advanced optical inference engines, bridging the gap between cutting ...

Staff Machine Learning Architect

San Mateo, CA ยท On-site

$250K - $315K/yr

We are seeking an experienced machine learning architect to lead the porting and optimization of ... Full-time onsite position. Key Responsibilities: * Lead the porting of LLM applications, diffusion ...

ISEE is seeking a full-time Machine Learning Engineer to join our team. The ideal candidate has several years of work experience. Role responsibilities include: * Working on the intersection of ...

next page

Showing results 1-20

Full Time Machine Learning Architect information

See salary details

$46.5K

$128.8K

$201.5K

How much do full time machine learning architect jobs pay per year?

As of May 30, 2026, the average yearly pay for full time machine learning architect in the United States is $128,756.00, according to ZipRecruiter salary data. Most workers in this role earn between $91,000.00 and $166,000.00 per year, depending on experience, location, and employer.

What are the key skills and qualifications needed to thrive as a Full Time Machine Learning Architect, and why are they important?

To thrive as a Full Time Machine Learning Architect, you need advanced expertise in machine learning algorithms, data modeling, and software engineering, typically supported by a degree in computer science or a related field. Familiarity with tools like TensorFlow, PyTorch, cloud platforms (AWS, Azure, GCP), and relevant certifications such as AWS Certified Machine Learning are highly valued. Strong problem-solving, communication, and project management skills distinguish top performers in this role. These competencies are crucial for designing scalable ML solutions that drive business value and align technical teams toward shared objectives.

What are some common challenges faced by Full Time Machine Learning Architects when integrating models into production systems?

Full Time Machine Learning Architects often encounter challenges related to ensuring scalability, reliability, and maintainability when deploying models into production. Integrating machine learning solutions with existing infrastructure requires careful consideration of data pipelines, version control, and real-time monitoring. Additionally, collaborating with cross-functional teams such as data engineers, software developers, and product managers is essential for a smooth deployment process. Addressing issues like model drift, data quality, and system performance is a continuous responsibility in this role.

What does a Full Time Machine Learning Architect do?

A Full Time Machine Learning Architect is responsible for designing and overseeing the implementation of machine learning systems within an organization. They analyze business needs, choose appropriate machine learning frameworks, and create scalable architectures that support data processing and model deployment. Their role also involves collaborating with data scientists, engineers, and stakeholders to ensure that machine learning solutions are robust, efficient, and aligned with strategic goals. Additionally, they often help set best practices for model development, data management, and system integration.

What is the difference between Full Time Machine Learning Architect vs Data Scientist?

AspectFull Time Machine Learning ArchitectData Scientist
Required CredentialsMaster's or PhD in Computer Science, AI, or related fields; certifications in ML frameworksMaster's or PhD in Data Science, Statistics, or related fields; certifications in data analysis tools
Work EnvironmentDesigning ML systems, overseeing architecture, collaborating with engineering teamsAnalyzing data, building models, interpreting results for business insights
Employer & Industry UsageTech companies, AI-focused firms, large enterprisesFinance, healthcare, marketing, research institutions

The main difference is that a Full Time Machine Learning Architect focuses on designing and implementing scalable ML systems and infrastructure, while a Data Scientist primarily analyzes data and develops models for insights. The Architect role is more technical and system-oriented, whereas the Data Scientist role emphasizes data analysis and interpretation.

More about Full Time Machine Learning Architect jobs
What are the most commonly searched types of Machine Learning Architect jobs? The most popular types of Machine Learning Architect jobs are:
What job categories do people searching Full Time Machine Learning Architect jobs look for? The top searched job categories for Full Time Machine Learning Architect jobs are:
Infographic showing various Full Time Machine Learning Architect job openings in the United States as of May 2026, with employment types broken down into 98% Full Time, 1% Part Time, and 1% Contract. Highlights an 79% Physical, and 21% Remote job distribution, with an average salary of $128,756 per year, or $61.9 per hour.

Staff Machine Learning Architect

Neurophos

San Jose, CA โ€ข On-site

Full-time

Posted 3 days ago


Job description

Job Summary:
Neurophos is a pioneering company focused on redefining AI computing through innovative optical architecture. They are seeking a Staff Machine Learning Architect to lead the adaptation and optimization of machine learning models for their advanced optical inference engines, bridging the gap between cutting-edge research and practical hardware applications.
Responsibilities:
โ€ข Lead the porting of LLM applications, diffusion models, and visual ML applications to Neurophos optical inference engines
โ€ข Adapt models from diverse sources, including GitHub, Hugging Face, other open-source repositories, and customer private models
โ€ข Work with models in various formats, including PyTorch, Triton, JAX, and emerging frameworks
โ€ข Develop and implement quantization strategies to migrate models from higher precision formats (FP8, INT8, and above) to our optimized 4-bit precision (FP4/INT4) for weights and activations
โ€ข Design and execute re-quantization, retraining, and other model adaptation techniques to minimize accuracy loss during precision reduction
โ€ข Create or integrate third-party tools and workflows for efficient model porting and optimization
โ€ข Optimize GEMM operations for high-throughput execution
โ€ข Develop benchmarking methodologies to measure and validate model quality post-porting, including perplexity metrics and other quality indicators
โ€ข Collaborate with hardware and software teams to co-optimize model architectures for optical compute characteristics
โ€ข Publish research papers on novel optimization techniques and methodologies (with appropriate IP protection)
Qualifications:
Required:
โ€ข MS or PhD in Computer Science, Data Science, Machine Learning, Mathematics, or related field
โ€ข 7+ years of experience in machine learning engineering with at least 3 years focused on model optimization and deployment
โ€ข Deep expertise in neural network quantization techniques, including post-training quantization (PTQ) and quantization-aware training (QAT)
โ€ข Strong proficiency in PyTorch and familiarity with other ML frameworks (JAX, Triton, TensorFlow)
โ€ข Hands-on experience with transformer architectures, LLMs, and diffusion models
โ€ข Experience with low-precision inference optimization (INT8, FP8, or lower)
โ€ข Strong understanding of GEMM operations and linear algebra optimizations for deep learning
โ€ข Experience with model evaluation metrics, including perplexity, accuracy, and benchmark suites
โ€ข Track record of successfully deploying ML models on specialized hardware accelerators
โ€ข Excellent communication skills with the ability to collaborate across hardware and software teams
Preferred:
โ€ข Experience with sub-8-bit quantization (INT4, FP4) and mixed-precision inference
โ€ข Familiarity with Hugging Face Transformers library and model hub ecosystem
โ€ข Experience with ONNX, TensorRT, or other model optimization frameworks
โ€ข Background in analog or optical computing architectures
โ€ข Knowledge of in-memory computing paradigms and matrix-vector multiplication acceleration
โ€ข Published research in model compression, quantization, or efficient inference
โ€ข Experience with large-scale batch inference optimization
โ€ข Familiarity with prefill vs. decode optimization strategies in LLM inference
Company:
Neurophos develops photonic AI processing technology that focuses on hardware solutions for accelerating artificial intelligence inference. Founded in 2020, the company is headquartered in Austin, USA, with a team of 11-50 employees. The company is currently Early Stage.