1

Machine Learning Engineer Quantization Jobs in Colorado

Who We Are Looking For We're hiring a Staff Machine Learning Engineer to help move forward the ML ... Production experience with model serving for both LLMs and custom models; understands quantization ...

CO · On-site

Who We Are Looking For We're hiring a Staff Machine Learning Engineer to help move forward the ML ... Production experience with model serving for both LLMs and custom models; understands quantization ...

Bachelor's degree in computer science, machine learning, data science, electrical engineering, or a similar discipline * Proficient in Python * Foundational understanding of machine learning concepts ...

Must-Have Skills 3+ years of ML engineering experience -- model training, fine-tuning, or post-training pipelines in research or production Strong Python and deep learning proficiency (PyTorch ...

Must-Have Skills 3+ years of ML engineering experience -- model training, fine-tuning, or post-training pipelines in research or production Strong Python and deep learning proficiency (PyTorch ...

Must-Have Skills 3+ years of ML engineering experience -- model training, fine-tuning, or post-training pipelines in research or production Strong Python and deep learning proficiency (PyTorch ...

Must-Have Skills 3+ years of ML engineering experience -- model training, fine-tuning, or post-training pipelines in research or production Strong Python and deep learning proficiency (PyTorch ...

Machine Learning Engineer III

Boulder, CO · On-site

$136K - $240K/yr

We're forming small, senior, cross-functional AI teams that bring together product leaders, machine learning engineers, and full-stack builders to create intelligent agents used by millions of people ...

next page

Showing results 1-20

Machine Learning Engineer Quantization information

What are some common challenges Machine Learning Engineers face when implementing quantization techniques in production models?

Machine Learning Engineers working on quantization often encounter challenges such as balancing reduced model size and computational efficiency with maintaining acceptable accuracy levels. Adapting quantization methods to different hardware platforms can also require significant testing and optimization. Additionally, engineers must frequently address compatibility issues with existing deployment pipelines and ensure that quantization-aware training is properly integrated to minimize performance degradation. Collaboration with hardware and software teams is essential to streamline deployment and achieve optimal results.

What are the key skills and qualifications needed to thrive as a Machine Learning Engineer Quantization, and why are they important?

To thrive as a Machine Learning Engineer Quantization, you need a solid background in machine learning, deep learning, and computer science, typically supported by a degree in a related field. Familiarity with quantization techniques, frameworks such as TensorFlow Lite or PyTorch, and experience with hardware accelerators are crucial. Strong problem-solving skills, attention to detail, and effective collaboration set top performers apart. These capabilities are vital for efficiently deploying high-performing models on resource-constrained devices and ensuring scalable, real-world AI solutions.

What does a Machine Learning Engineer Quantization do?

A Machine Learning Engineer specializing in quantization focuses on optimizing machine learning models by reducing their size and computational requirements without significantly sacrificing accuracy. This involves converting model parameters and computations from high-precision formats (like 32-bit floating point) to lower-precision formats (such as 8-bit integers). Quantization enables faster inference, lower memory usage, and allows models to run efficiently on edge devices and mobile platforms. These engineers work closely with data scientists and hardware teams to implement, test, and validate quantized models in production environments.

What is the difference between Machine Learning Engineer Quantization vs Data Scientist?

AspectMachine Learning Engineer QuantizationData Scientist
Required CredentialsBachelor's or master's in CS, ML, or related; certifications in ML or AIBachelor's or master's in statistics, CS, or related; certifications in data analysis or statistics
Work EnvironmentDeveloping optimized ML models, deploying quantized models for efficiencyAnalyzing data, building predictive models, interpreting results
Industry UsageTech companies, AI hardware firms, embedded systemsFinance, healthcare, marketing, research institutions

Machine Learning Engineer Quantization focuses on optimizing ML models for deployment efficiency, often working closely with hardware and software teams. Data Scientists analyze data and build models for insights. While both roles require ML knowledge, quantization engineers specialize in model compression techniques, whereas data scientists focus on data analysis and interpretation.

What are popular job titles related to Machine Learning Engineer Quantization jobs in Colorado? For Machine Learning Engineer Quantization jobs in Colorado, the most frequently searched job titles are:
What job categories do people searching Machine Learning Engineer Quantization jobs in Colorado look for? The top searched job categories for Machine Learning Engineer Quantization jobs in Colorado are:
What cities in Colorado are hiring for Machine Learning Engineer Quantization jobs? Cities in Colorado with the most Machine Learning Engineer Quantization job openings:

Staff Machine Learning Engineer

AppFolio

Denver, CO

Full-time

Posted 17 days ago


Job description

Hi, We're AppFolio
We're innovators, changemakers, and collaborators. We're more than just a software company — we're building the AI-native platform where the real estate industry comes to do business. We're transforming Property Management; how property managers operate, how residents live, and how intelligence flows across an entire industry.
Realm-X is AppFolio's AI-native platform powering this transformation. It enables a new generation of intelligent capabilities across our products, including Realm-X Assistant (copilot), Flows (AI Agentic workflows) and Performers (autonomous AI Agents). Realm-X serves as both a foundation for internal teams to build and scale AI-powered products, and a core layer delivering intelligent, high-impact experiences directly to our customers.
At its core, Realm-X is built on a structured domain ontology and a set of shared business primitives—such as transactions, actions, reports, metrics, and skills—that enable AI systems to deeply understand and operate across the full context of property management workflows. This foundation allows us to build context-aware, action-oriented AI systems that go beyond simple assistance to power real automation and decision-making.
Who We Are Looking For
We're hiring a Staff Machine Learning Engineer to help move forward the ML platform that every AI initiative at AppFolio depends on — training, fine-tuning, inference, RAG, evaluation, and cost. You'll keep our AI cloud always-on, observable, and economical, while staying close enough to applications to influence model and agent design.
This role works at the intersection of ML infrastructure, applied AI, and cost discipline. You'll partner closely with our Voice & Agents and Research ML engineers to harden their prototypes into production systems, and help move forward the platform layer that lets Realm-X scale across AppFolio's entire customer base.
Your Impact
  • ML Platform: Design and operate AppFolio's ML infrastructure on AWS — ECS, SageMaker, GPU fleets, model serving, autoscaling, and cost controls.
  • Drive AI Cost Discipline: Optimize cost across all AI applications — provider routing, caching, batch vs. real-time, model size selection, and inference economics.
  • Multi-Provider Reliability: Maintain reliable, multi-provider LLM access across Google, OpenAI, and Anthropic with sensible fallbacks and abstractions.
  • Training & Fine-Tuning Stack: Build the training and fine-tuning stack for Small Language Models, including data pipelines, GPU orchestration, and evaluation.
  • Productionize Research: Partner with Voice & Agents and Research ML engineers to harden their prototypes into production systems with SLOs, on-call rotations, and observability.
  • AI Safety & Guardrails: Operate AppFolio's AI safety and authorization layer — guardrails on AWS, scoped tool permissions, and human-in-the-loop gates for autonomous agent actions.
Qualifications
  • Systems thinker: You think in terms of platforms and long-term leverage, not just features.
  • Production builder: You've built and scaled ML infrastructure in production with meaningful business impact.
  • Ambiguity: You operate effectively in high ambiguity, turning unclear infra problems into clear direction.
  • Owner-operator: You take ownership with a founder/owner-operator mindset, act with urgency, and focus on outcomes.
  • Pace: You have a strong desire to move fast and deliver impact, while maintaining sound engineering judgment.
  • Collaboration: You are humble, collaborative, and low-ego, and you elevate those around you.
  • Sustainability: You value work-life balance as a foundation for sustained high performance.
  • Reliability mindset: You treat ML infra like any other production system — SLOs, on-call, observability, postmortems.
Must Have
  • ML infra at scale: Has built and operated production ML infrastructure on AWS — ECS, SageMaker, GPUs, autoscaling, and cost controls.
  • Inference platforms: Production experience with model serving for both LLMs and custom models; understands quantization, batching, and routing.
  • Provider breadth: Direct experience integrating with Google (Vertex / Gemini), OpenAI, and Anthropic APIs in production.
  • Training capability: Has trained or fine-tuned language models end-to-end; comfortable with deep learning, evaluation, and inference.
  • Cloud-native engineering: Strong Python, Docker, dependency management, and CI/CD for AI workloads.
  • RAG & agents: Working knowledge of LangChain / LangGraph and modern RAG patterns over structured and unstructured data.
  • Cost optimization: Demonstrated experience reducing unit cost of AI workloads without regressing quality or latency.
  • AI safety & authorization: Hands-on experience operating AI guardrails, scoped tool permissions, and authorization layers for production AI systems.
Nice to Have
  • Experience training Small Language Models for production use.
  • GPU performance tuning (vLLM, TensorRT, Triton, or similar).
  • Prior Staff-level role at a company with a significant AI infra footprint.
  • Experience with ontology-driven systems or knowledge graphs supporting AI applications.
  • Contributions to open-source ML infrastructure or LLM tooling.
Location
Find out more about our locations by visiting our site. 
Compensation & Benefits
The compensation that we reasonably expect to pay for this role is: $200,000 - 250,000 base pay. The actual compensation for this role will be determined by a variety of factors, including but not limited to the candidate’s skills, education, experience, and internal equity.
Please note that compensation is just one aspect of a comprehensive Total Rewards package. The compensation range listed here does not include additional benefits or any discretionary bonuses you may be eligible for based on your role and/or employment type.
Regular full-time employees are eligible for benefits - see here.
style="color:#ffffff;">#LI-KB1