1

Deep Learning Quantization Jobs in Colorado (NOW HIRING)

Production experience with model serving for both LLMs and custom models; understands quantization ... Has trained or fine-tuned language models end-to-end; comfortable with deep learning, evaluation ...

Production experience with model serving for both LLMs and custom models; understands quantization ... Has trained or fine-tuned language models end-to-end; comfortable with deep learning, evaluation ...

Senior ML Engineer

Denver, CO · On-site +1

$107K - $147K/yr

Advanced Python and deep learning proficiency (PyTorch, HuggingFace Transformers, spaCy ... models via quantization, batching, and throughput tuning * Proficiency with inference ...

Deep Learning Quantization information

What are the key skills and qualifications needed to thrive as a Deep Learning Quantization Engineer, and why are they important?

To excel as a Deep Learning Quantization Engineer, you need a strong background in machine learning, applied mathematics, and computer science, usually supported by an advanced degree in a related field. Familiarity with deep learning frameworks (such as TensorFlow or PyTorch), quantization toolkits, and hardware acceleration platforms is crucial. Analytical thinking, problem-solving, and clear technical communication are standout soft skills in this role. These abilities are essential for efficiently optimizing models for deployment on resource-constrained hardware while maintaining accuracy and performance.

What is the difference between Deep Learning Quantization vs Machine Learning Engineer?

AspectDeep Learning QuantizationMachine Learning Engineer
Required CredentialsAdvanced degrees in AI, Computer Science, or related fields; knowledge of neural networksBachelor's or Master's in CS, Data Science, or related fields; programming skills
Work EnvironmentResearch labs, AI development teams, hardware optimization settingsSoftware development teams, data-driven projects, product-focused environments
Industry UsageAI hardware optimization, model deployment, edge computingModel development, data analysis, software solutions across industries

Deep Learning Quantization focuses on reducing model size and improving inference speed through techniques like weight and activation quantization, often in hardware or embedded systems. Machine Learning Engineers develop, implement, and optimize machine learning models for various applications. While both roles require knowledge of AI and programming, Deep Learning Quantization is more specialized in model optimization techniques, whereas Machine Learning Engineers work broadly on model development and deployment.

What is deep learning quantization?

Deep learning quantization is the process of reducing the precision of the numbers used to represent a neural network's parameters, activations, or both. By converting the typically used 32-bit floating-point values to lower bit-width formats such as 16-bit or 8-bit integers, quantization significantly reduces the memory footprint and computational requirements of deep learning models. This technique helps deploy models efficiently on edge devices and mobile hardware while maintaining acceptable accuracy levels. Quantization is widely used in model optimization for faster inference and lower power consumption.

What are some common challenges faced when implementing deep learning quantization in production environments?

One of the main challenges in implementing deep learning quantization is balancing model accuracy with computational efficiency, as quantization can sometimes lead to a drop in model performance. Additionally, ensuring hardware compatibility and optimizing for different devices (such as CPUs, GPUs, or edge devices) can require extensive testing and tuning. Collaboration with data scientists, software engineers, and hardware specialists is often essential to successfully deploy quantized models at scale. Staying updated with the latest quantization techniques and frameworks is also important for overcoming these challenges.
What are popular job titles related to Deep Learning Quantization jobs in Colorado? For Deep Learning Quantization jobs in Colorado, the most frequently searched job titles are:
What cities in Colorado are hiring for Deep Learning Quantization jobs? Cities in Colorado with the most Deep Learning Quantization job openings:

Staff Machine Learning Engineer

AppFolio

Denver, CO

Full-time

Posted 10 days ago


Job description

Hi, We're AppFolio
We're innovators, changemakers, and collaborators. We're more than just a software company — we're building the AI-native platform where the real estate industry comes to do business. We're transforming Property Management; how property managers operate, how residents live, and how intelligence flows across an entire industry.
Realm-X is AppFolio's AI-native platform powering this transformation. It enables a new generation of intelligent capabilities across our products, including Realm-X Assistant (copilot), Flows (AI Agentic workflows) and Performers (autonomous AI Agents). Realm-X serves as both a foundation for internal teams to build and scale AI-powered products, and a core layer delivering intelligent, high-impact experiences directly to our customers.
At its core, Realm-X is built on a structured domain ontology and a set of shared business primitives—such as transactions, actions, reports, metrics, and skills—that enable AI systems to deeply understand and operate across the full context of property management workflows. This foundation allows us to build context-aware, action-oriented AI systems that go beyond simple assistance to power real automation and decision-making.
Who We Are Looking For
We're hiring a Staff Machine Learning Engineer to help move forward the ML platform that every AI initiative at AppFolio depends on — training, fine-tuning, inference, RAG, evaluation, and cost. You'll keep our AI cloud always-on, observable, and economical, while staying close enough to applications to influence model and agent design.
This role works at the intersection of ML infrastructure, applied AI, and cost discipline. You'll partner closely with our Voice & Agents and Research ML engineers to harden their prototypes into production systems, and help move forward the platform layer that lets Realm-X scale across AppFolio's entire customer base.
Your Impact
  • ML Platform: Design and operate AppFolio's ML infrastructure on AWS — ECS, SageMaker, GPU fleets, model serving, autoscaling, and cost controls.
  • Drive AI Cost Discipline: Optimize cost across all AI applications — provider routing, caching, batch vs. real-time, model size selection, and inference economics.
  • Multi-Provider Reliability: Maintain reliable, multi-provider LLM access across Google, OpenAI, and Anthropic with sensible fallbacks and abstractions.
  • Training & Fine-Tuning Stack: Build the training and fine-tuning stack for Small Language Models, including data pipelines, GPU orchestration, and evaluation.
  • Productionize Research: Partner with Voice & Agents and Research ML engineers to harden their prototypes into production systems with SLOs, on-call rotations, and observability.
  • AI Safety & Guardrails: Operate AppFolio's AI safety and authorization layer — guardrails on AWS, scoped tool permissions, and human-in-the-loop gates for autonomous agent actions.
Qualifications
  • Systems thinker: You think in terms of platforms and long-term leverage, not just features.
  • Production builder: You've built and scaled ML infrastructure in production with meaningful business impact.
  • Ambiguity: You operate effectively in high ambiguity, turning unclear infra problems into clear direction.
  • Owner-operator: You take ownership with a founder/owner-operator mindset, act with urgency, and focus on outcomes.
  • Pace: You have a strong desire to move fast and deliver impact, while maintaining sound engineering judgment.
  • Collaboration: You are humble, collaborative, and low-ego, and you elevate those around you.
  • Sustainability: You value work-life balance as a foundation for sustained high performance.
  • Reliability mindset: You treat ML infra like any other production system — SLOs, on-call, observability, postmortems.
Must Have
  • ML infra at scale: Has built and operated production ML infrastructure on AWS — ECS, SageMaker, GPUs, autoscaling, and cost controls.
  • Inference platforms: Production experience with model serving for both LLMs and custom models; understands quantization, batching, and routing.
  • Provider breadth: Direct experience integrating with Google (Vertex / Gemini), OpenAI, and Anthropic APIs in production.
  • Training capability: Has trained or fine-tuned language models end-to-end; comfortable with deep learning, evaluation, and inference.
  • Cloud-native engineering: Strong Python, Docker, dependency management, and CI/CD for AI workloads.
  • RAG & agents: Working knowledge of LangChain / LangGraph and modern RAG patterns over structured and unstructured data.
  • Cost optimization: Demonstrated experience reducing unit cost of AI workloads without regressing quality or latency.
  • AI safety & authorization: Hands-on experience operating AI guardrails, scoped tool permissions, and authorization layers for production AI systems.
Nice to Have
  • Experience training Small Language Models for production use.
  • GPU performance tuning (vLLM, TensorRT, Triton, or similar).
  • Prior Staff-level role at a company with a significant AI infra footprint.
  • Experience with ontology-driven systems or knowledge graphs supporting AI applications.
  • Contributions to open-source ML infrastructure or LLM tooling.
Location
Find out more about our locations by visiting our site. 
Compensation & Benefits
The compensation that we reasonably expect to pay for this role is: $200,000 - 250,000 base pay. The actual compensation for this role will be determined by a variety of factors, including but not limited to the candidate’s skills, education, experience, and internal equity.
Please note that compensation is just one aspect of a comprehensive Total Rewards package. The compensation range listed here does not include additional benefits or any discretionary bonuses you may be eligible for based on your role and/or employment type.
Regular full-time employees are eligible for benefits - see here.
style="color:#ffffff;">#LI-KB1