1

Deep Learning Quantization Jobs in Washington, DC

Software Engineer II

Herndon, VA · On-site +1

$100K - $137K/yr

... on deep learning * 1+ years of hands-on experience fine-tuning large foundation models (LLMs or ... Experience with model quantization and inference optimization (vLLM, TensorRT, ONNX) Experience ...

Software Engineer II

Herndon, VA · On-site

$100K - $137K/yr

... on deep learning * 1+ years of hands-on experience fine-tuning large foundation models (LLMs or ... Experience with model quantization and inference optimization (vLLM, TensorRT, ONNX) Experience ...

Deep understanding of machine learning architectures, model selection, training, and optimization ... Strong background in AI/ML performance optimization, including model compression, quantization, or ...

Deep understanding of machine learning architectures, model selection, training, and optimization ... Strong background in AI/ML performance optimization, including model compression, quantization, or ...

Deep understanding of machine learning architectures, model selection, training, and optimization ... Strong background in AI/ML performance optimization, including model compression, quantization, or ...

... learning, including multi-agent RL, deep Q-networks, or policy gradient methods • Experience with ... quantization, or edge deployment for resource-constrained environments • Experience with real ...

Deploy and manage machine learning models in production using tools like MLflow, Kubeflow, or AWS ... Optimize models for production (e.g., via quantization or pruning) and ensure efficient resource ...

Deploy and manage machine learning models in production using tools like MLflow, Kubeflow, or AWS ... Optimize models for production (e.g., via quantization or pruning) and ensure efficient resource ...

Deploy and manage machine learning models in production using tools like MLflow, Kubeflow, or AWS ... Optimize models for production (e.g., via quantization or pruning) and ensure efficient resource ...

Deploy and manage machine learning models in production using tools like MLflow, Kubeflow, or AWS ... Optimize models for production (e.g., via quantization or pruning) and ensure efficient resource ...

Deploy and manage machine learning models in production using tools like MLflow, Kubeflow, or AWS ... Optimize models for production (e.g., via quantization or pruning) and ensure efficient resource ...

Deploy and manage machine learning models in production using tools like MLflow, Kubeflow, or AWS ... Optimize models for production (e.g., via quantization or pruning) and ensure efficient resource ...

Deploy and manage machine learning models in production using tools like MLflow, Kubeflow, or AWS ... Optimize models for production (e.g., via quantization or pruning) and ensure efficient resource ...

Deploy and manage machine learning models in production using tools like MLflow, Kubeflow, or AWS ... Optimize models for production (e.g., via quantization or pruning) and ensure efficient resource ...

next page

Showing results 1-20

Deep Learning Quantization information

See Washington, DC salary details

$12.5K

$95K

$158.6K

How much do deep learning quantization jobs pay per year?

As of Jun 19, 2026, the average yearly pay for deep learning quantization in Washington, DC is $95,008.00, according to ZipRecruiter salary data. Most workers in this role earn between $81,500.00 and $157,400.00 per year, depending on experience, location, and employer.

What are the key skills and qualifications needed to thrive as a Deep Learning Quantization Engineer, and why are they important?

To excel as a Deep Learning Quantization Engineer, you need a strong background in machine learning, applied mathematics, and computer science, usually supported by an advanced degree in a related field. Familiarity with deep learning frameworks (such as TensorFlow or PyTorch), quantization toolkits, and hardware acceleration platforms is crucial. Analytical thinking, problem-solving, and clear technical communication are standout soft skills in this role. These abilities are essential for efficiently optimizing models for deployment on resource-constrained hardware while maintaining accuracy and performance.

What is the difference between Deep Learning Quantization vs Machine Learning Engineer?

AspectDeep Learning QuantizationMachine Learning Engineer
Required CredentialsAdvanced degrees in AI, Computer Science, or related fields; knowledge of neural networksBachelor's or Master's in CS, Data Science, or related fields; programming skills
Work EnvironmentResearch labs, AI development teams, hardware optimization settingsSoftware development teams, data-driven projects, product-focused environments
Industry UsageAI hardware optimization, model deployment, edge computingModel development, data analysis, software solutions across industries

Deep Learning Quantization focuses on reducing model size and improving inference speed through techniques like weight and activation quantization, often in hardware or embedded systems. Machine Learning Engineers develop, implement, and optimize machine learning models for various applications. While both roles require knowledge of AI and programming, Deep Learning Quantization is more specialized in model optimization techniques, whereas Machine Learning Engineers work broadly on model development and deployment.

What is deep learning quantization?

Deep learning quantization is the process of reducing the precision of the numbers used to represent a neural network's parameters, activations, or both. By converting the typically used 32-bit floating-point values to lower bit-width formats such as 16-bit or 8-bit integers, quantization significantly reduces the memory footprint and computational requirements of deep learning models. This technique helps deploy models efficiently on edge devices and mobile hardware while maintaining acceptable accuracy levels. Quantization is widely used in model optimization for faster inference and lower power consumption.

What are some common challenges faced when implementing deep learning quantization in production environments?

One of the main challenges in implementing deep learning quantization is balancing model accuracy with computational efficiency, as quantization can sometimes lead to a drop in model performance. Additionally, ensuring hardware compatibility and optimizing for different devices (such as CPUs, GPUs, or edge devices) can require extensive testing and tuning. Collaboration with data scientists, software engineers, and hardware specialists is often essential to successfully deploy quantized models at scale. Staying updated with the latest quantization techniques and frameworks is also important for overcoming these challenges.
What are popular job titles related to Deep Learning Quantization jobs in Washington, DC? For Deep Learning Quantization jobs in Washington, DC, the most frequently searched job titles are:
Software Engineer II

Software Engineer II

Quevera LLC

Herndon, VA • On-site, Remote

$100K - $137K/yr

Other

Medical, Dental, Vision, Life, Retirement

Posted 21 days ago


Job description

Job Description:
Quevera is seeking a Software Engineer II to join our team. At Quevera, we don't just offer jobs-we provide opportunities to be part of a dynamic, forward-thinking community that fosters innovation, collaboration, and personal growth. You'll work with industry experts, take on exciting challenges, and have the creative freedom to build cutting-edge solutions, all while advancing your career in a space that truly values your skills and ideas.
REQUIRED - MUST have a current TS/SCI Polygraph clearance to apply for role. Only those with a current TS/SCI with Poly clearance will be considered.
Duties and Responsibilities:
  • Design and execute fine-tuning pipelines for Vision-Language Models (VLMs) on domain-specific imagery datasets, including data preprocessing, training orchestration, and hyperparameter optimization
  • Develop and implement evaluation frameworks for multimodal model performance, including task-specific metrics for image understanding, visual question answering, and spatial reasoning
  • Build scalable training infrastructure on AWS (SageMaker, EC2 GPU instances) for distributed fine-tuning of large multimodal models
    Engineer data pipelines for curating, annotating, and transforming geospatial imagery datasets into model-ready formats for supervised and instruction-tuning workflows
  • Collaborate with applied scientists and solutions architects to iterate on model architectures, adapter strategies (LoRA/QLoRA), and inference optimization techniques

Required Experience:
  • TS/SCI with CI Poly required with current NGA eligibility and SBU/SECNet/COE accounts
  • Must be willing to work in SCIF daily or as needed
  • 5+ years of professional machine learning engineering experience with a focus on deep learning
  • 1+ years of hands-on experience fine-tuning large foundation models (LLMs or VLMs)
  • Experience with parameter-efficient fine-tuning methods (LoRA, QLoRA, adapters)
  • Familiarity with supervised fine-tuning, instruction tuning, and RLHF/DPO alignment techniques
  • 4+ years of advanced Python development for ML workloads
  • Strong proficiency with PyTorch and the HuggingFace ecosystem (Transformers, PEFT, Datasets, Accelerate)
  • Experience with distributed training frameworks (DeepSpeed, FSDP, or Megatron)
  • 3+ years of experience with computer vision or multimodal models
  • Understanding of vision transformer architectures (ViT, CLIP, LLaVA-family models, or similar)
  • Experience processing and augmenting image datasets at scale
  • 3+ years of experience with AWS ML infrastructure
    SageMaker Training jobs, Processing jobs, and endpoint deployment
    GPU instance selection, multi-node training, and cost optimization on EC2 (P4/P5/G5/G6e)
    S3 data management for large-scale training datasets
  • 2+ years of experience building ML evaluation pipelines
    Automated benchmarking, metric computation, and result analysis
    Experience with both quantitative metrics and qualitative/human evaluation approaches
  • Strong software engineering fundamentals (version control, testing, CI/CD for ML workflows)

Desired Experience:
  • 2+ years of experience with geospatial or remote sensing imagery
    Familiarity with electro-optical and SAR satellite imagery formats and characteristics
    Understanding of geospatial metadata, coordinate systems, and imagery preprocessing
  • Experience with model quantization and inference optimization (vLLM, TensorRT, ONNX)
    Experience with MLOps and experiment tracking tools (MLflow, Weights & Biases, SageMaker Experiments)
    Familiarity with data annotation platforms and active learning workflows for imagery
    Experience with containerized ML workflows (Docker, ECR, ECS/EKS)
    2+ years of experience with Authority to Operate (ATO) processes in government environments
    Implementation of NIST 800-53 controls and security compliance for ML systems
  • Experience deploying models in air-gapped or disconnected environments
    Familiarity with multimodal evaluation benchmarks (MMMU, MMBench, GQA, or domain-specific equivalents)
    Publications or demonstrated contributions in computer vision, VLMs, or multimodal AI
    Experience with synthetic data generation for training data augmentation
    Complete items below line after a partner is selected

Why Join Quevera?
Award-Winning Culture
Quevera was recognized as a Top Workplace in the Washington, DC/Baltimore region for 2025, marking our fifth consecutive year receiving this distinction based on employee feedback.
Outstanding Benefits
We invest in our employees and their families through a highly competitive benefits package, including:
  • 100% employer-paid medical coverage (optional plan)
  • Competitive options for Medical, Dental and Vision insurance
  • Employer-paid short-term and long-term disability coverage
  • Employer-paid life insurance
  • $5,000 annually for education, training, certifications, and professional development
  • Career advancement through our structured IQWay Program
  • Up to 6% 401(k) match
  • Additional 4% profit-sharing contribution

At Quevera, we believe exceptional people deserve exceptional opportunities. We're more than just a workplace-we're a team of innovators, problem-solvers, and industry experts committed to delivering mission-critical solutions while fostering professional growth, collaboration, and technical excellence.
Quevera is an equal opportunity/affirmative action employer. All qualified applicants will receive consideration for employment without regard to sex, gender identity, sexual orientation, race, color, religion, national origin, disability, protected veteran status, age or any other characteristic protected by law.