1

Deep Learning Quantization Jobs in Texas (NOW HIRING)

Deep understanding of LLMs, embeddings, vector databases (e.g., FAISS, Pinecone, Weaviate ... Use techniques like quantization, distillation, and caching to improve efficiency.

Senior ML Engineer

Austin, TX ยท On-site +1

$103K - $142K/yr

Advanced Python and deep learning proficiency (PyTorch, HuggingFace Transformers, spaCy ... models via quantization, batching, and throughput tuning * Proficiency with inference ...

next page

Showing results 1-20

Deep Learning Quantization information

What are the key skills and qualifications needed to thrive as a Deep Learning Quantization Engineer, and why are they important?

To excel as a Deep Learning Quantization Engineer, you need a strong background in machine learning, applied mathematics, and computer science, usually supported by an advanced degree in a related field. Familiarity with deep learning frameworks (such as TensorFlow or PyTorch), quantization toolkits, and hardware acceleration platforms is crucial. Analytical thinking, problem-solving, and clear technical communication are standout soft skills in this role. These abilities are essential for efficiently optimizing models for deployment on resource-constrained hardware while maintaining accuracy and performance.

What is the difference between Deep Learning Quantization vs Machine Learning Engineer?

AspectDeep Learning QuantizationMachine Learning Engineer
Required CredentialsAdvanced degrees in AI, Computer Science, or related fields; knowledge of neural networksBachelor's or Master's in CS, Data Science, or related fields; programming skills
Work EnvironmentResearch labs, AI development teams, hardware optimization settingsSoftware development teams, data-driven projects, product-focused environments
Industry UsageAI hardware optimization, model deployment, edge computingModel development, data analysis, software solutions across industries

Deep Learning Quantization focuses on reducing model size and improving inference speed through techniques like weight and activation quantization, often in hardware or embedded systems. Machine Learning Engineers develop, implement, and optimize machine learning models for various applications. While both roles require knowledge of AI and programming, Deep Learning Quantization is more specialized in model optimization techniques, whereas Machine Learning Engineers work broadly on model development and deployment.

What is deep learning quantization?

Deep learning quantization is the process of reducing the precision of the numbers used to represent a neural network's parameters, activations, or both. By converting the typically used 32-bit floating-point values to lower bit-width formats such as 16-bit or 8-bit integers, quantization significantly reduces the memory footprint and computational requirements of deep learning models. This technique helps deploy models efficiently on edge devices and mobile hardware while maintaining acceptable accuracy levels. Quantization is widely used in model optimization for faster inference and lower power consumption.

What are some common challenges faced when implementing deep learning quantization in production environments?

One of the main challenges in implementing deep learning quantization is balancing model accuracy with computational efficiency, as quantization can sometimes lead to a drop in model performance. Additionally, ensuring hardware compatibility and optimizing for different devices (such as CPUs, GPUs, or edge devices) can require extensive testing and tuning. Collaboration with data scientists, software engineers, and hardware specialists is often essential to successfully deploy quantized models at scale. Staying updated with the latest quantization techniques and frameworks is also important for overcoming these challenges.
What cities in Texas are hiring for Deep Learning Quantization jobs? Cities in Texas with the most Deep Learning Quantization job openings:
Senior Software Developer (Contractor) RITM1788244

Senior Software Developer (Contractor) RITM1788244

RE/SPEC Inc.

Austin, TX โ€ข On-site

$54 - $71.25/hr

Full-time

This job post hasย expired today.ย Applications are no longer accepted.


Job description

Company Description
Big challenges need bold thinkers.
If you're someone who sees problems as opportunities, you'll thrive here. RESPEC is 100% employee-owned, which means we take ownership of every challenge. Here, your ideas drive real solutions. Since 1969, we've tackled complex challenges in energy transition, infrastructure resilience, digital transformation, and sustainability.
At RESPEC, you'll work alongside clients to take on critical problems. Depending on your expertise, you might design infrastructure in remote locations, develop renewable energy solutions for global projects, or apply data-driven technology to improve mining and water systems.
We bring deep technical knowledge, real-world experience, and a commitment to work that matters. If you're looking for a place where your contributions have real impact, you'll fit right in.
We do not accept unsolicited resumes from third-party recruiters.
Job Description
RESPEC is seeking an experienced Software Developer Specialist to support a major transportation technology initiative for our government client in Austin, Texas. This role focuses on designing, developing, deploying, and optimizing advanced Artificial Intelligence (AI), Machine Learning (ML), Natural Language Processing (NLP), Computer Vision, and cloud-based solutions that support large-scale operational and business objectives.
The ideal candidate will bring deep expertise across AI/ML engineering, cloud platforms, MLOps, DevOps, and production-grade software development while collaborating with technical and business stakeholders in a highly visible public-sector environment.
Responsibilities:
  • Design, develop, test, and deploy scalable AI/ML solutions in cloud environments.
  • Build and maintain production-grade machine learning pipelines and model deployment frameworks.
  • Develop applications leveraging Large Language Models (LLMs), Retrieval-Augmented Generation (RAG), and transformer-based architectures.
  • Create and optimize NLP solutions, recommendation systems, forecasting models, and anomaly detection systems.
  • Design and implement computer vision solutions for real-time and large-scale data processing.
  • Develop and maintain MLOps workflows, model monitoring, and automated retraining processes.
  • Build and manage CI/CD pipelines supporting AI and software delivery.
  • Containerize and deploy applications using Docker and Kubernetes.
  • Collaborate with cross-functional teams to gather requirements and translate business needs into technical solutions.
  • Optimize model performance through quantization, pruning, distillation, and distributed training techniques.
  • Work with structured, unstructured, vector, and spatial datasets to support analytics and predictive modeling initiatives.
  • Document solutions, architectures, and deployment processes according to client standards.
  • Participate in technical reviews, troubleshooting, and ongoing operational support.

Qualifications
8+ Years of Experience Required:
Cloud Platforms & AI Infrastructure
  • AWS, Microsoft Azure, Google Cloud Platform (GCP), or Oracle Cloud Infrastructure (OCI)
  • Deploying and managing machine learning workloads in cloud environments
  • Utilizing AI/ML services across major cloud providers

DevOps & Platform Engineering
  • Ansible
  • Docker
  • Kubernetes
  • CI/CD implementation and automation

Database Technologies
  • SQL databases including PostgreSQL and MySQL
  • NoSQL databases
  • Vector databases

Automation & Scripting
  • Bash scripting
  • PowerShell scripting

CI/CD Tools
  • Azure DevOps
  • GitHub Actions
  • Jenkins
  • Comparable enterprise CI/CD platforms

3+ Years of Experience Required:
Python Development
  • Production-level Python application development
  • Building scalable backend and AI-driven solutions

Natural Language Processing & Large Language Models
  • Transformer architectures
  • Retrieval-Augmented Generation (RAG)
  • Fine-tuning models
  • Prompt engineering
  • LLM application development

Time Series Analytics
  • Forecasting
  • Sequential modeling
  • Anomaly detection
  • Real-time monitoring systems

Recommendation Systems
  • Collaborative filtering
  • Ranking algorithms
  • Personalization engines
  • Content recommendation platforms

MLOps
  • MLflow
  • Weights & Biases
  • Kubeflow
  • Apache Airflow
  • Similar MLOps platforms

Distributed AI Training
  • Multi-GPU environments
  • Multi-node training
  • Data parallelism
  • Large-scale model training

Computer Vision
  • PyTorch
  • TensorFlow
  • OpenCV
  • YOLO
  • Object detection
  • Image segmentation
  • Real-time inference systems

Feature Engineering
  • Feature stores such as Feast or Tecton
  • Advanced feature engineering methodologies

Model Optimization
  • Quantization
  • Pruning
  • Knowledge distillation

Alternative/Open-Source LLM Platforms
  • Ollama
  • Hugging Face
  • Other non-frontier/open-source model ecosystems

2+ Years of Experience Required:
Production AI/ML Delivery
  • Demonstrated experience building and deploying at least 2-3 machine learning models used by real-world users in production environments

Preferred Qualifications:
Candidates with one or more of the following qualifications will receive additional consideration:
  • GIS and spatial data analysis experience
  • Transportation, logistics, or smart-city technology experience
  • Computer vision applications involving infrastructure, roadway, or vehicle-related data
  • Public-sector data governance, compliance, and security experience
  • Unreal Engine experience
  • Digital twin implementation experience
  • Google Maps Cesium API experience
  • Polygonflow Dash experience

Additional Information
Schedule:
โ€ข Monday through Friday
โ€ข 8:00 AM to 5:00 PM Central Time
โ€ข State holidays observed per client schedule
On-Site Requirement:
โ€ข Minimum of 4 days per week on-site in Austin, Texas
โ€ข Remote work flexibility is limited and subject to client approval
Important:
Candidates must be able to reliably commute to the Austin office throughout the engagement.
Work Authorization
Applicants must be legally authorized to work in the United States throughout the duration of the engagement.
Background Screening
Selected candidates must successfully complete required background investigations before beginning work, including:
โ€ข Criminal history review
โ€ข State and county-level checks
โ€ข Sex offender registry review
โ€ข Additional client-required screenings if applicable
Employment Conditions
โ€ข Overtime may occasionally be required and must receive prior client approval.
โ€ข Candidates may be asked to support occasional evening, weekend, or holiday activities based on project demands.
โ€ข Time reporting must comply with client-established procedures and systems.
Candidate Considerations Before Applying
Please review the following carefully before applying:
โ€ข Relocation assistance is not specified.
โ€ข Candidates must be available to start near the anticipated project start date.
โ€ข Extended absences may impact project eligibility.
โ€ข Background screening is mandatory.
โ€ข Only candidates authorized to work in the United States will be considered.
โ€ข Compensation is subject to client-established limits and final approval.
If you are passionate about applying advanced AI, machine learning, computer vision, and cloud technologies to impactful public-sector initiatives, we encourage you to apply and join RESPEC's growing government technology practice.
All your information will be kept confidential according to EEO guidelines.