1

Assistant Cuda Jobs in Bothell, WA (NOW HIRING)

Software Engineer - C++ GPU Performance

Seattle, WA · On-site

$159.30K/yr

Strong knowledge of CUDA as applied to recent GPU microarchitectures (e.g., Ampere, Blackwell) and ... These tools assist our recruitment team but do not replace human judgment. Final hiring decisions ...

Strong knowledge of CUDA as applied to recent GPU microarchitectures (e.g., Ampere, Blackwell) and ... These tools assist our recruitment team but do not replace human judgment. Final hiring decisions ...

Solutions Architect, AI and ML

Seattle, WA · On-site

$71.75 - $94.50/hr

... assist customers with adoption of GPU hardware and Software, as well as building and deploying ... CUDA, RAPIDS, Triton etc.) * System-level experience specifically GPU-based systems * Experience ...

Solutions Architect, AI and ML

Seattle, WA

$71.75 - $94.50/hr

... assist customers with adoption of GPU hardware and Software, as well as building and deploying ... CUDA, RAPIDS, Triton etc.) * System-level experience specifically GPU-based systems * Experience ...

Solutions Architect, AI and ML

Redmond, WA · On-site

$70.50 - $93/hr

... assist customers with adoption of GPU hardware and Software, as well as building and deploying ... CUDA, RAPIDS, Triton etc.) * System-level experience specifically GPU-based systems * Experience ...

Solutions Architect, AI and ML

Redmond, WA · On-site

$70.50 - $93/hr

... assist customers with adoption of GPU hardware and Software, as well as building and deploying ... CUDA, RAPIDS, Triton etc.) * System-level experience specifically GPU-based systems * Experience ...

Solutions Architect, AI and ML

Redmond, WA

$70.50 - $93/hr

... assist customers with adoption of GPU hardware and Software, as well as building and deploying ... CUDA, RAPIDS, Triton etc.) * System-level experience specifically GPU-based systems * Experience ...

Solutions Architect, AI and ML

Redmond, WA · On-site

$70.50 - $93/hr

... assist customers with adoption of GPU hardware and Software, as well as building and deploying ... CUDA, RAPIDS, Triton etc.) * System-level experience specifically GPU-based systems * Experience ...

Senior Software Engineer, AI Resiliency

Redmond, WA · On-site

$137.20K - $180.90K/yr

Support Production Deployments: Assist in debugging and performance tuning large-scale AI workloads ... Hands-on experience with CUDA, NCCL, or MPI for GPU-accelerated computing, especially at extreme ...

Senior Software Engineer, AI Resiliency

Redmond, WA · On-site

$137.20K - $180.90K/yr

Support Production Deployments: Assist in debugging and performance tuning large-scale AI workloads ... Hands-on experience with CUDA, NCCL, or MPI for GPU-accelerated computing, especially at extreme ...

next page

Showing results 1-20

Assistant Cuda information

What are the key skills and qualifications needed to thrive as an Assistant CUDA Developer, and why are they important?

To thrive as an Assistant CUDA Developer, you need strong programming skills in C/C++, a solid understanding of parallel computing concepts, and familiarity with GPU architectures, often backed by a degree in computer science or a related field. Proficiency with CUDA development tools, debugging utilities, and version control systems like Git is typically required. Attention to detail, problem-solving abilities, and effective communication are crucial soft skills for collaborating with teams and optimizing code. These skills ensure efficient development of high-performance applications and successful integration of GPU acceleration into software solutions.

What are some common challenges faced by Assistant CUDA developers when optimizing code for GPU performance?

Assistant CUDA developers often encounter challenges such as managing memory efficiently between the host and device, ensuring proper kernel parallelization, and avoiding thread divergence. Balancing occupancy and resource usage can also be tricky, as it requires a deep understanding of how CUDA schedules and executes threads. Collaborating closely with data scientists and other engineers is essential to identify performance bottlenecks and implement effective optimizations.

What are Assistant Cuda roles and responsibilities?

An Assistant Cuda typically supports senior CUDA (Compute Unified Device Architecture) developers or teams working with NVIDIA’s parallel computing platform. Their responsibilities include assisting in developing, testing, and optimizing code written for GPUs to accelerate computing tasks, debugging CUDA applications, and maintaining documentation. They may also handle routine tasks such as performance benchmarking, code reviews, and collaborating with other team members to implement efficient GPU solutions. This role is crucial in organizations that rely on high-performance computing, scientific simulations, or AI workloads.

What is the difference between Assistant Cuda vs Assistant Data Analyst?

AspectAssistant CudaAssistant Data Analyst
Required CredentialsTypically a relevant degree in computer science or related fieldOften a degree in data science, statistics, or related field
Work EnvironmentTech companies, software development teams, AI projectsBusiness, finance, marketing, or research departments
Employer & Industry UsageUsed in tech and AI industries for supporting CUDA programming tasksCommon in data-driven industries for data processing and analysis

Assistant Cuda and Assistant Data Analyst roles share some technical background but differ mainly in focus. Assistant Cuda primarily supports GPU programming and AI development, while Assistant Data Analyst focuses on data interpretation and reporting. Both roles require relevant technical skills and are found in industries leveraging data and technology, but their daily tasks and industry applications vary significantly.

What are the most commonly searched types of Cuda jobs in Bothell, WA? The most popular types of Cuda jobs in Bothell, WA are:
What job categories do people searching Assistant Cuda jobs in Bothell, WA look for? The top searched job categories for Assistant Cuda jobs in Bothell, WA are:
What cities near Bothell, WA are hiring for Assistant Cuda jobs? Cities near Bothell, WA with the most Assistant Cuda job openings:

Senior AI Inference Engineer - Model Optimization & Deployment

Zoox

Seattle, WA • On-site

$242K - $290K/yr

Full-time

Medical, Life, PTO

Posted 19 days ago


Job description

The Perception team is pioneering the development of a multi-modality foundation model to drive the next generation of autonomous system intelligence.

As a Model Optimization & Deployment Engineer, you will focus on bringing highly efficient, production-ready large-scale models to our on-vehicle stack. We are looking for experts with hands-on experience in compressing, accelerating, and deploying complex models (LLMs, VLMs, or FMs) for power- and thermal-constrained vehicle SOCs. You will optimize the ML models, write custom CUDA kernels, and build highly concurrent inference code to ensure real-time, deterministic execution on edge devices.
In this role, you will:
  • Optimize large-scale models (Multi-Modal Sensor Fusion models, LLMs, VLMs) using advanced quantization (PTQ, QAT), pruning, mixed-precision inference frameworks, and parameter-efficient fine-tuning (LoRA, QLoRA).
  • Architect and implement model conversion and compilation pipelines using TensorRT for edge deployment.
  • Perform rigorous parity checking, accuracy recovery, and latency benchmarking between PyTorch frameworks and compiled edge binaries.
  • Develop and optimize custom ML OPs and TensorRT Plugins with efficient CUDA kernels to minimize latency and maximize memory bandwidth on AI accelerators.
  • Write production-level, low latency, and memory-safe C++ and CUDA code for real-time inference on vehicle systems.
Qualifications:
  • Deep expertise in model quantization (PTQ, QAT) and mixed-precision inference frameworks (INT8, FP8, FP4, BF16/FP16).
  • Proven experience optimizing large-scale models (Multi-Modal Sensor Fusion models, LLMs, VLMs/VLAs) utilizing Efficient Attention mechanisms (e.g., FlashAttention, Linear Attention), KV-cache optimization (e.g., PagedAttention) and Speculative Decoding.
  • Extensive experience with model conversion/compilation pipelines (e.g., ONNX, TensorRT, torch.compile) and performing rigorous latency benchmark and model quality parity valuation.
  • Proficiency in low-level programming for AI accelerators, specifically developing and optimizing custom ML OPs and TensorRT Plugins with efficient CUDA kernel implementations.
  • Production-level C++ (14/17/20) and Python programming skills, with experience developing concurrent, memory-safe, real-time inference code for edge devices.
Bonus Qualifications:
  • Familiarity with SOTA autonomous driving perception algorithms (temporal 3D object detection, BEV, 3D Occupancy Networks) and multi-modal sensor processing (Vision, LiDAR, Radar).
  • Experience with distributed training pipelines and model/tensor parallelism (PyTorch Distributed, Ray, DeepSpeed, Megatron-LM) and runtime efficiency optimization for GPU clusters.
  • Experience with end-to-end autonomous driving paradigms (VLM/VLA models, Foundation models) and edge deployment technologies (e.g., TensorRT-LLM).
$242,000 - $290,000 a year
Base Salary Range
 
There are three major components to compensation for this position: salary, Amazon Restricted Stock Units (RSUs), and Zoox Stock Appreciation Rights. A sign-on bonus may be offered as part of the compensation package. The listed range applies only to the base salary. Compensation will vary based on geographic location and level. Leveling, as well as positioning within a level, is determined by a range of factors, including, but not limited to, a candidate's relevant years of experience, domain knowledge, and interview performance. The salary range listed in this posting is representative of the range of levels Zoox is considering for this position.
 
Zoox also offers a comprehensive package of benefits, including paid time off (e.g. sick leave, vacation, bereavement), unpaid time off, Zoox Stock Appreciation Rights, Amazon RSUs, health insurance, long-term care insurance, long-term and short-term disability insurance, and life insurance.
About Zoox
Zoox is developing the first ground-up, fully autonomous vehicle fleet and the supporting ecosystem required to bring this technology to market. Sitting at the intersection of robotics, machine learning, and design, Zoox aims to provide the next generation of mobility-as-a-service in urban environments. We're looking for top talent that shares our passion and wants to be part of a fast-moving and highly execution-oriented team.

Follow us on LinkedIn

Accommodations
If you need an accommodation to participate in the application or interview process please reach out to [email protected] or your assigned recruiter.

A Final Note:
You do not need to match every listed expectation to apply for this position. Here at Zoox, we know that diverse perspectives foster the innovation we need to be successful, and we are committed to building a team that encompasses a variety of backgrounds, experiences, and skills.
We may use artificial intelligence (AI) tools to support parts of the hiring process, such as reviewing applications, analyzing resumes, or assessing responses. These tools assist our recruitment team but do not replace human judgment. Final hiring decisions are ultimately made by humans. If you would like more information about how your data is processed, please contact us.
apply for this job