2

Remote Llm Training Jobs (NOW HIRING)

GPU Cluster Architect

$184K - $318K/yr

LLM training, inference) to inform design tradeoffs across latency, bandwidth, and GPU density ... Remote work reimbursement: Up to $85/month for mobile and internet. * Disability & life insurance

... remote office setup in first year + $400 each following year Internet reimbursement up to $75 per ... You will act as a trusted advisor across LLM training, fine-tuning, RAG workloads, distributed ...

Improve accuracy by employing systematic experimentation, prompt engineering, data and training ... Guide teams in adopting and building LLM-powered applications, including context engineering ...

Utilize AI-assisted development tools (e.g., LLM coding assistants, code analysis tools) to enhance ... Familiarity with the AI/ML lifecycle including data preparation, model training, evaluation ...

next page

Showing results 1-20

Remote Llm Training information

See salary details

$15

$42

$77

How much do remote llm training jobs pay per hour?

As of Jul 4, 2026, the average hourly pay for remote llm training in the United States is $42.21, according to ZipRecruiter salary data. Most workers in this role earn between $27.88 and $53.85 per hour, depending on experience, location, and employer.

What are some common challenges faced by professionals in remote LLM training roles, and how can they be addressed?

Professionals in remote LLM (Large Language Model) training roles often face challenges such as managing distributed team communication, ensuring data privacy, and handling large-scale computational resources. Staying organized with asynchronous collaboration tools and maintaining clear documentation can help streamline teamwork. Additionally, understanding cloud-based infrastructure and adhering to strict data security protocols are essential for handling sensitive datasets. Regular check-ins and knowledge-sharing sessions also foster a supportive and productive remote work environment.

What is the difference between Remote Llm Training vs Data Scientist?

AspectRemote Llm TrainingData Scientist
Required CredentialsKnowledge of NLP, machine learning, programming skillsStatistics, programming, domain expertise
Work EnvironmentRemote, collaborative teams, AI/ML companiesRemote or on-site, diverse industries
Industry UsageAI development, NLP projectsData analysis, predictive modeling

Remote Llm Training focuses on developing and fine-tuning large language models, requiring expertise in NLP and machine learning. Data Scientists analyze data to extract insights and build models across various industries. While both roles involve programming and data skills, Remote Llm Training is specialized in AI model development, whereas Data Scientists work on broader data analysis tasks.

What is remote LLM training?

Remote LLM training refers to the process of training large language models (LLMs), such as GPT or similar AI models, on distributed computing resources that are accessed remotely. This allows data scientists and AI engineers to leverage powerful hardware, like GPUs or TPUs, which may not be available locally. Remote LLM training is commonly used to handle the massive computational requirements of modern AI models and enables collaboration among teams in different locations. It also provides scalability, flexibility, and cost-effectiveness for organizations working on advanced AI projects.

What are the key skills and qualifications needed to thrive as a Remote LLM Training Specialist, and why are they important?

To excel in Remote LLM Training, you need a strong background in machine learning, natural language processing, and computer science, often demonstrated by a relevant degree or industry experience. Familiarity with frameworks like PyTorch or TensorFlow, experience with large-scale data management, and knowledge of distributed computing systems are typically required. Strong problem-solving skills, effective communication, and the ability to work independently are vital soft skills in this remote, collaborative environment. These competencies ensure efficient model training, high-quality output, and seamless teamwork across distributed teams.
More about Remote Llm Training jobs
What cities are hiring for Remote Llm Training jobs? Cities with the most Remote Llm Training job openings:
What are the most commonly searched types of Llm Training jobs? The most popular types of Llm Training jobs are:
What states have the most Remote Llm Training jobs? States with the most job openings for Remote Llm Training jobs include:
Infographic showing various Remote Llm Training job openings in the United States as of June 2026, with employment types broken down into 100% Full Time. Highlights an 100% Remote job distribution, with an average salary of $87,800 per year, or $42.2 per hour.
Principal AI Performance Modeling Architect

Principal AI Performance Modeling Architect

Advanced Micro Devices, Inc

Santa Clara, CA • On-site, Remote

Full-time

Posted 13 days ago


Advanced Micro Devices rating

8.4

Company rating: 8.4 out of 10

Based on 7 frontline employees who took The Breakroom Quiz

22nd of 141 rated electronics manufacturers


Job description


WHAT YOU DO AT AMD CHANGES EVERYTHING 

At AMD, our mission is to build great products that accelerate next-generation computing experiences—from AI and data centers, to PCs, gaming and embedded systems. Grounded in a culture of innovation and collaboration, we believe real progress comes from bold ideas, human ingenuity and a shared passion to create something extraordinary. When you join AMD, you’ll discover the real differentiator is our culture. We push the limits of innovation to solve the world’s most important challenges—striving for execution excellence, while being direct, humble, collaborative, and inclusive of diverse perspectives. Join us as we shape the future of AI and beyond.  Together, we advance your career.  



THE ROLE:

As a Principal Engineer, you will spearhead the next generation of AI infrastructure by defining GPU architecture specifications that enable massive model training at scale. Your expertise will drive 2-3x performance gains in both training and inference pipelines through innovative system design and optimization. You will champion the adoption of cutting-edge techniques across the engineering organization, from efficient attention mechanisms to advanced parallelization strategies. By establishing comprehensive best practices for distributed ML systems, you will create a framework that enables seamless scaling from single-GPU to thousand-GPU deployments.

THE PERSON:

You have a deep understanding of GPU microarchitecture, memory hierarchies, and their impact on large-scale ML workloads You are passionate about software engineering and possess leadership skills to drive sophisticated issues to resolution. You are able to communicate effectively and work optimally with different teams across AMD. 

KEY RESPONSIBILITIES:

  • Lead performance modeling and optimization for multi-trillion parameter LLM training/inference including Dense, Mixture of Experts (MoE) with multiple modalities (text, vision, speech)
  • Model/optimize novel parallelization strategies across tensor, pipeline, context, expert and data parallel dimensions
  • Architect memory-efficient training systems utilizing techniques like structured pruning, quantization (MX formats), continuous batching/chunked prefill, speculative decoding
  • Incorporate and extend SOTA models such as GPT-4, Reasoning models (Deepseek-R1), and multi-modal architectures
  • Collaborate with internal and external stakeholders/ML researchers to disseminate results and iterate at rapid pace.

REQUIRED EXPERIENCE:

  • Extensive and Senior experience optimizing large-scale ML systems and GPU architectures
  • Deep expertise in CUDA programming, GPU memory hierarchies, and hardware-specific optimizations
  • Proven track record architecting distributed training systems handling large scale systems
  • Expert knowledge of transformer architectures, attention mechanisms, and model parallelism techniques

PREFERRED EXPERIENCE:

  • PyTorch, CUDA, TensorRT, OpenAI Triton
  • Distributed systems: Ray, Megatron-LM
  • Performance analysis tools: NSight Compute, nvprof, PyTorch Profiler
  • KV cache optimization, Flash Attention, Mixture of Experts
  • High-speed networking: InfiniBand, RDMA, NVLink

ACADEMIC CREDENTIALS:

  • Bachelors, MS/PhD in Computer Science/Engineering or equivalent industry experience

LOCATION: Austin, Tx or Santa Clara, Ca strongly preferred; Remote is a possibility for the right candidate

This role is not eligible for visa sponsorship.

#LI-RL1



Benefits offered are described:  AMD benefits at a glance.

AMD does not accept unsolicited resumes from headhunters, recruitment agencies, or fee-based recruitment services. AMD and its subsidiaries are equal opportunity, inclusive employers and will consider all applicants without regard to age, ancestry, color, marital status, medical condition, mental or physical disability, national origin, race, religion, political and/or third-party affiliation, sex, pregnancy, sexual orientation, gender identity, military or veteran status, or any other characteristic protected by law.   We encourage applications from all qualified candidates and will accommodate applicants’ needs under the respective laws throughout all stages of the recruitment and selection process.

AMD may use Artificial Intelligence to help screen, assess or select applicants for this position.  AMD’s “Responsible AI Policy” is available here.

 

This posting is for an existing vacancy.

Qualifications:

Benefits offered are described:  AMD benefits at a glance.

AMD does not accept unsolicited resumes from headhunters, recruitment agencies, or fee-based recruitment services. AMD and its subsidiaries are equal opportunity, inclusive employers and will consider all applicants without regard to age, ancestry, color, marital status, medical condition, mental or physical disability, national origin, race, religion, political and/or third-party affiliation, sex, pregnancy, sexual orientation, gender identity, military or veteran status, or any other characteristic protected by law.   We encourage applications from all qualified candidates and will accommodate applicants’ needs under the respective laws throughout all stages of the recruitment and selection process.

AMD may use Artificial Intelligence to help screen, assess or select applicants for this position.  AMD’s “Responsible AI Policy” is available here.

 

This posting is for an existing vacancy.

Education:UNAVAILABLEEmployment Type: FULL_TIME

What Advanced Micro Devices employees say

Pay

Benefits

Hours and flexibility

Workplace

Get the full story on Breakroom