2

Internship Remote Data Labelling Jobs in Massachusetts

Drive product decisions using data, partner feedback, and competitive analysis * Work cross ... Remote-first flexibility; work ET hours on your terms * Real decision-making authority with ...

Senior Scientist, Oceans Science

Boston, MA · On-site +1

$99K - $135K/yr

Remote US Home, Austin, Boulder, Boston, New York, San Francisco, Washington DC Duration: This is a ... Analyze, interpret, and communicate scientific data to state, federal, and international ...

New

Track social performance, report on KPIs, and use data to adjust strategy, improve content, and ... Internship or work experience at a beauty, hair care, fashion, lifestyle, consumer, or creator-led ...

next page

Showing results 1-20

Internship Remote Data Labelling information

What are the key skills and qualifications needed to thrive as an Internship Remote Data Labelling professional, and why are they important?

To excel as an Internship Remote Data Labelling professional, you need strong attention to detail, basic computer literacy, and familiarity with data annotation processes, often requiring at least a high school diploma or equivalent. Experience with data labelling platforms such as Labelbox or Supervisely, and understanding file formats like CSV or JSON, are commonly expected. Reliability, time management, and effective communication are important soft skills for remote collaboration and meeting deadlines. These competencies ensure high-quality, consistent data labelling that supports accurate machine learning model development.

What is the difference between Internship Remote Data Labelling vs Data Annotation Specialist?

AspectInternship Remote Data LabellingData Annotation Specialist
CredentialsTypically students or entry-level with basic computer skillsOften requires experience or training in data annotation tools
Work EnvironmentRemote, flexible hours, internship settingRemote or on-site, professional setting
Employer & IndustryTech companies, AI startups, research projectsAI, machine learning, data services companies
Search & Comparison IntentLearning opportunity, entry-level roleProfessional data labeling work, career development

Internship Remote Data Labelling typically involves entry-level, temporary roles focused on training and learning, often suitable for students. Data Annotation Specialists are more experienced professionals performing detailed labeling tasks for ongoing projects. While both roles involve data labeling, the internship emphasizes skill development, whereas the specialist role centers on professional expertise.

What are some typical challenges faced by remote data labelling interns, and how can they be addressed?

Remote data labelling interns often encounter challenges such as managing repetitive tasks, maintaining high accuracy, and communicating effectively with team members across different time zones. To address these, it's helpful to establish a structured daily routine, regularly review quality guidelines, and use collaboration tools like Slack or Teams to stay connected. Seeking timely feedback from supervisors and participating in virtual team check-ins can also improve both efficiency and data consistency.

What is an Internship Remote Data Labelling job?

An Internship Remote Data Labelling job involves reviewing and tagging data—such as images, text, or audio—from a remote location to help train machine learning algorithms. Interns in this role classify, annotate, or categorize raw data according to specific guidelines provided by the employer or project. This work is crucial for improving the accuracy of AI models, as properly labeled data enables better learning outcomes. Remote data labelling internships are ideal for students or recent graduates looking to gain experience in AI, data science, or related fields while working from anywhere.
What are the most commonly searched types of Remote Data Labelling jobs in Massachusetts? The most popular types of Remote Data Labelling jobs in Massachusetts are:
What job categories do people searching Internship Remote Data Labelling jobs in Massachusetts look for? The top searched job categories for Internship Remote Data Labelling jobs in Massachusetts are:
What cities in Massachusetts are hiring for Internship Remote Data Labelling jobs? Cities in Massachusetts with the most Internship Remote Data Labelling job openings:

Research Engineer, Frontier Capabilities

Lila Sciences

Cambridge, MA • On-site, Remote

Other

Posted 19 days ago


Job description

Your Impact at LILA

The AI Research team is tackling one of the most exciting, open problems in AI: training LLMs to run long-horizon scientific discovery tasks. Our approach spans the full post-training stack - from SFT to asynchronous RL on agentic harnesses - teaching models to plan, use tools, and learn from experience in domains where the ground truth isn't a preference label, but a scientific result.

We're rapidly growing our Research Engineering org and seeking talented engineers and ML practitioners across levels to design, build, and optimize systems to push this frontier: scaling post-training, sharpening reasoning, and unlocking compute-intensive agentic-harness training. This is a rare chance to join an early team with the autonomy, flexibility, and compute to tackle frontier science problems.

We operate with high agency, and a bias toward execution. Below are several focus areas within the team. We ask that candidates select the stream that best matches their experience and excitement.

Work Streams

Stream A: GPU Optimization & Training Performance

Maximize hardware utilization across 100B+ parameter asynchronous RL training runs. Responsibilities include profiling, performance optimization, custom kernel development, communication-computation overlap, and long-context throughput improvements. You set and maintain the performance baseline.

Stream B: Stack & Infrastructure

Own the post-training infrastructure end-to-end - supervised fine-tuning, asynchronous RL with tool integration, and data pipelines. Build modular, reproducible workflows with single-command execution. Manage upstream framework upgrades and deliver composable pipelines spanning Data, SFT, and RL stages. You work tightly with Research Scientists to develop and productionize novel algorithms to run at scale.

Stream C: Model Experimentation

Bring deep, hands-on experience training large language models. Lead experimentation on reasoning model development, including mixture-of-experts stabilization, curriculum design, and synthetic reasoning trace generation. You have a bias toward experimental design and tracking, and know how to prioritize runs that yield promising outcomes.

Stream D: Evaluations & Benchmarks

Design and build best-in-class scientific agentic benchmarks and harnesses, along with the dashboards and leaderboards that inform every training decision. You have experience working with well known public benchmarks and have spent time building bespoke agentic benchmarks and harnesses.

Stream E: Agentic Capabilities & Frontier Research

Train models capable of planning, exploration, and tool use over extended horizons. Advance the state of the art in RL at scale with tool-calling, subgoal decomposition, and shared memory/skills across trials to expand the frontier of scientific agent capabilities.

What You'll Need to Succeed

  • Strong software engineering skills in Python; C++/CUDA a plus
  • Experience with distributed ML training frameworks (Megatron-LM, TorchTitan, DeepSpeed, Ray)
  • Understanding of large-scale model training techniques for 100B+ models
  • Experience with cloud or HPC environment
  • Ability to communicate technical results to internal and external stakeholders

Bonus Points For

  • Prior work with large scale scientific datasets or domain-specific modeling
  • Contributions to open-source ML frameworks
  • Experience with RL post-training (RLHF, GRPO, tool-augmented RL)
  • Experience training MoE architectures

Location

San Francisco, CA or Cambridge, MA (Remote, Hybrid, and On-Site available depending on team needs).