Annotation Judge Jobs (NOW HIRING)

Annotation Data Scientist, Evaluation Integrity (Siri)

$157K - $280K/yr

This role sits at the intersection of data science, human annotation engineering, and evaluation methodology, and is instrumental in turning human judgment into a rigorous, reproducible signal that ...

Apple

Annotation Data Scientist, Evaluation Integrity (Siri)

Cambridge, MA

$157K - $280K/yr

Apple

Annotation Data Scientist, Evaluation Integrity (Siri)

Cambridge, MA · On-site

Apple

Annotation Data Scientist, Evaluation Integrity (Siri)

Cambridge, MA · On-site

Q Analysts

Data Annotation Technician

Kirkland, WA · On-site

Data Annotation Technician Join Q Analysts and become part of a world-class organization. Q ... Analyze content and use best judgement for proper answers * Meet daily, weekly and monthly velocity ...

Q Analysts

Data Annotation Technician

Kirkland, WA · On-site

Apple

Data Scientist - Survey Design, Data Annotation, and Machine Learning Evaluation

Cupertino, CA

$144K - $263K/yr

This person will work closely with ML Engineers to manage and analyze our human and automated data annotation processes, and to develop, test, and refine LLM judges for generative AI model evaluation.

Apple

Data Scientist - Survey Design, Data Annotation, and Machine Learning Evaluation

Cupertino, CA

$144K - $263K/yr

UnitedHealth Group

Manager of Clinical Annotation - Remote

Atlanta, GA · Remote

$91K - $163K/yr

The Manager of Clinical Annotation leads a team of Registered Nurse Clinical Annotators responsible ... Use Sound Judgement * Performance Value: Deliver Quality Results * Drive for Results * Manage Time ...

UnitedHealth Group

Manager of Clinical Annotation - Remote

Atlanta, GA · Remote

$91K - $163K/yr

Apple

Data Scientist - Survey Design, Data Annotation, and Machine Learning Evaluation

Cupertino, CA · On-site

... Large Language Model judges. We are looking for a skilled Data Scientist to join our Machine ... A successful candidate is experienced in survey design, data annotation, LLM prompt engineering and ...

Apple

Data Scientist - Survey Design, Data Annotation, and Machine Learning Evaluation

Cupertino, CA · On-site

Q ANALYSTS LLC

Data Annotation Technician

Kirkland, WA · On-site +1

Q Analysts is looking for Data Annotation Technicians to support Ground Truth Data Collection ... Analyze content and use best judgement for proper answers * Meet daily, weekly and monthly velocity ...

Q ANALYSTS LLC

Data Annotation Technician

Kirkland, WA · On-site +1

Q ANALYSTS LLC

Data Annotation Technician

Kirkland, WA

Q ANALYSTS LLC

Data Annotation Technician

Kirkland, WA

BC Forward

Project Coordinator III (AI Annotation)

Indianapolis, IN · On-site

$51 - $55/hr

Project Coordinator III (AI Annotation) Location: Remote Duration: Contract - 6 months Pay Range ... adherence to company policies, judgment, stress management, safe and respectful work ...

BC Forward

Project Coordinator III (AI Annotation)

Indianapolis, IN · On-site

$51 - $55/hr

Project Coordinator III (AI Annotation) Location: Remote Duration: Contract - 6 months Pay Range ... adherence to company policies, judgment, stress management, safe and respectful work ...

Bloomberg LP

Team Leader - Annotations Operations and Governance

Princeton, NJ · On-site

This role leads two tightly coupled but distinct capabilities: (1) governing canonical annotation standards and judgment frameworks, and (2) applying those standards at scale through operational ...

Bloomberg LP

Team Leader - Annotations Operations and Governance

Princeton, NJ · On-site

Barker Staffing Solutions LLC

Remote Video Annotators

Hampton, VA · Remote

Apply Advanced Annotation Capability by using the following skills: Use your experience with ... Judgment by using the following skills: Strong understanding of traffic rules and right-of-way ...

Quick apply

Barker Staffing Solutions LLC

Remote Video Annotators

Hampton, VA · Remote

UnitedHealth Group

Nurse Clinical Annotator - Remote in US

Atlanta, GA · Remote

Maintain detailed documentation of annotation decisions, rationales, and guideline interpretations ... Use Sound Judgement * Performance Value: Deliver Quality Results * Drive for Results * Manage Time ...

UnitedHealth Group

Nurse Clinical Annotator - Remote in US

Atlanta, GA · Remote

Welo Global

Data Annotation Specialist - Seattle (On Site)

Seattle, WA · On-site

Conduct detailed data annotation and quality assurance of natural language datasets following ... These tools assist our recruitment team but do not replace human judgment. Final hiring decisions ...

Welo Global

Data Annotation Specialist - Seattle (On Site)

Seattle, WA · On-site

UnitedHealthcare At Home

Nurse Clinical Annotator - Remote in US

Review medical records and interpret clinical documentation with accuracy and clinical judgment ... Maintain detailed documentation of annotation decisions, rationales, and guideline interpretations

UnitedHealthcare At Home

Nurse Clinical Annotator - Remote in US

Johns Hopkins University

Assistant Research Scientist (PREP0004176)

Gaithersburg, MD · On-site

Gain familiarity with existing literature on data annotation and LLM as judge * Understand NIST's role and ongoing efforts in assessing and measuring the validity and reliability of AI-related risks ...

Johns Hopkins University

Assistant Research Scientist (PREP0004176)

Gaithersburg, MD · On-site

Apple

AI/ML Evaluation Specialist, Human Data

Cupertino, CA · On-site

... annotation, and human evaluation efforts across Apple Music, App Store, TV+, Podcasts, and Books ... You will identify where human judgment is essential and where it could be better directed, then ...

Apple

AI/ML Evaluation Specialist, Human Data

Cupertino, CA · On-site

Welo Data

Data Annotation Specialist - Seattle (On Site)

Seattle, WA

Quick apply

Welo Data

Data Annotation Specialist - Seattle (On Site)

Seattle, WA

Propio

AI Data Strategy Engineer / Applied Scientist, LLM Data

Leawood, KS · On-site

Experience with synthetic data generation, active learning, weak supervision, LLM-as-judge workflows, or automated data quality scoring. * Experience with modern annotation and data platforms such as ...

Quick apply

Propio

AI Data Strategy Engineer / Applied Scientist, LLM Data

Leawood, KS · On-site

Propio Language Services

AI Data Strategy Engineer / Applied Scientist, LLM Data

Overland Park, KS · On-site

Propio Language Services

AI Data Strategy Engineer / Applied Scientist, LLM Data

Overland Park, KS · On-site

Propio

AI Data Strategy Engineer / Applied Scientist, LLM Data

Overland Park, KS

Propio

AI Data Strategy Engineer / Applied Scientist, LLM Data

Overland Park, KS

Showing results 1-20

Annotation Judge Jobs

Annotation Judge information

What is an Annotation Judge?

An Annotation Judge is a professional who evaluates the quality and accuracy of labeled data, such as text, images, or audio, which has been annotated for use in machine learning and artificial intelligence projects. Their main responsibility is to review, verify, and ensure that the data annotations meet specific guidelines and standards. Annotation Judges play a critical role in improving the reliability of training datasets, which directly impacts the performance of AI systems. They often work closely with data annotators, quality assurance teams, and project managers to maintain high data quality.

What are the key skills and qualifications needed to thrive as an Annotation Judge, and why are they important?

To thrive as an Annotation Judge, you need strong analytical skills, attention to detail, and subject matter expertise relevant to the data being evaluated, usually supported by a degree in a related field. Familiarity with annotation platforms, data labeling tools, and quality assurance systems is typically required. Excellent communication, impartiality, and critical thinking help you provide clear feedback and maintain high annotation standards. These skills are crucial to ensure data accuracy and consistency, which directly impact the performance of machine learning models.

What is the difference between Annotation Judge vs Data Annotator?

Aspect	Annotation Judge	Data Annotator
Credentials	Typically requires basic education, sometimes certification in data labeling	Usually requires similar or less formal education, often on-the-job training
Work Environment	Office or remote, working with data labeling platforms	Office or remote, performing data labeling tasks
Industry Usage	Used across AI, machine learning, and data science projects	Common in AI, machine learning, and data preparation workflows
Search & Comparison Intent	Often compared for roles involving data review and quality control	Compared for entry-level data labeling roles

The main difference between an Annotation Judge and a Data Annotator lies in their roles. Annotation Judges typically review and validate annotations made by Data Annotators, ensuring quality and accuracy. Data Annotators perform the initial labeling of data. Both roles are essential in AI data pipelines, with Annotation Judges focusing on quality control and Data Annotators on data preparation.

What are some common challenges faced by Annotation Judges, and how can they effectively overcome them?

Annotation Judges often face challenges such as maintaining impartiality, handling ambiguous or subjective data, and ensuring high consistency across large volumes of work. To overcome these, it’s essential to follow established guidelines closely, communicate regularly with team members for clarification, and participate in calibration sessions. Staying detail-oriented and seeking feedback can also help maintain accuracy and fairness in their assessments.

More about Annotation Judge jobs

The 10 Top Types Of Annotation Judge Jobs

What cities are hiring for Annotation Judge jobs? Cities with the most Annotation Judge job openings:

What states have the most Annotation Judge jobs? States with the most job openings for Annotation Judge jobs include:

What job categories do people searching Annotation Judge jobs look for? The top searched job categories for Annotation Judge jobs are:

Annotation Judge jobs near you

Infographic showing various Annotation Judge job openings in the United States as of July 2026, with employment types broken down into 1% As Needed, 33% Full Time, 31% Part Time, 34% Contract, and 1% Nights. Highlights an 34% Physical, and 66% Remote job distribution.

Annotation Data Scientist, Evaluation Integrity (Siri)

Apple

Cambridge, MA

Apply

$157K - $280K/yr

Full-time

Medical, Dental, Retirement

Re-posted 29 days ago

Apple rating

8.1

Based on 670 frontline employees who took The Breakroom Quiz

5th of 30 rated technology retailers

Job description

Join the team redefining what a deeply personal and integrated assistant can be.
As part of the Siri organization, you will help shape one of the world's most widely used AI assistants, powered by our next-generation of Apple Intelligence, with capabilities like personal context understanding and on-screen awareness, built with privacy from the ground up. Your work will have direct, meaningful impact for users across iOS, iPadOS, macOS, watchOS, and visionOS.
This is a rare opportunity to build at the intersection of cutting-edge AI and human-centered design, shipping technology that is centered around users and their needs.
Description
Play a part in the ongoing revolution in human-computer interaction. Siri is evolving - and the way we evaluate it has to evolve with it. Join the Evaluation Integrity team to help build the trusted quality signal behind every Siri release.
Within the Siri evaluation organization, the Human Evaluation sub-team is responsible for answering the question: can we trust our evals? We do that by designing human-in-the-loop (HITL) annotation tasks that scrutinize every moving part of an agentic evaluation - the simulated user agent, the conversation it has with Siri, and the automated evaluators that grade the exchange. This role sits at the intersection of data science, human annotation engineering, and evaluation methodology, and is instrumental in turning human judgment into a rigorous, reproducible signal that directly informs pre-ship model and product decisions.
As an Annotation Data Scientist on the Evaluation Integrity team, you will design and run HITL annotation projects that evaluate the quality and authenticity of agentic user personae, the validity of agent-to-agent conversations, and the reliability of LLM-as-judge and rule-based evaluators against Siri's product specifications. You will own annotation initiatives end-to-end; from rubric design and tooling, through annotator calibration, to data science analysis that turns annotator judgments into actionable signal for modeling, planning, and product teams.
","responsibilities":"Design HITL annotation tasks for agentic evaluation. Advise on rubrics and design workflows that ask annotators to assess (a) the quality and authenticity of user agent personae, (b) the validity of agent-to-agent conversations, and (c) whether agentic evaluators' verdicts align with Siri's product specifications and human interface guidelines.
Author, maintain, and iterate on annotation guidelines. Translate evolving Siri capabilities and product specs into clear, defensible rubrics for human grading aligned with agentic evaluators; run calibration sessions; monitor inter-annotator agreement; and refine guidelines based on edge cases surfaced during grading.
Manage multiple annotation programs in parallel. Plan, scope, and manage human evaluation tasks end-to-end - requirements gathering, annotator coordination, vendor management, timeline tracking, and stakeholder delivery.
Design custom annotation tooling in partnership with software engineers. Prototype task UIs, specify tool requirements, and collaborate with tooling engineers on the annotation platforms the Human Evaluation team relies on.
Apply data science rigor to human-labeled data. Use Python to build analysis pipelines that measure evaluator accuracy against the annotator pool, surface discrepancies between LLM-judge and rule-based evaluators, and quantify the reliability of each agentic evaluator as a source of truth.
Turn annotator feedback into evaluator improvements. Close the loop between annotators and the data scientists and software engineers who own user agents and automated evaluators, feeding findings back into prompts, rubrics, and product guidelines.
Contribute to the organization-wide eval health story. Partner with the User Feedback and Eval Science sub-team to ensure human signal is represented in the eval health report delivered to leadership.
Preferred Qualifications
Experience evaluating LLM-powered or agentic systems, including familiarity with LLM-as-judge methodologies, rubric-based grading, or trajectory and tool-call evaluation.
Familiarity with statistical methods that address accuracy and variability in human annotation data, such as inter-annotator agreement, Cohen's or Fleiss' kappa, Krippendorff's alpha, or bootstrapping.
Data-querying experience with SQL, Spark, or similar, and comfort working with large, complex, real-world datasets.
Experience building pre-ship evaluation pipelines for conversational or assistant products.
Experience with prompt engineering, or with designing simulated user personae for agent evaluation.
Experience running annotation programs across multiple locales or at large scale.
Excellent written and verbal communication skills, with the ability to explain technical topics clearly to data scientists, engineers, annotators, and cross-functional partners.
Proven ability to collaborate effectively across functions and drive projects of varying sizes and scopes - knowing when to dive deep and when to delegate.
Minimum Qualifications
Bachelor's or Master's degree in a quantitative or related field such as Data Science, Computer Science, Linguistics, Statistics, or Cognitive Science, or equivalent job-related experience.
5+ years of hands-on experience working with human-annotated datasets or human-in-the-loop evaluation methodologies for machine learning, natural language processing, or large language model systems.
5+ years of experience using Python for data processing, analysis, and prototyping, including experience with libraries such as pandas, Jupyter, and at least one data visualization library.
Experience designing, implementing, and communicating annotation schemas, rubrics, or ontologies for machine learning training or evaluation data.
Experience managing multiple concurrent dataset curation efforts, including scoping work, iterating on guidelines, coordinating with in-house or vendor annotators, and monitoring annotator performance metrics such as accuracy, throughput, and inter-annotator agreement.
Experience specifying or designing custom annotation tooling in collaboration with software engineers.
Pay & Benefits
At Apple, base pay is one part of our total compensation package and is determined within a range. This provides the opportunity to progress as you grow and develop within a role. The base pay range for this role is between $157,700 and $280,600, and your base pay will depend on your skills, qualifications, experience, and location.
Apple employees also have the opportunity to become an Apple shareholder through participation in Apple's discretionary employee stock programs. Apple employees are eligible for discretionary restricted stock unit awards, and can purchase Apple stock at a discount if voluntarily participating in Apple's Employee Stock Purchase Plan. You'll also receive benefits including: Comprehensive medical and dental coverage, retirement benefits, a range of discounted products and free services, and for formal education related to advancing your career at Apple, reimbursement for certain educational expenses - including tuition. Additionally, this role might be eligible for discretionary bonuses or commission payments as well as relocation. Learn more about Apple Benefits
Note: Apple benefit, compensation and employee stock programs are subject to eligibility requirements and other terms of the applicable plan or program.

What Apple employees say

Pay

Benefits

Hours and flexibility

Workplace

Get the full story on Breakroom

About Apple

Sourced by ZipRecruiter

Imagine what you could do here! At Apple, new ideas have a way of becoming extraordinary products, services, and customer experiences very quickly. Bring passion and dedication to your job and there's no telling what you could accomplish. Dynamic, intelligent people and inspiring, innovative technologies are the norm here. The people who work here have reinvented entire industries with all Apple Hardware products. The same real passion for innovation that goes into our products also applies to our practices strengthening our dedication to leave the world better than we found it.

Industry

Computer and electronic product manufacturing

Company size

10,000+ Employees

Headquarters location

Cupertino, CA, US

Year founded

1976

Website

apple.com

Social media

View All Apple Jobs

Apply

Annotation Judge Jobs (NOW HIRING)

Annotation Data Scientist, Evaluation Integrity (Siri)

Annotation Data Scientist, Evaluation Integrity (Siri)

Annotation Data Scientist, Evaluation Integrity (Siri)

Annotation Data Scientist, Evaluation Integrity (Siri)

Data Annotation Technician

Data Annotation Technician

Data Scientist - Survey Design, Data Annotation, and Machine Learning Evaluation

Data Scientist - Survey Design, Data Annotation, and Machine Learning Evaluation

Manager of Clinical Annotation - Remote

Manager of Clinical Annotation - Remote

Data Scientist - Survey Design, Data Annotation, and Machine Learning Evaluation

Data Scientist - Survey Design, Data Annotation, and Machine Learning Evaluation

Data Annotation Technician

Data Annotation Technician

Data Annotation Technician

Data Annotation Technician

Project Coordinator III (AI Annotation)

Project Coordinator III (AI Annotation)

Team Leader - Annotations Operations and Governance

Team Leader - Annotations Operations and Governance

Remote Video Annotators

Remote Video Annotators

Nurse Clinical Annotator - Remote in US

Nurse Clinical Annotator - Remote in US

Data Annotation Specialist - Seattle (On Site)

Data Annotation Specialist - Seattle (On Site)

Nurse Clinical Annotator - Remote in US

Nurse Clinical Annotator - Remote in US

Assistant Research Scientist (PREP0004176)

Assistant Research Scientist (PREP0004176)

AI/ML Evaluation Specialist, Human Data

AI/ML Evaluation Specialist, Human Data

Data Annotation Specialist - Seattle (On Site)

Data Annotation Specialist - Seattle (On Site)

AI Data Strategy Engineer / Applied Scientist, LLM Data

AI Data Strategy Engineer / Applied Scientist, LLM Data

AI Data Strategy Engineer / Applied Scientist, LLM Data

AI Data Strategy Engineer / Applied Scientist, LLM Data

AI Data Strategy Engineer / Applied Scientist, LLM Data

AI Data Strategy Engineer / Applied Scientist, LLM Data

Annotation Judge information

What is an Annotation Judge?

What are the key skills and qualifications needed to thrive as an Annotation Judge, and why are they important?

What is the difference between Annotation Judge vs Data Annotator?

What are some common challenges faced by Annotation Judges, and how can they effectively overcome them?

Annotation Data Scientist, Evaluation Integrity (Siri)

Share this job

Apple rating

Get the real story on frontline employers

Job description

What Apple employees say

Get the real story on frontline employers

Pay

Most people get paid breaks

Most people get paid when they’re sick

The job rarely spills into unpaid time

Benefits

Sick days don’t use up paid time off

Most part-timers can get health insurance

Most part-timers get paid time off

Hours and flexibility

Less than 4 weeks notice of work schedule

Most people don’t worry about their hours

Only some people can choose their shifts

Workplace

Most people feel treated with respect

Most people get breaks without interruption

Most people are stressed out

About Apple

Industry

Company size

Headquarters location

Year founded

Website

Social media

Share this job