Gain familiarity with existing literature on data annotation and LLM as judge * Understand NIST's role and ongoing efforts in assessing and measuring the validity and reliability of AI-related risks ...
Gain familiarity with existing literature on data annotation and LLM as judge * Understand NIST's role and ongoing efforts in assessing and measuring the validity and reliability of AI-related risks ...
This is not a traditional annotation role. You'll be working directly with cutting-edge AI systems ... The work sits at the intersection of data quality, model evaluation, and human judgment , where ...
New
Quick apply
This is not a traditional annotation role. You'll be working directly with cutting-edge AI systems ... The work sits at the intersection of data quality, model evaluation, and human judgment , where ...
New
Project Perseus \u007C Data Labeling Associate - Portuguese (Portugal) Speakers (Human-in-the-Loop A
$34/hr
This is not a traditional annotation role. You'll be working directly with cutting-edge AI systems ... The work sits at the intersection of data quality, model evaluation, and human judgment , where ...
New
Quick apply
Project Perseus \u007C Data Labeling Associate - Portuguese (Portugal) Speakers (Human-in-the-Loop A
$34/hr
This is not a traditional annotation role. You'll be working directly with cutting-edge AI systems ... The work sits at the intersection of data quality, model evaluation, and human judgment , where ...
New
Project Perseus \u007C Data Labeling Associate - Portuguese (Brazil) Speakers (Human-in-the-Loop AI)
$34/hr
This is not a traditional annotation role. You'll be working directly with cutting-edge AI systems ... The work sits at the intersection of data quality, model evaluation, and human judgment , where ...
New
Quick apply
Project Perseus \u007C Data Labeling Associate - Portuguese (Brazil) Speakers (Human-in-the-Loop AI)
$34/hr
This is not a traditional annotation role. You'll be working directly with cutting-edge AI systems ... The work sits at the intersection of data quality, model evaluation, and human judgment , where ...
New
This is not a traditional annotation role. You'll be working directly with cutting-edge AI systems ... The work sits at the intersection of data quality, model evaluation, and human judgment , where ...
New
Quick apply
This is not a traditional annotation role. You'll be working directly with cutting-edge AI systems ... The work sits at the intersection of data quality, model evaluation, and human judgment , where ...
New
... judgment, attention to detail, and consistency . The work sits at the intersection of language ... Execute high-volume data labeling and annotation tasks across speech and voice datasets * Follow ...
New
Quick apply
... judgment, attention to detail, and consistency . The work sits at the intersection of language ... Execute high-volume data labeling and annotation tasks across speech and voice datasets * Follow ...
New
This is not a traditional annotation role. You'll be working directly with cutting-edge AI systems ... The work sits at the intersection of data quality, model evaluation, and human judgment , where ...
New
Quick apply
This is not a traditional annotation role. You'll be working directly with cutting-edge AI systems ... The work sits at the intersection of data quality, model evaluation, and human judgment , where ...
New
Project Perseus \u007C Data Labeling Associate - English Speakers (UK, Canadian and/ or Australian v
$34/hr
This is not a traditional annotation role. You'll be working directly with cutting-edge AI systems ... The work sits at the intersection of data quality, model evaluation, and human judgment , where ...
New
Quick apply
Project Perseus \u007C Data Labeling Associate - English Speakers (UK, Canadian and/ or Australian v
$34/hr
This is not a traditional annotation role. You'll be working directly with cutting-edge AI systems ... The work sits at the intersection of data quality, model evaluation, and human judgment , where ...
New
This is not a traditional annotation role. You'll be working directly with cutting-edge AI systems ... The work sits at the intersection of data quality, model evaluation, and human judgment , where ...
New
Quick apply
This is not a traditional annotation role. You'll be working directly with cutting-edge AI systems ... The work sits at the intersection of data quality, model evaluation, and human judgment , where ...
New
This is not a traditional annotation role. You'll be working directly with cutting-edge AI systems ... The work sits at the intersection of data quality, model evaluation, and human judgment , where ...
New
Quick apply
This is not a traditional annotation role. You'll be working directly with cutting-edge AI systems ... The work sits at the intersection of data quality, model evaluation, and human judgment , where ...
New
This is not a traditional annotation role. You'll be working directly with cutting-edge AI systems ... The work sits at the intersection of data quality, model evaluation, and human judgment , where ...
New
Quick apply
This is not a traditional annotation role. You'll be working directly with cutting-edge AI systems ... The work sits at the intersection of data quality, model evaluation, and human judgment , where ...
New
Project Perseus \u007C Speech & Voice AI Analyst - Spanish Speakers
Washington, DC · On-site
$26 - $28/hr
... judgment, attention to detail, and consistency . The work sits at the intersection of language ... Execute high-volume data labeling and annotation tasks across speech and voice datasets * Follow ...
New
Quick apply
Project Perseus \u007C Speech & Voice AI Analyst - Spanish Speakers
Washington, DC · On-site
$26 - $28/hr
... judgment, attention to detail, and consistency . The work sits at the intersection of language ... Execute high-volume data labeling and annotation tasks across speech and voice datasets * Follow ...
New
... judgment, attention to detail, and consistency . The work sits at the intersection of language ... Execute high-volume data labeling and annotation tasks across speech and voice datasets * Follow ...
New
Quick apply
... judgment, attention to detail, and consistency . The work sits at the intersection of language ... Execute high-volume data labeling and annotation tasks across speech and voice datasets * Follow ...
New
... judgment, attention to detail, and consistency . The work sits at the intersection of language ... Execute high-volume data labeling and annotation tasks across speech and voice datasets * Follow ...
New
Quick apply
... judgment, attention to detail, and consistency . The work sits at the intersection of language ... Execute high-volume data labeling and annotation tasks across speech and voice datasets * Follow ...
New
... judgment, attention to detail, and consistency . The work sits at the intersection of language ... Execute high-volume data labeling and annotation tasks across speech and voice datasets * Follow ...
New
Quick apply
... judgment, attention to detail, and consistency . The work sits at the intersection of language ... Execute high-volume data labeling and annotation tasks across speech and voice datasets * Follow ...
New
... judgment, attention to detail, and consistency . The work sits at the intersection of language ... Execute high-volume data labeling and annotation tasks across speech and voice datasets * Follow ...
New
Quick apply
... judgment, attention to detail, and consistency . The work sits at the intersection of language ... Execute high-volume data labeling and annotation tasks across speech and voice datasets * Follow ...
New
... judgment, attention to detail, and consistency . The work sits at the intersection of language ... Execute high-volume data labeling and annotation tasks across speech and voice datasets * Follow ...
New
Quick apply
... judgment, attention to detail, and consistency . The work sits at the intersection of language ... Execute high-volume data labeling and annotation tasks across speech and voice datasets * Follow ...
New
... judgment, attention to detail, and consistency . The work sits at the intersection of language ... Execute high-volume data labeling and annotation tasks across speech and voice datasets * Follow ...
New
Quick apply
... judgment, attention to detail, and consistency . The work sits at the intersection of language ... Execute high-volume data labeling and annotation tasks across speech and voice datasets * Follow ...
New
... judgment, attention to detail, and consistency . The work sits at the intersection of language ... Execute high-volume data labeling and annotation tasks across speech and voice datasets * Follow ...
New
Quick apply
... judgment, attention to detail, and consistency . The work sits at the intersection of language ... Execute high-volume data labeling and annotation tasks across speech and voice datasets * Follow ...
New
... judgment, attention to detail, and consistency . The work sits at the intersection of language ... Execute high-volume data labeling and annotation tasks across speech and voice datasets * Follow ...
New
Quick apply
... judgment, attention to detail, and consistency . The work sits at the intersection of language ... Execute high-volume data labeling and annotation tasks across speech and voice datasets * Follow ...
New
Annotation Judge information
What are the key skills and qualifications needed to thrive as an Annotation Judge, and why are they important?
What are some common challenges faced by Annotation Judges, and how can they effectively overcome them?
What is an Annotation Judge?
What is the difference between Annotation Judge vs Data Annotator?
| Aspect | Annotation Judge | Data Annotator |
|---|---|---|
| Credentials | Typically requires basic education, sometimes certification in data labeling | Usually requires similar or less formal education, often on-the-job training |
| Work Environment | Office or remote, working with data labeling platforms | Office or remote, performing data labeling tasks |
| Industry Usage | Used across AI, machine learning, and data science projects | Common in AI, machine learning, and data preparation workflows |
| Search & Comparison Intent | Often compared for roles involving data review and quality control | Compared for entry-level data labeling roles |
The main difference between an Annotation Judge and a Data Annotator lies in their roles. Annotation Judges typically review and validate annotations made by Data Annotators, ensuring quality and accuracy. Data Annotators perform the initial labeling of data. Both roles are essential in AI data pipelines, with Annotation Judges focusing on quality control and Data Annotators on data preparation.

Full-time
Posted 21 days ago
Johns Hopkins Medicine rating
7.5
Based on 199 frontline employees who took The Breakroom Quiz
216th of 864 rated healthcare providers
Job description
PREP Research Associate
This position is part of the National Institute of Standards and Technology (NIST) Professional Research Experience Program (PREP). NIST recognizes that its research staff may want to collaborate with researchers at academic institutions on specific projects of mutual interest and, therefore, requires those institutions to be recipients of a PREP award. The PREP program involves staff from a wide range of backgrounds conducting scientific research across various fields. Individuals in this position will perform technical work supporting the collaboration's scientific research.
Research Title:
Reliability of Human and LLM Annotations for AI Risk Assessment
The work will entail:
This project focuses on using Large Language Models (LLMs) to provide annotations of evaluation data (a.k.a., LLM as judge), and the design of an Inter-Annotator Agreement study to assess the reliability of both human and LLM annotations. The candidate will explore assessing the indicators of a given AI-related risk, determining how to identify them, and providing annotators with examples to annotate the presence of various risks. The project aims to develop an annotation framework for AI risk assessment and establish metrics for data quality in AI risk research, supporting broader work at NIST in assessing and measuring the validity and reliability of AI-related risks in data annotation.
U.S. Citizen Preferred
Key responsibilities will include but are not limited to:
- Gain familiarity with existing literature on data annotation and LLM as judge
- Understand NIST's role and ongoing efforts in assessing and measuring the validity and reliability of AI-related risks in data annotation
- Contribute to developing an annotation framework for AI risk assessment
- Collaborate effectively with cross-functional and interdisciplinary stakeholders to ensure successful project outcomes
Deliverables
- Contributions to a NIST report that supports ongoing NIST AI evaluation efforts focused on the design of an Inter-Annotator Agreement to assess the reliability of both human and LLM annotations.
Qualifications
- Background in Computer Science, Data Science, or related field.
- Education level: Bachelor's or Graduate Degree
- Strong interest in data annotation and AI risks
- Familiarity with scientific reading and technical writing
Application Instructions
Please upload the following with your application:
• CV/Resume
*Please limit C.V to 3 pages only and ONLY include a valid email address for your contact info. Your resume will not be considered if the following information is included on your CV/resume.
• Self portraits
• Phone number
• Home address/Country
• Citizenship status
• Languages spoken
• Sex/Gender
Privacy Act Statement
Authority: 15 U.S.C. § 278g-1(e)(1) and (e)(3) and 15 U.S.C. § 272(b) and (c)
Purpose: The National Institute for Standards and Technology (NIST) hosts the Professional Research Experience Program (PREP) which is designed to provide valuable laboratory experience and financial assistance to undergraduates, post-bachelor's degree holders, graduate students, master's degree holders, postdocs, and faculty.
PREP is a 5-year cooperative agreement between NIST laboratories and participating PREP Universities to establish a collaborative research relationship between NIST and U.S. institutions of higher education in the following disciplines including (but may not be limited to) biochemistry, biological sciences, chemistry, computer science, engineering, electronics, materials science, mathematics, nanoscale science, neutron science, physical science, physics, and statistics. This collection of information is needed to facilitate administrative functions of the PREP Program.
Routine Uses: NIST will use the information collected to perform the requisite reviews of the applications to determine eligibility, and to meet programmatic requirements. Disclosure of this information is also subject to all the published routine uses as identified in the Privacy Act System of Records Notices: NIST-1: NIST Associates.
Disclosure: Furnishing this information is voluntary. When you submit the form, you are indicating your voluntary consent for NIST to use of the information you submit for the purpose stated.
What Johns Hopkins Medicine employees say
Pay
Benefits
Hours and flexibility
Workplace
Get the full story on Breakroom
About Johns Hopkins Hospital
Sourced by ZipRecruiter
Industry
Health care and social assistance
Company size
10,000+ Employees
Headquarters location
Baltimore, MD, US
Year founded
1889