... text * LLM-based information extraction, few-shot and multi-task learning, and post-training ... Annotation workflow design and evaluation framework development for document understanding tasks ...
... text * LLM-based information extraction, few-shot and multi-task learning, and post-training ... Annotation workflow design and evaluation framework development for document understanding tasks ...
Design evaluation frameworks - component-level and end-to-end - using expert annotation and ... text * LLM-based information extraction, few-shot and multi-task learning, and post-training
Design evaluation frameworks - component-level and end-to-end - using expert annotation and ... text * LLM-based information extraction, few-shot and multi-task learning, and post-training
Text Annotation information
What are typical day-to-day responsibilities for someone working in text annotation?
Text Annotation professionals spend much of their day reading and labeling text data according to specific guidelines, ensuring that information is correctly categorized and flagged. This can involve highlighting entities, identifying sentiments, tagging parts of speech, or annotating complex relationships within text documents. They frequently collaborate with project managers, data scientists, and quality assurance teams to clarify instructions and maintain data consistency. The role often involves independent work, but regular check-ins and feedback sessions help maintain accuracy and enhance understanding of evolving annotation requirements. This combination of independent and collaborative tasks makes the position dynamic and integral to successful AI or NLP project outcomes.
What are the key skills and qualifications needed to thrive in the Text Annotation position, and why are they important?
Strong language proficiency, attention to detail, and critical thinking are essential skills for succeeding as a Text Annotation specialist, often supported by a bachelor's degree in linguistics, computer science, or a related field. Familiarity with annotation tools like Labelbox, Prodigy, or the Amazon Mechanical Turk platform, as well as knowledge of data privacy and handling protocols, is typically required. Excellent communication, self-motivation, and the ability to focus on repetitive tasks help individuals excel in this position. These capabilities ensure high-quality, consistent data labeling for machine learning models, supporting the development of cutting-edge AI solutions.
What is a Text Annotation job?
A Text Annotation job involves labeling and categorizing text data to help train machine learning models. Annotators add tags, metadata, or classifications to text, enabling AI systems to understand language patterns. This work is essential for applications like chatbots, search engines, and sentiment analysis. Strong attention to detail and language proficiency are key skills for this role.

Full-time
Medical, Dental, Vision, Life, Retirement, PTO
Posted 22 days ago
Thomson Reuters rating
8.9
Based on 19 frontline employees who took The Breakroom Quiz
18th of 428 rated business services
Job description
Senior Applied Scientist, Document Understanding
About the Role
This is an applied science position focused on designing, building, and deploying production-grade document understanding systems that power Westlaw, PracticalLaw, and CoCounsel.
You will work across semantic chunking, document enrichment, and knowledge graph construction for complex legal, tax, and accounting content - delivering foundational intelligence that multiple product teams depend on at scale.
About You
You hold a PhD or Master's in Computer Science, AI, NLP, or a related field, with 5+ years of post-degree industry experience shipping document understanding, information extraction, or knowledge graph systems into production. You have hands-on depth across model development, distillation, evaluation, and deployment. You work independently, lead through influence in an applied research setting, and measure success by what ships and performs in production.
What You'll Do
Design and deploy semantic chunking models for lengthy, non-uniformly structured legal documents with adjustable granularity across use cases
Build document enrichment systems that classify documents according to legal and customer-defined taxonomies and extract rich metadata
Develop LLM-based knowledge graph construction pipelines that extract and link citations, entities, and legal concepts across diverse legal content
Build scalable synthetic data generation systems for model training, multi-hop query simulation, and hallucination-free answer generation
Apply knowledge distillation techniques to compress large models into latency-constrained, production-ready SLMs
Design evaluation frameworks - component-level and end-to-end - using expert annotation and synthetic data
Drive independent technical decisions on chunking strategy, classification approach, knowledge extraction methods, and multi-document reasoning architecture
Partner with engineering on delivery, reliability, and scale across multiple product lines
Contribute to published research at venues such as ACL, EMNLP, ICLR, NeurIPS, SIGIR, and KDD, and to intellectual property
Required Qualifications
PhD or Master's in Computer Science, AI, NLP, or a related field
5+ years of post-degree industry experience shipping document understanding, information extraction, or knowledge graph systems into production - not research-only experience
Publications at ACL, EMNLP, ICLR, NeurIPS, SIGIR, KDD, or equivalent
Experience leading through influence in an applied research setting
Production Python and experience with PyTorch, Hugging Face Transformers, and DeepSpeed
Hands-on production depth required in:
Document layout analysis and semantic chunking beyond fixed-size or paragraph-based methods
Hierarchical, multi-label document classification with domain-specific and customer-defined schemas
Entity recognition and linking, relation extraction, citation parsing, and knowledge graph construction from unstructured text
LLM-based information extraction, few-shot and multi-task learning, and post-training
Knowledge distillation, model compression, and SLM deployment under latency constraints
Synthetic data generation for NLP: query-answer generation with verification and scalable data augmentation
Annotation workflow design and evaluation framework development for document understanding tasks
Preferred Qualifications
Legal document understanding, legal information extraction, or legal AI applications
Complex document structures common in legal content: nested hierarchies, cross-references, non-uniform formatting, and embedded elements
Retrieval, QA, or analysis systems over large document collections
Knowledge graph frameworks for legal or enterprise applications
RAG and agentic workflows for enterprise knowledge systems
AzureML or AWS SageMaker
#LI-LP2
New Position: This position is open due to an existing vacancy to support our evolving business needs.What's in it For You?
Flexibility & Work-Life Balance: Flex My Way is a set of supportive workplace policies designed to help manage personal and professional responsibilities, whether caring for family, giving back to the community, or finding time to refresh and reset. This builds upon our flexible work arrangements, including work from anywhere for up to 8 weeks per year, empowering employees to achieve a better work-life balance.
Career Development and Growth: By fostering a culture of continuous learning and skill development, we prepare our talent to tackle tomorrow's challenges and deliver real-world solutions. Our Grow My Way programming and skills-first approach ensures you have the tools and knowledge to grow, lead, and thrive in an AI-enabled future.
Industry Competitive Benefits: We offer comprehensive benefit plans to include flexible vacation, two company-wide Mental Health Days off, access to the Headspace app, retirement savings, tuition reimbursement, employee incentive programs, and resources for mental, physical, and financial wellbeing.
Culture: Globally recognized, award-winning reputation for inclusion and belonging, flexibility, work-life balance, and more. We live by our values: Obsess over our Customers, Compete to Win, Challenge (Y)our Thinking, Act Fast / Learn Fast, and Stronger Together.
Social Impact: Make an impact in your community with our Social Impact Institute. We offer employees two paid volunteer days off annually and opportunities to get involved with pro-bono consulting projects and Environmental, Social, and Governance (ESG) initiatives.
Making a Real-World Impact:We are one of the few companies globally that helps its customers pursue justice, truth, and transparency. Together, with the professionals and institutions we serve, we help uphold the rule of law, turn the wheels of commerce, catch bad actors, report the facts, and provide trusted, unbiased information to people all over the world.
About Us
Thomson Reuters informs the way forward by bringing together the trusted content and technology that people and organizations need to make the right decisions. We serve professionals across legal, tax, accounting, compliance, government, and media. Our products combine highly specialized software and insights to empower professionals with the data, intelligence, and solutions needed to make informed decisions, and to help institutions in their pursuit of justice, truth, and transparency. Reuters, part of Thomson Reuters, is a world leading provider of trusted journalism and news.
We are powered by the talents of 26,000 employees across more than 70 countries, where everyone has a chance to contribute and grow professionally in flexible work environments. At a time when objectivity, accuracy, fairness, and transparency are under attack, we consider it our duty to pursue them. Sound exciting? Join us and help shape the industries that move society forward.
As a global business, we rely on the unique backgrounds, perspectives, and experiences of all employees to deliver on our business goals. To ensure we can do that, we seek talented, qualified employees in all our operations around the world regardless of race, color, sex/gender, including pregnancy, gender identity and expression, national origin, religion, sexual orientation, disability, age, marital status, citizen status, veteran status, or any other protected classification under applicable law. Thomson Reuters is proud to be an Equal Employment Opportunity Employer providing a drug-free workplace.
We also make reasonable accommodations for qualified individuals with disabilities and for sincerely held religious beliefs in accordance with applicable law. More information on requesting an accommodation here.
Learn more on how to protect yourself from fraudulent job postings here.
More information about Thomson Reuters can be found on thomsonreuters.com
What Thomson Reuters employees say
Pay
Benefits
Hours and flexibility
Workplace
Get the full story on Breakroom