You'll orchestrate annotation pipelines across platforms such as MTurk and own the full lifecycle of training data, from raw ingestion to clean, model-ready samples that directly drive quality ...
You'll orchestrate annotation pipelines across platforms such as MTurk and own the full lifecycle of training data, from raw ingestion to clean, model-ready samples that directly drive quality ...
MTurk, Prime Panels). A PhD degree or ABD and prior teaching experience is required. The University of Michigan-Dearborn is one of the three campuses of the University of Michigan. We are a ...
MTurk, Prime Panels). A PhD degree or ABD and prior teaching experience is required. The University of Michigan-Dearborn is one of the three campuses of the University of Michigan. We are a ...
Catalog Specialist, RCX
Bellevue, WA · On-site
... MTurk or Sagemaker platform for data annotation tasks - Understanding of data annotation methodologies and tools - Familiarity with Amazon's product and category ecosystem - Previous exposure to ...
Catalog Specialist, RCX
Bellevue, WA · On-site
... MTurk or Sagemaker platform for data annotation tasks - Understanding of data annotation methodologies and tools - Familiarity with Amazon's product and category ecosystem - Previous exposure to ...
Lab Manager
Hyde Park, VT · On-site
Implementing large-scale data collection using online survey tools (e.g., MTurk, Prolific) * Designing and programming experimental stimuli (using Qualtrics, PsychoPy, etc.) * Recruiting, testing ...
Lab Manager
Hyde Park, VT · On-site
Implementing large-scale data collection using online survey tools (e.g., MTurk, Prolific) * Designing and programming experimental stimuli (using Qualtrics, PsychoPy, etc.) * Recruiting, testing ...
Experience with Qualtrics survey builder and online research platforms such as Prolific and Mturk * Candidates should be organized, conscientious, detail-oriented, self-motivated, and responsive ...
Experience with Qualtrics survey builder and online research platforms such as Prolific and Mturk * Candidates should be organized, conscientious, detail-oriented, self-motivated, and responsive ...
Mturk information
What are the key skills and qualifications needed to thrive in the Mturk position, and why are they important?
To thrive as an Mturk worker (Amazon Mechanical Turk), you should have strong attention to detail, time management abilities, and basic computer literacy, with no formal degree required. Familiarity with the Amazon Mechanical Turk platform and proficiency using internet browsers and spreadsheet software are typically beneficial. Exceptional self-motivation, reliability, and the ability to follow instructions closely distinguish top performers in this role. These skills ensure high-quality task completion, meet requester specifications, and maximize earning potential.
What are some common challenges faced by Mturk workers, and how can they be overcome?
One of the most common challenges Mturk workers encounter is the variability in task availability and payment rates, which can make earnings unpredictable. Additionally, tasks may sometimes be repetitive, require meticulous attention to detail, or have strict completion guidelines. To overcome these challenges, successful workers regularly monitor new tasks, carefully review instructions, and build a reputation for high-quality submissions, which can lead to access to better-paying HITs. Networking with other workers through online communities can also provide valuable tips and support for maximizing your success on the platform.
What is an MTurk job?
An MTurk job, also known as a Human Intelligence Task (HIT), is a small task that workers complete on the Amazon Mechanical Turk platform. These tasks often involve data validation, surveys, content moderation, or other simple tasks that require human intelligence. Workers complete these HITs for a specified payment set by the requester. It is a flexible, gig-based platform where tasks can be completed remotely.

$200K - $260K/yr
Full-time
Medical, Dental, Vision, Retirement, PTO
Posted 16 days ago
Job description
Cantina Labs is a social AI company, developing a suite of advanced real-time models that push the boundaries of expression, personality, and realism. We bring characters to life, transforming how people tell stories, connect, and create. We build and power ecosystems. Cantina, our flagship social AI platform, is just the beginning.
If you're excited about the potential AI has to shape human creativity and social interactions, join us in building the future!
About the Role:
We are looking for a new Member of Technical Staff to build and scale the data pipelines behind our large video generation models. This role is focused on collecting large amounts of relevant video data, preparing high-quality training samples, and developing robust preprocessing, filtering, and parsing workflows. You'll orchestrate annotation pipelines across platforms such as MTurk and own the full lifecycle of training data, from raw ingestion to clean, model-ready samples that directly drive quality improvements. This role sits at the intersection of data engineering and ML research, making it central to how we turn messy real-world data into the fuel that moves our models forward.
What You'll Do:
- Build and maintain data pipelines for large video generation models, including data ingestion, parsing, filtering, preprocessing, and dataset curation at scale, using tools such as AWS S3 and DynamoDB.
- Design and run annotation workflows across platforms such as MTurk, Prolific, and Mechanical Turk, including task design, quality control, and label validation.
- Train, evaluate, and improve smaller supporting models used for data filtering, quality assessment, preprocessing, or other parts of the ML pipeline.
- Partner closely with research and engineering teams to turn experimental workflows into scalable, repeatable systems that support model training and evaluation.
- Own data quality across the pipeline by identifying bottlenecks, failure modes, and low-quality sources, and continuously improving tooling and processes.
- Build internal tools and automation that make it easier to prepare datasets, launch annotation jobs, monitor outputs, and support model development end to end.
- Drive larger pipeline projects from start to finish, such as new dataset creation efforts or upgrades to labeling and preprocessing infrastructure.
- Work within a Kubernetes-based training infrastructure, ensuring datasets are properly prepared, formatted, and delivered to training clusters.
- Profile and optimize research model inference scripts used in preprocessing steps, ensuring that model-driven filtering and transformation stages run within practical time and cost constraints when applied to large-scale raw data.
What You'll Bring:
- 3+ years of experience in machine learning, applied ML, data pipelines, or related engineering roles, ideally working on large-scale multimodal, video, or vision-based systems.
- Strong programming skills in Python and solid experience building reliable data processing and preprocessing pipelines for ML workflows.
- Hands-on experience preparing training data for ML models, including parsing, filtering, dataset curation, quality control, and large-scale data handling using tools such as AWS S3 and DynamoDB.
- Familiarity with annotation and labeling workflows, including task design, vendor or crowd-platform orchestration such as MTurk or Prolific, and methods for ensuring label quality.
- Experience working with Kubernetes for orchestrating distributed workloads, including data preprocessing, pipeline execution, and dataset delivery to training clusters.
- Comfort working across cloud and on-demand compute environments such as AWS and RunPod, with the ability to port and optimize pipelines across infrastructure.
- Familiarity with distributed data processing frameworks and experience designing systems that operate reliably at scale across many nodes or workers.
- Working knowledge of PyTorch and the broader deep learning stack, with the ability to read, debug, and optimize research model inference code for use in production preprocessing pipelines.
- Ability to work cross-functionally with research and engineering teams and translate experimental ideas into robust, scalable systems.
- Bachelor's, Master's, or PhD in Computer Science, Machine Learning, Engineering, Mathematics, or a related technical field; experience in generative video, computer vision, or multimodal ML is strongly preferred.
- Bonus: Experience training, evaluating, or fine-tuning smaller ML models used for classification, filtering, ranking, quality assessment, or other supporting tasks in an ML pipeline.
Compensation:
The anticipated annual base salary range for this role is between $200,000-$260,000 (€170,000-€225,000). When determining compensation, a number of factors will be considered, including skills, experience, job scope, location, and competitive compensation market data.
Benefits for U.S.-based roles:
- Competitive salary and generous company equity
- Medical, dental, and vision insurance - 99.99% of premiums covered by Cantina
- 42 days of paid time off, including:
- 15 PTO days
- 10 sick days
- 15 company holidays
- 2 floating holidays
- Generous parental leave & fertility support
- 401(k) retirement savings plan
- Lifestyle spending account - $500/month to use however you'd like
- Complimentary lunch and snacks for in-office employees
- One Medical membership, and more!
About Cantina
Sourced by ZipRecruiter