1

Synthetic Data Jobs (NOW HIRING)

Platform Engineer, Data

Austin, TX

$113K - $136K/yr

You'll partner closely with our ML engineers to orchestrate ingestion, synthetic data generation, and versioned releases, ensuring that every dataset is not only high-integrity and available but ...

Platform Engineer, Data

Austin, TX · On-site

$113K - $136K/yr

You'll partner closely with our ML engineers to orchestrate ingestion, synthetic data generation, and versioned releases, ensuring that every dataset is not only high-integrity and available but ...

Senior Robotics Data Engineer - Only W2

Warren, MI · On-site

$99K - $135K/yr

... and synthetic data generation. · Manage data versioning, metadata, and dataset governance to support model training, evaluation, and regression testing. · Collaborate with Robotics Perception ...

Generate representative synthetic data sets for systems development and data science initiatives * Build and secure data pipelines to enable teammates, engineers, and client stakeholders

Help innovate and explore in one of our areas of expertise, like structured and unstructured synthetic data, generative and multimodal language models, etc. * Help monitor new advances in AI and ...

That means running end-to-end campaigns across human feedback, synthetic data, and product-embedded signals. The quality of what we collect shapes the quality of what we ship, and this role owns that ...

Data Collection

San Jose, CA · On-site

$150K - $250K/yr

That means running end-to-end campaigns across human feedback, synthetic data, and product-embedded signals. The quality of what we collect shapes the quality of what we ship, and this role owns that ...

OR · On-site

$63 - $83/hr

You will help partners use Omniverse, Cosmos, synthetic data, and coding-agent-assisted digital twins workflows to define architectures, compute footprints, test plans, and rollout strategies. What ...

AI Engineer

Leawood, KS · On-site

$111K - $133K/yr

Support post-training data workflows such as SFT, instruction tuning, preference data, RLHF/DPO-style data, reward model data, and synthetic data generation. * Use modern annotation tools and AWS ...

AI Engineer

Leawood, KS · On-site

$111K - $133K/yr

Support post-training data workflows such as SFT, instruction tuning, preference data, RLHF/DPO-style data, reward model data, and synthetic data generation. * Use modern annotation tools and AWS ...

TDM Architect

Raleigh, NC · On-site

$61.25 - $80.75/hr

TDM Senior Tech Architect(CA TDM, Delphix, Informatica TDM, etc.) Test Data provisioning, Data Profiling & masking (subsetting, synthetic data, PII protection) Deep, hands on TDM experience (strategy ...

next page

Showing results 1-20

Synthetic Data information

What are the key skills and qualifications needed to thrive as a Synthetic Data Engineer, and why are they important?

To thrive as a Synthetic Data Engineer, you need a strong background in computer science, statistics, and data modeling, usually with a degree in a related field. Experience with programming languages like Python or R, familiarity with machine learning frameworks, and knowledge of data privacy tools are essential. Strong analytical thinking, attention to detail, and effective communication help in designing robust data solutions and collaborating with stakeholders. These skills ensure the creation of high-quality synthetic datasets that support research, model training, and compliance with data privacy regulations.

What is the difference between Synthetic Data vs Data Analyst?

AspectSynthetic DataData Analyst
CredentialsNone required, but knowledge of data generation tools helpfulBachelor's degree in data science, statistics, or related field
Work EnvironmentData labs, software development teams, AI/ML projectsBusiness environments, analytics teams, reporting platforms
Industry UsageAI training, testing, privacy complianceData interpretation, reporting, decision support

While Synthetic Data involves creating artificial datasets for testing and training AI models, Data Analysts focus on interpreting real-world data to generate insights. Both roles require data literacy, but Synthetic Data specialists focus on data generation techniques, whereas Data Analysts analyze existing data to inform business decisions.

What are the main challenges faced by professionals working with synthetic data in a production environment?

One of the primary challenges in a synthetic data role is ensuring that the generated datasets accurately reflect real-world scenarios while maintaining privacy and compliance standards. Professionals often need to balance data utility with the risk of introducing bias or unrealistic patterns. Collaboration with data scientists, engineers, and domain experts is essential to validate results and integrate synthetic data into machine learning pipelines. Additionally, staying updated on evolving tools and best practices is crucial for maintaining data quality and relevance.

What is synthetic data and how is it used?

Synthetic data refers to artificially generated information that mimics real-world data but does not contain any actual personal or sensitive details. It is commonly used to train machine learning models, test software, and protect privacy when sharing datasets. By using synthetic data, organizations can avoid data privacy concerns and still gain valuable insights or test algorithms effectively. This approach is especially valuable in industries like healthcare and finance where real data may be restricted. Synthetic data can be generated using various statistical techniques, simulations, or machine learning models.
More about Synthetic Data jobs
What cities are hiring for Synthetic Data jobs? Cities with the most Synthetic Data job openings:
What states have the most Synthetic Data jobs? States with the most job openings for Synthetic Data jobs include:
Infographic showing various Synthetic Data job openings in the United States as of June 2026, with employment types broken down into 100% As Needed. Highlights an 89% Physical, 3% Hybrid, and 8% Remote job distribution.

$113K - $136K/yr

Other

Medical, Dental, Vision, PTO

Posted 25 days ago


Job description

Data Platform Engineer

Austin, TX

Allen Control Systems (ACS) is a cutting-edge defense startup founded by two ex-Navy electrical engineers with a proven track record in robotics and software. We are developing a small, autonomous gun turret that employs advanced computer vision and control systems to precisely target and neutralize small drones and loitering munitions. Our innovative approach requires overcoming significant technical challenges, making this an exciting and dynamic environment for experienced engineers.

With an engineering-first culture, ACS values technical excellence and continuous learning. Backed by our founders' successful exits from two previous ventures acquired for a combined $180M in 2022, we are committed to ensuring that the groundbreaking technologies we develop have a real-world impact.

Position Overview:

We are seeking a Data Platform Engineer who combines expert-level data infrastructure skills with a strong knowledge of AI & Machine Learning principles. In this role, you will go beyond simple data validation scripts; you will apply your understanding of model training dynamics to design and implement existing and novel approaches to optimize our datasets.

You will build and maintain large-scale image and video pipelines, but with a focus on data curation strategies such as coreset selection, embedding-based filtering, and automated complexity scoring. You'll partner closely with our ML engineers to orchestrate ingestion, synthetic data generation, and versioned releases, ensuring that every dataset is not only high-integrity and available but strictly optimized to maximize model performance.

What You'll Do:
  • Design and develop a scalable data infrastructure, focusing on organization and curation to support continuing increases in data volume and complexity
  • Design and implement existing and novel approaches to optimize datasets for model training (e.g., hard example mining, class balancing, de-duplication, embedded-based filtering).
  • Support the data infrastructure required for optimal ingestion, transformation, and storing of datasets
  • Develop and use synthetic data generation workflows to create realistic synthetic training data for computer vision models.
  • Design and own end-to-end image and video pipelines for computer vision model training: multi-source ingestion, QA and visualization, standardization, and organization.
  • Coordinate collection of real-world data; coordinate label creation and QA with labelers.
  • Develop and use data quality tooling: metrics for balance, drift, and annotation error; active-learning sampling to target gaps; feedback loops from production back to curation.
  • Implement and own dataset versioning, release management, and lineage and metadata cataloging.
What You'll Need:
  • 3+ years of experience in data engineering or equivalent fields.
  • Solid understanding of data structures and systems design for orchestrating data-related workflows in a rapidly growing environment.
  • Proficient in using AWS for data management and processing.
  • Proficient in Python for scripting and data processing; proficient with SQL and Linux.
  • Educational Background: Bachelor's or Master's degree in Computer Science or a related field.
  • Proven ability to communicate well across engineering teams, and write and maintain effective documentation.
You'll Stand Out:
  • 5+ years of industry experience.
  • Experience in image/video data engineering for computer vision projects.
  • Experience with PyTorch DeepCore.
  • Experience with Unreal Engine.
What We Offer:
  • Competitive salary
  • ACS Equity Package
  • Health, Dental, Vision Insurance
  • Paid Time Off

Allen Control Systems is an Equal Opportunity Employer, providing equal employment opportunities to all employees and applicants for employment. Allen Control Systems prohibits discrimination and harassment of any type without regard to race, color, religion, age, sex, national origin, disability status, genetics, protected veteran status, sexual orientation, gender identity or expression, or any other characteristic protected by federal, state or local laws.