1

Evaluation Engineer Jobs (NOW HIRING)

An engineer starting on a new feature should be able to quickly add examples and run an eval. Ensuring evaluations are accurate and reliable * We need to evaluate how well Elicit actually helps with ...

* Description: -Plan and lead prototype evaluations(Serviceability, Waterleak and Wireharness routing) -Conduct drawing reviews during drawing stage activities proposing structure countermeasures to ...

AMERICAN SYSTEMS is seeking a Test and Evaluation Engineer with demonstrated expertise supporting Department of Defense (DoD) programs. This position will be responsible for providing subject matter ...

Test & Evaluation Engineer

Norfolk, VA · On-site

$100K - $130K/yr

Overview AMERICAN SYSTEMS is seeking a Test and Evaluation Engineer with demonstrated expertise supporting Department of Defense (DoD) programs. This position will be responsible for providing ...

As a Test & Evaluation Engineer, you'll play a pivotal role in the success of our projects. Your responsibilities will include: Collaborating with CWL and MTWS engineers to conduct testing and ...

We are seeking a motivated Test & Evaluation Engineer to join our dynamic team. This role will support developmental test and evaluation (DT&E) activities for Electronic Warfare systems and ...

Senior Design Evaluation Engineer

Durham, NC · On-site

$101K - $138K/yr

Senior Design Evaluation Engineer The Power Control team is seeking a motivated Design Evaluation Engineer to support our Datacenter and Energy Business Unit at ADI's Research Triangle Park location ...

Job Type Full-time Description We are seeking a motivated Test & Evaluation Engineer to join our dynamic team. This role will support developmental test and evaluation (DT&E) activities for ...

Senior Design Evaluation Engineer

Durham, NC · On-site

$100K - $135K/yr

Senior Design Evaluation Engineer The Power Control team is seeking a motivated Design Evaluation Engineer to support our Datacenter and Energy Business Unit at ADI's Research Triangle Park location ...

Description We are seeking a motivated Test & Evaluation Engineer to join our dynamic team. This role will support developmental test and evaluation (DT&E) activities for Electronic Warfare systems ...

next page

Showing results 1-20

Evaluation Engineer information

See salary details

$55.5K

$128.4K

$166K

How much do evaluation engineer jobs pay per year?

As of Jun 23, 2026, the average yearly pay for evaluation engineer in the United States is $128,450.00, according to ZipRecruiter salary data. Most workers in this role earn between $118,500.00 and $150,000.00 per year, depending on experience, location, and employer.

What does an evaluation engineer do?

An evaluation engineer assesses the performance, reliability, and safety of products, systems, or processes through testing, analysis, and data collection. They often use specialized tools and techniques to ensure standards are met and may prepare reports for stakeholders. This role typically requires strong analytical skills and knowledge of engineering principles.

What engineers make $300,000 a year?

Senior engineers in specialized fields such as petroleum, aerospace, or software engineering can earn $300,000 or more annually, especially with extensive experience, advanced skills, and leadership roles. High compensation often involves working in high-demand industries, holding managerial or executive positions, or possessing rare technical expertise and certifications.

What engineers make $500,000?

Senior engineers in specialized fields such as petroleum, aerospace, or software engineering with extensive experience and advanced skills can earn $500,000 or more annually. High compensation often involves leadership roles, bonuses, stock options, or working in high-demand industries with complex projects.

What is an Evaluation Engineer?

An Evaluation Engineer is a professional who assesses products, systems, or processes to ensure they meet specified standards and performance criteria. They are responsible for designing and conducting tests, analyzing results, and recommending improvements or changes. Evaluation Engineers work in various industries, including manufacturing, electronics, software, and automotive, to support product development and quality assurance. Their work helps companies deliver reliable and effective products to the market.

What are some common challenges faced by Evaluation Engineers when assessing new products or systems?

Evaluation Engineers often encounter challenges such as tight project deadlines, rapidly evolving technology, and the need to balance thorough testing with efficiency. They may also face difficulties in obtaining comprehensive data or replicating real-world scenarios during evaluations. Collaborating closely with cross-functional teams—like design, manufacturing, and quality assurance—is essential to address these challenges and ensure accurate, actionable results.

What is the difference between Evaluation Engineer vs Test Engineer?

AspectEvaluation EngineerTest Engineer
Required CredentialsBachelor's in Engineering, certifications in testing or evaluation methodsBachelor's in Engineering, certifications in testing or quality assurance
Work EnvironmentResearch labs, product development, quality assessmentManufacturing plants, testing labs, product validation
Industry UsageUsed in electronics, aerospace, automotive for evaluating performanceUsed across industries for testing products and systems

Evaluation Engineers focus on assessing product performance, reliability, and compliance through detailed analysis, often in research or development settings. Test Engineers primarily execute testing procedures to identify defects and ensure quality during manufacturing or pre-release stages. While both roles require technical skills and certifications, Evaluation Engineers emphasize evaluation and analysis, whereas Test Engineers concentrate on testing execution and defect detection.

What engineer is in highest demand?

Evaluation engineers are in high demand in industries such as manufacturing, aerospace, and electronics, especially those with skills in testing, data analysis, and quality assurance. Their expertise in assessing product performance and compliance makes them valuable as companies prioritize reliability and safety, often requiring certifications and proficiency with testing tools. The demand for evaluation engineers continues to grow with advancements in technology and quality standards.

What are the key skills and qualifications needed to thrive as an Evaluation Engineer, and why are they important?

To thrive as an Evaluation Engineer, you need a solid background in engineering principles, analytical problem-solving, and experience with product testing, often supported by a degree in engineering or a related field. Familiarity with testing equipment, data analysis tools (such as MATLAB or LabVIEW), and industry-specific standards or certifications is typically required. Strong attention to detail, effective communication, and collaboration skills help Evaluation Engineers accurately assess products and share findings with cross-functional teams. These skills are crucial for ensuring product quality, safety, and compliance with regulatory and customer requirements.
More about Evaluation Engineer jobs
What cities are hiring for Evaluation Engineer jobs? Cities with the most Evaluation Engineer job openings:
Who are the top companies hiring for Evaluation Engineer jobs? The top employers for Evaluation Engineer jobs are:
What states have the most Evaluation Engineer jobs? States with the most job openings for Evaluation Engineer jobs include:
Evaluation Engineer

Evaluation Engineer

Elicit

Oakland, CA • On-site, Remote

Full-time

Medical, Dental, Vision, Life, Retirement, PTO

Posted 2 days ago


Job description

About Elicit
Elicit is an AI research platform that uses language models to help researchers figure out what's true and make better decisions, starting with common research tasks like literature review.
What we're aiming for:
  1. Elicit radically increases the amount of good reasoning in the world.
    • For experts, Elicit pushes the frontier forward.
    • For non-experts, Elicit makes good reasoning more affordable. People who don't have the tools, expertise, time, or mental energy to make well-reasoned decisions on their own can do so with Elicit.
  2. Elicit is a scalable ML system based on human-understandable task decompositions, with supervision of process, not outcomes. This expands our collective understanding of safe AGI architectures.

Visit our Twitter to learn more about how Elicit is helping researchers and making progress on our mission.
The mission of Elicit evals
Some orgs build evals to warn us about dangerous capabilities. Some build evals to understand trends and predict where models are heading. Some build evals to hill-climb toward models that users will like more.
At Elicit, we're after something different. We want to understand, and hill-climb toward, models that help us make better decisions.
This is harder than "what will users like better." Decision support is difficult to evaluate, and users' knee-jerk reactions don't always track with what actually helps them decide. Because it's hard, and because the sales pitch is more complicated, few are doing it well. If we get this right, we have a real shot at pushing AI toward better decision-making, both inside Elicit and beyond.
Why we're hiring for this role
We need someone to own the technical foundation of our auto-evaluation systems. Our evals are much slower than they need to be, and our interfaces aren't built for the range of people who rely on them: ML engineers iterating on models, product managers monitoring quality, and customers assessing how much to trust a result.
This role goes beyond building infrastructure. You'll work out what it actually means for Elicit to support decision-making in pharma, and encode that understanding into our evaluation systems.
What you'll own
The core auto-eval platform
You'll build a comprehensive system that runs fast, is easy to use, and supports quickly building new evals:
  • Speed: You'll build a lightning-fast basic evals infrastructure that schedules tasks to introduce practically no latency; and then you'll figure out clever ways to solve the fundamental sources of latency (building a version of Elicit, running it on a query, and evaluating it using LMs)
  • Interfaces: ML engineers need evals to kick off automatically on relevant commits, with results they can see at a glance and drill into. Product managers need dashboards showing performance over time and what's going wrong in production.
  • Architecture: Your code must be well-architected so other team members and ML engineers can understand and build on it. An engineer starting on a new feature should be able to quickly add examples and run an eval.

Ensuring evaluations are accurate and reliable
  • We need to evaluate how well Elicit actually helps with decision-making in pharma, not just measure what's easy to measure. This requires encoding real knowledge about how pharma customers make decisions (for example, choosing appropriate gold standards).
  • You'll provide appropriate statistical tests and confidence intervals so we can trust our results.
A month in the role
In a typical month, expect to spend:
  • 60% working on the core eval platform
  • 15% working closely with the evals team to build and improve specific evals (e.g., an eval of our paper search within our systematic review flow)
  • 10% mentoring our evals engineering intern
  • The rest on learning how people interact with the eval system so you can make it work better for them, and understanding what our users want from Elicit so evals measure what matters
What you bring to the role
Requirements
  • At least 3 years of experience as a professional software engineer, with demonstrated experience building complex backend systems (e.g., backend for a complex website, data pipelines, etc.)
  • Aptitude and interest in evaluating how Elicit helps with pharma decision-making. There's no particular experience you must have, but we'll evaluate your aptitude.

Will make you more competitive for the role
  • Knowledge of statistics (for e.g. calculating power and credence intervals for evals)
  • Experience with advanced Python (asyncio/trio and parallel processing strategies)
  • Front-end experience and strong UX sensibility (you'll be building dashboards). TypeScript experience is a plus.
  • Experience building developer tools (ML engineers are one of your most important clients)
  • Previous experience as a data engineer or working on AI infrastructure
  • Knowledge of pharma/biomed
  • Experience evaluating ML systems
  • Experience building language-model-based systems (helps with understanding Elicit and how to evaluate it)

This is a diverse list of nice-to-haves. We expect the candidate we select to have some, but not all, of these. Other team members can fill in for skills you lack.
Location and travel
We have a great office in Oakland, CA, and we'd love to see you there if you're local. That said, we're just as happy for you to work remotely. We do get the whole team together for a quarterly retreat somewhere fun, because in-person time matters to us.
Benefits and perks
In addition to working on important problems as part of a productive and positive team, we also offer great benefits (with some variation based on location):
  • Flexible work environment: work from our office in Oakland or remotely with time zone overlap (between GMT and GMT-8), as long as you can travel for in-person retreats and coworking events
  • Fully covered health, dental, vision, and life insurance for you, generous coverage for the rest of your family
  • Flexible vacation policy, with a minimum recommendation of 20 days/year + company holidays
  • Every Elician receives a $200 monthly wellbeing stipend to spend on whatever supports your health and wellbeing
  • 401K with a 6% employer match
  • A new Mac + $1,000 budget to set up your workstation or home office in your first year, then $500 every year thereafter
  • $1,000 quarterly AI Experimentation & Learning budget, so you can freely experiment with new AI tools, take courses, purchase educational resources, or attend AI-focused conferences and events
  • A team administrative assistant who can help you with personal and work tasks

Compensation
For all roles at Elicit, we use a data-backed compensation framework to keep salaries market-competitive, equitable, and simple to understand. For this role, we target starting ranges of:
  • Career (L3): $140-185k + equity
  • Senior (L4): $175-230k + equity