1

Evaluations Engineer Jobs (NOW HIRING)

Senior AI Agent & Evaluations Engineer

Portland, OR ยท On-site

$111K - $152K/yr

We're looking for a hands-on Senior AI Agent & Evals Engineer to own the intelligence layer behind these systems. You'll be responsible for designing agent behavior, building evaluation frameworks ...

Senior AI Agent & Evaluations Engineer

Portland, OR ยท On-site +1

$110K - $152K/yr

We're looking for a hands-on Senior AI Agent & Evals Engineer to own the intelligence layer behind these systems. You'll be responsible for designing agent behavior, building evaluation frameworks ...

Senior AI Agent & Evaluations Engineer

Portland, OR ยท On-site +1

$110K - $152K/yr

We're looking for a hands-on Senior AI Agent & Evals Engineer to own the intelligence layer behind these systems. You'll be responsible for designing agent behavior, building evaluation frameworks ...

Senior AI Agent & Evaluations Engineer

Portland, OR ยท On-site +1

$111K - $152K/yr

We're looking for a hands-on Senior AI Agent & Evals Engineer to own the intelligence layer behind these systems. You'll be responsible for designing agent behavior, building evaluation frameworks ...

ML engineers iterating on models, product managers monitoring quality, and customers assessing how ... Ensuring evaluations are accurate and reliable * We need to evaluate how well Elicit actually helps ...

AMERICAN SYSTEMS is seeking a Test and Evaluation Engineer with demonstrated expertise supporting Department of Defense (DoD) programs. This position will be responsible for providing subject matter ...

next page

Showing results 1-20

Evaluations Engineer information

See salary details

$12

$55

$80

How much do evaluations engineer jobs pay per hour?

As of Jun 23, 2026, the average hourly pay for evaluations engineer in the United States is $55.99, according to ZipRecruiter salary data. Most workers in this role earn between $40.14 and $74.52 per hour, depending on experience, location, and employer.

What does an Evaluations Engineer do?

An Evaluations Engineer is responsible for assessing the performance, quality, and reliability of products, systems, or processes. They design and conduct tests, analyze data, and prepare detailed reports to ensure that products meet industry standards and client requirements. Evaluations Engineers often collaborate with design, manufacturing, and quality assurance teams to recommend improvements and address any deficiencies. Their role is critical in ensuring that products are safe, effective, and ready for market release.

What engineer makes $500,000 a year?

Highly experienced engineers in specialized fields such as petroleum engineering, aerospace engineering, or senior software engineering roles at large tech companies can earn $500,000 or more annually, often including bonuses and stock options. These positions typically require advanced skills, extensive experience, and often involve leadership or executive responsibilities.

What engineers make $300,000 a year?

Senior engineers in specialized fields such as petroleum, aerospace, or software engineering with extensive experience and advanced skills can earn $300,000 or more annually. High-level roles often require advanced degrees, certifications, and leadership responsibilities, especially in industries with high technical demands and project complexity.

How does an Evaluations Engineer typically collaborate with other departments during product assessments?

Evaluations Engineers often work closely with cross-functional teams such as product development, quality assurance, and marketing to thoroughly assess products or systems. They provide technical feedback, prepare testing protocols, and communicate findings to stakeholders to inform design improvements or go-to-market decisions. Regular meetings and clear documentation are key, as Evaluations Engineers must ensure that all departments understand the evaluation outcomes and their implications on the product lifecycle.

What is the difference between Evaluations Engineer vs Quality Assurance Engineer?

CriteriaEvaluations EngineerQuality Assurance Engineer
Required CredentialsBachelor's in Engineering or related field; certifications varyBachelor's in Engineering, Computer Science, or related; QA certifications often preferred
Work EnvironmentDesign and conduct evaluations of products or systems, often in labs or testing facilitiesDevelop and execute testing plans, often in labs or production environments
Employer & Industry UsageUsed in manufacturing, aerospace, automotive, and tech industriesCommon in software, manufacturing, and tech sectors
Common Search & Comparison IntentEvaluations Engineer vs Quality Assurance Engineer

Evaluations Engineers focus on testing and analyzing products or systems to ensure they meet specifications, often involving detailed assessments and technical evaluations. Quality Assurance Engineers primarily develop testing procedures to identify defects and improve product quality. While both roles involve testing, Evaluations Engineers tend to focus on technical evaluations, whereas QA Engineers emphasize process and defect detection.

What does an evaluation engineer do?

An evaluation engineer assesses the performance, safety, and reliability of products, systems, or processes through testing, analysis, and data collection. They often use specialized tools and techniques to ensure standards are met and may prepare reports or recommendations based on their findings.

What are the 4 types of engineers?

Engineers are typically categorized into four main types: civil, mechanical, electrical, and chemical engineers. Each type specializes in different fields such as infrastructure, machinery, electronics, or chemical processes, and often requires specific technical skills and certifications. In the context of evaluations engineering, professionals may focus on assessing systems, products, or processes within these engineering disciplines.

What are the key skills and qualifications needed to thrive as an Evaluations Engineer, and why are they important?

To thrive as an Evaluations Engineer, you need a strong background in engineering principles, data analysis, and problem-solving, often supported by a relevant engineering degree. Familiarity with testing tools, statistical analysis software, and quality management systems is typically required, along with certifications like Six Sigma being advantageous. Excellent attention to detail, communication skills, and a collaborative mindset help you interpret results and work effectively with cross-functional teams. These skills ensure precise evaluations, drive continuous improvement, and support informed decision-making within technical projects.
More about Evaluations Engineer jobs
What cities are hiring for Evaluations Engineer jobs? Cities with the most Evaluations Engineer job openings:
What states have the most Evaluations Engineer jobs? States with the most job openings for Evaluations Engineer jobs include:
What job categories do people searching Evaluations Engineer jobs look for? The top searched job categories for Evaluations Engineer jobs are:
Infographic showing various Evaluations Engineer job openings in the United States as of June 2026, with employment types broken down into 84% Full Time, 12% Part Time, and 4% Contract. Highlights an 87% Physical, 5% Hybrid, and 8% Remote job distribution, with an average salary of $116,463 per year, or $56 per hour.
Senior AI Agent & Evaluations Engineer

Senior AI Agent & Evaluations Engineer

Vacatia

Portland, OR โ€ข On-site

$111K - $152K/yr

Full-time

Posted 6 days ago


Job description

Join Vacatia and Help Build the Future of AI-Powered Vacation Ownership
Location: Portland, OR (Hybrid - Three Days In Office)
Remote considered for exceptional candidates.
About Vacatia
Vacatia is building the future of vacation ownership. We operate in a fragmented, operationally complex industry where AI has the potential to fundamentally transform how decisions are made, how customers are supported, and how businesses scale.
We're developing AI agents that sit at the center of critical business workflows-helping owners, supporting operations, surfacing insights, and automating decisions that historically required significant human effort. These agents interact with real customers and influence real business outcomes, making reliability, safety, and performance essential.
We're looking for a hands-on Senior AI Agent & Evals Engineer to own the intelligence layer behind these systems. You'll be responsible for designing agent behavior, building evaluation frameworks, creating guardrails, and continuously improving agent performance as our AI footprint expands across the organization.
If you're passionate about prompt engineering, agent reliability, and creating measurable AI systems that solve meaningful business problems, we'd love to meet you.
Why You'll Love Working at Vacatia
Build the Future of Applied AI
Design and improve AI agents that directly impact customer experiences, operational efficiency, and business outcomes across our organization.
Work on Problems That Matter
Your work will influence real-world decisions involving customer communications, mortgage outcomes, rental operations, and owner experiences.
Own the Intelligence Layer
Take full ownership of prompt design, agent behavior, evaluation systems, guardrails, and continuous performance improvement.
Measure What Matters
Build sophisticated evaluation frameworks, golden datasets, and automated scoring systems that ensure our agents continually improve.
Partner Across the Business
Collaborate closely with engineers, operators, and subject matter experts to transform business knowledge into scalable AI systems.
Join a Small Team with Outsized Impact
Work alongside experienced engineers and leaders who believe AI can create meaningful competitive advantages in a traditionally underserved industry.
Your Impact
  • Design, refine, and optimize prompts, tool definitions, routing logic, and decision-making behavior across Vacatia's AI agent ecosystem
  • Build and maintain evaluation frameworks, golden datasets, grading systems, and regression testing pipelines that measure agent quality and reliability
  • Develop guardrails and safe-failure mechanisms that ensure agents operate responsibly in customer-facing and financially sensitive workflows
  • Monitor production performance, investigate failures, identify edge cases, and continuously improve agent outcomes through data-driven iteration
  • Partner with business stakeholders to translate policies, operational requirements, and domain expertise into measurable agent behavior
  • Collaborate with engineering teams to define context requirements, tool contracts, and integration specifications that support agent success
  • Create scalable frameworks and reusable patterns for deploying AI agents across new business workflows and use cases
  • Establish best practices for prompt engineering, evaluation methodologies, observability, and agent operations

What You Bring
  • Proven experience shipping and owning production AI agents or LLM-powered systems beyond proof-of-concept environments
  • Deep expertise in prompt engineering, including system prompts, tool usage, context management, output constraints, and agent behavior design
  • Hands-on experience building evaluation frameworks using golden datasets, scoring rubrics, LLM-as-judge methodologies, and regression testing
  • Strong familiarity with modern AI development tools such as Claude Code, Codex, or similar coding agents
  • Experience with agent observability and evaluation platforms such as LangSmith, Langfuse, Arize, Galileo, or comparable solutions
  • Ability to distinguish prompt issues from data, tooling, model, or evaluation failures and systematically improve agent performance
  • Strong written and verbal communication skills with the ability to work effectively across engineering and business teams
  • Demonstrated ownership mindset with a passion for building reliable, measurable, and continuously improving AI systems

Strongly Preferred
  • Experience building agents that process communication-based workflows including emails, support tickets, chat interactions, or transcripts
  • Experience with multiple agent frameworks and a practical understanding of their tradeoffs
  • Familiarity with the evolving LLM landscape and model selection strategies
  • Experience designing and implementing end-to-end evaluation pipelines and agent operations workflows
  • Production experience with online evaluation systems and automated scoring of live traffic

Nice to Have
  • Experience integrating AI systems with Salesforce, AWS Connect, or customer engagement platforms
  • Background in customer-facing industries where accuracy, compliance, and communication quality are critical
  • Contributions to open-source projects, technical writing, or public thought leadership in AI, prompt engineering, or agent development

Join Us
Join us at the forefront of applied AI innovation. If you're excited about building intelligent systems that solve complex business problems, improving agent behavior through rigorous evaluation, and helping shape the future of vacation ownership, we'd love to hear from you.
At Vacatia, you'll have the opportunity to build AI solutions that matter, work alongside talented teammates, and create technology that drives real business impact.