Building software to improve our evaluations. We don't just try and run the same evaluation over ... a cap-exempt H-1B visa for this role. We encourage you to apply even if your background may not ...
Building software to improve our evaluations. We don't just try and run the same evaluation over ... a cap-exempt H-1B visa for this role. We encourage you to apply even if your background may not ...
Project Analyst - CAP and Fee Group
Temecula, CA · On-site +1
$65K - $100K/yr
Full Time, Exempt Location:Remote or Temecula, CA Salary Range: $65,000 - $100,000 NBS has standard ... In addition, NBS licenses its proprietary software, called D-FAST, to local government agencies ...
Project Analyst - CAP and Fee Group
Temecula, CA · On-site +1
$65K - $100K/yr
Full Time, Exempt Location:Remote or Temecula, CA Salary Range: $65,000 - $100,000 NBS has standard ... In addition, NBS licenses its proprietary software, called D-FAST, to local government agencies ...
Project Analyst - CAP and Fee Group
Temecula, CA · On-site
$65K - $100K/yr
Analyst - Project Classification: Full Time, Exempt Location: Remote or Temecula, CA Salary Range ... In addition, NBS licenses its proprietary software, called D-FAST ® , to local government agencies ...
Project Analyst - CAP and Fee Group
Temecula, CA · On-site
$65K - $100K/yr
Analyst - Project Classification: Full Time, Exempt Location: Remote or Temecula, CA Salary Range ... In addition, NBS licenses its proprietary software, called D-FAST ® , to local government agencies ...
Member of Technical Staff
Berkeley, CA · On-site
$285K - $503K/yr
Software Engineering * You balance rapid prototyping with the creation of maintainable, scalable ... a cap-exempt H-1B visa for this role. We encourage you to apply even if your background may not ...
Member of Technical Staff
Berkeley, CA · On-site
$285K - $503K/yr
Software Engineering * You balance rapid prototyping with the creation of maintainable, scalable ... a cap-exempt H-1B visa for this role. We encourage you to apply even if your background may not ...
Cloud Evals Infrastructure Engineer
Berkeley, CA · On-site
$285K - $428K/yr
Background in supporting researchers and software engineers * Familiarity with the wacky world of ... If you lack US work authorization, we can likely sponsor a cap-exempt H-1B visa for this role. We ...
Cloud Evals Infrastructure Engineer
Berkeley, CA · On-site
$285K - $428K/yr
Background in supporting researchers and software engineers * Familiarity with the wacky world of ... If you lack US work authorization, we can likely sponsor a cap-exempt H-1B visa for this role. We ...
Robotics Test Engineer
$298K - $376K/yr
... Cap'n Proto middleware stack ... Simulation-Based Testing - Build the simulation infrastructure that lets us run autonomy software ...
Robotics Test Engineer
$298K - $376K/yr
... Cap'n Proto middleware stack ... Simulation-Based Testing - Build the simulation infrastructure that lets us run autonomy software ...
Robotics Test Engineer
$298K - $376K/yr
... Cap'n Proto middleware stack ... Simulation-Based Testing - Build the simulation infrastructure that lets us run autonomy software ...
Robotics Test Engineer
$298K - $376K/yr
... Cap'n Proto middleware stack ... Simulation-Based Testing - Build the simulation infrastructure that lets us run autonomy software ...
Robotics Test Engineer
San Francisco, CA · On-site
$298K - $376K/yr
... Cap'n Proto middleware stack ... Simulation-Based Testing - Build the simulation infrastructure that lets us run autonomy software ...
Robotics Test Engineer
San Francisco, CA · On-site
$298K - $376K/yr
... Cap'n Proto middleware stack ... Simulation-Based Testing - Build the simulation infrastructure that lets us run autonomy software ...
Case Manager
$20.20 - $22.72/hr
*This position is Full Time, Non-exempt Hourly; Starting Range $20.20 to $22.72; Day (8am-5pm) and ... Collaborate with the Street Outreach Specialist, Housing Resource Specialist, and CAP Coordinator ...
Case Manager
$20.20 - $22.72/hr
*This position is Full Time, Non-exempt Hourly; Starting Range $20.20 to $22.72; Day (8am-5pm) and ... Collaborate with the Street Outreach Specialist, Housing Resource Specialist, and CAP Coordinator ...
Process Engineer Utilities
Oxnard, CA · On-site
$85K - $105K/yr
... gas/cap-and-trade compliance. * Operates and supports wastewater treatment systems to meet ... This is an exempt position. Qualifications: * Chemical Engineer, Environmental Engineer or Pulp ...
Process Engineer Utilities
Oxnard, CA · On-site
$85K - $105K/yr
... gas/cap-and-trade compliance. * Operates and supports wastewater treatment systems to meet ... This is an exempt position. Qualifications: * Chemical Engineer, Environmental Engineer or Pulp ...
... gas/cap-and-trade compliance. * Operates and supports wastewater treatment systems to meet ... This is an exempt position. Qualifications: * Chemical Engineer, Environmental Engineer or Pulp ...
... gas/cap-and-trade compliance. * Operates and supports wastewater treatment systems to meet ... This is an exempt position. Qualifications: * Chemical Engineer, Environmental Engineer or Pulp ...
Electrical and Instrumentation Technician
Galt, CA · On-site
$38.82 - $47.64/hr
IUOE, Local 39 REPRESENTED NON-EXEMPT Performs highly skilled electrical/instrumentation ... Computers and software programs (e.g., Microsoft software packages) to conduct, compile, and/or ...
Electrical and Instrumentation Technician
Galt, CA · On-site
$38.82 - $47.64/hr
IUOE, Local 39 REPRESENTED NON-EXEMPT Performs highly skilled electrical/instrumentation ... Computers and software programs (e.g., Microsoft software packages) to conduct, compile, and/or ...
Civil Engineer - Assessments
Temecula, CA · On-site
$90K - $150K/yr
Civil Engineer Classification: Full Time, Exempt Location: Remote or Temecula, CA Salary Range: $90 ... Expected to partner with the Utility Rate and CAP and Fee Groups to support specialized studies ...
Civil Engineer - Assessments
Temecula, CA · On-site
$90K - $150K/yr
Civil Engineer Classification: Full Time, Exempt Location: Remote or Temecula, CA Salary Range: $90 ... Expected to partner with the Utility Rate and CAP and Fee Groups to support specialized studies ...
Program Manager, Arleta, Ca
CA · On-site
$60K - $75K/yr
Exempt SALARY: $60,000.00 - $75,000.00 * Work Shift: To Be Determined * All positions require a ... determine census cap for operational planning (i.e.) room changes, meal planning etc. • ...
Program Manager, Arleta, Ca
CA · On-site
$60K - $75K/yr
Exempt SALARY: $60,000.00 - $75,000.00 * Work Shift: To Be Determined * All positions require a ... determine census cap for operational planning (i.e.) room changes, meal planning etc. • ...
This position is classified as non-exempt and is eligible for overtime pay. Employees are paid bi ... Experience using Canva (or similar graphic design software) to create visually compelling marketing ...
Quick apply
This position is classified as non-exempt and is eligible for overtime pay. Employees are paid bi ... Experience using Canva (or similar graphic design software) to create visually compelling marketing ...
Program Manager, Arleta, Ca
Los Angeles, CA · On-site
$60K - $75K/yr
Exempt SALARY: $60,000.00 - $75,000.00 * Work Shift: To Be Determined * All positions require a ... determine census cap for operational planning (i.e.) room changes, meal planning etc. • ...
Quick apply
Apply Early
Program Manager, Arleta, Ca
Los Angeles, CA · On-site
$60K - $75K/yr
Exempt SALARY: $60,000.00 - $75,000.00 * Work Shift: To Be Determined * All positions require a ... determine census cap for operational planning (i.e.) room changes, meal planning etc. • ...
Apply Early
Civil Engineer - Assessments
Temecula, CA · On-site +1
$90K - $150K/yr
Civil Engineer Classification: Full Time, Exempt Location: Remote or Temecula, CA Salary Range: $90 ... Expected to partner with the Utility Rate and CAP and Fee Groups to support specialized studies ...
Civil Engineer - Assessments
Temecula, CA · On-site +1
$90K - $150K/yr
Civil Engineer Classification: Full Time, Exempt Location: Remote or Temecula, CA Salary Range: $90 ... Expected to partner with the Utility Rate and CAP and Fee Groups to support specialized studies ...
Civil Engineer
Temecula, CA · On-site
$90K - $150K/yr
Civil Engineer Classification: Full Time, Exempt Location: Remote or Temecula, CA Salary Range: $90 ... Expected to partner with the Utility Rate and CAP and Fee Groups to support specialized studies ...
Quick apply
Apply Early
Civil Engineer
Temecula, CA · On-site
$90K - $150K/yr
Civil Engineer Classification: Full Time, Exempt Location: Remote or Temecula, CA Salary Range: $90 ... Expected to partner with the Utility Rate and CAP and Fee Groups to support specialized studies ...
Apply Early
Electrical and Instrumentation Technician
Galt, CA · On-site
$38.82 - $47.64/hr
IUOE, Local 39 REPRESENTED NON-EXEMPT Performs highly skilled electrical/instrumentation ... Computers and software programs (e.g., Microsoft software packages) to conduct, compile, and/or ...
Electrical and Instrumentation Technician
Galt, CA · On-site
$38.82 - $47.64/hr
IUOE, Local 39 REPRESENTED NON-EXEMPT Performs highly skilled electrical/instrumentation ... Computers and software programs (e.g., Microsoft software packages) to conduct, compile, and/or ...
Cap Exempt Software information
What is the difference between Cap Exempt Software vs Software Developer?
| Aspect | Cap Exempt Software | Software Developer |
|---|---|---|
| Credentials | Typically requires relevant technical certifications or degrees in software engineering or computer science | Requires degrees in computer science or related fields; certifications are optional |
| Work Environment | Often employed in government, nonprofit, or educational institutions with cap-exempt status | Works in various industries including tech companies, startups, or corporate settings |
| Employer & Industry Usage | Commonly used in organizations with cap-exempt status for federal funding or grants | Widely used across private and public sectors for software development projects |
In summary, Cap Exempt Software professionals typically work in organizations with specific funding or tax-exempt status, focusing on software solutions within those environments. Software Developers have a broader role across industries, developing applications regardless of funding status.
Other
PTO
Posted 7 days ago
Job description
We are a nonprofit research organization that develops scientific methods to assess AI capabilities, risks, and mitigations, with a specific focus on threats related to AI R&D automation and misalignment.
METR has consistently set precedents for catastrophic AI risk evaluations, including the first independent safety evaluations (working informally with Anthropic and OpenAI in 2022), the first loss-of-control evaluations and first agentic dangerous capability evaluations, the first evaluations using finetuning (mentioned briefly here),the first independent evaluations using internal information about training, the first review partnership for company risk analysis, the first embedded redteaming, and the first evaluations of internal deployments.
We've been consulted and/or favorably referenced by groups on opposite ends of various spectra, including a16z, Khosla, Gary Marcus, Obama, and Dean Ball, and are known for producing one of the most positive results on AI capabilities (the time horizon trend) and the most negative (our downlift study). We're generally referenced as the canonical third party assessor, e.g. as the obvious candidate to verify conditional pause agreements.
We believe it is robustly good for policymakers and civil society to have a clear understanding of risks from AI systems, and we are extremely excited to build a team of ambitious, excellent people to tackle one of the most important challenges of our time.
What this role looks likeRunning models on tasks. Often this means integrating models into our agent scaffolds, running them on our infrastructure and checking the results carefully. (METR both develops our own tasks internally and runs external evaluations.)
Communicating results and takeaways. This includes designing useful graphs, writing up conclusions for different audiences (system cards, risk reports, regulators, X, etc), and having great takes on what matters for risk.
Building software to improve our evaluations. We don't just try and run the same evaluation over and over again. We also run faster, more informative evaluations over time; this means making the right investments (with the support of our platform team).
Project management. Live evaluations require keeping track of a bunch of threads and staying organized. With our recent risk report process, we were running many evaluations at once.
Strong and professional communication. We run important and sensitive evaluations, and so the team needs to coordinate with METR leadership, lab contacts, regulators, and others.
As part of informing the world about risk from frontier AI systems, METR often runs and publishes evaluations of frontier models.
Our evaluations are a central tool the world uses to understand AI progress. Our Time Horizon methodology has been included in system cards, called an "obsession" by the NYT, has wide reach online, and is used by governments to inform national policy.
We're expanding the ambition and scale of our evaluations. We have recently begun to measure model propensities and monitorability, and we are increasing the speed, reliability, and quantity of evaluations we aim to do so that we can keep the world informed.
Time Horizon is close to saturation, so we're currently working on Time Horizon 2.0, which we expect to be running on models over the next 6 to 18 months.
We're gearing up for our first large-scale publication on monitorability, which we believe will be similar to TH in helping folks understand trends over time.
We spent the past three months working on a large, industry-wide third-party risk assessment program - which includes us collecting information (and running evaluations!) for both monitorability and propensities/alignment. We expect to do much more work as part of our own risk assessment programs in the future.
In general, many ambitious impact stories for METR require us having the capacity to run many more evaluations than we have run historically. For example, while our evaluations currently inform many key decisionmakers about AI capabilities, they are not yet consistently run with the scale, reliability, and speed necessary to play concrete, codified roles in regulatory frameworks. Unlocking this capacity is part of the near-future vision for evaluation execution.
Software engineering. You're a strong engineer with solid infra fundamentals. You can dig into unfamiliar systems, debug from logs, and identify and fix performance bottlenecks.
Speed and scrappiness. You get things done quickly. You're able to quickly identify what 80/20 looks like, and then do that.
High attention to detail. You read closely, can spot bugs in transcripts, and pay attention to the important fiddly bits.
Research understanding and taste. You understand research ideas and priorities, and have good intuitions for which plots are informative and which analyses are worth running to poke at the data.
Strong external communicator. You communicate well with external stakeholders, and we trust you to stay on the ball with communications with, e.g., lab contacts.
Project management. You can juggle many balls at once, keep stakeholders updated, and track and anticipate blockers.
Strong writing ability. You can be a solid contributor to METR's writeups of evaluation results, see e.g. our GPT-5 report.