Experience with mechanistic interpretability and/or alternative approaches to understanding model internals (e.g., activation analysis, circuit-level reasoning, representation learning). Background ...
Experience with mechanistic interpretability and/or alternative approaches to understanding model internals (e.g., activation analysis, circuit-level reasoning, representation learning). Background ...
Familiarity with interpretability, mechanistic interpretability, or model internals (sparse autoencoders, feature steering, etc.). Our values Goodfire is looking for individuals who embody our values ...
Familiarity with interpretability, mechanistic interpretability, or model internals (sparse autoencoders, feature steering, etc.). Our values Goodfire is looking for individuals who embody our values ...
Familiarity with interpretability, mechanistic interpretability, or model internals (sparse autoencoders, feature steering, etc.). Our Values Goodfire is looking for individuals who embody our values ...
Familiarity with interpretability, mechanistic interpretability, or model internals (sparse autoencoders, feature steering, etc.). Our Values Goodfire is looking for individuals who embody our values ...
... mechanistic interpretability. Key Responsibilities • Conduct original research in AI security, including adversarial machine learning, model robustness, and secure AI system design. • Develop and ...
... mechanistic interpretability. Key Responsibilities • Conduct original research in AI security, including adversarial machine learning, model robustness, and secure AI system design. • Develop and ...
... mechanistic interpretability. Key Responsibilities • Conduct original research in AI security, including adversarial machine learning, model robustness, and secure AI system design. • Develop and ...
... mechanistic interpretability. Key Responsibilities • Conduct original research in AI security, including adversarial machine learning, model robustness, and secure AI system design. • Develop and ...
Field Team - Member of Technical Staff
San Francisco, CA · On-site
$200K - $325K/yr
Familiarity with interpretability, mechanistic interpretability, or model internals (sparse autoencoders, feature steering, etc.). Our values Goodfire is looking for individuals who embody our values ...
Field Team - Member of Technical Staff
San Francisco, CA · On-site
$200K - $325K/yr
Familiarity with interpretability, mechanistic interpretability, or model internals (sparse autoencoders, feature steering, etc.). Our values Goodfire is looking for individuals who embody our values ...
Experience in explainable and interpretable AI, such as feature attribution methods like LIME and SHAP, example- or influence-based attribution, or mechanistic interpretability. * Track record of ...
Experience in explainable and interpretable AI, such as feature attribution methods like LIME and SHAP, example- or influence-based attribution, or mechanistic interpretability. * Track record of ...
Research Lead
Berkeley, CA · On-site
Mechanistic Interpretability: finding issues with Sparse Autoencoders, probing deception using AmongUs, understanding learned planning in SokoBan, and interpretable data attribution. * Red-teaming ...
Research Lead
Berkeley, CA · On-site
Mechanistic Interpretability: finding issues with Sparse Autoencoders, probing deception using AmongUs, understanding learned planning in SokoBan, and interpretable data attribution. * Red-teaming ...
$170K - $270K/yr
Mechanistic Interpretability: finding issues with Sparse Autoencoders, probing deception using AmongUs, understanding learned planning in SokoBan, and interpretable data attribution. Red-teaming ...
$170K - $270K/yr
Mechanistic Interpretability: finding issues with Sparse Autoencoders, probing deception using AmongUs, understanding learned planning in SokoBan, and interpretable data attribution. Red-teaming ...
Experience in explainable and interpretable AI, such as feature attribution methods like LIME and SHAP, example- or influence-based attribution, or mechanistic interpretability. * Track record of ...
Experience in explainable and interpretable AI, such as feature attribution methods like LIME and SHAP, example- or influence-based attribution, or mechanistic interpretability. * Track record of ...
AI Strategy Leader, R&D
Mountain View, CA · On-site
In this pivotal role, you will translate cutting-edge AI trends-such as agentic and multi-agent systems, advanced reasoning frameworks, embodied/physical AI, mechanistic interpretability, efficient ...
AI Strategy Leader, R&D
Mountain View, CA · On-site
In this pivotal role, you will translate cutting-edge AI trends-such as agentic and multi-agent systems, advanced reasoning frameworks, embodied/physical AI, mechanistic interpretability, efficient ...
AI Strategy Leader, R&D
Mountain View, CA · On-site
In this pivotal role, you will translate cutting-edge AI trends-such as agentic and multi-agent systems, advanced reasoning frameworks, embodied/physical AI, mechanistic interpretability, efficient ...
AI Strategy Leader, R&D
Mountain View, CA · On-site
In this pivotal role, you will translate cutting-edge AI trends-such as agentic and multi-agent systems, advanced reasoning frameworks, embodied/physical AI, mechanistic interpretability, efficient ...
Senior Program Scientist, AI and Advanced Computing Institute
Manhattan, NY · On-site
$100K - $137K/yr
AI model evaluation and red-teaming for scientific reliability, Mechanistic interpretability or auditing methods, Multi-agent systems and emergent behavior, AI-accelerated simulation frameworks • ...
Senior Program Scientist, AI and Advanced Computing Institute
Manhattan, NY · On-site
$100K - $137K/yr
AI model evaluation and red-teaming for scientific reliability, Mechanistic interpretability or auditing methods, Multi-agent systems and emergent behavior, AI-accelerated simulation frameworks • ...
Research Engineer, AI
New York, NY · Hybrid
$214K - $375K/yr
Mechanistic interpretability of biological foundation models: extracting new biological knowledge directly from model weights * Scientific data at unprecedented scale: AI systems to collect, curate ...
Research Engineer, AI
New York, NY · Hybrid
$214K - $375K/yr
Mechanistic interpretability of biological foundation models: extracting new biological knowledge directly from model weights * Scientific data at unprecedented scale: AI systems to collect, curate ...
Research Scientist, AI
Redwood City, CA · Hybrid
$214K - $375K/yr
Mechanistic interpretability of biological foundation models: extracting new biological knowledge directly from model weights * Scientific data at unprecedented scale: AI systems to collect, curate ...
Research Scientist, AI
Redwood City, CA · Hybrid
$214K - $375K/yr
Mechanistic interpretability of biological foundation models: extracting new biological knowledge directly from model weights * Scientific data at unprecedented scale: AI systems to collect, curate ...
Research Scientist, AI
New York, NY · Hybrid
$214K - $375K/yr
Mechanistic interpretability of biological foundation models: extracting new biological knowledge directly from model weights * Scientific data at unprecedented scale: AI systems to collect, curate ...
Research Scientist, AI
New York, NY · Hybrid
$214K - $375K/yr
Mechanistic interpretability of biological foundation models: extracting new biological knowledge directly from model weights * Scientific data at unprecedented scale: AI systems to collect, curate ...
Research Engineer, AI
Redwood City, CA · Hybrid
$214K - $375K/yr
Mechanistic interpretability of biological foundation models: extracting new biological knowledge directly from model weights * Scientific data at unprecedented scale: AI systems to collect, curate ...
Research Engineer, AI
Redwood City, CA · Hybrid
$214K - $375K/yr
Mechanistic interpretability of biological foundation models: extracting new biological knowledge directly from model weights * Scientific data at unprecedented scale: AI systems to collect, curate ...
Research Scientist, AI
New York, NY · On-site +1
$214K - $375K/yr
Mechanistic interpretability of biological foundation models: extracting new biological knowledge directly from model weights * Scientific data at unprecedented scale: AI systems to collect, curate ...
Research Scientist, AI
New York, NY · On-site +1
$214K - $375K/yr
Mechanistic interpretability of biological foundation models: extracting new biological knowledge directly from model weights * Scientific data at unprecedented scale: AI systems to collect, curate ...
Research Engineer, AI
New York, NY · On-site +1
$214K - $375K/yr
Mechanistic interpretability of biological foundation models: extracting new biological knowledge directly from model weights * Scientific data at unprecedented scale: AI systems to collect, curate ...
Research Engineer, AI
New York, NY · On-site +1
$214K - $375K/yr
Mechanistic interpretability of biological foundation models: extracting new biological knowledge directly from model weights * Scientific data at unprecedented scale: AI systems to collect, curate ...
Research Engineer, AI
Redwood City, CA · On-site +1
$214K - $375K/yr
Mechanistic interpretability of biological foundation models: extracting new biological knowledge directly from model weights * Scientific data at unprecedented scale: AI systems to collect, curate ...
Research Engineer, AI
Redwood City, CA · On-site +1
$214K - $375K/yr
Mechanistic interpretability of biological foundation models: extracting new biological knowledge directly from model weights * Scientific data at unprecedented scale: AI systems to collect, curate ...
Mechanistic Interpretability information
See salary details
$31K - $32.8K
13% of jobs
$33.2K is the 25th percentile. Wages below this are outliers.
$32.8K - $34.5K
56% of jobs
$35K is the 75th percentile. Wages above this are outliers.
$34.5K - $36.3K
26% of jobs
$36.3K - $38.1K
1% of jobs
$38.1K - $39.9K
0% of jobs
$39.9K - $41.6K
0% of jobs
$41.6K - $43.4K
0% of jobs
$43.4K - $45.2K
1% of jobs
$45.2K - $47K
1% of jobs
$47K - $48.7K
1% of jobs
$48.7K - $50.5K
1% of jobs
$31K
$36.3K
$50.5K
How much do mechanistic interpretability jobs pay per year?
What is the difference between Mechanistic Interpretability vs Data Scientist?
| Aspect | Mechanistic Interpretability | Data Scientist |
|---|---|---|
| Required credentials | Advanced degrees in AI, ML, or related fields | Degree in Data Science, Statistics, or Computer Science |
| Work environment | Research labs, AI development teams | Business, tech companies, consulting firms |
| Industry usage | AI research, model transparency, safety | Data analysis, predictive modeling, insights |
| Search intent | Understanding model internals, interpretability techniques | Data analysis, insights, model building |
Mechanistic Interpretability focuses on understanding how AI models work internally, often requiring deep technical expertise. Data Scientists analyze data to build models and extract insights. While both roles involve data and algorithms, Mechanistic Interpretability is more research-oriented, emphasizing transparency and safety of AI systems, whereas Data Scientists focus on practical data analysis and modeling for business applications.
Full-time
Posted 29 days ago
Job description
The Applied Research Laboratory for Intelligence & Security (ARLIS) at the University of Maryland is a University-Affiliated Research Center (UARC) dedicated to advancing research, innovation, and technology transition to improve decision making for U.S. national security. ARLIS combines deep scientific expertise with operational insight to address challenges in intelligence analysis, cybersecurity, artificial intelligence / machine learning, quantum science, and human-machine teaming. Researchers, scientists, engineers, and analysts at ARLIS collaborate with government agencies, industry partners, and academic institutions to deliver actionable insights and transformative solutions through research and development. Employees at ARLIS work on projects of critical importance, contribute directly to the nation's security, and are supported by a culture that values integrity, collaboration, and professional growth.
The Applied Research Laboratory for Intelligence and Security (ARLIS) at the University of Maryland is seeking a Postdoctoral Associate in AI Security to conduct cutting-edge research at the intersection of machine learning, cybersecurity, and national security.
This position focuses on advancing the science and practice of securing advanced AI systems against sophisticated adversaries, such as large language models (LLMs), reasoning systems, and agentic architectures. The role operates within a mission-driven R&D environment supporting government and Intelligence Community (IC) partners, where the threat model assumes highly capable actors with deep technical access to deployed systems. Opportunities include basic and open research, publishing in top-tier venues, as well as transitioning capabilities into operational use. The successful candidate will contribute to frontier research spanning adversarial machine learning, secure AI deployment, and other approaches to security and safety, such as mechanistic interpretability.
Key Responsibilities
Conduct original research in AI security, including adversarial machine learning, model robustness, and secure AI system design.
Develop and evaluate novel attack and defense techniques for modern AI systems, including:
Mechanistic and white-box analysis of model behavior and safety mechanisms
Multi-turn and adaptive adversarial interactions with AI systems
Security of reasoning models and agent-based architectures
Design and implement experimental frameworks for evaluating AI system vulnerabilities across deployment scenarios (e.g., open-weight, API-based, and hybrid systems).
Apply interpretability techniques (e.g., circuit analysis, feature attribution, sparse autoencoders) to understand internal model behavior and failure modes.
Contribute to the development of benchmarks, evaluation methodologies, and datasets for AI security research.
Collaborate with interdisciplinary teams including machine learning researchers, systems engineers, and national security domain experts.
Translate research findings into actionable insights for government sponsors, including technical reports and briefings.
Publish research in leading conferences and journals (e.g., NeurIPS, ICML, ICLR, IEEE S&P, CCS), consistent with program objectives.
Must be able to obtain a U.S. security clearance. If selected, you must meet the requirements for access to classified information and will be subject to a government security clearance investigation that includes criminal and credit history checks, as well as verification of U.S. citizenship, birth, education, employment, and military history.
Final offer is contingent upon the candidate's ability to successfully obtain the necessary interim Secret security clearance, as determined by the U.S. Government, prior to commencing employment.
Research Areas of Interest
Candidates may contribute to one or more of the following focus areas:
Adversarial AI & Red Teaming
Adaptive, multi-turn attacks and reasoning-based adversarial strategies
Evaluation of model robustness under realistic threat models
Secure AI Systems & Deployment
Security of agentic systems, tool use, and multi-model architectures
Supply chain and fine-tuning risks in open-weight models
AI Evaluation & Benchmarking
Development of security-focused benchmarks and evaluation pipelines
Measurement of robustness, safety degradation, and attack transferability
Mechanistic AI Security
Circuit-level analysis of safety and capability mechanisms
Feature geometry, representation learning, and interpretability-driven security
Work Environment & Impact
Engage in high-impact research directly supporting national security missions.
Work alongside leading experts in AI, cybersecurity, and intelligence applications.
Access to advanced computing infrastructure and unique government-relevant problem sets.
Opportunity to shape emerging standards and practices for securing advanced AI systems.
Balance of publishable academic research and mission-driven applied work.
Why This Role:
AI systems are rapidly becoming foundational to national security operations. At the same time, their attack surface is evolving toward more sophisticated threat models, including adversaries with deep technical access and the ability to exploit internal model behavior.
This position offers a unique opportunity to define how next-generation AI systems are secured, combining foundational research with real-world mission impact.
Physical Demands:
Sedentary work performed in a normal office environment; exerts up to 10 pounds of force occasionally and/or negligible amount of force frequently or constantly to lift, carry, push, pull or otherwise move objects, including the human body. Ability to attend meetings both on and off campus. Spending long hours in front of a computer screen.
Minimum Qualifications
Ph.D. in Computer Science, Machine Learning, Cybersecurity, or a related technical field.
Demonstrated research experience in one or more of the following areas:
Machine learning (deep learning, LLMs, reinforcement learning)
Adversarial machine learning or AI safety/security
Systems security, applied cryptography, or cyber operations
Strong programming skills in Python and experience with ML frameworks (e.g., PyTorch, TensorFlow).
Experience designing and executing empirical research, including experimentation and evaluation.
Ability to work in a collaborative, interdisciplinary research environment.
Ability to obtain and maintain a U.S. security clearance.
Preferences:
Familiarity with white-box threat models and evaluation of open-weight AI systems.
Experience with MLOps or large-scale training infrastructure, including distributed training, GPU clusters, or ML experimentation platforms.
Knowledge of AI system deployment architectures, including RAG systems, multi-agent systems, or tool-augmented models.
Experience with adversarial evaluation frameworks, red-teaming methodologies, or benchmark development.
Experience with mechanistic interpretability and/or alternative approaches to understanding model internals (e.g., activation analysis, circuit-level reasoning, representation learning).
Background in national security applications, including work with DoD, IC, or federally funded research programs.
Record of publications in top-tier conferences or journals.
Licenses/ Certifications: N/AAdditional Job Details
Required Application Materials: Cover Letter, Resume, List of References
Best Consideration Date: 6/13/26
Posting Close Date: N/A
Open Until Filled: YES
For more information on Financial Disclosure, please visit Maryland's State Ethics Commission website.
For more information on Regular Faculty benefits, select this link.
Offers of employment are contingent on completion of a background check. Information reported by the background check will not automatically disqualify anyone from employment. Before any adverse decision, the finalist will have an opportunity to provide information to the University regardingdisclosablebackground checkinformation. The University reserves the right to rescind the offer of employment or otherwise decline or terminate employment if the information reported by the background check is deemed incompatible with the position, regardless of when the background check is completed.
Employment EligibilityThe successful candidate must complete employment eligibility verification (on Form I-9) by presenting documents that establish identity and work authorization within the timeframe required by federal immigration law, and where applicable, to demonstrate renewed employment authorization. Failure to complete employment eligibility verification or reverification within the timeframe set forth by law may result in suspension or termination of employment.
EEO StatementThe University of Maryland, College Park is an Equal Opportunity Employer. All qualified applicants will receive equal consideration for employment. Please read the University's Equal Employment Opportunity Statement of Policy.
Title IX Non-Discrimination NoticeResourcesLearn how military skills translate to civilian opportunities withO*Net Online
There are some positions that are not advertised on this career site as the search is being managed by a Search Firm.
Please visit the link below to see these available opportunities:
Search Firm Managed Vacancies