What we do

BERI provides operational infrastructure and fiscal sponsorship to AI safety and existential risk initiatives worldwide through four programs. For every program, we provide the same core value: handling operational complexity so teams can focus entirely on their mission. 

From university-affiliated programs to independent research projects, we support their work by managing a wide range of operations and grant administration, including contracts, financial operations, tax compliance, and more.

Our collaborations

Launched in 2025, BERI is supporting Coefficient Giving in administering grants awarded through their Technical AI Safety RFP. BERI manages research support expenses for 26 participating research groups in seven countries, including cloud computing costs, API credits, LLM subscriptions, and other software expenses. BERI also provides living stipends to some researchers in the program, and currently has five independent contractors engaged supporting this work. 

Agent Misalignment Dataset | Ram Potham | United States

AI Debate for Control | Ethan Elasky and Frank Nakasako | United States

Capability Shaping with Gradient Routing in MoEs | Krishna Patel | United States

Chain of Thought Faithfulness | Marmik Chaudhari | United States

Chain of Thought Legibility in Reasoning Models | Arun Jose | India

Control in Distributed Attacks | Benjamin Arnav | United States

Detecting Deception Through Directional Suppression | Christian Hardy | United States

Discovering Non-Linear Representations | Ignacio de la Serna | Spain

Eliciting Encoded Reasoning in LLMs | Usman Anwar | United Kingdom

Emergent Alignment Faking | Robert Graham | Canada

Evals Synthesis Using Interpretability | Isabelle Lee | United States

Evaluation & Mitigation of Deception | Shi Feng | United States

Graph SAE: Mapping+Merging SAE Features | Zach Maas | United States

Interpretability Sprints | Bart Bussman | Netherlands

Jailbreak Effectiveness | Matthew Bozoukov | United States

Making Unlearning More Targeted | Filip Sondej | Poland

Mechanistic Data Attribution | Joseph Lee | United States

Mechanisms of Out-of-Context Reasoning | Atticus Wang | United States

Optimizing Against AI Awareness | Sohaib Imran | United Kingdom

Probabilistic Verification of Neural Networks | Noah Schwartz | United States

Steganography in R1's Chain-of-Thought | Kabir Khandpur | United Kingdom

Training LLMs to Introspect & Explain SAEs | Adam Karvonen | United States

Triggering Alignment Faking | Allen Thomas | United States

Understanding AI Control Scalability | Monika Jotautaite | United Kingdom

Understanding Emergent Misalignment | Aiden Ewart | United Kingdom

Upstream Merging | Eric Easley | United States

Agent Misalignment Dataset

Ram Potham, United States


AI Debate for Control

Ethan Elasky, United States


Chain of Thought Faithfulness

Marmik Chaudhari, United States


Chain-of-Thought Legibility in Reasoning Models

Arun Jose, India


Control in Distributed Attacks

Benjamin Arnav, United States


Detecting Deception Through Directional Suppression

Christian Hardy, United States


Discovering Non-Linear Representations

Ignacio de la Serna, Spain


Eliciting Encoded Reasoning in LLMs

Usman Anwar, United Kingdom


Emergent Alignment Faking

Robert Graham, Canada


Evaluation & Mitigation of Deception

Shi Feng, United States


Evals Synthesis Using Interpretability

Isabelle Lee, United States


Graph SAE: Mapping+Merging SAE Features

Zach Maas, United States


Interpretability Sprints

Bart Bussman, Netherlands


Jailbreak Effectiveness

Matthew Bozoukov, United States


Making Unlearning More Targeted

Filip Sondej, Poland


Mechanistic Data Attribution

Joseph Lee, United States


Mechanisms of Out-of-Context

Reasoning Atticus Wang, United States


Optimizing Against AI Awareness

Sohaib Imran, United Kingdom


Probabilistic Verification of Neural Networks

Noah Schwartz, United States


Steganography in R1's Chain-of-Thought

Kabir Khandpur, United Kingdom


Training LLMs to Introspect & Explain SAEs

Adam Karvonen, United States


Triggering Alignment Faking

Allen Thomas, United States


Understanding AI Control Scalability

Monika Jotautaite, United Kingdom


Understanding Emergent Misalignment

Aiden Ewart, United Kingdom


Upstream Merging

Eric Easley, United States

Agent Misalignment Dataset

Ram Potham, United States


AI Debate for Control

Ethan Elasky, United States


Chain of Thought Faithfulness

Marmik Chaudhari, United States


Chain-of-Thought Legibility in Reasoning Models

Arun Jose, India


Control in Distributed Attacks

Benjamin Arnav, United States


Detecting Deception Through Directional Suppression

Christian Hardy, United States


Discovering Non-Linear Representations

Ignacio de la Serna, Spain


Eliciting Encoded Reasoning in LLMs

Usman Anwar, United Kingdom


Emergent Alignment Faking

Robert Graham, Canada


Evaluation & Mitigation of Deception

Shi Feng, United States


Evals Synthesis Using Interpretability

Isabelle Lee, United States


Graph SAE: Mapping+Merging SAE Features

Zach Maas, United States


Interpretability Sprints

Bart Bussman, Netherlands


Jailbreak Effectiveness

Matthew Bozoukov, United States


Making Unlearning More Targeted

Filip Sondej, Poland


Mechanistic Data Attribution

Joseph Lee, United States


Mechanisms of Out-of-Context

Reasoning Atticus Wang, United States


Optimizing Against AI Awareness

Sohaib Imran, United Kingdom


Probabilistic Verification of Neural Networks

Noah Schwartz, United States


Steganography in R1's Chain-of-Thought

Kabir Khandpur, United Kingdom


Training LLMs to Introspect & Explain SAEs

Adam Karvonen, United States


Triggering Alignment Faking

Allen Thomas, United States


Understanding AI Control Scalability

Monika Jotautaite, United Kingdom


Understanding Emergent Misalignment

Aiden Ewart, United Kingdom


Upstream Merging

Eric Easley, United States

Agent Misalignment Dataset | Ram Potham | United States

AI Debate for Control | Ethan Elasky | United States

Chain of Thought Faithfulness | Marmik Chaudhari | United States

Chain-of-Thought Legibility in Reasoning Models | Arun Jose | India

Control in Distributed Attacks | Benjamin Arnav | United States

Detecting Deception Through Directional Suppression | Christian Hardy | United States

Discovering Non-Linear Representations | Ignacio de la Serna | Spain

Eliciting Encoded Reasoning in LLMs | Usman Anwar | United Kingdom

Emergent Alignment Faking | Robert Graham | Canada

Evaluation & Mitigation of Deception | Shi Feng | United States

Evals Synthesis Using Interpretability | Isabelle Lee | United States

Graph SAE: Mapping+Merging SAE Features | Zach Maas | United States

Interpretability Sprints | Bart Bussman | Netherlands

Jailbreak Effectiveness | Matthew Bozoukov | United States

Making Unlearning More Targeted | Filip Sondej | Poland

Mechanistic Data Attribution | Joseph Lee | United States

Mechanisms of Out-of-Context Reasoning | Atticus Wang | United States

Optimizing Against AI Awareness | Sohaib Imran | United Kingdom

Probabilistic Verification of Neural Networks | Noah Schwartz | United States

Steganography in R1's Chain-of-Thought | Kabir Khandpur | United Kingdom

Training LLMs to Introspect & Explain SAEs | Adam Karvonen | United States

Triggering Alignment Faking | Allen Thomas | United States

Understanding AI Control Scalability | Monika Jotautaite | United Kingdom

Understanding Emergent Misalignment | Aiden Ewart | United Kingdom

Upstream Merging | Eric Easley | United States

Agent Misalignment Dataset

Ram Potham, United States


AI Debate for Control

Ethan Elasky, United States


Chain of Thought Faithfulness

Marmik Chaudhari, United States


Chain-of-Thought Legibility in Reasoning Models

Arun Jose, India


Control in Distributed Attacks

Benjamin Arnav, United States


Detecting Deception Through Directional Suppression

Christian Hardy, United States


Discovering Non-Linear Representations

Ignacio de la Serna, Spain


Eliciting Encoded Reasoning in LLMs

Usman Anwar, United Kingdom


Emergent Alignment Faking

Robert Graham, Canada


Evaluation & Mitigation of Deception

Shi Feng, United States


Evals Synthesis Using Interpretability

Isabelle Lee, United States


Graph SAE: Mapping+Merging SAE Features

Zach Maas, United States


Interpretability Sprints

Bart Bussman, Netherlands


Jailbreak Effectiveness

Matthew Bozoukov, United States


Making Unlearning More Targeted

Filip Sondej, Poland


Mechanistic Data Attribution

Joseph Lee, United States


Mechanisms of Out-of-Context

Reasoning Atticus Wang, United States


Optimizing Against AI Awareness

Sohaib Imran, United Kingdom


Probabilistic Verification of Neural Networks

Noah Schwartz, United States


Steganography in R1's Chain-of-Thought

Kabir Khandpur, United Kingdom


Training LLMs to Introspect & Explain SAEs

Adam Karvonen, United States


Triggering Alignment Faking

Allen Thomas, United States


Understanding AI Control Scalability

Monika Jotautaite, United Kingdom


Understanding Emergent Misalignment

Aiden Ewart, United Kingdom


Upstream Merging

Eric Easley, United States

Agent Misalignment Dataset | Ram Potham | United States

AI Debate for Control | Ethan Elasky and Frank Nakasako | United States

Capability Shaping with Gradient Routing in MoEs | Krishna Patel | United States

Chain of Thought Faithfulness | Marmik Chaudhari | United States

Chain of Thought Legibility in Reasoning Models | Arun Jose | India

Control in Distributed Attacks | Benjamin Arnav | United States

Detecting Deception Through Directional Suppression | Christian Hardy | United States

Discovering Non-Linear Representations | Ignacio de la Serna | Spain

Eliciting Encoded Reasoning in LLMs | Usman Anwar | United Kingdom

Emergent Alignment Faking | Robert Graham | Canada

Evals Synthesis Using Interpretability | Isabelle Lee | United States

Evaluation & Mitigation of Deception | Shi Feng | United States

Graph SAE: Mapping+Merging SAE Features | Zach Maas | United States

Interpretability Sprints | Bart Bussman | Netherlands

Jailbreak Effectiveness | Matthew Bozoukov | United States

Making Unlearning More Targeted | Filip Sondej | Poland

Mechanistic Data Attribution | Joseph Lee | United States

Mechanisms of Out-of-Context Reasoning | Atticus Wang | United States

Optimizing Against AI Awareness | Sohaib Imran | United Kingdom

Probabilistic Verification of Neural Networks | Noah Schwartz | United States

Steganography in R1's Chain-of-Thought | Kabir Khandpur | United Kingdom

Training LLMs to Introspect & Explain SAEs | Adam Karvonen | United States

Triggering Alignment Faking | Allen Thomas | United States

Understanding AI Control Scalability | Monika Jotautaite | United Kingdom

Understanding Emergent Misalignment | Aiden Ewart | United Kingdom

Upstream Merging | Eric Easley | United States

© 2026 Berkeley Existential Risk Initiative. All rights reserved.

440 N. Barranca Ave. #2374
Covina, CA 91723

contact@existence.org

© 2026 Berkeley Existential Risk Initiative. All rights reserved.

440 N. Barranca Ave. #2374
Covina, CA 91723

contact@existence.org

440 N. Barranca Ave. #2374
Covina, CA 91723

contact@existence.org

© 2026 Berkeley Existential Risk Initiative. All rights reserved.

© 2026 Berkeley Existential Risk Initiative. All rights reserved.

440 N. Barranca Ave. #2374
Covina, CA 91723

contact@existence.org