What we do
BERI provides operational infrastructure and fiscal sponsorship to AI safety and existential risk initiatives worldwide through four programs. For every program, we provide the same core value: handling operational complexity so teams can focus entirely on their mission.
From university-affiliated programs to independent research projects, we support their work by managing a wide range of operations and grant administration, including contracts, financial operations, tax compliance, and more.
Our collaborations
Launched in 2025, BERI is supporting Coefficient Giving in administering grants awarded through their Technical AI Safety RFP. BERI manages research support expenses for 26 participating research groups in seven countries, including cloud computing costs, API credits, LLM subscriptions, and other software expenses. BERI also provides living stipends to some researchers in the program, and currently has five independent contractors engaged supporting this work.
Agent Misalignment Dataset | Ram Potham | United States
AI Debate for Control | Ethan Elasky and Frank Nakasako | United States
Capability Shaping with Gradient Routing in MoEs | Krishna Patel | United States
Chain of Thought Faithfulness | Marmik Chaudhari | United States
Chain of Thought Legibility in Reasoning Models | Arun Jose | India
Control in Distributed Attacks | Benjamin Arnav | United States
Detecting Deception Through Directional Suppression | Christian Hardy | United States
Discovering Non-Linear Representations | Ignacio de la Serna | Spain
Eliciting Encoded Reasoning in LLMs | Usman Anwar | United Kingdom
Emergent Alignment Faking | Robert Graham | Canada
Evals Synthesis Using Interpretability | Isabelle Lee | United States
Evaluation & Mitigation of Deception | Shi Feng | United States
Graph SAE: Mapping+Merging SAE Features | Zach Maas | United States
Interpretability Sprints | Bart Bussman | Netherlands
Jailbreak Effectiveness | Matthew Bozoukov | United States
Making Unlearning More Targeted | Filip Sondej | Poland
Mechanistic Data Attribution | Joseph Lee | United States
Mechanisms of Out-of-Context Reasoning | Atticus Wang | United States
Optimizing Against AI Awareness | Sohaib Imran | United Kingdom
Probabilistic Verification of Neural Networks | Noah Schwartz | United States
Steganography in R1's Chain-of-Thought | Kabir Khandpur | United Kingdom
Training LLMs to Introspect & Explain SAEs | Adam Karvonen | United States
Triggering Alignment Faking | Allen Thomas | United States
Understanding AI Control Scalability | Monika Jotautaite | United Kingdom
Understanding Emergent Misalignment | Aiden Ewart | United Kingdom
Upstream Merging | Eric Easley | United States
Agent Misalignment Dataset
Ram Potham, United States
AI Debate for Control
Ethan Elasky, United States
Chain of Thought Faithfulness
Marmik Chaudhari, United States
Chain-of-Thought Legibility in Reasoning Models
Arun Jose, India
Control in Distributed Attacks
Benjamin Arnav, United States
Detecting Deception Through Directional Suppression
Christian Hardy, United States
Discovering Non-Linear Representations
Ignacio de la Serna, Spain
Eliciting Encoded Reasoning in LLMs
Usman Anwar, United Kingdom
Emergent Alignment Faking
Robert Graham, Canada
Evaluation & Mitigation of Deception
Shi Feng, United States
Evals Synthesis Using Interpretability
Isabelle Lee, United States
Graph SAE: Mapping+Merging SAE Features
Zach Maas, United States
Interpretability Sprints
Bart Bussman, Netherlands
Jailbreak Effectiveness
Matthew Bozoukov, United States
Making Unlearning More Targeted
Filip Sondej, Poland
Mechanistic Data Attribution
Joseph Lee, United States
Mechanisms of Out-of-Context
Reasoning Atticus Wang, United States
Optimizing Against AI Awareness
Sohaib Imran, United Kingdom
Probabilistic Verification of Neural Networks
Noah Schwartz, United States
Steganography in R1's Chain-of-Thought
Kabir Khandpur, United Kingdom
Training LLMs to Introspect & Explain SAEs
Adam Karvonen, United States
Triggering Alignment Faking
Allen Thomas, United States
Understanding AI Control Scalability
Monika Jotautaite, United Kingdom
Understanding Emergent Misalignment
Aiden Ewart, United Kingdom
Upstream Merging
Eric Easley, United States
Agent Misalignment Dataset
Ram Potham, United States
AI Debate for Control
Ethan Elasky, United States
Chain of Thought Faithfulness
Marmik Chaudhari, United States
Chain-of-Thought Legibility in Reasoning Models
Arun Jose, India
Control in Distributed Attacks
Benjamin Arnav, United States
Detecting Deception Through Directional Suppression
Christian Hardy, United States
Discovering Non-Linear Representations
Ignacio de la Serna, Spain
Eliciting Encoded Reasoning in LLMs
Usman Anwar, United Kingdom
Emergent Alignment Faking
Robert Graham, Canada
Evaluation & Mitigation of Deception
Shi Feng, United States
Evals Synthesis Using Interpretability
Isabelle Lee, United States
Graph SAE: Mapping+Merging SAE Features
Zach Maas, United States
Interpretability Sprints
Bart Bussman, Netherlands
Jailbreak Effectiveness
Matthew Bozoukov, United States
Making Unlearning More Targeted
Filip Sondej, Poland
Mechanistic Data Attribution
Joseph Lee, United States
Mechanisms of Out-of-Context
Reasoning Atticus Wang, United States
Optimizing Against AI Awareness
Sohaib Imran, United Kingdom
Probabilistic Verification of Neural Networks
Noah Schwartz, United States
Steganography in R1's Chain-of-Thought
Kabir Khandpur, United Kingdom
Training LLMs to Introspect & Explain SAEs
Adam Karvonen, United States
Triggering Alignment Faking
Allen Thomas, United States
Understanding AI Control Scalability
Monika Jotautaite, United Kingdom
Understanding Emergent Misalignment
Aiden Ewart, United Kingdom
Upstream Merging
Eric Easley, United States
Agent Misalignment Dataset | Ram Potham | United States
AI Debate for Control | Ethan Elasky | United States
Chain of Thought Faithfulness | Marmik Chaudhari | United States
Chain-of-Thought Legibility in Reasoning Models | Arun Jose | India
Control in Distributed Attacks | Benjamin Arnav | United States
Detecting Deception Through Directional Suppression | Christian Hardy | United States
Discovering Non-Linear Representations | Ignacio de la Serna | Spain
Eliciting Encoded Reasoning in LLMs | Usman Anwar | United Kingdom
Emergent Alignment Faking | Robert Graham | Canada
Evaluation & Mitigation of Deception | Shi Feng | United States
Evals Synthesis Using Interpretability | Isabelle Lee | United States
Graph SAE: Mapping+Merging SAE Features | Zach Maas | United States
Interpretability Sprints | Bart Bussman | Netherlands
Jailbreak Effectiveness | Matthew Bozoukov | United States
Making Unlearning More Targeted | Filip Sondej | Poland
Mechanistic Data Attribution | Joseph Lee | United States
Mechanisms of Out-of-Context Reasoning | Atticus Wang | United States
Optimizing Against AI Awareness | Sohaib Imran | United Kingdom
Probabilistic Verification of Neural Networks | Noah Schwartz | United States
Steganography in R1's Chain-of-Thought | Kabir Khandpur | United Kingdom
Training LLMs to Introspect & Explain SAEs | Adam Karvonen | United States
Triggering Alignment Faking | Allen Thomas | United States
Understanding AI Control Scalability | Monika Jotautaite | United Kingdom
Understanding Emergent Misalignment | Aiden Ewart | United Kingdom
Upstream Merging | Eric Easley | United States
Agent Misalignment Dataset
Ram Potham, United States
AI Debate for Control
Ethan Elasky, United States
Chain of Thought Faithfulness
Marmik Chaudhari, United States
Chain-of-Thought Legibility in Reasoning Models
Arun Jose, India
Control in Distributed Attacks
Benjamin Arnav, United States
Detecting Deception Through Directional Suppression
Christian Hardy, United States
Discovering Non-Linear Representations
Ignacio de la Serna, Spain
Eliciting Encoded Reasoning in LLMs
Usman Anwar, United Kingdom
Emergent Alignment Faking
Robert Graham, Canada
Evaluation & Mitigation of Deception
Shi Feng, United States
Evals Synthesis Using Interpretability
Isabelle Lee, United States
Graph SAE: Mapping+Merging SAE Features
Zach Maas, United States
Interpretability Sprints
Bart Bussman, Netherlands
Jailbreak Effectiveness
Matthew Bozoukov, United States
Making Unlearning More Targeted
Filip Sondej, Poland
Mechanistic Data Attribution
Joseph Lee, United States
Mechanisms of Out-of-Context
Reasoning Atticus Wang, United States
Optimizing Against AI Awareness
Sohaib Imran, United Kingdom
Probabilistic Verification of Neural Networks
Noah Schwartz, United States
Steganography in R1's Chain-of-Thought
Kabir Khandpur, United Kingdom
Training LLMs to Introspect & Explain SAEs
Adam Karvonen, United States
Triggering Alignment Faking
Allen Thomas, United States
Understanding AI Control Scalability
Monika Jotautaite, United Kingdom
Understanding Emergent Misalignment
Aiden Ewart, United Kingdom
Upstream Merging
Eric Easley, United States
Agent Misalignment Dataset | Ram Potham | United States
AI Debate for Control | Ethan Elasky and Frank Nakasako | United States
Capability Shaping with Gradient Routing in MoEs | Krishna Patel | United States
Chain of Thought Faithfulness | Marmik Chaudhari | United States
Chain of Thought Legibility in Reasoning Models | Arun Jose | India
Control in Distributed Attacks | Benjamin Arnav | United States
Detecting Deception Through Directional Suppression | Christian Hardy | United States
Discovering Non-Linear Representations | Ignacio de la Serna | Spain
Eliciting Encoded Reasoning in LLMs | Usman Anwar | United Kingdom
Emergent Alignment Faking | Robert Graham | Canada
Evals Synthesis Using Interpretability | Isabelle Lee | United States
Evaluation & Mitigation of Deception | Shi Feng | United States
Graph SAE: Mapping+Merging SAE Features | Zach Maas | United States
Interpretability Sprints | Bart Bussman | Netherlands
Jailbreak Effectiveness | Matthew Bozoukov | United States
Making Unlearning More Targeted | Filip Sondej | Poland
Mechanistic Data Attribution | Joseph Lee | United States
Mechanisms of Out-of-Context Reasoning | Atticus Wang | United States
Optimizing Against AI Awareness | Sohaib Imran | United Kingdom
Probabilistic Verification of Neural Networks | Noah Schwartz | United States
Steganography in R1's Chain-of-Thought | Kabir Khandpur | United Kingdom
Training LLMs to Introspect & Explain SAEs | Adam Karvonen | United States
Triggering Alignment Faking | Allen Thomas | United States
Understanding AI Control Scalability | Monika Jotautaite | United Kingdom
Understanding Emergent Misalignment | Aiden Ewart | United Kingdom
Upstream Merging | Eric Easley | United States
440 N. Barranca Ave. #2374
Covina, CA 91723
© 2026 Berkeley Existential Risk Initiative. All rights reserved.