Bilal Chughtai
About
Research Engineer @ Google DeepMind, working on mechanistic interpretability and AGI safety. Trying to make the development of transformative AI go well for humanity.
If you'd like to chat about work or life, please email me.
Links
google scholar // github // linkedin // lesswrong // substack // twitter // strava // last.fm // email
Posts
Pinned
2026
2025
- How I fixed my sleep
- Difficulties in evaluating a deception detector for AIs
- GDM mechanistic interpretability team updates
- How to save 1/3 off TFL rail fares
- Coaching is good, actually
- Should you spend time making things more efficient?
- An opinionated guide to building a good to-do system
- everything2prompt
- My health dashboard
- Joining Google DeepMind
- Detecting strategic deception using linear probes
- Open problems in mechanistic interpretability
- Intellectual progress in 2024
- Activation space interpretability may be doomed
2024
- Book Summary: Zero to One
- Reasons for and against working on technical AI safety at a frontier AI lab
- You should remap your caps lock key
- You should consider applying to PhDs (soon!)
- Understanding positional features in layer 0 SAEs
- Unlearning via RMU is mostly shallow
- Transformer circuit faithfulness metrics are not robust
- Me, Myself, and AI: The Situational Awareness Dataset (SAD) for LLMs
2023
2021
ai
- Difficulties in evaluating a deception detector for AIs
- GDM mechanistic interpretability team updates
- everything2prompt
- Joining Google DeepMind
- Detecting strategic deception using linear probes
- Open problems in mechanistic interpretability
- Activation space interpretability may be doomed
- Reasons for and against working on technical AI safety at a frontier AI lab
- You should consider applying to PhDs (soon!)
- Understanding positional features in layer 0 SAEs
- Unlearning via RMU is mostly shallow
- Transformer circuit faithfulness metrics are not robust
- Me, Myself, and AI: The Situational Awareness Dataset (SAD) for LLMs
productivity
- On being "busy"
- How I fixed my sleep
- How to save 1/3 off TFL rail fares
- Coaching is good, actually
- Should you spend time making things more efficient?
- Product recommendations
- An opinionated guide to building a good to-do system
- everything2prompt
- You should remap your caps lock key
interpretability
- Difficulties in evaluating a deception detector for AIs
- GDM mechanistic interpretability team updates
- Joining Google DeepMind
- Detecting strategic deception using linear probes
- Open problems in mechanistic interpretability
- Activation space interpretability may be doomed
- Understanding positional features in layer 0 SAEs
- Unlearning via RMU is mostly shallow
- Transformer circuit faithfulness metrics are not robust
research
- Activation space interpretability may be doomed
- Understanding positional features in layer 0 SAEs
- Unlearning via RMU is mostly shallow
- Transformer circuit faithfulness metrics are not robust
- Me, Myself, and AI: The Situational Awareness Dataset (SAD) for LLMs
paper
- Difficulties in evaluating a deception detector for AIs
- Detecting strategic deception using linear probes
- Open problems in mechanistic interpretability
- Transformer circuit faithfulness metrics are not robust
- Me, Myself, and AI: The Situational Awareness Dataset (SAD) for LLMs
careers
- Joining Google DeepMind
- Reasons for and against working on technical AI safety at a frontier AI lab
- You should consider applying to PhDs (soon!)
books
health
reading
physics
personal
startups
sport