Alignment Research Blog

Informal updates from the OpenAI team
2025
Debugging misaligned completions with sparse-autoencoder latent attribution
Efficiently finding features that cause behaviors.
A Practical Approach to Verifying Code at Scale
We train and deploy an AI review agent optimised for precision and real-world use, enabling oversight to scale with autonomous code generation.
Hello World
Introducing our blog on alignment research.