
Anthropic's Bloom is an open-source tool for generating automated behavioral evaluations of AI models. Bloom assesses specific behaviors like self-preferential bias and sabotage by creating scenarios and quantifying behavior occurrence across models. It efficiently differentiates between aligned and misaligned models and correlates strongly with human judgment, enabling scalable and reliable behavior evaluations.