The limits of black-box evaluations: two hypotheticals Apr 11, 2025 6 min read Evaluations A prominent approach to AI safety goes under the name of "evals" or "evaluations". These are a critical component of plans that various major labs have, such as Anthropic&