AI Safety

Is Agency Identifiable?

Identifiability in IRL One of my favorite papers is this one, titled "Occam's razor is insufficient to infer the preferences of irrational agents". It relates to an area of

A baseline for regulation of ML models

Machine learning researchers and engineers love baselines. Baselines serve as an important starting point to make improvements, and the ability to check new ideas against baselines helps measure and incentivize progress. For hard