Tag: evals
-

Fit Testing AI Benchmarking
In February I got in touch with CaML as part of the AIxAnimals incubator, run by Sentient Futures. They tasked me with putting MORU Bench (Moral Reasoning Under Uncertainty) up on Inspect a benchmarking service run by the UK’s AI Security Institute. My PR was accepted and completed the project within three weeks. It was…