Tag: evals

  • Fit Testing AI Benchmarking

    Fit Testing AI Benchmarking

    In February I got in touch with CaML as part of the AIxAnimals incubator, run by Sentient Futures. They tasked me with putting MORU Bench (Moral Reasoning Under Uncertainty) up on Inspect a benchmarking service run by the UK’s AI Security Institute. My PR was accepted and completed the project within three weeks. It was…