Understanding Benchmarks

Why most AI benchmarks tell us so little

On Tuesday, startup Anthropic released a family of generative AI models that it claims achieve best-in-class performance. Just a few days later, rival Inflection AI unveiled a model that it asserts ...

VentureBeat

AI researchers launch SuperGLUE, a rigorous benchmark for language understanding

Join the event trusted by enterprise leaders for nearly two decades. VB Transform brings together the people building real enterprise AI strategy. Learn more Facebook AI Research, together with Google ...

JD Supra

Human rights benchmarks: Corporate performance rankings on the rise: The business case for understanding human rights and ESG benchmarks

Some companies are skeptical about engaging with human rights and ESG benchmarking, because they question whether human rights and ESG disclosures and compliance have a direct economic effect on their ...

InfoWorld

Show inaccessible results

Why most AI benchmarks tell us so little

AI researchers launch SuperGLUE, a rigorous benchmark for language understanding

Human rights benchmarks: Corporate performance rankings on the rise: The business case for understanding human rights and ESG benchmarks

Why benchmarks are key to AI progress

AI benchmark numbers are meaningless — here's what to look for instead

The way we measure progress in AI is terrible

Crowdsourced AI benchmarks have serious flaws, some experts say

Portfolio Performance: Understanding Attribution Analysis