AI groups rush to redesign model testing and create new benchmarks

This was originally published on post
Rapidly advancing technology is surpassing current methods of evaluating and comparing large language models