China’s DeepSeek, in collaboration with researchers from Tsinghua University, developed a technique to improve the reasoning capabilities of large language models (LLMs) that combines generative reward modeling (GRM) and self-principled critique tuning, SCMP reported, citing a paper published on Friday. The dual approach aims to enable LLMs to deliver better and faster results to general queries. Reportedly, the DeepSeek-GRM models outperformed existing methods, according to SCMP, who cited the