LLM Model Benchmarks - Search News

News

DeepSeek R1-0528 arrives in powerful open source challenge to OpenAI o3 and Google Gemini 2.5 Pro

Additionally, the model’s hallucination rate has been reduced, contributing to more reliable and consistent output.

Live Science on MSN8d

AI benchmarking platform is helping top companies rig their model performances, study claims

LMArena, a popular benchmark for large language models, has been accused of giving preferential treatment to AIs made by big ...

New Claude 4 AI model refactored code for 7 hours straight

In particular, that marathon refactoring claim reportedly comes from Rakuten, a Japanese tech services conglomerate that ...

NextBigFuture9d

Qwen 2.5 Coder and Qwen 3 Lead in Open Source LLM Over DeepSeek and Meta

Qwen 2.5 Coder/Max is currently the top open-source model for coding, with the highest HumanEval (~70–72%), LiveCodeBench (70 ...

YourStory5d

Sarvam AI brings 24B-parameter LLM for Indian languages, reasoning

The Bengaluru startup noted that Sarvam-M sets a new benchmark for models of its size in Indian languages, as well as in math ...

Fortune India6d

Sarvam AI launches India's sovereign LLM 'Sarvam-M', claims edge over LLaMA-3, Gemma 3 on key benchmarks

Founded in July 2023 by Vivek Raghavan and Pratyush Kumar, Sarvam aims to make Generative AI accessible at scale in India. In ...

Stark Insider7d

Claude 4 is here – ChatGPT responds

Anthropic this week unveiled it's latest LLM (Large Language Model) which can act as both a chatbot and AI assistant. Its special sauce -- coding -- seems ...

NewsBytes6d

Sarvam AI launches flagship LLM, comparable to Meta, Google models

Sarvam AI claims that the advanced Sarvam-M model outperforms Meta 's LLaMA-4 Scout on most benchmarks and is comparable to ...

Sarvam AI debuts flagship open-source LLM with 24 billion parameters

M, a 24-billion-parameter hybrid language model boasting strong performance in math, programming, and Indian languages.

2don MSN

Sarvam-M is a large language model, or LLM, developed by Indian startup Sarvam AI.

Despite criticism over whether the model is “good enough” to compete globally, Sarvam-M’s launch has significantly raised the profile of Indian efforts in the AI space. The model is now publicly ...

2don MSN

DeepSeek: Everything you need to know about the AI chatbot app

DeepSeek has gone viral. Chinese AI lab DeepSeek broke into the mainstream consciousness this week after its chatbot app rose ...

Devdiscourse4d

Reverse engineering reveals cognitive gaps in current AI systems

Researchers identified two consistent failure modes in LLM reasoning: overcomplication and overlooking. In the ...

Some results have been hidden because they may be inaccessible to you

Show inaccessible results