LLM Inference Benchmark

News

AMD: OpenAI Endorsement Is A Game Changer

Advanced Micro Devices' partnership with OpenAI and strong AI tailwinds make it an undervalued growth stock. Click here to ...

InfoQ1d

Nvidia's GB200 NVL72 Supercomputer Achieves 2.7× Faster Inference on DeepSeek V2

In collaboration with NVIDIA, researchers from SGLang have published early benchmarks of the GB200 (Grace Blackwell) NVL72 ...

Navigating a Patent Gap for AI and Machine Learning Algorithms

Opinion: Lidiya Mishchenko and Pooya Shoghi explain how to bridge a gap preventing successful patent claims to protect new ...

Decrypt4d

Inference Labs Raises $6.3M to Secure AI Agents Through Verifiable Inference Protocol

As autonomous AI agents increasingly influence decisions in critical domains—healthcare, finance, governance, and more—the ...

7don MSN

ROCm 7: AMD’s big open-source bet on the future of AI

If 2023 and 2024 were the years NVIDIA set the pace for AI acceleration, 2025 is shaping up to be the year AMD answers back ...

10d

Top AI Advances for Enterprise and Deployment Challenges, According to Storage Exec

AI services have slashed inference costs up to 100x in two years, fueling a surge in enterprise adoption and $30B in ...

TMCnet13d

APTO Releases High-Accuracy Japanese Reasoning Data for LLM Fine-Tuning, Free of Charge

TOKYO, June 17, 2025 /PRNewswire/ -- APTO is pleased to announce the release of a free dataset for fine-tuning reasoning models, such as OpenAI's GPT-01 and Deepseek's Deepseek R1. This dataset can ...

17d

Everyone Saw The Earnings; Few Saw This

While Nvidia’s record-breaking earnings grabbed headlines, its release of NVLink Fusion reveals a deeper strategy to entrench itself as indispensable backbone of global AI infrastructure.

Business Wire22d

VeriSilicon’s Ultra-Low Energy NPU Provides Over 40 TOPS for On-Device LLM Inference in Mobile Applications - Business Wire

VeriSilicon announced that its ultra-low power NPU IP now supports on-device inference of LLMs with AI computing performance scaling beyond 40 TOPS.

Computer Weekly1mon

Red Hat launches llm-d community & project - Computer Weekly

Red Hat has announced the launch of llm-d, a new open source project designed to address generative AI’s future with inference at scale. Powered by a native Kubernetes architecture, llm-d ...

GitHub2mon

Michaelvll/llm-ie-benchmarks: A collection of reproducible inference engine benchmarks - GitHub

This collection of open-source LLM inference engine benchmarks provides fair and reproducible one-line commands to compare different inference engines on identical hardware on different ...

Some results have been hidden because they may be inaccessible to you

Show inaccessible results