LLM Inference Benchmark

News

Five Expensive Myths About AI Inferencing (And How To Fix Them)

Here are five common misconceptions about AI inferencing and what leaders can do differently to future-proof their ...

Rubrik acquires AI model tooling startup Predibase for reported $100M+

Rubrik Inc. today announced plans to acquire Predibase Inc., a startup that develops software for fine-tuning large language ...

Forbes7mon

Cerebras Now The Fastest LLM Inference Processor; Its Not Even Close - Forbes

And while Blackwell will increase inference performance by four fold over Hopper, it will not come even close to the performance of Cerebras. And Cerebras is just getting started on models like ...

VentureBeat10mon

DeepMind and UC Berkeley show how to make the most of LLM inference-time compute - VentureBeat

The tradeoff between inference-time and pre-training compute. The dominant approach to improving LLM performance has been to scale up model size and pre-training compute.However, this approach has ...

17d

Apple’s LLM study draws an important distinction about reasoning models

There’s a new Apple research paper making the rounds, and if you’ve seen the reactions, you’d think it just toppled the ...

Hosted on MSN5mon

Apple embraces Nvidia GPUs to accelerate LLM inference via its open source ReDrafter tech

Apple’s benchmarks show that this method generates 2.7x more tokens per second compared to ... ReDrafter extends its impact by enabling faster LLM inference on Nvidia GPUs widely used in ...

Manufacturing.net14d

Measuring LLM Performance in the SOC

The goal is to provide security teams with guidance for picking the best LLM for their organization.

The Bakersfield Californian14d

KAYTUS Unveils Upgraded MotusAI to Accelerate LLM Deployment

KAYTUS, a leading provider of end-to-end AI and liquid cooling solutions, today announced the release of the latest version of its MotusAI AI DevOps Platform at ISC High Performance 2025. The upgraded ...

Morningstar3mon

Alluxio Partners with vLLM Production Stack to Accelerate LLM Inference - Morningstar

To meet these unique requirements, Alluxio has collaborated with the vLLM Production Stack to accelerate LLM inference performance by providing an integrated solution for KV Cache management.

Yahoo Finance3mon

Alluxio Partners with vLLM Production Stack to Accelerate LLM Inference - Yahoo Finance

To meet these unique requirements, Alluxio has collaborated with the vLLM Production Stack to accelerate LLM inference performance by providing an integrated solution for KV Cache management.

Some results have been hidden because they may be inaccessible to you

Show inaccessible results