Live Coding Decoding - 搜索 News

5 天

DeepSeek open sources DSpark, a new framework to speed up LLM inference by up to 85%

DSpark can make decoding faster, but acceptance quality still determines how much speed the system actually realizes.

DeepSeek Releases DSpark: Speculative Decoding Makes V4 Up to 85 Percent Faster

DeepSeek speculative decoding framework DSpark went live June 27 on V4-Flash and V4-Pro, reporting up to 85 percent faster ...

techtimes

GLM-5.2 Open Weights Live: Top Coding Benchmark, but API Use Carries China Data Risk

a mobile phone's screen showing the logo of Chinese AI Zhipu in Beijing on January 21, 2026. Investor confidence in Chinese AI startups is riding high, but obstacles to their long-term success range ...

18 天

Z.ai’s open-weights GLM-5.2 beats GPT-5.5 on multiple long-horizon coding benchmarks for ...

It allows engineering teams to host frontier-level AI on their own sovereign infrastructure, entirely eliminating vendor lock ...

the-decoder

Zhipu AI's GLM-5.2 closes in on closed-source leaders in coding marathons

Chinese AI lab Zhipu AI releases GLM-5.2 with a stable 1-million-token context under the MIT license. On hours-long coding tasks, the open-source model trails Anthropic's Opus models by just a few ...

GizChina

Xiaomi MiMo-V2.5-Pro Just Hit 1,000 Tokens Per Second!

Most people know Xiaomi for phones and scooters. Not for breaking AI inference records. That changes today. Working with inference partner TileRT, Xiaomi has hit over 1,000 tokens per second on a ...

GitHub

TileRT: Tile-Based Runtime for

🎉 2026-02-14 · v0.1.3 Released. The v0.1.3 release introduces full support for the latest GLM-5 model, achieving up to 500 tokens/s on GLM-5-FP8 and up to 600 tokens/s on DeepSeek-V3.2. TileRT is a ...

一些您可能无法访问的结果已被隐去。

显示无法访问的结果