Code Signal Coding Score

AI Coding Benchmark Scores Are Inflated by Answer Retrieval, Cursor Study Finds

AI coding benchmark scores that labs, enterprises, and investors use to compare frontier models are inflated by answer retrieval — not genuine reasoning — and the smarter the model, the more inflated ...

Tech Times

Open-Source Coding Model Ornith-1.0 Writes Its Own Training Scaffold in Reinforcement Learning

Open-source agentic coding model Ornith-1.0, released today under the MIT license, uses a self-improving reinforcement ...

21 天

Kimi K2.7-Code cuts thinking tokens 30% — but practitioners say the benchmarks don't ...

Kimi K2.7-Code claims 30% fewer thinking tokens and a drop-in API swap path, but independent benchmarks show kernel regressions and no DeepSWE submission.

1 天

Meta's new app 'Pocket' is a social feed of vibe-coded mini games

Pocket is a new app from Meta that lets you create and share interactive content, like mini games, with friends. It's powered ...

the-decoder

Zhipu AI's GLM-5.2 closes in on closed-source leaders in coding marathons

Chinese AI lab Zhipu AI releases GLM-5.2 with a stable 1-million-token context under the MIT license. On hours-long coding tasks, the open-source model trails Anthropic's Opus models by just a few ...

1 天

Lenny Rachitsky Swears By These 36 High-Signal Books to Be a Better Manager and Builder

I was tearing through so many coding books that my dad started returning the ones I’d finished to the bookstore so we could ...

GitHub

bradAGI/awesome-cli-coding-agents

A CLI coding agent is an AI-powered tool that runs in your terminal and can autonomously read, write, and execute code in your repository. Unlike chat-based assistants, these agents have direct access ...

17 天

Why Weibo’s tiny VibeThinker-3B has the AI world arguing over benchmarks again

B, a 3-billion-parameter AI model, is challenging OpenAI, Google and DeepSeek on math and coding benchmarks while reigniting the debate over AI scaling, benchmark gaming and small-model reasoning.

3 天on MSN

Anthropic launches Claude Sonnet 5 as a cheaper way to run agents

Anthropic’s Claude Sonnet 5 brings stronger agentic capabilities, lower pricing, and improved safety, positioning the model ...

AMBCrypto

Top 15 Free AI Stock Trading Bots for Beginners in 2026: A Practical Guide

AI stock trading bots are becoming easier for beginners to explore in 2026. Many platforms now offer free trials, paper ...

一些您可能无法访问的结果已被隐去。

显示无法访问的结果