
A 2025 review of large language models, from DeepSeek R1 and RLVR to inference-time scaling, benchmarks, architectures, and predictions for 2026.

In June, I shared a bonus article with my curated and bookmarked research paper lists to the paid subscribers who make this Substack possible.

Understanding How DeepSeek's Flagship Open-Weight Models Evolved

Linear Attention Hybrids, Text Diffusion, Code World Models, and Small Recursive Transformers

Multiple-Choice Benchmarks, Verifiers, Leaderboards, and LLM Judges with Code Examples

A Detailed Look at One of the Leading Open-Source LLMs

And How They Stack Up Against Qwen3

From DeepSeek-V3 to Kimi K2: A Look At Modern LLM Architecture Design

A topic-organized collection of 200+ LLM research papers from 2025

KV caches are one of the most critical techniques for efficient inference in LLMs in production.

Why build LLMs from scratch? It's probably the best and most efficient way to learn how LLMs really work. Plus, many readers have told me they had a lot of fun doing it.

Understanding GRPO and New Insights from Reasoning Model Papers

Welcome to the next stage of large language models (LLMs): reasoning. LLMs have transformed how we process and generate text, but their success has been largely driven by statistical pattern recognition. However, new advances in reasoning methodologies now enable LLMs to tackle more complex tasks, such as solving logical puzzles or multi-step arithmetic. Understanding these methodologies is the central focus of this book.

Inference-Time Compute Scaling Methods to Improve Reasoning Models

Methods and Strategies for Building and Refining Reasoning Models

Six influential AI papers from July to December

Six influential AI papers from January to June

A curated list of interesting LLM-related research papers from 2024, shared for those looking for something to read over the holidays.

An introduction to the main techniques and latest models

Finetuning a GPT Model for Spam Classification