塞

塞巴斯蒂安·拉什卡

Sebastian Raschka

🤖 AI/技术· 20 篇内容

LLM研究者、技术作者，《Build A Large Language Model (From Scratch)》作者

The State Of LLMs 2025: Progress, Problems, and Predictions

A 2025 review of large language models, from DeepSeek R1 and RLVR to inference-time scaling, benchmarks, architectures, and predictions for 2026.

博客2025/12/30

LLM Research Papers: The 2025 List (July to December)

In June, I shared a bonus article with my curated and bookmarked research paper lists to the paid subscribers who make this Substack possible.

博客2025/12/30

From DeepSeek V3 to V3.2: Architecture, Sparse Attention, and RL Updates

Understanding How DeepSeek's Flagship Open-Weight Models Evolved

博客2025/12/3

Beyond Standard LLMs

Linear Attention Hybrids, Text Diffusion, Code World Models, and Small Recursive Transformers

博客2025/11/4

Understanding the 4 Main Approaches to LLM Evaluation (From Scratch)

Multiple-Choice Benchmarks, Verifiers, Leaderboards, and LLM Judges with Code Examples

博客2025/10/5

Understanding and Implementing Qwen3 From Scratch

A Detailed Look at One of the Leading Open-Source LLMs

博客2025/9/6

From GPT-2 to gpt-oss: Analyzing the Architectural Advances

And How They Stack Up Against Qwen3

博客2025/8/9

The Big LLM Architecture Comparison

From DeepSeek-V3 to Kimi K2: A Look At Modern LLM Architecture Design

博客2025/7/19

LLM Research Papers: The 2025 List (January to June)

A topic-organized collection of 200+ LLM research papers from 2025

博客2025/7/1

Understanding and Coding the KV Cache in LLMs from Scratch

KV caches are one of the most critical techniques for efficient inference in LLMs in production.

博客2025/6/17

Coding LLMs from the Ground Up: A Complete Course

Why build LLMs from scratch? It's probably the best and most efficient way to learn how LLMs really work. Plus, many readers have told me they had a lot of fun doing it.

博客2025/5/10

The State of Reinforcement Learning for LLM Reasoning

Understanding GRPO and New Insights from Reasoning Model Papers

博客2025/4/19

First Look at Reasoning From Scratch: Chapter 1

Welcome to the next stage of large language models (LLMs): reasoning. LLMs have transformed how we process and generate text, but their success has been largely driven by statistical pattern recognition. However, new advances in reasoning methodologies now enable LLMs to tackle more complex tasks, such as solving logical puzzles or multi-step arithmetic. Understanding these methodologies is the central focus of this book.

博客2025/3/29