cs/ai.

What is a Transformer?
Learnings from building AI agents
Building Effective AI Agents
Building a Linear Regression from Scratch with Python & Mathematics
The Annotated Kolmogorov-Arnold Network (KAN)
The Annotated Transformer
How do LLMs work?
Large Lambda Model
A Field Guide to Rapidly Improving AI Products
The 2025 AI Engineer Reading List
The Lost Reading Items
Monty Anderson
How to train a model on 10k H100 GPUs?
Tensor Labbet · A blog of deep learnings
naklecha/llama3-from-scratch: llama3 implementation one matrix multiplication at a time
A Visual Guide to Quantization - by Maarten Grootendorst
Ask HN: What are some "toy" projects you used to learn neural networks hands-on? | Hacker News
soulmachine/machine-learning-cheat-sheet: Classical equations and diagrams in machine learning
interdb.jp
Let's reproduce GPT-2 (1.6B): one 8XH100 node, 24 hours, $672, in llm.c · karpathy/llm.c · Discussion #677
Welcome … — Physics-based Deep Learning
Deep-ML
Trying Kolmogorov-Arnold Networks in Practice - Casey Primozic's Homepage
NMI_Review
karpathy/LLM101n: LLM101n: Let's build a Storyteller
Ilya 30u30
A Visual Guide to Vision Transformers | MDTURP
What are 1-bit LLMs?. The Era of 1-bit LLMs with BitNet b1.58 | by Mehul Gupta | Data Science in your pocket | Mar, 2024 | Medium
A Deep Dive into the Underlying Architecture of Groq's LPU