Building an LLM from Scratch: From Tokens to Aligned Model

Shriira Press

Preface

A hands-on, build-it-yourself guide to constructing a large language model from the ground up — a working GPT-style model you tokenize for, archite…

Welcome to Building an LLM from Scratch: From Tokens to Aligned Model.

A hands-on, build-it-yourself guide to constructing a large language model from the ground up — a working GPT-style model you tokenize for, architect, train, sample from, fine-tune, and align, with every component built in readable PyTorch. This is the eighth volume in a series, and its engine room: the transformer and next-token prediction you build here are the exact machinery the companion books on audio, video, and vision repeatedly invoke. Where they survey, this book implements.

This title is part of the ShriIra library and is free to read in full, right here — our small contribution to making world-class knowledge easy to reach.

A note on reading it: open the Contents menu at the top of the reader to jump between chapters, use the Aa menu to set a comfortable text size, theme (light, sepia, or night), and single- or two-page layout. Your place is saved automatically, so you can always pick up where you left off.

We hope it serves you well.

— Shriira Press

Contents

  1. Chapter 1 — What Is an LLM, and What We're Building
  2. Chapter 2 — Text to Tokens: Tokenization
  3. Chapter 3 — Embeddings and the Data Pipeline
  4. Chapter 4 — Attention
  5. Chapter 5 — Multi-Head Attention and Causal Masking
  6. Chapter 6 — The Transformer Block
  7. Chapter 7 — Assembling the GPT Model
  8. Chapter 8 — Pretraining: The Training Loop
  9. Chapter 9 — Generating Text: Decoding and Sampling
  10. Chapter 10 — Scaling, Efficiency, and Practical Training
  11. Chapter 11 — Fine-Tuning and Instruction Tuning
  12. Chapter 12 — Alignment: RLHF, DPO, and Preferences
  13. Chapter 13 — Evaluation, Deployment, and Limitations
  14. Appendix A — Notation and Symbols
  15. Appendix B — Further Reading
0%
1/1