Thanos: Highly Available, Long-Term Prometheus

Shriira Press

Preface

Scale Prometheus up. Add a global query view across clusters, unlimited long-term retention via cheap object storage, and high availability — without replacing Prometheus.

Welcome to Thanos: Highly Available, Long-Term Prometheus.

Thanos is the CNCF project that extends Prometheus into a highly available, long-term, globally-queryable metrics platform — adding a global query view across all your Prometheus servers, unlimited long-term retention via cheap object storage, HA with deduplication, and downsampling, all Prometheus-compatible. This free book teaches it from the ground up: the Prometheus scaling problem and what Thanos is, Prometheus and metrics concepts, Thanos's architecture (the components and two models), the Sidecar and global query (the foundation, external labels), long-term storage (object storage, the Store Gateway), the Compactor and downsampling, high availability and deduplication, the Receiver and push-based ingestion, querying, rules, and operations, and using Thanos in practice. Ten focused chapters with clear diagrams that make scaling Prometheus concrete — keep your Prometheus servers and add Thanos to get a unified global view across clusters, affordable years-long retention, and reliable HA monitoring, incrementally and without relearning (same PromQL and Grafana).

This title is part of the ShriIra library and is free to read in full, right here — our small contribution to making world-class knowledge easy to reach.

A note on reading it: open the Contents menu at the top of the reader to jump between chapters, use the Aa menu to set a comfortable text size, theme (light, sepia, or night), and single- or two-page layout. Your place is saved automatically, so you can always pick up where you left off.

We hope it serves you well.

— Shriira Press

Contents

  1. Chapter 1 — What Thanos Is
  2. Chapter 2 — Prometheus and Metrics
  3. Chapter 3 — Thanos Architecture
  4. Chapter 4 — The Sidecar and Global Query
  5. Chapter 5 — Long-Term Storage
  6. Chapter 6 — The Compactor and Downsampling
  7. Chapter 7 — High Availability and Deduplication
  8. Chapter 8 — The Receiver and Push-Based Ingestion
  9. Chapter 9 — Querying, Rules, and Operations
  10. Chapter 10 — Thanos in Practice
0%
1/1