Fluid: Data Orchestration for Kubernetes cover

Technology · Ebook

Fluid: Data Orchestration for Kubernetes

by Shriira Press

4.7(400)186 pagesPublished 2026

Fluid is a Kubernetes-native distributed dataset orchestration and acceleration engine for data-intensive applications (AI/ML and big data) — making data a first-class, cached, locality-aware citizen of Kubernetes. This free book teaches it from the ground up: the data problem in cloud-native compute and what Fluid is, the data-locality problem (decoupled storage, repeated fetching, idle GPUs), Fluid's architecture (Datasets, Runtimes, controllers), the Dataset abstraction, caching runtimes (Alluxio, JuiceFS, ThinRuntime), data acceleration (caching, prefetching, fast reads), data-aware scheduling (bringing compute to the data), AI/ML and big data use cases (training, serving, analytics), operating Fluid (managing caches, observability, autoscaling, consistency), and using Fluid in practice. Ten focused chapters with clear diagrams that demystify how to keep expensive compute fed — by caching data near compute, reusing it, and scheduling for locality — so data stops being the bottleneck.

Contents

  1. 1Preface
  2. 2Chapter 1 — What Fluid Is
  3. 3Chapter 2 — The Data Locality Problem
  4. 4Chapter 3 — Fluid's Architecture
  5. 5Chapter 4 — The Dataset Abstraction
  6. 6Chapter 5 — Caching Runtimes
  7. 7Chapter 6 — Data Acceleration
  8. 8Chapter 7 — Data-Aware Scheduling
  9. 9Chapter 8 — AI/ML and Big Data Use Cases
  10. 10Chapter 9 — Operating Fluid
  11. 11Chapter 10 — Using Fluid in Practice