Technology · Ebook

Fluid: Data Orchestration for Kubernetes

by Shriira Press

4.7(400)186 pagesPublished 2026

Fluid is a Kubernetes-native distributed dataset orchestration and acceleration engine for data-intensive applications (AI/ML and big data) — making data a first-class, cached, locality-aware citizen of Kubernetes. This free book teaches it from the ground up: the data problem in cloud-native compute and what Fluid is, the data-locality problem (decoupled storage, repeated fetching, idle GPUs), Fluid's architecture (Datasets, Runtimes, controllers), the Dataset abstraction, caching runtimes (Alluxio, JuiceFS, ThinRuntime), data acceleration (caching, prefetching, fast reads), data-aware scheduling (bringing compute to the data), AI/ML and big data use cases (training, serving, analytics), operating Fluid (managing caches, observability, autoscaling, consistency), and using Fluid in practice. Ten focused chapters with clear diagrams that demystify how to keep expensive compute fed — by caching data near compute, reusing it, and scheduling for locality — so data stops being the bottleneck.

1Preface
2Chapter 1 — What Fluid Is
3Chapter 2 — The Data Locality Problem
4Chapter 3 — Fluid's Architecture
5Chapter 4 — The Dataset Abstraction
6Chapter 5 — Caching Runtimes
7Chapter 6 — Data Acceleration
8Chapter 7 — Data-Aware Scheduling
9Chapter 8 — AI/ML and Big Data Use Cases
10Chapter 9 — Operating Fluid
11Chapter 10 — Using Fluid in Practice

Fluid: Data Orchestration for Kubernetes

Contents