Data Science · Ebook
Pandas: Data Wrangling in Python
by Shriira Press
A comprehensive, self-contained guide to pandas, the library at the heart of data work in Python — the tool that loads, cleans, reshapes, combines, and analyzes the messy tabular data of the real world before any model or chart ever sees it. If a Python data-science or machine-learning project starts with a CSV, a database table, or a spreadsheet, pandas is almost certainly the first thing it touches. This book teaches it from first principles: the Series and DataFrame, reading and writing data, selecting and cleaning, the split-apply-combine of groupby, merging and reshaping, time series, performance, and the end-to-end data-analysis workflow. It blends intuition, the concepts behind the API, and runnable code.
Contents
- 1Preface
- 2Chapter 1 — What Is Pandas?
- 3Chapter 2 — Series and DataFrame: The Core Data Structures
- 4Chapter 3 — Reading and Writing Data: I/O
- 5Chapter 4 — Indexing and Selection: loc, iloc, and Boolean Filtering
- 6Chapter 5 — Cleaning Data: Missing Values, Duplicates, and Types
- 7Chapter 6 — Transforming Columns: apply, map, and Vectorized Operations
- 8Chapter 7 — GroupBy: Split-Apply-Combine
- 9Chapter 8 — Combining Data: Merge, Join, and Concat
- 10Chapter 9 — Reshaping Data: Pivot, Melt, and Tidy Data
- 11Chapter 10 — Time Series
- 12Chapter 11 — Categorical Data, Text, and Advanced Types
- 13Chapter 12 — Performance, Scaling, and the Ecosystem
- 14Chapter 13 — The Data Analysis Workflow and the Profession
- 15Appendix A — Glossary and API Quick Reference
- 16Appendix B — Further Reading and Resources
