Computer Vision: From Pixels to Perception

Shriira Press

Preface

A comprehensive, self-contained guide to how machines learn to see — to turn raw pixels into understanding: what objects are present, where they ar…

Welcome to Computer Vision: From Pixels to Perception.

A comprehensive, self-contained guide to how machines learn to see — to turn raw pixels into understanding: what objects are present, where they are, how a scene is laid out in three dimensions, and what is happening over time. This is the seventh volume in a series; it blends intuition, mathematics, and runnable code, and is the discriminative counterpart to the companion Image Generation book: where that book asks "how do I create an image?", this one asks "how do I understand one?"

This title is part of the ShriIra library and is free to read in full, right here — our small contribution to making world-class knowledge easy to reach.

A note on reading it: open the Contents menu at the top of the reader to jump between chapters, use the Aa menu to set a comfortable text size, theme (light, sepia, or night), and single- or two-page layout. Your place is saved automatically, so you can always pick up where you left off.

We hope it serves you well.

— Shriira Press

Contents

  1. Chapter 1 — What Is Computer Vision?
  2. Chapter 2 — Images, Pixels, and Formation
  3. Chapter 3 — Classical Computer Vision
  4. Chapter 4 — Convolutional Neural Networks for Vision
  5. Chapter 5 — Image Classification and Architectures
  6. Chapter 6 — Vision Transformers and Modern Backbones
  7. Chapter 7 — Object Detection
  8. Chapter 8 — Semantic and Instance Segmentation
  9. Chapter 9 — Beyond 2D: Pose, Depth, Motion, and 3D
  10. Chapter 10 — Self-Supervised and Representation Learning
  11. Chapter 11 — Video Understanding
  12. Chapter 12 — Datasets, Evaluation, and Deployment
  13. Chapter 13 — Ethics: Facial Recognition, Surveillance, and Bias
  14. Appendix A — Notation and Symbols
  15. Appendix B — Further Reading
0%
1/1