Quantization from the ground up

A thorough explainer on how quantization makes LLMs 4x smaller and 2x faster while losing only 5-10% accuracy. Covers floating point precision, compression techniques, and how to measure quality loss, with interactive examples throughout.

Quantization from the ground up

Published by hadi on April 8, 2026

PHP

A tour of my dotfiles

PHP

Rewriting Bun in Rust

PHP

CLAUDE.md is RAM, not disk

Related Posts

PHP

A tour of my dotfiles

PHP

Rewriting Bun in Rust

PHP

CLAUDE.md is RAM, not disk