Building RAG in Laravel: Four Ingestion Bugs That Silently Wreck Retrieval

Every Laravel RAG tutorial builds the same ingestion pipeline (chunk, embed, store) and stops the moment the agent answers on screen. None of them check whether retrieval is any good. But retrieval quality is decided at ingestion, before the model runs once, and four decisions there fail with no error, no exception, no failed test:

Chunking that severs the answer mid-sentence, so answer@1 falls while source hit@1 still looks healthy.
An HNSW index built with vector_l2_ops while you query with cosine <=>. Postgres silently ignores the index and scans every row. Laravel 13’s native whereVectorSimilarTo() hardcodes <=>, so it’s easier to hit than ever. Shown with EXPLAIN.
The embedding dimension baked into the vector(1536) column type, so “shrink it to save storage” is a migration plus a full re-embed that quietly drops retrieval to 47%.
Ingesting and querying with different models, which turns every distance into noise.

Each bug is real code from a working repo, proven against an eval suite. It’s the prequel to my earlier “Evaluating RAG in Laravel” post: build it, prove it, tune it. Every example verified against laravel/ai v0.7.2 and pgvector, with the full repo to clone.

Building RAG in Laravel: Four Ingestion Bugs That Silently Wreck Retrieval

Published by hadi on June 19, 2026

PHP

PHP version stats: July, 2026

PHP

Building an order fulfilment workflow in Laravel

PHP

Setting up a remote environment for agentic coding on a VPS

Related Posts

PHP

PHP version stats: July, 2026

PHP

Building an order fulfilment workflow in Laravel

PHP

Setting up a remote environment for agentic coding on a VPS