RAG in Production: Beyond the Tutorial (Vector DB Selection, Chunking, Evaluation)
Every RAG tutorial follows the same script. Load documents, split into chunks, embed them, store in a vector database, retrieve the top-k results, pass to an LL...
All posts tagged with #production
Every RAG tutorial follows the same script. Load documents, split into chunks, embed them, store in a vector database, retrieve the top-k results, pass to an LL...
Last year I shipped an AI agent that cost us $400 in a single afternoon. It got stuck in a loop, calling the same API endpoint over and over, burning tokens on ...
I have been building software for over 18 years. In the last two years, my work has shifted more dramatically than in any previous decade. The shift is not abou...
The first prompt you write for a production system is a string in your code. The fiftieth prompt is a liability if it changes without testing, if it diverges be...
There is a version of every AI agent that works in a conference room demo. The LLM call succeeds, the tool executes cleanly, the output looks impressive. Then s...
After managing over 50 production database migrations from small schema changes to complete database engine switches affecting millions of users I've learned th...