eve-sage

Agentic RAG agent that is an expert on the framework it is built with - retrieval engineering and eval rigor as the product.

multi-hop retrieval · evals in CI

role: Architect and sole engineer
stack: eve (Vercel) · TypeScript · AI SDK · RAG
status: oss

// 01 - PROBLEM

Most RAG demos stop at "embed some docs, retrieve top-k, stuff the prompt." eve-sage takes the opposite bet: build agentic retrieval with an honest eval harness, on Vercel's brand-new eve framework, using eve's own documentation as the corpus - so anyone learning eve is using an eve agent to do it.

// 02 - APPROACH

Agentic multi-hop retrieval: hard questions get decomposed and re-searched until answerable, not one query and hope.
Inline citations on every answer, grounded in the eve / AI SDK / Workflow SDK docs.
A finite, high-quality corpus makes eval numbers honest and reproducible.

// 03 - ARCHITECTURE

questiondecomposeretrievererankanswerciteeval

Self-demonstrating corpus: The agent's knowledge base is the documentation of its own stack - self-referential by design, and it keeps the project measurable instead of vibes-based.
Evals in CI, not as an afterthought: Retrieval quality regressions fail the build. The eval harness is the differentiator versus weekend chatbots.

// 04 - PRODUCTION-GRADE

Eval harness wired into CI
Hybrid retrieval with reranking
Tracing for every retrieval hop
Built on durable execution with HITL approvals available from the framework

// 05 - ARTIFACTS

github.com/mikulgohil/eve-sage →