A practical guide for product leaders who want a working knowledge assistant prototype — without sinking three months in vendor demos.
S
Super AdminApril 28, 2026
PlaybookRAGLLM
A practical guide for product leaders who want a working knowledge assistant prototype — without sinking three months in vendor demos.
Why 7 days?
The goal of a 7-day RAG pilot is not to build production software. It's to answer the highest-priority uncertainty: "Does this technology actually work with my data?" Speed is the point. Every day you spend scoping is a day of organizational doubt.
Want to run this playbook with us?
A 30-minute scoping call. We listen, ask three questions, tell you if we can help.
Before touching a vector database, write down the three most common questions your users ask that take more than 5 minutes to answer. Those are your eval cases. Not personas, not journey maps — just questions.
Pick a single domain: a product FAQ, a compliance manual, or a set of internal SOPs.
Collect 20–50 documents from that domain. Not 500. Not your entire intranet.
Write 20 ground-truth Q&A pairs. You'll use these to measure accuracy objectively.
Day 3–4: Build the minimum RAG stack
You need four components: a document chunker, an embedding model, a vector store, and an LLM. Here is the minimal stack that has worked for every pilot we've shipped:
Embeddings: text-embedding-3-small (OpenAI) or nomic-embed-text (self-hosted)
Vector store: pgvector on a Postgres instance you already have, or Qdrant Cloud free tier
LLM: GPT-4o-mini for cost, GPT-4o for quality — run both and compare
"The best RAG stack is the one you can explain to your client in 10 minutes and debug at 2am without documentation." — Sainskerta Engineering Principle
Day 5: Measure, not eyeball
Run your 20 ground-truth pairs. Score each answer on three dimensions: faithfulness (does it cite a real source?), relevance (does it answer the question asked?), and completeness (does it miss important details?). Target: ≥ 80% on all three before you demo.
Day 6: Fix the top 3 failure modes
Common failure modes in week-one RAG pilots: chunks too large (retrieval misses narrow facts), no reranking (noisy top-k), wrong embedding model for your language (especially critical for Indonesian-language documents).
Day 7: Demo and decide
Show 5 real users 5 real queries. Record their reactions. The question is not "do they like it?" — it's "can they tell when it's wrong?" If they can, you have a viable product. If they can't, you have a risk to manage.
What comes next
A 7-day pilot answers the feasibility question. The production question — latency, cost, accuracy at scale, ongoing evaluation — is a separate engagement. But you can't answer the production question until you've answered the feasibility question. Start here.
How to scope your first RAG pilot in 7 days — Sainskerta Blog · Sainskerta