RAG Implementation

Build a production-grade Retrieval-Augmented Generation pipeline for Turkish documents.

15 min read

Getting Started Prompt Engineering RAG Fine-Tuning AI Agents Evaluation Deployment

Chunk your documents

Use recursive character splitting at 512 tokens with 10% overlap. For Turkish: avoid splitting mid-sentence — the language has long compound words that span chunk boundaries.

Choose an embedding model

For Turkish: multilingual-e5-large outperforms text-embedding-3-small on Turkish retrieval benchmarks. Self-host via Sentence-Transformers for cost control.

Vector store setup

Use pgvector on your Basefyio (Basefyio) instance — no additional infra. Enable HNSW index for <50ms retrieval at 1M vectors.

CREATE INDEX ON embeddings USING hnsw (embedding vector_cosine_ops)
WITH (m = 16, ef_construction = 64);

Hybrid search

Combine dense (vector) and sparse (BM25/tsvector) retrieval. For Turkish, FTS with "turkish" dictionary handles morphological variants — büyümek matches büyüme, büyüdü, etc.

Reranking

Cross-encoder reranking (bge-reranker-v2-m3) reduces hallucination by surfacing the most relevant chunks. Run reranker on top-20, pass top-5 to LLM.