O16 · LLM-powered Enterprise Search (RAG) O16 · 面向企业的 LLM 语义搜索(RAG)
Verified source经核实出处
Prompt: "Design an LLM-powered Enterprise Search System." — Glassdoor. Credibility C.
Two pipelines两条管道
- Indexing pipeline: data ingest → clean/chunk → embed → vector store (+ keyword index).索引管道:采集 → 清洗/分块 → embed → 向量库(+ 关键词索引)。
- Query pipeline: query → rewrite (optional) → hybrid retrieval → rerank → context assemble → LLM generate with citations.查询管道:query → 改写(可选)→ 混合检索 → 重排 → 上下文组装 → LLM 生成(带引用)。
Architecture架构
flowchart LR
subgraph Indexing
A[Docs / Connectors] --> B[Cleaner]
B --> C[Chunker]
C --> D[Embedder]
D --> E[(Vector DB)]
C --> F[(Keyword Index)]
end
subgraph Query
Q[User Query] --> R[Query Rewrite]
R --> HV[Vector Search] --> M[Merge]
R --> HK[Keyword Search] --> M
M --> RR[Reranker] --> LLM
LLM --> ANS[Answer + Citations]
end
E --> HV
F --> HK
Key design decisions关键决策
- Chunk size is the single most important trade-off — too small loses context, too large injects noise. 200-500 tokens typical with 10-20% overlap.Chunk 大小是最关键的权衡——太小丢上下文,太大引入噪声。典型 200-500 tokens,10-20% 重叠。
- Hybrid retrieval: BM25 (sparse) + dense embedding; merge via reciprocal rank fusion.混合检索:BM25(稀疏)+ 稠密 embedding;通过 RRF 合并。
- Cross-encoder rerank of top-K (say K=50 → 10). Cost increases but answer quality jumps.Cross-encoder 重排对 top-K(K=50 → 10)。成本上升但答案质量显著提升。
- HyDE (hypothetical document embeddings) for recall boost on rare queries.HyDE(假设文档 embedding)提升稀缺 query 召回。
Evaluation评估
- Retrieval: Recall@K, MRR, NDCG.检索:Recall@K、MRR、NDCG。
- Generation: Faithfulness (grounded in context), answer relevancy, citation accuracy.生成:Faithfulness(是否基于上下文)、answer relevancy、引用准确性。
Follow-ups追问
- Knowledge-base updates: incremental embed + delete; use soft-delete with tombstones.知识库更新:增量 embed + 删除;软删除用 tombstone。
- Hallucination mitigation: constrain to "cite or say you don't know"; evaluate with faithfulness metric.幻觉缓解:约束「引用或承认不知道」;用 faithfulness 评估。
- Multi-modal docs: PDFs with tables/images → layout-aware parsers + separate embedding for tables.多模态文档:含表格/图片的 PDF → layout-aware 解析器 + 表格独立 embedding。