OpenAI ★★ Frequent Hard RAGVector DB

O16 · LLM-powered Enterprise Search (RAG) O16 · 面向企业的 LLM 语义搜索(RAG)

Verified source经核实出处

Prompt: "Design an LLM-powered Enterprise Search System." — Glassdoor. Credibility C.

Two pipelines两条管道

  1. Indexing pipeline: data ingest → clean/chunk → embed → vector store (+ keyword index).索引管道:采集 → 清洗/分块 → embed → 向量库(+ 关键词索引)。
  2. Query pipeline: query → rewrite (optional) → hybrid retrieval → rerank → context assemble → LLM generate with citations.查询管道:query → 改写(可选)→ 混合检索 → 重排 → 上下文组装 → LLM 生成(带引用)。

Architecture架构

flowchart LR
  subgraph Indexing
    A[Docs / Connectors] --> B[Cleaner]
    B --> C[Chunker]
    C --> D[Embedder]
    D --> E[(Vector DB)]
    C --> F[(Keyword Index)]
  end
  subgraph Query
    Q[User Query] --> R[Query Rewrite]
    R --> HV[Vector Search] --> M[Merge]
    R --> HK[Keyword Search] --> M
    M --> RR[Reranker] --> LLM
    LLM --> ANS[Answer + Citations]
  end
  E --> HV
  F --> HK

Key design decisions关键决策

  • Chunk size is the single most important trade-off — too small loses context, too large injects noise. 200-500 tokens typical with 10-20% overlap.Chunk 大小是最关键的权衡——太小丢上下文,太大引入噪声。典型 200-500 tokens,10-20% 重叠。
  • Hybrid retrieval: BM25 (sparse) + dense embedding; merge via reciprocal rank fusion.混合检索:BM25(稀疏)+ 稠密 embedding;通过 RRF 合并。
  • Cross-encoder rerank of top-K (say K=50 → 10). Cost increases but answer quality jumps.Cross-encoder 重排对 top-K(K=50 → 10)。成本上升但答案质量显著提升。
  • HyDE (hypothetical document embeddings) for recall boost on rare queries.HyDE(假设文档 embedding)提升稀缺 query 召回。

Evaluation评估

  • Retrieval: Recall@K, MRR, NDCG.检索:Recall@K、MRR、NDCG。
  • Generation: Faithfulness (grounded in context), answer relevancy, citation accuracy.生成:Faithfulness(是否基于上下文)、answer relevancy、引用准确性。

Follow-ups追问

  • Knowledge-base updates: incremental embed + delete; use soft-delete with tombstones.知识库更新:增量 embed + 删除;软删除用 tombstone。
  • Hallucination mitigation: constrain to "cite or say you don't know"; evaluate with faithfulness metric.幻觉缓解:约束「引用或承认不知道」;用 faithfulness 评估。
  • Multi-modal docs: PDFs with tables/images → layout-aware parsers + separate embedding for tables.多模态文档:含表格/图片的 PDF → layout-aware 解析器 + 表格独立 embedding。

Related study-guide topics相关学习手册专题