OpenAI ★★ Frequent Hard RAGVector DB

O16 · LLM-powered Enterprise Search (RAG) O16 · 面向企业的 LLM 语义搜索（RAG）

Verified source经核实出处

Prompt: "Design an LLM-powered Enterprise Search System." — Glassdoor. Credibility C.

Two pipelines两条管道

Indexing pipeline: data ingest → clean/chunk → embed → vector store (+ keyword index).索引管道：采集 → 清洗/分块 → embed → 向量库（+ 关键词索引）。
Query pipeline: query → rewrite (optional) → hybrid retrieval → rerank → context assemble → LLM generate with citations.查询管道：query → 改写（可选）→ 混合检索 → 重排 → 上下文组装 → LLM 生成（带引用）。

Architecture架构

flowchart LR
  subgraph Indexing
    A[Docs / Connectors] --> B[Cleaner]
    B --> C[Chunker]
    C --> D[Embedder]
    D --> E[(Vector DB)]
    C --> F[(Keyword Index)]
  end
  subgraph Query
    Q[User Query] --> R[Query Rewrite]
    R --> HV[Vector Search] --> M[Merge]
    R --> HK[Keyword Search] --> M
    M --> RR[Reranker] --> LLM
    LLM --> ANS[Answer + Citations]
  end
  E --> HV
  F --> HK

Key design decisions关键决策

Chunk size is the single most important trade-off — too small loses context, too large injects noise. 200-500 tokens typical with 10-20% overlap.Chunk 大小是最关键的权衡——太小丢上下文，太大引入噪声。典型 200-500 tokens，10-20% 重叠。
Hybrid retrieval: BM25 (sparse) + dense embedding; merge via reciprocal rank fusion.混合检索：BM25（稀疏）+ 稠密 embedding；通过 RRF 合并。
Cross-encoder rerank of top-K (say K=50 → 10). Cost increases but answer quality jumps.Cross-encoder 重排对 top-K（K=50 → 10）。成本上升但答案质量显著提升。
HyDE (hypothetical document embeddings) for recall boost on rare queries.HyDE（假设文档 embedding）提升稀缺 query 召回。

Evaluation评估

Retrieval: Recall@K, MRR, NDCG.检索：Recall@K、MRR、NDCG。
Generation: Faithfulness (grounded in context), answer relevancy, citation accuracy.生成：Faithfulness（是否基于上下文）、answer relevancy、引用准确性。

Follow-ups追问

Knowledge-base updates: incremental embed + delete; use soft-delete with tombstones.知识库更新：增量 embed + 删除；软删除用 tombstone。
Hallucination mitigation: constrain to "cite or say you don't know"; evaluate with faithfulness metric.幻觉缓解：约束「引用或承认不知道」；用 faithfulness 评估。
Multi-modal docs: PDFs with tables/images → layout-aware parsers + separate embedding for tables.多模态文档：含表格/图片的 PDF → layout-aware 解析器 + 表格独立 embedding。

O16 · LLM-powered Enterprise Search (RAG) O16 · 面向企业的 LLM 语义搜索（RAG）

Verified source经核实出处

Two pipelines两条管道

Architecture架构

Key design decisions关键决策

Evaluation评估

Follow-ups追问

Related study-guide topics相关学习手册专题