O8 · Search/Recommendation System with LLMs O8 · 融合 LLM 的搜索/推荐系统
Verified source经核实出处
Recruiter prompt: "We'll explore your experience with search, ranking, retrieval, and how to adapt LLMs to interact with such systems." — TeamBlind, 2025-10-22. Credibility C.
Split into two chains拆成两条主链路
- Online retrieval + ranking: Query → recall (keyword + vector) → rerank → result page.在线检索与排序:Query → 召回(keyword + 向量)→ rerank → 结果页。
- LLM insertion points: query understanding/rewrite, result summary/conversational explanation, tool-use to trigger secondary retrieval.LLM 插入点:Query 理解/改写、结果摘要/对话式解释、工具调用触发二次检索。
Reference architecture参考架构
flowchart LR U[User] --> QP[Query Parser] QP --> KR[Keyword Search (BM25)] QP --> VR[Vector Search (ANN)] KR --> M[Merge] VR --> M M --> RR[Reranker (Cross-Encoder)] RR --> LLM[LLM Layer] LLM --> U
Key trade-offs核心权衡
- LLM in recall (query rewrite): cheaper downstream; risk of recall bias.LLM 参与召回(Query 改写):后续成本低;但有召回偏差风险。
- LLM in rerank / summarize: safer but more expensive per-query.LLM 参与 rerank / 摘要:更稳但每次成本更高。
- Index update: real-time writes sacrifice throughput; batch sacrifices freshness.索引更新:实时写牺牲吞吐;批处理牺牲新鲜度。
Evaluation (dual)评估(双轨)
- IR metrics: Recall@K, NDCG@10, MRR.IR 指标:Recall@K、NDCG@10、MRR。
- Generation quality: faithfulness, helpfulness, LLM-as-judge + human pairwise.生成质量:faithfulness、helpfulness、LLM-as-judge + 人类 pairwise。
- Online: A/B with click, dwell time, satisfaction proxy.在线:A/B 测试点击、停留时间、满意度代理指标。
Safety lensSafety 视角
If safety comes up, reference Anthropic's Constitutional AI as a structured way to layer policy into the LLM step. A moderation pipeline on both the query and the answer is standard.若被问到安全,可引用 Anthropic 的 Constitutional AI 作为「把策略注入 LLM 步骤」的结构化方法。对 Query 与 Answer 都做 moderation 是标配。