X2 · Design DeepSearch (Agentic Web Research) X2 · 设计 DeepSearch(Agent 式网页深度研究)
Verified source经核实出处
Grok's DeepSearch feature launched 2025; pattern described in xAI blog posts and matches OpenAI's Deep Research / Perplexity patterns. Credibility C.
Problem问题
DeepSearch takes a user question, decides it needs web research, plans sub-queries, fetches pages (often from X for fresh signal), synthesizes an answer, and cites sources. Design the pipeline to handle 1M queries/day with 30-90s latency budget.DeepSearch 接收用户问题,判断是否需要网页研究,规划子查询,抓取页面(常从 X 获取时效性信号),综合出答案并附引用。设计流水线以支持每天 100 万次查询,延迟预算 30-90 秒。
Architecture架构
flowchart TB
U[User] --> P[Planner LLM]
P --> Q{Need web?}
Q -->|no| A[Direct answer]
Q -->|yes| S[Sub-query generator]
S --> F[Fetcher pool]
F --> X[X firehose]
F --> W[Web crawler]
F --> C[Content cache]
F --> SUM[Per-page summarizer]
SUM --> SYN[Synthesizer LLM]
SYN --> CITE[Citation formatter]
CITE --> U
Key decisions关键决策
- Planner is a separate, smaller LLM call — decouples planning from synthesis to reduce token cost.Planner 是独立的、更小的 LLM 调用——将规划与综合解耦以降低 token 成本。
- Parallel fetch pool with per-domain rate limits and polite User-Agent.并行抓取池,按域名设置速率限制,使用合规的 User-Agent。
- Per-page summarization before synthesis — keeps final context window manageable (e.g. 20 pages × 500 tokens).综合前先做分页摘要——使最终上下文可控(如 20 页 × 500 token)。
- X-firehose integration is xAI's moat: fresh signal unavailable to OpenAI/Anthropic. Rank X posts by author credibility × recency.X 全量 firehose 是 xAI 的护城河:OpenAI/Anthropic 拿不到的时效信号。按作者可信度 × 时效对 X 帖子排序。
- Stream intermediate reasoning to the user so 60s latency feels interactive.向用户流式输出中间推理过程,让 60 秒延迟也感觉有交互性。
Follow-ups追问
- How do you handle hallucinated citations? (Verify each citation against fetched content.)如何处理虚构的引用?(对每条引用与抓取内容做校验。)
- How do you prevent prompt injection from fetched web pages?如何防止从抓取页面注入 prompt?