xAI ★★ Frequent Hard AgentSearchRAG

X2 · Design DeepSearch (Agentic Web Research) X2 · 设计 DeepSearch(Agent 式网页深度研究)

Verified source经核实出处

Grok's DeepSearch feature launched 2025; pattern described in xAI blog posts and matches OpenAI's Deep Research / Perplexity patterns. Credibility C.

Problem问题

DeepSearch takes a user question, decides it needs web research, plans sub-queries, fetches pages (often from X for fresh signal), synthesizes an answer, and cites sources. Design the pipeline to handle 1M queries/day with 30-90s latency budget.DeepSearch 接收用户问题,判断是否需要网页研究,规划子查询,抓取页面(常从 X 获取时效性信号),综合出答案并附引用。设计流水线以支持每天 100 万次查询,延迟预算 30-90 秒。

Architecture架构

flowchart TB
  U[User] --> P[Planner LLM]
  P --> Q{Need web?}
  Q -->|no| A[Direct answer]
  Q -->|yes| S[Sub-query generator]
  S --> F[Fetcher pool]
  F --> X[X firehose]
  F --> W[Web crawler]
  F --> C[Content cache]
  F --> SUM[Per-page summarizer]
  SUM --> SYN[Synthesizer LLM]
  SYN --> CITE[Citation formatter]
  CITE --> U

Key decisions关键决策

  • Planner is a separate, smaller LLM call — decouples planning from synthesis to reduce token cost.Planner 是独立的、更小的 LLM 调用——将规划与综合解耦以降低 token 成本。
  • Parallel fetch pool with per-domain rate limits and polite User-Agent.并行抓取池,按域名设置速率限制,使用合规的 User-Agent。
  • Per-page summarization before synthesis — keeps final context window manageable (e.g. 20 pages × 500 tokens).综合前先做分页摘要——使最终上下文可控(如 20 页 × 500 token)。
  • X-firehose integration is xAI's moat: fresh signal unavailable to OpenAI/Anthropic. Rank X posts by author credibility × recency.X 全量 firehose 是 xAI 的护城河:OpenAI/Anthropic 拿不到的时效信号。按作者可信度 × 时效对 X 帖子排序。
  • Stream intermediate reasoning to the user so 60s latency feels interactive.向用户流式输出中间推理过程,让 60 秒延迟也感觉有交互性。

Follow-ups追问

  • How do you handle hallucinated citations? (Verify each citation against fetched content.)如何处理虚构的引用?(对每条引用与抓取内容做校验。)
  • How do you prevent prompt injection from fetched web pages?如何防止从抓取页面注入 prompt?

Related study-guide topics相关学习手册专题