真题 Arena — SD-Guide

O1 Design a Webhook Delivery Platform 设计 Webhook 投递平台

OpenAI ★★★ Hard

Billions of events/day, 24h retry window, per-endpoint ordering, idempotency, DLQ, multi-tenant isolation, cost controls. 日均十亿级事件、24h 重试窗口、每 endpoint 顺序、幂等、DLQ、多租户隔离、成本控制。

Queue & RetryState MachineIdempotency LeetCode 2026-02 · Screen

O2 Design a Webhook Service (REST API) 设计 Webhook 服务（REST API）

OpenAI ★★★ Hard

REST resource semantics, cache invalidation, DB schema, queue retry semantics. Cache-aside vs write-through trade-offs explicit. REST 资源语义、缓存失效、DB schema、队列重试语义。显式讨论 cache-aside 与 write-through 的权衡。

RESTCachingMessage Queue Blind 2024-04 · Screen

O3 Webhook Platform with External URL Lookup (24h) 依赖外部服务的 Webhook 平台（24 小时重试）

OpenAI ★★ Hard

External dependency (ServiceB) for URL lookup + 24h retry forces you to design a robust state machine, TTL config cache, circuit breaker. URL 需从 ServiceB 查询 + 24 小时重试，把系统推向「状态机 + 外部一致性 + 可靠重试」。

State MachineCircuit BreakerExternal Dep LeetCode 2024-10 · Screen

O4 Design Slack 设计 Slack

OpenAI ★★★ Hard

Real-time chat, channels, presence, offline delivery, fan-out. The "2-week MVP" framing is a scope-discipline trap. 实时消息、频道、在线状态、离线投递、fanout 策略。「2 周 MVP」是范围控制的陷阱。

WebSocketFan-outOrdering LeetCode · Exponent · Glassdoor

O5 Design a CI/CD System (GitHub Actions) 设计 CI/CD 系统（类 GitHub Actions）

OpenAI ★★★ Hard

Multi-tenant workflow engine, runner pool, log/artifact storage, lease-based scheduling, "reliability first" per interviewer. 多租户工作流引擎、Runner 池、日志/制品存储、基于 lease 的调度，「可靠性优先」。

Workflow EngineLeaseMulti-tenant LeetCode 2026-02 · Jointaro 2025-07

O6 Design GitHub Actions from Scratch 从零设计 GitHub Actions

OpenAI ★★ Hard

Builds on O5 with productization: YAML config parsing/versioning, event integration, secret management, permission tokens, audit. 在 O5 基础上加产品化能力：YAML 解析/版本化、事件集成、Secret 管理、权限 Token、审计。

YAMLSecret MgmtRBAC Jointaro 2025-07

O7 CI/CD with Linear Multi-Step Pipeline 线性多步 CI/CD 流水线

OpenAI ★★ Medium-Hard

Minimum distributed-workflow subset: state-driven scheduling via CDC, step table, lease-based serialization. 分布式工作流引擎最小子集：CDC 驱动调度、step 表、lease 化串行。

CDCState MachineChange Stream LeetCode 2025-11

O8 Search/Recommendation with LLMs 融合 LLM 的搜索/推荐系统

OpenAI ★★ Hard

Hybrid retrieval (BM25 + vector), reranker, where to insert the LLM (query rewrite vs summarize), offline/online eval. 混合检索（BM25 + 向量）、重排序、LLM 插入位置（Query rewrite vs Summarize）、离线/在线评估。

Hybrid RetrievalRerankEval Blind 2025-10 · Recruiter prompt

O9 Design an In-Memory Database 设计内存数据库

OpenAI ★★★ Hard

SET/GET/DEL with TTL + range scans; evolve to WAL, snapshots, sharding, replication. Follow-ups add GROUP BY, ORDER BY. SET/GET/DEL + TTL + 范围扫描；演进到 WAL、快照、分片、复制。追问会加 GROUP BY、ORDER BY。

KV StoreWALSharding Glassdoor · StaffEngPrep · IGotAnOffer

O10 Fault-Tolerant Polite Web Crawler @10M RPS 容错礼貌型网页爬虫（10M RPS）

OpenAI ★★ Hard

URL frontier, politeness scheduler, robots.txt cache, Bloom-filter dedup, canonicalization, content-hash for idempotent writes. URL frontier、礼貌调度器、robots.txt 缓存、Bloom 去重、URL 规范化、基于内容 hash 的幂等写入。

FrontierPolitenessBloom Filter IGotAnOffer · Jointaro · Interviewing.io

O11 Design the OpenAI Playground 设计 OpenAI Playground

OpenAI ★★★ Hard

Frontend wireframe, API layer, thread/message history schema, prompt versioning, streaming output, multi-model selector. 前端线框图、API 层、线程/消息历史 schema、prompt 版本化、流式输出、多模型选择器。

Product DesignStreamingVersioning Exponent · Candidate-verified

O12 Design ChatGPT for 100M Users 设计承载 1 亿用户的 ChatGPT

OpenAI ★★ Hard

End-to-end scaling: session persistence, GPU fleet management, regional routing, conversation storage, usage metering. 端到端扩展：会话持久化、GPU 集群管理、区域路由、对话存储、用量计费。

ScaleGPU FleetMetering Medium · 1Point3Acres

O13 NSFW / Safety Detection for ChatGPT Outputs ChatGPT 输出的 NSFW/安全检测

OpenAI ★★ Hard

Data collection pipeline, model choice (rule vs classifier vs LLM-judge), latency budget, feedback loop, red-team flywheel. 数据收集管道、模型选择（规则 vs 分类器 vs LLM 评判）、延迟预算、反馈循环、red-team 飞轮。

ModerationClassifierFeedback Jobright · Glassdoor

O14 GPU Credit / Quota Scheduling System GPU 信用/配额调度系统

OpenAI ★★ Hard

Credit calculator with expiration, FIFO-consume-oldest semantics, prevent double-spend, fair-queue per tenant. 信用计算器带过期、先消费最旧额度、防止双重扣减、按租户公平排队。

Credit SystemFairnessAccounting Glassdoor

O15 Streaming Token Response System Token 流式响应系统

OpenAI ★★ Hard

SSE/WebSocket, low TTFT, in-stream moderation pipeline, backpressure, reconnect with cursor, server-to-client event model. SSE/WebSocket、低 TTFT、流式 moderation、背压、带游标重连、服务端到客户端事件模型。

SSEBackpressureTTFT SystemDesignHandbook

O16 LLM-powered Enterprise Search (RAG) 面向企业的 LLM 语义搜索（RAG）

OpenAI ★★ Hard

RAG pipeline: ingestion → chunking → embedding → vector DB → hybrid retrieval → rerank → LLM w/ citation. Hallucination mitigation. RAG 管道：采集 → 分块 → embedding → 向量库 → 混合检索 → 重排 → 带引用的 LLM 生成。幻觉缓解。

RAGVector DBHybrid Search Glassdoor

O17 Design a Rate Limiter 设计限流器

OpenAI ★★ Medium

Token bucket vs leaky bucket vs sliding window. Token-based billing for LLM APIs. Distributed coordination via Redis. 令牌桶 vs 漏桶 vs 滑动窗口。LLM API 的 token 级计费限流。基于 Redis 的分布式协调。

Token BucketRedisDistributed Jobright · IGotAnOffer

O18 Design a Vector Database 设计向量数据库

OpenAI ★★ Hard

Store & search billions of embeddings. ANN algorithms (HNSW, IVF-PQ), sharding, hybrid filter queries, ingestion pipeline. 数十亿 embedding 的存储与搜索。ANN 算法（HNSW、IVF-PQ）、分片、混合过滤查询、采集管道。

ANNHNSWSharding Jobright

O19 Distributed ML Training Platform 分布式 ML 训练平台

OpenAI ★★ Staff-level

Orchestrate training across thousands of GPUs: DP/TP/PP, ZeRO, checkpointing, fault-tolerance, job scheduler, bandwidth-aware routing. 数千 GPU 的训练编排：DP/TP/PP、ZeRO、Checkpoint、容错、作业调度、带宽感知路由。

3D ParallelismZeROCheckpoint Hello Interview

O20 Design a URL Shortener (Shorten URL) 设计短链接系统

OpenAI ★★★ Medium

One of OpenAI's five SD-pool classics. Base62 vs hash-of-URL, redirect path cache, write/read ratio, analytics, custom aliases, expiration. OpenAI 五大 SD 题池之一。Base62 vs URL 哈希、重定向路径缓存、读写比、分析、自定义别名、过期。

Base62RedirectHot Cache 小红书 @Infra+RL+Robots · 2026 SD pool

O21 Design a Chat Room 设计聊天室

OpenAI ★★ Medium-Hard

Narrower than Slack (O4): single/multi chat rooms, WebSocket + pub/sub, presence, message ordering, history pagination. No DMs, workspaces, or threads. 比 Slack（O4）窄：单/多聊天室、WebSocket + pub/sub、在线状态、消息顺序、历史翻页。不含 DM、工作区、Thread。

WebSocketPub/SubRoom State 小红书 @Infra+RL+Robots · 2026 SD pool

O22 Design a Toy Language Interpreter 设计自定义语言解释器

OpenAI ★★ Hard

Language-runtime flavor: lexer → parser → AST → tree-walking evaluator. Sandbox, memory/time limits, stdlib surface area, REPL vs script mode, error UX. 语言运行时：词法 → 语法 → AST → 树遍历求值。沙箱、内存/时间上限、标准库边界、REPL vs 脚本模式、错误提示。

Lexer/ParserASTSandbox 小红书 @Infra+RL+Robots · 2026 SD pool

O23 Enterprise GPT for a 20K-Employee Company (RAG + ACL) 2 万人企业的 Enterprise GPT（RAG + ACL）

OpenAI ★★★ Hard

"Strong Hire" real question. Four-layer: ingestion/chunking → retriever → evaluator → generator. Traceable citations, per-doc ACLs, P95 < 2s, EN/中 multilingual, online learning from feedback. 「Strong Hire」真题。四层架构：采集/分块 → 检索 → 评估 → 生成。可追溯引用、按文档 ACL、P95 < 2s、中英双语、基于反馈的在线学习。

RAGACLCitations 小红书 @Justin (Strong Hire) · 2026

A11 High-Concurrency LLM Inference Service 高并发 LLM 推理服务

Anthropic ★★★★ Hard

Streaming tokens, prefill vs decode split, KV cache, continuous batching, tail latency, GPU memory management. The canonical Anthropic Q. 流式 token、prefill 与 decode 分相、KV cache、连续 batching、尾延迟控制、GPU 显存管理。Anthropic 经典题。

LLM ServingKV CachePagedAttention PracHub 2026-02 · Onsite

A12 GPU Inference Request Batching GPU 推理请求动态 batching

Anthropic ★★★ Hard

Flush policies (size/age/length-spread), head-of-line blocking mitigation, admission control, observability, overload handling. Flush 策略（大小/时间/长度方差）、队头阻塞缓解、准入控制、可观测性、过载处理。

BatchingSLOAdmission Control PracHub 2026-03 · Onsite

A13 Inference Routing & Scheduling Layer 推理路由与调度层

Anthropic ★★★ Hard

Priority queues, credit-based fairness (WFQ/DRR), result cache for determinism (temp=0), heterogeneous hardware pools. 优先级队列、基于 credit 的公平调度（WFQ/DRR）、temp=0 确定性结果缓存、异构硬件池。

SchedulerWFQResult Cache PracHub 2025-09 · Onsite

A14 Batch Inference API 批量推理 API

Anthropic ★★★ Hard

POST job + poll for results, idempotency, partial batch failures, result pagination, cost-optimized off-peak scheduling. 提交作业 + 轮询结果、幂等、批内部分失败、结果分页、低峰期成本优化调度。

Async JobIdempotencyPartial Failure PracHub 2025-09 · Onsite

A15 Multi-Model GPU Inference API 多模型 GPU 推理 API

Anthropic ★★★ Hard

Control plane + data plane split, model registry, canary + rollback, A/B routing, autoscaling, warm/cold tiers. 控制面 + 数据面分离、模型注册表、灰度 + 回滚、A/B 路由、自动扩缩容、冷/热分层。

Control PlaneCanaryAutoscale PracHub 2025-08 · Onsite

A16 Low-Latency ML Inference API 低延迟 ML 推理 API

Anthropic ★★★ Hard

SLOs (p95, availability, QPS), online feature store, rollout/canary, drift detection, degradation strategies. SLO（p95、可用性、QPS）、在线 feature store、rollout/canary、漂移检测、降级策略。

Feature StoreDriftSLO PracHub 2025-09 · Onsite

A17 Review an Inference API Design for Scale 评审他人的推理 API 设计

Anthropic ★★ Hard

Design-review rubric: fill missing SLOs, single points of failure, improvement priority (safety → efficiency → cost). 评审清单：补全缺失的 SLO、找单点故障、改进优先级（保命 → 提效 → 降本）。

Design ReviewSREFailure Modes PracHub 2025-09 · Onsite

A18 Model Downloader & Artifact Distribution 模型分发器 / 制品分发系统

Anthropic ★★ Medium-Hard

Manifest-driven releases, atomic symlink switch, rollback, thundering-herd avoidance, audit trail, integrity validation. Manifest 驱动发布、原子 symlink 切换、回滚、防止惊群、审计链路、完整性校验。

ArtifactRolloutIntegrity PracHub 2026-02 · Onsite

A19 Prompt Playground / Experiment Platform Prompt 实验平台

Anthropic ★★ Medium

Prompt versioning, experiment mgmt, side-by-side eval, prompt caching, collaboration/ACL, huge-context strategy. Prompt 版本化、实验管理、并排对比、Prompt 缓存、协作/权限、超长上下文策略。

ExperimentationPrompt CacheACL PracHub 2026-02 · Onsite

A20 Performance Take-Home (Optimization) 性能 Take-Home（底层优化）

Anthropic ★★ Hard

Optimize simulated machine cycles — benchmark-driven iteration. Includes explicit warning that LLMs can "cheat" by modifying tests. 优化模拟机器周期——benchmark 驱动迭代。明确警告 LLM 可能修改测试「作弊」。

Perf EngineeringProfilingCorrectness GitHub (open-source) · HN 2026-01

A21 Design Claude Chat Service 设计 Claude Chat 服务

Anthropic ★★ Hard

Session management, streaming output, token-level billing, log aggregation, safety filter integration. 会话管理、流式输出、Token 级计费、日志聚合、Safety 过滤器集成。

SessionStreamingMetering interviewing.io

A22 P2P File Distribution (BitTorrent-style) P2P 大文件分发（类 BitTorrent）

Anthropic ★★ Hard

Bandwidth-constrained distribution of large files (model weights, datasets) to thousands of machines. Peer discovery, tit-for-tat, chunk selection. 带宽受限下将大文件（模型权重、数据集）分发到数千台机器。Peer 发现、tit-for-tat、分块选择。

P2PChunkingBandwidth Exponent · Candidate-verified

A23 Handle 100K RPS for LLM Token Generation 承载 100K RPS 的 LLM 吞吐

Anthropic ★★ Hard

Horizontal scaling for throughput, request routing across replicas, GPU pool sizing, load-based autoscale. 吞吐导向的水平扩展、副本间请求路由、GPU 池容量规划、基于负载的自动扩缩容。

ScaleLoad BalancingThroughput Exponent

A24 Design a Key-Value Store (Dynamo-style) 设计键值存储（Dynamo 风格）

Anthropic ★★ Hard

Consistent hashing, quorum reads/writes, vector clocks for conflict resolution, Merkle trees for anti-entropy, gossip for membership. 一致性哈希、读写 quorum、向量时钟冲突解决、Merkle 树反熵、gossip 成员协议。

Consistent HashQuorumReplication Exponent

A25 Design an Agentic AI System 设计自主 Agent 系统

Anthropic ★★ Hard

Agent loop (reason → plan → act), tool use via MCP, short/long-term memory, multi-agent coordination, sandbox & guardrails, infinite-loop prevention. Agent 主循环（推理 → 规划 → 执行）、基于 MCP 的工具使用、短/长期记忆、多 Agent 协作、沙箱与护栏、无限循环防护。

Agent LoopMCPGuardrails Exponent (MLE role)

A26 Design a Web Crawler 设计网页爬虫

Anthropic ★★ Hard

Distributed fetching, deduplication, multi-threaded / async, rate control. Interviewers will follow up on scaling and robots.txt handling. 分布式抓取、去重、多线程/异步、速率控制。面试官会追问扩展性与 robots.txt 处理。

CrawlerDedupRate Control Exponent · programhelp.net

A27 Design a Banking App 设计银行应用

Anthropic ★★ Medium-Hard

Traditional: transactional consistency, double-entry ledger, idempotent transfers, fraud detection, audit log, regulatory compliance. 传统考察：事务一致性、双重记账、幂等转账、反欺诈、审计日志、合规要求。

ACIDLedgerAudit interviewing.io

A28 Distributed Search for 1B Documents @1M QPS 1B 文档 @1M QPS 的分布式搜索

Anthropic ★★★ Hard

1B documents, 1M QPS. Sharding strategies, hot-spot avoidance, multi-level cache, LLM inference scaling, GPU memory optimization. 1B 文档、1M QPS。分片策略、热点规避、多级缓存、LLM 推理扩展、GPU 内存优化。

ScaleShardingCache Hierarchy Medium Anqi Silvia 2025 · linkjob.ai

A29 Model Serving Platform for LLMs LLM 模型服务平台

Anthropic ★★★ Hard

Open-ended: clarify requirements → high-level architecture → safety/latency/reliability trade-offs. You must drive the conversation. 开放式：澄清需求 → 高层架构 → 安全/延迟/可靠性权衡。你必须主导对话。

Open-endedTrade-offsScope linkjob.ai

A30 Design Instagram (Feed Generation) 设计 Instagram（信息流生成）

Anthropic ★★★ Hard

Feed generation is the crux: push vs pull vs hybrid. Interviewer will push on celebrities with millions of followers → answer is hybrid. Follow-up probes DB read scaling. Feed 生成是核心：push vs pull vs hybrid。面试官一定会追问"百万粉明星"→ 正解是 hybrid。后续追问数据库读扩展。

FeedHybrid Fan-outCelebrities 小红书 @OA昨日下雨 · VO1 · 2026

O24 Design a Distributed Job Queue for ML Workloads 设计面向 ML 工作负载的分布式任务队列

OpenAI ★★ Hard

Heterogeneous GPU jobs scheduled across a pool with priority, preemption, at-least-once + idempotent handlers. GPU 池上调度异构任务；含优先级、抢占、at-least-once 与幂等 handler。

Job QueueSchedulerGPU LeetCode 2025 · Onsite

O25 Design OpenAI Batch Inference API 设计 OpenAI 批量推理 API

OpenAI ★★★ Hard

Async endpoint: upload JSONL, poll for completion. 24h SLA at 50% discount. Scheduler fills idle GPU capacity. 异步端点：上传 JSONL、轮询完成；24h SLA、5 折价；调度器填充 GPU 空闲。

BatchLLM InferenceSLA Blind 2024 · Onsite

O26 Design OpenAI Assistants / Threads API 设计 OpenAI Assistants / Threads API

OpenAI ★★ Hard

Server-side conversational state: persistent threads, tool calls, file search, SSE streaming, run lifecycle. 服务端维持会话：持久 threads、工具调用、文件检索、SSE 流式、run 生命周期。

Stateful AgentToolsMemory Blind 2024 · Onsite

O27 Design a Fine-Tuning Platform 设计微调平台

OpenAI ★★ Hard

Dataset upload → validation → scheduled training → model artefact → private-tag inference. 数据集上传 → 校验 → 排程训练 → 模型制品 → 私有 tag 投入推理。

Fine-TuningDatasetTraining Blind 2025 · Onsite

O28 Design a Feature-Flag Platform 设计特性开关平台

OpenAI ★★ Medium

Targeting rules, percentage rollout, SDK, low-latency evaluation, audit trail, A/B analysis. 定向规则、百分比灰度、SDK、低延迟判定、审计日志、A/B 分析。

FlagsConfigEventual Consistency 1Point3Acres 2025 · Screen

O29 Design a Realtime Voice Backend (OpenAI Realtime API) 设计实时语音后端（OpenAI Realtime API）

OpenAI ★★ Hard

Bidirectional audio with interruption, ASR+LLM+TTS streamed, sub-300 ms first audio. 双向音频 + 打断；ASR+LLM+TTS 全流式；首音 < 300 ms。

WebRTCStreamingVoice Blind 2024 · Onsite

O30 Design an Evals Platform 设计评估平台

OpenAI ★★ Medium

Author graders, run suites, compare across runs, catch regressions pre-rollout. 编写打分器、跑 suite、跨次对比、上线前捕获回归。

EvalsBenchmarkRegression GitHub + Blind

O31 Design Prompt / Model Cache 设计 Prompt / 模型缓存

OpenAI ★★★ Medium

Detect shared prefixes, cache KV states, serve subsequent tokens cheaper. 识别共享前缀，缓存 KV，后续 token 低成本生成。

KV CachePrefixHit Rate OpenAI docs + Blind

O32 Design a Content Moderation Pipeline 设计内容审核流水线

OpenAI ★★ Hard

Input + output moderation at API boundary; classifier ensemble; block/flag/review tiers; appeals. 入站与出站均审核；分类器集成；阻断/打标/人工复核；申诉流程。

ModerationSafetyClassifier OpenAI docs + Blind

O33 Design an Autocomplete Service (Codex/Copilot-like) 设计自动补全服务（类 Codex/Copilot）

OpenAI ★★ Hard

IDE-triggered completions: p95 < 200 ms, huge cancellation rate, context with repo retrieval, privacy tiers. IDE 触发补全：p95 < 200 ms，取消率极高，上下文含仓库检索，隐私分级。

AutocompleteLatencySpeculative Blind 2025 · Onsite

O34 Design Image Generation Serving (DALL-E) 设计图像生成服务（DALL-E）

OpenAI ★★ Hard

Text-to-image path, variable step counts, upscaling, NSFW filters, CDN for outputs. 文生图路径、可变步数、上采样、NSFW 过滤、输出走 CDN。

DiffusionGPUCDN 1Point3Acres 2024 · Onsite

O35 Design Usage Metering & Billing for LLM API 设计 LLM API 的用量计量与计费

OpenAI ★★ Medium

Per-request token counts, aggregation by user/org, daily billing, rate-limit enforcement, audit-grade records. 按请求 token 计数、按用户/组织聚合、日结计费、限流执行、审计级记录。

MeteringBillingExactly-Once Blind 2024 · Onsite

O36 Design a Distributed Log / Trace Pipeline 设计分布式日志 / 链路追踪流水线

OpenAI ★★ Medium

High-cardinality logs/traces from GPU fleet, cost-controlled sampling, fast search, retention tiers. GPU fleet 高基数日志/trace、成本可控采样、快速检索、分级保留。

LogsTracingSampling 1Point3Acres 2025 · Screen

O37 Design a Tool-Use Sandbox for Agents 设计智能体工具沙箱

OpenAI ★★ Hard

Execute untrusted code/tool calls produced by LLMs in isolation: filesystem, network, time limits. 在隔离环境执行 LLM 产出的不可信代码/工具调用：文件系统、网络、时间限制。

SandboxCode InterpreterIsolation Blind 2024 · Onsite

O38 Design a Multi-Region API Gateway 设计多区域 API 网关

OpenAI ★★ Medium

Global front door for API traffic, auth, rate limit, routing to regional fleet, automatic failover. 全球前门：认证、限流、路由到区域 fleet、自动故障切换。

GatewayMulti-RegionFailover Blind 2024 · Onsite

A31 Design a Rate Limiter for the Claude API 为 Claude API 设计限流器

Anthropic ★★★ Medium

Per-org RPM/TPM/TPD buckets, tier-based, token-level not just request-level, priority passes for enterprise. 按组织 RPM/TPM/TPD 多维桶，分层限额，按 token 而非请求计，企业优先。

Rate LimitTokensTiers Anthropic docs + Blind

A32 Design Anthropic's Safety Pipeline 设计 Anthropic 的安全流水线

Anthropic ★★★ Hard

Input + output + tool-use moderation, policy taxonomy, ASL gating, red-team loop, eval feedback. 入站、出站、工具调用均审核；策略分类；ASL 等级门控；红队闭环；评估反馈。

SafetyConstitutional AIClassifiers Anthropic RSP + Blind

A33 Design Claude-for-Work RAG (Enterprise) 设计 Claude-for-Work 企业级 RAG

Anthropic ★★ Hard

Corporate data (Slack, GDrive, Notion) ingested with ACLs, retrieval respects per-user permissions, zero cross-tenant leakage. 企业数据（Slack/GDrive/Notion）连同 ACL 一起 ingest；检索尊重用户权限；零跨租户泄漏。

RAGACLEnterprise Anthropic news + Blind

A34 Design Conversation Memory for Claude 为 Claude 设计会话记忆

Anthropic ★★ Hard

Long-running conversations beyond context window, user-specific memory, opt-in/out, privacy controls. 超长对话超出上下文；用户级记忆；opt-in/out；隐私控制。

MemorySummarizationLong Context Blind 2025 · Onsite

A35 Design the MCP Server Registry 设计 MCP 服务注册中心

Anthropic ★★ Medium

Directory of Model Context Protocol servers with discovery, versioning, signing, security review, per-user install scope. MCP 服务目录：发现、版本、签名、安全审核、按用户安装范围。

MCPRegistryDiscovery MCP spec + Blind

A36 Design a Code Execution Sandbox for Claude 为 Claude 设计代码执行沙箱

Anthropic ★★ Hard

Run agent-generated Python/Bash safely: network policy, filesystem policy, resource caps, artefact return. 安全执行代理生成的 Python/Bash：网络策略、文件系统策略、资源上限、产物回传。

SandboxCodeIsolation Anthropic docs + Blind

A37 Design a Training Checkpoint Service 设计训练 Checkpoint 服务

Anthropic ★★ Hard

At 10k+ GPUs, checkpoint every N steps without stalling training. Async sharded writes, resumable. 万卡训练每 N 步 checkpoint 而不阻塞。异步分片写，故障可恢复。

CheckpointDistributed TrainingStorage Anthropic eng · Blind

A38 Design Claude's Prompt Caching Service 设计 Claude 的 Prompt 缓存服务

Anthropic ★★★ Medium

Ephemeral KV cache keyed on prompt prefix; 5-minute TTL; explicit cache_control markers; 90% discount on hits. 按 prefix 的临时 KV 缓存；5 分钟 TTL；显式 cache_control；命中 9 折优惠。

Prompt CacheKV CacheCost Anthropic docs

A39 Design an Evals Platform for Alignment Research 为对齐研究设计评估平台

Anthropic ★★ Medium

Runs constitutional AI + safety evals, red-team results, compares across checkpoints, blocks regressions. 跑 Constitutional AI + 安全评估、红队结果；跨 checkpoint 对比；阻断能力回归。

EvalsAlignmentRed-Team Anthropic research + Blind

A40 Design Anthropic's Billing Pipeline 设计 Anthropic 计费流水线

Anthropic ★★ Medium

Per-second token metering, monthly invoicing, commit plans, credit card + invoice billing, dispute flow. 秒级 token 计量、月度开票、承诺计划、信用卡 + 发票、纠纷流程。

BillingTokensAudit Blind 2025 · Onsite

A41 Design a Red-Team Detection System 设计红队攻击检测系统

Anthropic ★★ Hard

Detect probing / jailbreak campaigns across users, cluster attack patterns, feedback to safety team. 跨用户检测探测/越狱攻势、聚类攻击模式、反馈安全团队。

Red TeamDetectionAdversarial Anthropic blog + Blind

G1 Design Gemini API Serving 设计 Gemini API 推理服务

Google ★★ Hard

Google's Gemini API: multimodal (text+image+video+audio), 2M-token context, TPU fleet, global endpoints. Google Gemini API：多模态、2M token 上下文、TPU fleet、全球端点。

LLM ServingTPUMultimodal Google docs + Blind

G2 Design Vertex AI Training Pipelines 设计 Vertex AI 训练流水线

Google ★★ Hard

Managed pipelines: data → feature → train → validate → deploy; DAG, artefacts, retries, multi-tenant. 托管流水线：数据 → 特征 → 训练 → 验证 → 部署；DAG、制品、重试、多租户。

MLOpsPipelinesVertex AI GCP docs + Blind

G3 Design NotebookLM 设计 NotebookLM

Google ★★ Hard

Upload sources → chat + podcast generation grounded in sources → high citation rate. 上传资料 → 基于资料聊天 + 播客生成 → 高引用率。

RAGLong ContextNotebookLM Google + Blind

G4 Design Google Search AI Overviews 设计 Google Search AI 概览

Google ★★ Hard

Selective LLM summary above classic results: decide when to trigger, retrieve diverse sources, cite links, sub-second p95. 在经典结果上方按需触发 LLM 摘要：何时触发、多源检索、引用链接、p95 秒级。

SearchRAGLow Latency Google blog + Blind

G5 Design a TPU Cluster Scheduler 设计 TPU 集群调度器

Google ★★ Hard

Borg-style scheduler for TPU pods: topology-aware, gang scheduling, interconnect locality, preemption. Borg 风格 TPU pod 调度：拓扑感知、gang scheduling、互联亲和、抢占。

TPUSchedulerBorg Google papers + Blind

G6 Design Google Web Crawler 设计 Google 网络爬虫

Google ★★★ Hard

Crawl billions of pages/day, respect robots.txt, re-crawl freshness policy, politeness per host, dedup. 日爬数十亿页，遵守 robots.txt，按新鲜度重爬，按域礼貌延迟，去重。

CrawlerBFSPoliteness Glassdoor + LeetCode

G7 Design Google Search Index (Inverted Index) 设计 Google 搜索索引（倒排索引）

Google ★★★ Hard

Build and serve an inverted index over billions of docs: term posting lists, sharding, tiered storage. 千亿级文档的倒排索引构建与检索：term posting、分片、分层存储。

Inverted IndexSearchRanking Google SD book + Blind

G8 Design Google Search Suggestions (Typeahead) 设计 Google 搜索建议（Typeahead）

Google ★★★ Medium

Top-K popular prefixes, personalised, real-time trend surfacing, safety filter. 前缀 top-K、个性化、实时热点、安全过滤。

TypeaheadTrieAutocomplete Google SD book

G9 Design Google Spell-Check 设计 Google 拼写纠错

Google ★★ Medium

Detect misspellings, propose corrections: edit distance + language model + context-aware rerank. 拼写错误检测与纠正：编辑距离 + 语言模型 + 上下文 rerank。

SpellcheckEdit DistanceNoisy Channel Norvig / Google

G10 Design YouTube View Count 设计 YouTube 观看数

Google ★★★ Medium

Count views accurately but cheaply at billions/day; anti-inflation; approximate hot-video counts. 日十亿级计数，抗刷量，热门视频近似。

CounterAt-Least-OnceAnti-Fraud YouTube eng

G11 Design YouTube Video Upload & Transcode 设计 YouTube 视频上传与转码

Google ★★★ Hard

Resumable upload of GB-scale video; parallel transcode to resolution ladder; CDN distribution; thumbnails. GB 级可断点续传；并行转码多码率；CDN 分发；缩略图。

UploadTranscodeResumable YouTube eng

G12 Design YouTube Recommendations 设计 YouTube 推荐

Google ★★ Hard

Two-stage recsys: candidate gen (two-tower/ANN) + ranking (deep); cold-start, freshness, diversity. 两阶段推荐：召回（two-tower/ANN） + 排序（深度模型）；冷启、新鲜度、多样性。

RecsysTwo-TowerRanking YouTube paper

G13 Design a Video CDN / Live Streaming 设计视频 CDN / 直播

Google ★★ Hard

Origin → edge pull / push; HLS/DASH; live ingest with low-latency HLS; failover between POPs. 源站 → 边缘拉/推；HLS/DASH；低延迟 HLS 直播；PoP 间故障切换。

CDNHLSLive Google eng

G14 Design Google Maps Tile Service 设计 Google 地图瓦片服务

Google ★★ Hard

Serve pre-rendered raster/vector tiles at zoom levels 0-22; edge caching; incremental updates; offline. 0-22 级缩放瓦片（栅格/矢量）；边缘缓存；增量更新；离线。

MapsTilesGeo Google eng

G15 Design Google Maps Routing & ETA 设计 Google Maps 路线与 ETA

Google ★★ Hard

Shortest-path on road graph + real-time traffic; learned ETA; billions queries/day. 道路图上最短路径 + 实时路况；学习式 ETA；日十亿级查询。

RoutingETADijkstra Google research

G16 Design a Globally Consistent DB (Spanner-like) 设计全球一致 DB（Spanner 风格）

Google ★★ Hard

Globally distributed SQL with external consistency via TrueTime and 2PC over Paxos groups. 全球分布式 SQL，通过 TrueTime 与 Paxos 组上的 2PC 保证外部一致。

SpannerExternal ConsistencyTrueTime Spanner paper

G17 Design Bigtable 设计 Bigtable

Google ★★ Hard

Petabyte-scale wide-column store: tablet-based sharding, GFS for persistence, Chubby for coordination. PB 级宽列存储：tablet 分片、GFS 持久化、Chubby 协调。

BigtableLSMTablet Bigtable paper

G18 Design Gmail Backend 设计 Gmail 后端

Google ★★ Hard

Billion users × 10k emails; labels, search, spam filter, attachments, IMAP/SMTP compat. 十亿用户 x 万封邮件；label、搜索、反垃圾、附件、IMAP/SMTP。

EmailIMAPSearch Google eng

G19 Design Google Docs Realtime Collaboration 设计 Google Docs 实时协作

Google ★★★ Hard

Conflict-free multi-user editing, presence, offline, history timeline. 多人无冲突编辑、在线状态、离线、历史时间线。

CRDTOTRealtime Google eng

G20 Design AdWords Bidding & Serving 设计 AdWords 竞价与投放

Google ★★ Hard

Second-price auction per query, ad ranking with CTR prediction, budget pacing, advertiser reporting. 每 query 第二价拍卖、基于 CTR 的广告排序、预算节流、广告主报表。

AdsAuctionReal-time Google eng

G21 Design Google Pay Transactions 设计 Google Pay 交易

Google ★★ Hard

NFC/tap-to-pay + online checkout; tokenised PAN; fraud detection; reconciliation with issuers. NFC 刷卡 + 在线结账；PAN token 化；风控；与发卡行对账。

PaymentsTokenizationExactly-Once GCP blog

G22 Design Firebase Realtime Database / Pub-Sub 设计 Firebase 实时数据库 / Pub-Sub

Google ★★ Medium

Hierarchical JSON tree with realtime listeners to mobile clients at scale; ACL rules; offline. 层次化 JSON 树，移动端实时监听；规则引擎；离线。

Pub-SubWebSocketMobile GCP docs

G23 Design Google Photos 设计 Google Photos

Google ★★ Medium

Upload, dedup, face/thing classification, search, albums, sharing. 上传、去重、人脸/物体识别、搜索、相册、分享。

PhotosDedupML Google eng

G24 Design Android Push Notifications (FCM) 设计 Android 推送（FCM）

Google ★★ Medium

FCM: developers push to billions of devices; priority classes; battery-efficient delivery. FCM：开发者向十亿设备推送；优先级分层；省电投递。

PushFanoutDevice GCP docs

G25 Design Chrome Sync 设计 Chrome Sync

Google ★★ Medium

Sync bookmarks/passwords/tabs across devices; e2e encryption; conflict resolution. 跨设备同步书签/密码/标签；端到端加密；冲突解决。

SyncEncryptionConflict Google eng

X1 Design Grok's Inference Serving Stack 设计 Grok 的推理服务栈

xAI ★★ Hard

Serve Grok to 200M+ X users with sub-second first-token latency, while competing on cost. 为 2 亿 X 用户提供亚秒级首字延迟的 Grok 推理，同时控制成本。

LLM ServingInferenceGPU xAI eng blog + X posts

X2 Design DeepSearch (Agentic Web Research) 设计 DeepSearch（Agent 式网页深度研究）

xAI ★★ Hard

Grok's DeepSearch plans multi-step web queries, fetches and summarizes pages, and returns cited answers. Grok 的 DeepSearch 规划多步网页查询、抓取并总结页面、返回带引用的答案。

AgentSearchRAG Grok product docs

X3 Design the X Firehose Ingestion for Grok Training 设计供 Grok 训练使用的 X 全量 Firehose 摄入

xAI ★★ Hard

Ingest X's real-time post firehose for Grok training and real-time features. 摄入 X 的实时帖子 firehose，用于 Grok 训练和实时特征。

StreamingKafkaData Pipeline xAI + X data team

X4 Design X For-You Re-ranking with Grok 设计用 Grok 重排序 X 的 For-You 信息流

xAI ★ Medium

Use Grok to re-rank candidate posts for X's For-You timeline based on user intent signals. 用 Grok 基于用户意图信号对 X For-You 候选帖子进行重排序。

RankingRecSysLLM X algo repo + xAI posts

X5 Design Training Orchestration for 100k+ GPU Colossus 设计 10 万卡 Colossus 的训练编排

xAI ★★ Hard

Orchestrate a training run across 100,000+ H100 GPUs in Memphis with fault tolerance and checkpointing. 在孟菲斯的 10 万+ H100 GPU 上编排训练，具备容错和检查点能力。

Distributed TrainingGPUFault Tolerance Colossus blog posts

X6 Design Grok Voice Mode 设计 Grok 语音模式

xAI ★ Medium

Real-time voice conversations with Grok in the X mobile app. 在 X 移动端中与 Grok 进行实时语音对话。

VoiceRealtimeWebRTC X app release notes

真题 Arena 真题竞技场