OpenAI ★★★ Frequent Hard RESTCachingQueue

O2 · Design a Webhook Service (REST API) O2 · 设计 Webhook 服务(REST API)

Verified source经核实出处

Original prompt: "System design Webhook service…caching, db design, focus on failure and retry mechanism in message queue. Implement the REST service with JSON body and query for GET and POST." — TeamBlind, 2024-04-10, screening round. Credibility C.

Requirements clarification需求澄清

Unlike O1 (internal platform), this framing wants a product API. Make GET/POST resource semantics explicit: resource paths, filtering, pagination. Confirm whether delivery is at-least-once (almost always yes) and the max retry window.与 O1 不同,这题把 webhook 当作产品 API。要讲清 GET/POST 的资源语义:路径、过滤、分页。确认投递是否 at-least-once(基本都是)与最大重试窗口。

High-level architecture高层架构

Split read path (config lookups, dashboard) from write path (delivery). Cache the read-heavy config layer, but avoid caching delivery results unless a clear hot-read pattern exists.读路径(配置查询、仪表盘)与写路径(投递)分离。对读多的 config 层做缓存;除非有明显的热读,否则不要缓存投递结果。

flowchart LR
  U[API Caller] --> G[REST API]
  G --> C[Config Cache]
  C -->|miss| D[(Config DB)]
  G --> Q[Delivery Queue]
  Q --> W[Workers]
  W --> T[Target URL]
  W --> L[(Delivery Log)]

MVP resource modelMVP 资源模型

WebhookSubscription(subscription_id, tenant_id, event_type, endpoint_id)
WebhookDelivery(delivery_id, subscription_id, event_id, status, last_attempt)
WebhookAttempt(attempt_id, delivery_id, status_code, latency_ms, timestamp)

APIAPI 设计

POST /subscriptions  { event_type, endpoint_id }
POST /deliveries     { subscription_id, payload, idempotency_key } → { delivery_id }
GET  /deliveries?subscription_id=&status=&cursor=
GET  /deliveries/{delivery_id}/attempts

How to talk about cache like an engineer缓存该怎么讲才像真做过

  • Cache endpoint/subscription config: read-heavy, cache-aside, write-DB-then-invalidate. Alternative write-through adds latency and is rarely worth it.缓存 endpoint/subscription 配置:读多,cache-aside,写 DB 后失效缓存。Write-through 方案会增加延迟,不划算。
  • Don't cache delivery results except for well-known hot debug pages, and then only with short TTL + singleflight.不要缓存投递结果,除非明显热读(调试页),且必须加短 TTL + singleflight。
  • Cache stampede mitigation: random TTL, singleflight, per-key rate limit. Mention these or you'll be asked.缓存雪崩/击穿:随机 TTL、singleflight、按 key 限流。主动提及,否则必被追问。

Consistency一致性

  • Config updates are eventually consistent (cache may be briefly stale). But disable_endpoint must take effect fast — push to workers with a version number.配置更新最终一致(缓存可能短暂旧值)。但「禁用 endpoint」必须快速生效——给 worker 推送带版本号的禁用信号。
  • Delivery state uses append-only attempt log + materialized view to avoid write amplification.delivery 状态用追加式 attempt 日志 + 物化视图,避免写放大。

Common follow-ups高频追问

  1. Cache invalidation race with concurrent writes? Version key (etag/updated_at), invalidate after DB write, read-through correction.缓存失效与写竞态?版本号(etag/updated_at),写后失效,必要时读修正。
  2. How do you implement delay queues? Either a delayed topic or a scheduled_at column with a worker that pulls due items.延迟队列如何实现?延迟 topic 或 scheduled_at 字段 + worker 拉取到期项。
  3. How to paginate the attempts list? Cursor-based (cursor=last_attempt_id); offset pagination breaks under high write-rate.attempt 分页如何做?游标分页(cursor=last_attempt_id);offset 在高写入速率下会错位。

Related study-guide topics相关学习手册专题