OpenAI ★★★ Frequent Hard WebSocketFan-outChat

O4 · Design Slack O4 · 设计 Slack

Verified source经核实出处

Original prompt: "design slack…deliver a MVP in 2 weeks…message delivery scalability/reliability" — LeetCode, 2024-10-30. Also reported on Exponent and Glassdoor. Credibility C.

Framing the MVP correctlyMVP 砍对范围

Two-week MVP means ruthless scoping. In-scope: 1:1 & small-group chat, send/receive, fetch history, online push via WebSocket, basic auth. Out of scope: search, files, complex permissions, cross-device state sync, read receipts.两周 MVP 意味着果断砍需求。范围内：1:1 与小群聊、发送/接收、拉历史、WebSocket 在线推送、基础鉴权。范围外：搜索、文件、复杂权限、跨设备状态同步、已读回执。

Minimum architecture最小架构

flowchart LR
  C[Client] --> G[Gateway]
  G --> A[Auth]
  G --> M[Message Service]
  M --> D[(Message DB)]
  M --> Q[Fanout Queue]
  Q --> N[Notification/Push]
  N --> C

Semantics you MUST nail必须讲清的语义

Delivery guarantee: at-least-once is standard; client dedups on message_id.投递保证：at-least-once 是标准；客户端按 message_id 去重。
Ordering: per-channel monotonic seq; no cross-channel ordering.顺序：每 channel 单调递增 seq；跨 channel 无序。
Offline: persist all messages; on reconnect, pull history since last seq + subscribe to stream.离线：所有消息持久化；重连时按最后 seq 拉历史 + 订阅流。
Multi-device: one user multiple sessions; server tracks last_read_seq per session.多端：单用户多 session；服务端记录每 session 的 last_read_seq。

API (MVP)API（MVP）

POST /channels/{id}/messages  { client_msg_id, text }
GET  /channels/{id}/messages?before=&limit=
GET  /ws  (or /events/stream)  — streaming new messages
                                heartbeat every 30s, cursor resume on reconnect

Data model数据模型

Message(channel_id, message_id, sender_id, created_at, payload, seq)
-- Index: (channel_id, seq DESC) for pagination
Channel(channel_id, type, members, next_seq)

Scale & consistency扩展与一致性

Strict per-channel ordering via single-partition writes caps throughput. Mitigation: shard by channel_id; use claim-check for attachments.严格每 channel 顺序 = 单分区写限制吞吐。缓解：按 channel_id 分片；附件用 claim-check 模式。
Fanout-on-write (push to each user inbox) vs fanout-on-read (aggregate on read). MVP uses fanout-on-read for small groups; production mixes them for big channels.Fanout-on-write（推送到每用户 inbox） vs fanout-on-read（读时聚合）。MVP 用 fanout-on-read；生产混合策略应对大群。

Hot channel / big group热频道 / 大群

A 10,000-member channel can cause a fanout storm. Solution: push to online users in real time, offline users pull on reconnect; big channels use a layered topic.1 万人频道会触发 fanout 风暴。方案：在线用户实时推送，离线用户重连时拉取；大频道用分层 topic。

Follow-ups追问

Exactly-once? Not in messaging — use client_msg_id dedup.Exactly-once？IM 不做；用 client_msg_id 去重。
Edit/delete? New event (edit_event) with same message_id; client replays.编辑/撤回？新事件（edit_event）复用 message_id；客户端重放。
Presence? Heartbeat → short TTL cache; online/offline is eventually consistent.在线状态？心跳 → 短 TTL 缓存；在线/离线最终一致。

O4 · Design Slack O4 · 设计 Slack

Verified source经核实出处

Framing the MVP correctlyMVP 砍对范围

Minimum architecture最小架构

Semantics you MUST nail必须讲清的语义

API (MVP)API（MVP）

Data model数据模型

Scale & consistency扩展与一致性

Hot channel / big group热频道 / 大群

Follow-ups追问

Related study-guide topics相关学习手册专题