A22 · P2P File Distribution (BitTorrent-style) A22 · P2P 大文件分发(类 BitTorrent)
Verified source经核实出处
Prompt: "Design a Peer-to-Peer File Distribution System" (bandwidth-constrained, thousands of machines). — Exponent. Credibility B/C.
Why P2P here为什么用 P2P
Distributing TB-scale model weights to thousands of hosts from a single origin saturates the origin's uplink. P2P shares upload bandwidth across peers.从单源向数千主机分发 TB 级模型权重会打满上行。P2P 把上传带宽分摊到 peers。
Core mechanics (BitTorrent-style)核心机制(BitTorrent 风格)
- Chunk the file into fixed-size pieces, each with a SHA-256.将文件切成固定大小的片段,每片带 SHA-256。
- Tracker discovery: simple centralized tracker returns peer list per-chunk.Tracker 发现:中心化 tracker 按 chunk 返回 peer 列表。
- Chunk selection: rarest-first to avoid last-chunk starvation.Chunk 选择:最稀缺优先,避免最后一片饥饿。
- Peer selection: tit-for-tat unchoking — prefer peers that upload back to you.Peer 选择:tit-for-tat——优先给会回上传给你的 peer。
- Super-seeding mode for initial seeder: feeds each peer a different chunk first.初始 seeder 的 super-seeding:先给每个 peer 不同 chunk。
Production tweaks for model weights模型权重场景的生产调优
- Signed manifests to prevent poisoning.签名 manifest 防止投毒。
- Rack-aware peer selection (prefer same-AZ peers).机架感知 peer 选择(优先同 AZ)。
- Staged rollout: 1% → 10% → 100% by AZ + validation.分级 rollout:1% → 10% → 100% 按 AZ 推 + 验证。