O18 · Design a Vector Database O18 · 设计向量数据库
Verified source经核实出处
Prompt: "Design a Vector Database." — Jobright. Credibility C.
Problem size规模
Billions of 768-1536 dim vectors, sub-100ms query latency, filtered by metadata (tenant, date, source).数十亿 768-1536 维向量,查询延迟 < 100ms,支持 metadata 过滤(租户、日期、来源)。
ANN index familiesANN 索引家族
- HNSW: graph-based, best recall/speed trade-off, higher memory.HNSW:基于图,召回/速度权衡最佳,内存更高。
- IVF-PQ: inverted file + product quantization, lower memory, worse recall at high K.IVF-PQ:倒排 + 乘积量化,内存低,高 K 召回偏弱。
- ScaNN (Google) and DiskANN (Microsoft) for disk-resident indices.ScaNN(Google)与 DiskANN(微软)用于磁盘索引。
Architecture架构
flowchart LR W[Writer] --> SHARD[Sharder] SHARD --> S1[Shard 1: HNSW] SHARD --> S2[Shard 2: HNSW] SHARD --> S3[Shard 3: HNSW] Q[Query] --> ROUTE[Router] ROUTE --> S1 & S2 & S3 S1 & S2 & S3 --> MERGE[Top-K Merger] MERGE --> RET[Return]
Key decisions关键决策
- Sharding: random or by tenant. Random = balance but query fans to all shards. By tenant = locality but hot-tenant risk.分片:随机或按租户。随机 = 均衡但 query 扇出;按租户 = 局部性但有热租户风险。
- Metadata filter: pre-filter can crash recall; post-filter wastes compute. Hybrid "filter-then-index" per-shard is often best.Metadata 过滤:预过滤会破坏召回;后过滤浪费算力。每分片 hybrid「filter-then-index」通常最优。
- Ingestion pipeline: batched writes; async index rebuild; immutable segments like LSM-tree.采集管道:批量写;异步重建索引;不可变段(类 LSM)。
- Freshness: if real-time writes required, maintain a live search layer + periodic merge into cold HNSW index.新鲜度:若需实时写入,维护实时搜索层 + 周期合并到冷 HNSW 索引。