OpenAI ★★ Frequent Medium MeteringBillingExactly-Once

O35 · Design Usage Metering & Billing for LLM API O35 · 设计 LLM API 的用量计量与计费

Verified source经核实出处

Cross-referenced in multiple onsite reports (Blind, 1Point3Acres). Credibility B.

Architecture架构

flowchart LR
  API --> EV[Usage Event - request_id, tokens]
  EV --> STREAM[(Durable log - Kafka)]
  STREAM --> AGG[Streaming Aggregator]
  AGG --> LIVE[(Hot counters - Redis)]
  AGG --> WH[(Warehouse)]
  LIVE --> RL[Rate limiter]
  WH --> BILL[Billing reconciliation]
  BILL --> INV[Invoice service]

Key decisions关键决策

  • **Dual-ledger**: fast Redis counters for rate-limit enforcement + durable Kafka log for audit billing.**双账本**:Redis 热计数用于限流 + Kafka 持久日志用于审计计费。
  • **Idempotent events keyed on request_id**; reconciliation job rebuilds ledger.**事件以 request_id 幂等**;对账任务从日志重建账本。
  • **Streaming responses**: meter emits incremental tokens; final event has final count.**流式响应**:边生成边报增量 token;最后一个事件带最终计数。
  • **Time-bucket reconciliation**: hourly snapshots detect drift.**时间分桶对账**:小时快照检测偏差。

Follow-ups追问

  • Retries? dedup by request_id in aggregator.重试?聚合器按 request_id 去重。
  • Multi-region? region-local counters + daily cross-region merge.多区?区域本地计数 + 日级跨区合并。

Related study-guide topics相关学习手册专题