OpenAI ★★ Frequent Hard AccountingFairness

O14 · GPU Credit / Quota Scheduling System O14 · GPU 信用/配额调度系统

Verified source经核实出处

Prompt: "Design a GPU Scheduling System (Credits)." — Glassdoor. Credibility C.

Clarifications需求澄清

  • Granularity: credits per GPU-second, per token, per request?粒度:每 GPU-秒、每 token、每请求?
  • Consumption order: oldest first (FIFO by grant date)?消费顺序:最旧额度优先(按授予日期 FIFO)?
  • Expiration: per grant or per account?过期:按批次还是按账户?
  • Overdraft allowed?允许透支?

Data model数据模型

Account(account_id, tenant_id, status)
CreditGrant(grant_id, account_id, amount, granted_at, expires_at, source)
CreditLedger(entry_id, account_id, grant_id, delta, reason, job_id, created_at)
-- Running balance = SUM(delta) grouped by account
-- FIFO consumption: consume from oldest non-expired grant first

Key algorithms关键算法

  • Reserve → commit → release: reserve credits before job starts (2-phase). Avoids race and overspend.Reserve → commit → release:作业开始前先预留(两阶段)。避免竞态与超支。
  • FIFO consumption: maintain a sorted view by (expires_at ASC). Greedy-consume oldest grants first.FIFO 消费:维护按 (expires_at ASC) 排序的视图。贪心从最旧 grant 消费。
  • Idempotent commits keyed on job_id — retries must not double-charge.幂等 commit 按 job_id 加键——重试不得双扣。

Fairness layer公平层

Credits answer "can you run?"; a separate WFQ scheduler answers "who runs next?". Don't conflate them.Credit 回答「能不能跑」;独立的 WFQ 调度器回答「谁先跑」。两者不要混淆。

Related study-guide topics相关学习手册专题