O14 · GPU Credit / Quota Scheduling System O14 · GPU 信用/配额调度系统
Verified source经核实出处
Prompt: "Design a GPU Scheduling System (Credits)." — Glassdoor. Credibility C.
Clarifications需求澄清
- Granularity: credits per GPU-second, per token, per request?粒度:每 GPU-秒、每 token、每请求?
- Consumption order: oldest first (FIFO by grant date)?消费顺序:最旧额度优先(按授予日期 FIFO)?
- Expiration: per grant or per account?过期:按批次还是按账户?
- Overdraft allowed?允许透支?
Data model数据模型
Account(account_id, tenant_id, status)
CreditGrant(grant_id, account_id, amount, granted_at, expires_at, source)
CreditLedger(entry_id, account_id, grant_id, delta, reason, job_id, created_at)
-- Running balance = SUM(delta) grouped by account
-- FIFO consumption: consume from oldest non-expired grant firstKey algorithms关键算法
- Reserve → commit → release: reserve credits before job starts (2-phase). Avoids race and overspend.Reserve → commit → release:作业开始前先预留(两阶段)。避免竞态与超支。
- FIFO consumption: maintain a sorted view by (expires_at ASC). Greedy-consume oldest grants first.FIFO 消费:维护按 (expires_at ASC) 排序的视图。贪心从最旧 grant 消费。
- Idempotent commits keyed on job_id — retries must not double-charge.幂等 commit 按 job_id 加键——重试不得双扣。
Fairness layer公平层
Credits answer "can you run?"; a separate WFQ scheduler answers "who runs next?". Don't conflate them.Credit 回答「能不能跑」;独立的 WFQ 调度器回答「谁先跑」。两者不要混淆。