OpenAI ★★ Frequent Hard Fine-TuningDatasetTraining

O27 · Design a Fine-Tuning Platform O27 · 设计微调平台

Verified source经核实出处

OpenAI fine-tuning is public (docs). Onsite asks for end-to-end platform design. Credibility A.

Architecture架构

flowchart LR
  U[User] --> UP[Upload Dataset]
  UP --> VAL[Validator - schema, PII, safety]
  VAL --> Q[(Training Queue)]
  Q --> SCH[Scheduler]
  SCH --> T[Trainer Workers - LoRA / full]
  T --> REG[(Model Registry)]
  REG --> SERVE[Inference Tenancy]
  U -->|use ft:...| SERVE

Decisions决策

  • **LoRA by default**: 10-100x cheaper; full fine-tune only for enterprise tier.**默认 LoRA**:成本/速度提升 10-100 倍;全量微调仅限企业层。
  • **Dataset validation as gate**: schema, token count, dedup, safety classifier; reject before queueing.**数据集校验为 gate**:schema、token 数、去重、安全分类器;未过则不入队。
  • **Multi-tenant model registry**: model ids include org_id; inference fleet loads LoRA adapters on demand.**多租户注册表**:model id 带 org_id;推理 fleet 按需加载 LoRA adapter。
  • **Incremental eval** per run vs base model; user reviews curve before paying to deploy.**渐进评估**:每次运行产出与 base 的对比曲线;用户确认再部署。

Safety安全

Watch-out注意

Untrusted training data can poison models. Mitigations: PII scrub, toxic-sample filter, jailbreak-pattern detector, post-training red-team eval.不可信训练数据会污染模型。缓解:PII 清洗、有害样本过滤、越狱模式检测、训练后红队评估。

Follow-ups追问

  • Checkpoint strategy? shard checkpoints to object store every N steps.checkpoint?每 N 步分片写入对象存储。
  • Cost attribution? bill per GPU-hour + per 1M tokens trained.成本分摊?按 GPU-hour + 每百万训练 token。

Related study-guide topics相关学习手册专题