OpenAI ★★★ Frequent Hard Product DesignStreamingVersioning

O11 · Design the OpenAI Playground O11 · 设计 OpenAI Playground

Verified source经核实出处

Prompt: "Design the OpenAI Playground." — Exponent (candidate-verified, 2024–2025). Credibility B.

Scope clarification范围澄清

  • Target audience: developers debugging prompts, or end users? (Developers.)目标用户:调试 prompt 的开发者,还是终端用户?(开发者)
  • Which surfaces: web only, or web + API + SDK?覆盖面:仅 Web,还是 Web + API + SDK?
  • Multi-turn threads persisted? Sharable? Versioned?多轮线程是否持久化?可分享?版本化?
  • Cost visibility: token counter per-run?成本可见性:每次运行的 token 计数?

Architecture架构

flowchart LR
  UI[Web UI / SDK] --> GW[API Gateway]
  GW --> AUTH[Auth + Rate Limit]
  GW --> TH[Thread Service]
  GW --> MS[Model Service]
  TH --> DB[(Threads DB)]
  MS --> LLM[LLM Inference]
  MS --> METER[Usage Metering]
  MS --> STREAM[SSE / WebSocket]

Thread / Message data model线程/消息数据模型

Thread(thread_id, user_id, title, model, system_prompt, created_at, updated_at)
Message(message_id, thread_id, role, content, parent_id, token_count, model_version)
Run(run_id, thread_id, message_id, model, params, status, latency_ms, cost_usd)

APIAPI

POST /threads                         -> { thread_id }
POST /threads/{id}/messages  -> SSE stream of tokens
GET  /threads/{id}/messages?cursor=
POST /threads/{id}/fork               (for A/B compare)

Key engineering topics关键工程点

  • Streaming: SSE per request; reconnect via message_id + token cursor.流式:SSE per request;按 message_id + token cursor 重连。
  • Versioning: model version & system prompt pinned per Run for reproducibility.版本化:模型版本与 system prompt 固化到 Run,保障可复现。
  • Metering: count tokens twice (local estimate + server truth) to show pre-send budget.计量:本地估算 + 服务端真实,用于发送前预算显示。
  • Multi-model selector: unified API, model-specific context-window enforcement.多模型选择器:统一 API,按模型各自上下文长度校验。

Related study-guide topics相关学习手册专题