O11 · Design the OpenAI Playground O11 · 设计 OpenAI Playground
Verified source经核实出处
Prompt: "Design the OpenAI Playground." — Exponent (candidate-verified, 2024–2025). Credibility B.
Scope clarification范围澄清
- Target audience: developers debugging prompts, or end users? (Developers.)目标用户:调试 prompt 的开发者,还是终端用户?(开发者)
- Which surfaces: web only, or web + API + SDK?覆盖面:仅 Web,还是 Web + API + SDK?
- Multi-turn threads persisted? Sharable? Versioned?多轮线程是否持久化?可分享?版本化?
- Cost visibility: token counter per-run?成本可见性:每次运行的 token 计数?
Architecture架构
flowchart LR UI[Web UI / SDK] --> GW[API Gateway] GW --> AUTH[Auth + Rate Limit] GW --> TH[Thread Service] GW --> MS[Model Service] TH --> DB[(Threads DB)] MS --> LLM[LLM Inference] MS --> METER[Usage Metering] MS --> STREAM[SSE / WebSocket]
Thread / Message data model线程/消息数据模型
Thread(thread_id, user_id, title, model, system_prompt, created_at, updated_at)
Message(message_id, thread_id, role, content, parent_id, token_count, model_version)
Run(run_id, thread_id, message_id, model, params, status, latency_ms, cost_usd)APIAPI
POST /threads -> { thread_id }
POST /threads/{id}/messages -> SSE stream of tokens
GET /threads/{id}/messages?cursor=
POST /threads/{id}/fork (for A/B compare)Key engineering topics关键工程点
- Streaming: SSE per request; reconnect via message_id + token cursor.流式:SSE per request;按 message_id + token cursor 重连。
- Versioning: model version & system prompt pinned per Run for reproducibility.版本化:模型版本与 system prompt 固化到 Run,保障可复现。
- Metering: count tokens twice (local estimate + server truth) to show pre-send budget.计量:本地估算 + 服务端真实,用于发送前预算显示。
- Multi-model selector: unified API, model-specific context-window enforcement.多模型选择器:统一 API,按模型各自上下文长度校验。