O26 · Design OpenAI Assistants / Threads API O26 · 设计 OpenAI Assistants / Threads API
Verified source经核实出处
OpenAI public product (Assistants docs). Asked at OpenAI onsites per Blind. Credibility A.
Architecture架构
flowchart TB C[Client] --> API[Assistants API] API --> ORCH[Run Orchestrator] ORCH --> MEM[(Thread / Message Store)] ORCH --> LLM[Inference] ORCH --> TOOLS[Tool Dispatcher] TOOLS --> VS[File / Vector search] TOOLS --> CI[Code Interpreter Sandbox] ORCH --> S[SSE Stream Gateway] S --> C
Key decisions关键决策
- **Run as a state machine**: queued to in_progress to requires_action (tool call) to in_progress to completed/failed/cancelled.**Run 视为状态机**:queued → in_progress → requires_action(工具调用)→ in_progress → completed|failed|cancelled。
- **Thread = ordered append-only log** of messages + file ids; forking via thread.copy().**Thread = 有序 append-only 日志** + 文件 id;通过 thread.copy 分叉。
- **Tool-call loop** on server side: model emits tool_calls, orchestrator executes or awaits submission, feeds observation back.**服务端工具循环**:模型输出 tool_calls,编排执行或等待提交,将观察回灌。
- **Memory scaling**: long threads compacted via windowed summarisation + RAG over full log.**长上下文**:长 thread 做窗口摘要 + 对完整日志 RAG 检索。
API surfaceAPI 界面
POST /threads
POST /threads/{id}/messages
POST /threads/{id}/runs
GET /threads/{id}/runs/{run_id}/events (SSE)
POST /threads/{id}/runs/{run_id}/submit_tool_outputs
Follow-ups追问
- Concurrent runs on same thread? serialise per-thread; second returns 409.同 thread 并发 run?串行化;第二个 409。
- Cancel mid-tool-call? set run.status=cancelling; sandbox honours SIGTERM.中途取消?置 cancelling;沙箱响应 SIGTERM。