OpenAI ★★ Frequent Hard Lexer/ParserASTSandbox

O22 · Design a Toy Language Interpreter O22 · 设计自定义语言解释器

Verified source经核实出处

Reported on 小红书 @Infra+RL+Robots as one of OpenAI's 5 SD-pool questions (2026). Fits OpenAI's taste for "real engineering" SD — not a distributed system but a language runtime you'd ship inside Code Interpreter / tool-use sandboxes. Credibility B.来自小红书 @Infra+RL+Robots(2026),OpenAI 五大 SD 题池之一。符合 OpenAI 偏好「工程实感」的 SD 风格——不是分布式系统而是 Code Interpreter / 工具沙箱可能用到的语言运行时。可信度 B

Pipeline执行流水线

flowchart LR
  S[Source] --> L[Lexer
tokens] L --> P[Parser
AST] P --> SA[Semantic Analyzer
resolve + type-check] SA --> E[Evaluator
tree walker] E --> R[Runtime
heap, env, stdlib] R --> O[Output / stdio] E -.quota.-> SB[Sandbox limits
steps, mem, time]

Language surface (scoped for a 60-min whiteboard)语言特性(按 60 分钟白板量裁剪)

  • Values: int, float, bool, string, list, map, function.:int、float、bool、string、list、map、function。
  • Syntax: let x = ..., if/else, while, fn name(args) { ... }, first-class functions.语法let x = ...if/elsewhilefn name(args) { ... }、一等函数。
  • Scoping: lexical; nested environments (parent-pointer trees).作用域:词法;嵌套 env(父指针树)。
  • Stdlib: print, len, range, JSON round-trip. Everything else is a tool call.标准库printlenrange、JSON 往返。其他全部走工具调用。

Parser choice语法分析选型

OptionWhen
Recursive descentBest for whiteboard; hand-written; easy debugging.白板首选;手写;调试友好。
Pratt parsingClean operator precedence; pairs with recursive descent.优雅处理运算符优先级;与递归下降配合。
Parser generator (ANTLR, yacc)Mention for production; don't write one live.生产环境可提;白板不现场写。

Evaluator: tree-walk vs bytecode求值:树遍历 vs 字节码

Start with tree-walking: eval(node, env) dispatches on AST type. Simple, debuggable, ~1-2× slower than bytecode. Mention bytecode VM (stack machine, constant pool, instruction decode loop) as the natural upgrade when perf matters.先用树遍历eval(node, env) 按 AST 类型分发。简单、易调试、比字节码慢 1-2 倍。提及字节码 VM(栈机器、常量池、指令解码循环)作为性能升级方向。

Sandbox & safety (OpenAI's real reason to ask this)沙箱与安全(OpenAI 真正在乎的)

  • Step budget: increment counter per eval step; throw OutOfSteps after N.步数预算:每步自增计数器;超 N 抛 OutOfSteps
  • Memory cap: custom heap with allocated-byte tracking; refuse allocations past ceiling.内存上限:自定义堆跟踪已分配字节;超上限拒绝分配。
  • Wall-clock watchdog: separate thread cancels execution past deadline.墙钟看门狗:独立线程到期取消执行。
  • No host escape: no raw FS, network, or exec — only explicit stdlib + tool calls.无宿主逃逸:无裸 FS、网络、exec —— 只暴露标准库 + 工具调用。
  • Deterministic mode: fix PRNG seed; freeze time; makes testing and judging reproducible.确定性模式:固定 PRNG 种子;冻结时间;测试与判题可复现。

Error UX错误提示

  • Token / node carries (line, col, span); errors render a caret-underlined snippet.Token / 节点带 (line, col, span);错误渲染带下划线的片段。
  • Separate SyntaxError, NameError, TypeError, RuntimeError.区分 SyntaxErrorNameErrorTypeErrorRuntimeError

Follow-ups追问

  • Add closures? Capture surrounding environment by reference in the Function value.加闭包?Function 值中按引用捕获外层 env。
  • Add types? HM type inference for let/fn; escape hatch with any.加类型?let/fn 用 HM 类型推导;用 any 做逃逸口。
  • REPL vs script? REPL keeps one long-lived env; script starts fresh. Both wrapped in same sandbox.REPL vs 脚本?REPL 保持长 env;脚本每次新 env。都在同一沙箱。
  • How would Code Interpreter use this? Each user message opens a fresh VM; tool-call results come back as values injected into the env.Code Interpreter 如何用?每条用户消息开新 VM;工具调用结果作为值注入 env。

Related study-guide topics相关学习手册专题