G2 · Design Vertex AI Training Pipelines G2 · 设计 Vertex AI 训练流水线
Verified source经核实出处
Vertex AI Pipelines is GA (docs). Google Cloud interview staple. Credibility A.
Key decisions关键决策
- **DAG + caching by input hash**: recomputes only dirty steps.**DAG + 按输入哈希缓存**:只重算 dirty 步骤。
- **Artefact lineage**: every output linked to inputs + code SHA.**制品血缘**:输出绑定输入 + 代码 SHA。
- **Pluggable runners**: Kubernetes, managed, on-prem.**可插拔执行器**:K8s / 托管 / 本地。
- **Multi-tenant isolation** via GCP projects; quota per project.**多租户隔离**按 GCP project;project 级配额。
Architecture架构
flowchart LR UI --> CTL[Pipeline Controller] CTL --> SCH[Scheduler] SCH --> POD[Step pods - GKE] POD --> ART[(Artefact store)] ART --> POD POD --> MDS[(Metadata store)]
Follow-ups追问
- Roll back a bad model? artefact store redeploy; shadow eval.坏模型回滚?artefact store 回滚;影子评估。