apple-ml-research

自适应思考：LLM 知道何时在潜在空间中思考

Adaptive Thinking: Large Language Models Know When to Think in Latent Space

二〇二六年五月八日 · 英文原文

摘要

该研究探讨 LLM test-time computing 中 thinking budget 的分配问题，关注模型能力、query complexity 与 compute-optimal inference 的关系，并采用 self-consistency 作为判断是否需要 CoT thinking 的代理指标。

大语言模型（LLM）test-time computing 的最新进展，引入了在生成答案之前执行中间 chain-of-thought（CoT）推理（thinking）的能力。虽然增加 thinking budget 会在 inference time 带来平滑的性能提升，但为了实现 compute-optimal inference，LLM 能力、query complexity 与最优 budget allocation 之间的关系仍然缺乏充分理解。为应对这一挑战，我们使用 self-consistency（多条推理路径之间的一致性）作为是否需要 thinking 的代理指标。我们首先识别……

译自 apple-ml-research · 录于二〇二六年五月八日