资讯 ArXiv AI Papers 2026-05-12

On Distinguishing Capability Elicitation from Capability Creation in Post-Training: A Free-Energy Perspective

arXiv:2605.08368v1 Announce Type: new Abstract: Debates about large language model post-training often treat supervised fine-tuning (SFT) as imitation and reinforcement learning (RL) as discovery. But this distinction is too coarse. What matters is whether a training procedure increases the probability of behaviors the pretrained model could already produce, or whether it changes what the model ca

0 0
分享:

暂无详细内容

讨论

发表评论

0/2000
...
= ?