PredictionPending
Optimally trained models will serve roughly as many inference tokens as they saw in pre-training, implying current frontier models are ~100x over-trained relative to Chinchilla-optimal.
- Who
- Reiner Pope
- Topic
- Over-training
- How it gets scored
- Does a credible analysis confirm a ~150T-token frontier model serves at least 10T inference tokens before deprecation?
- Resolves
- 2029-05-22