Claim≈ Approx
Using ~100B active params and ~150T pre-training tokens, frontier models look ~100x over-trained versus Chinchilla-optimal.
- Who
- Reiner Pope
- Topic
- Over-training math
- Verification note
- The 150T figure was an uncited rumor offered by the host; the Chinchilla math is standard. Conclusion is internally consistent but depends on unverified inputs.