Claim≈ Approx

Using ~100B active params and ~150T pre-training tokens, frontier models look ~100x over-trained versus Chinchilla-optimal.

Who: Reiner Pope
Topic: Over-training math
Verification note: The 150T figure was an uncited rumor offered by the host; the Chinchilla math is standard. Conclusion is internally consistent but depends on unverified inputs.
Source: Reiner Pope — The math behind how LLMs are trained and served (01:18:59)