PredictionPending
Today's models can't reliably pick the next experiment or do lateral 'return to first principles' thinking; successor models may close this gap.
- Who
- Eric Jang
- Topic
- Automated research
- How it gets scored
- Does a benchmarked agent autonomously pivot away from a dead-end research track without human prompting, confirmed in a peer-reviewed eval by end of 2027?
- Resolves
- 2027-12-31