• "Measuring no CoT math time horizon (single forward pass)" by ryan_greenblatt
    Dec 27 2025
    A key risk factor for scheming (and misalignment more generally) is opaque reasoning ability.One proxy for this is how good AIs are at solving math problems immediately without any chain-of-thought (CoT) (as in, in a single forward pass).I've measured this on a dataset of easy math problems and used this to estimate 50% reliability no-CoT time horizon using the same methodology introduced in Measuring AI Ability to Complete Long Tasks (the METR time horizon paper).

    Important caveat: To get human completion times, I ask Opus 4.5 (with thinking) to estimate how long it would take the median AIME participant to complete a given problem.These times seem roughly reasonable to me, but getting some actual human baselines and using these to correct Opus 4.5's estimates would be better.

    Here are the 50% reliability time horizon results:

    I find that Opus 4.5 has a no-CoT 50% reliability time horizon of 3.5 minutes and that time horizon has been doubling every 9 months.

    In an earlier post (Recent LLMs can leverage filler tokens or repeated problems to improve (no-CoT) math performance), I found that repeating the problem substantially boosts performance.In the above plot, if [...]

    ---

    Outline:

    (02:33) Some details about the time horizon fit and data

    (04:17) Analysis

    (06:59) Appendix: scores for Gemini 3 Pro

    (11:41) Appendix: full result tables

    (11:46) Time Horizon - Repetitions

    (12:07) Time Horizon - Filler

    The original text contained 2 footnotes which were omitted from this narration.

    ---

    First published:
    December 26th, 2025

    Source:
    https://www.lesswrong.com/posts/Ty5Bmg7P6Tciy2uj2/measuring-no-cot-math-time-horizon-single-forward-pass

    ---



    Narrated by TYPE III AUDIO.

    ---

    Images from the article:

    Show More Show Less
    13 mins
  • "Recent LLMs can use filler tokens or problem repeats to improve (no-CoT) math performance" by ryan_greenblatt
    Dec 23 2025
    Prior results have shown that LLMs released before 2024 can't leverage 'filler tokens'—unrelated tokens prior to the model's final answer—to perform additional computation and improve performance.[1]I did an investigation on more recent models (e.g. Opus 4.5) and found that many recent LLMs improve substantially on math problems when given filler tokens.That is, I force the LLM to answer some math question immediately without being able to reason using Chain-of-Thought (CoT), but do give the LLM filter tokens (e.g., text like "Filler: 1 2 3 ...") before it has to answer.Giving Opus 4.5 filler tokens[2] boosts no-CoT performance from 45% to 51% (p=4e-7) on a dataset of relatively easy (competition) math problems[3].I find a similar effect from repeating the problem statement many times, e.g. Opus 4.5's no-CoT performance is boosted from 45% to 51%.

    Repeating the problem statement generally works better and is more reliable than filler tokens, especially for relatively weaker models I test (e.g. Qwen3 235B A22B), though the performance boost is often very similar, especially for Anthropic models.The first model that is measurably uplifted by repeats/filler is Opus 3, so this effect has been around for a [...]

    ---

    Outline:

    (03:59) Datasets

    (06:01) Prompt

    (06:58) Results

    (07:01) Performance vs number of repeats/filler

    (08:29) Alternative types of filler tokens

    (09:03) Minimal comparison on many models

    (10:05) Relative performance improvement vs default absolute performance

    (13:59) Comparing filler vs repeat

    (14:44) Future work

    (15:53) Appendix: How helpful were AIs for this project?

    (16:44) Appendix: more information about Easy-Comp-Math

    (25:05) Appendix: tables with exact values

    (25:11) Gen-Arithmetic - Repetitions

    (25:36) Gen-Arithmetic - Filler

    (25:59) Easy-Comp-Math - Repetitions

    (26:21) Easy-Comp-Math - Filler

    (26:45) Partial Sweep - Gen-Arithmetic - Repetitions

    (27:07) Partial Sweep - Gen-Arithmetic - Filler

    (27:27) Partial Sweep - Easy-Comp-Math - Repetitions

    (27:48) Partial Sweep - Easy-Comp-Math - Filler

    (28:09) Appendix: more plots

    (28:14) Relative accuracy improvement plots

    (29:05) Absolute accuracy improvement plots

    (29:57) Filler vs repeat comparison plots

    (30:02) Absolute

    (30:30) Relative

    (30:58) Appendix: significance matrices

    (31:03) Easy-Comp-Math - Repetitions

    (32:27) Easy-Comp-Math - Filler

    (33:50) Gen-Arithmetic - Repetitions

    (35:12) Gen-Arithmetic - Filler

    The original text contained 9 footnotes which were omitted from this narration.

    ---

    First published:
    December 22nd, 2025

    Source:
    https://www.lesswrong.com/posts/NYzYJ2WoB74E6uj9L/recent-llms-can-use-filler-tokens-or-problem-repeats-to

    ---



    Narrated by TYPE III AUDIO.

    ---

    Images from the article:

    Show More Show Less
    37 mins
  • "Turning 20 in the probable pre-apocalypse" by Parv Mahajan
    Dec 23 2025
    Master version of this on https://parvmahajan.com/2025/12/21/turning-20.html

    I turn 20 in January, and the world looks very strange. Probably, things will change very quickly. Maybe, one of those things is whether or not we’re still here.

    This moment seems very fragile, and perhaps more than most moments will never happen again. I want to capture a little bit of what it feels like to be alive right now.

    1.

    Everywhere around me there is this incredible sense of freefall and of grasping. I realize with excitement and horror that over a semester Claude went from not understanding my homework to easily solving it, and I recognize this is the most normal things will ever be. Suddenly, the ceiling for what is possible seems so high - my classmates join startups, accelerate their degrees; I find myself building bespoke bioinformatics tools in minutes, running month-long projects in days. I write dozens of emails and thousands of lines of code a week, and for the first time I no longer feel limited by my ability but by my willpower. I spread the gospel to my friends - “there has never been a better time to have a problem” - even as [...]

    ---

    Outline:

    (00:34) 1.

    (03:12) 2.

    ---

    First published:
    December 21st, 2025

    Source:
    https://www.lesswrong.com/posts/S5dnLsmRbj2JkLWvf/turning-20-in-the-probable-pre-apocalypse

    ---



    Narrated by TYPE III AUDIO.

    Show More Show Less
    5 mins
  • "Alignment Pretraining: AI Discourse Causes Self-Fulfilling (Mis)alignment" by Cam, Puria Radmard, Kyle O’Brien, David Africa, Samuel Ratnam, andyk
    Dec 23 2025
    TL;DR

    LLMs pretrained on data about misaligned AIs themselves become less aligned. Luckily, pretraining LLMs with synthetic data about good AIs helps them become more aligned. These alignment priors persist through post-training, providing alignment-in-depth. We recommend labs pretrain for alignment, just as they do for capabilities.

    Website: alignmentpretraining.ai
    Us: geodesicresearch.org | x.com/geodesresearch

    Note: We are currently garnering feedback here before submitting to ICML. Any suggestions here or on our Google Doc are welcome! We will be releasing a revision on arXiv in the coming days. Folks who leave feedback will be added to the Acknowledgment section. Thank you!

    Abstract

    We pretrained a suite of 6.9B-parameter LLMs, varying only the content related to AI systems, and evaluated them for misalignment. When filtering the vast majority of the content related to AI, we see significant decreases in misalignment rates. The opposite was also true - synthetic positive AI data led to self-fulfilling alignment.

    While post-training decreased the effect size, benign fine-tuning[1] degrades the effects of post-training, models revert toward their midtraining misalignment rates. Models pretrained on realistic or artificial upsampled negative AI discourse become more misaligned with benign fine-tuning, while models pretrained on only positive AI discourse become more aligned.

    This [...]



    ---

    Outline:

    (00:15) TL;DR

    (01:10) Abstract

    (02:52) Background and Motivation

    (04:38) Methodology

    (04:41) Misalignment Evaluations

    (06:39) Synthetic AI Discourse Generation

    (07:57) Data Filtering

    (08:27) Training Setup

    (09:06) Post-Training

    (09:37) Results

    (09:41) Base Models: AI Discourse Causally Affects Alignment

    (10:50) Post-Training: Effects Persist

    (12:14) Tampering: Pretraining Provides Alignment-In-Depth

    (14:10) Additional Results

    (15:23) Discussion

    (15:26) Pretraining as Creating Good Alignment Priors

    (16:09) Curation Outperforms Naive Filtering

    (17:07) Alignment Pretraining

    (17:28) Limitations

    (18:16) Next Steps and Call for Feedback

    (19:18) Acknowledgements

    The original text contained 1 footnote which was omitted from this narration.

    ---

    First published:
    December 20th, 2025

    Source:
    https://www.lesswrong.com/posts/TcfyGD2aKdZ7Rt3hk/alignment-pretraining-ai-discourse-causes-self-fulfilling

    ---



    Narrated by TYPE III AUDIO.

    ---

    Images from the article:

    Show More Show Less
    21 mins
  • "Dancing in a World of Horseradish" by lsusr
    Dec 22 2025
    Commercial airplane tickets are divided up into coach, business class, and first class. In 2014, Etihad introduced The Residence, a premium experience above first class. The Residence isn't very popular.

    The reason The Residence isn't very popular is because of economics. A Residence flight is almost as expensive as a private charter jet. Private jets aren't just a little bit better than commercial flights. They're a totally different product. The airplane waits for you, and you don't have to go through TSA (or Un-American equivalent). The differences between flying coach and flying on The Residence are small compared to the difference between flying on The Residence and flying a low-end charter jet.

    It's difficult to compare costs of big airlines vs private jets for a myriad of reasons. The exact details of this graph should not be taken seriously. I'm just trying to give a visual representation of how a price bifurcation works. Even in the rare situations where it's slightly cheaper than a private jet, nobody should buy them. Rich people should just rent low-end private jets, and poor people shouldn't buy anything more expensive than first class tickets. Why was Etihad's silly product created? Mostly for the halo [...]

    ---

    Outline:

    (01:55) Definition

    (03:16) Product Bifurcation

    (04:55) The Death of Live Music

    ---

    First published:
    December 16th, 2025

    Source:
    https://www.lesswrong.com/posts/7zkFzDAjGGLzab4LH/dancing-in-a-world-of-horseradish

    ---



    Narrated by TYPE III AUDIO.

    ---

    Images from the article:

    Show More Show Less
    8 mins
  • "Contradict my take on OpenPhil’s past AI beliefs" by Eliezer Yudkowsky
    Dec 21 2025
    At many points now, I've been asked in private for a critique of EA / EA's history / EA's impact and I have ad-libbed statements that I feel guilty about because they have not been subjected to EA critique and refutation. I need to write up my take and let you all try to shoot it down.

    Before I can or should try to write up that take, I need to fact-check one of my take-central beliefs about how the last couple of decades have gone down. My belief is that the Open Philanthropy Project, EA generally, and Oxford EA particularly, had bad AI timelines and bad ASI ruin conditional probabilities; and that these invalidly arrived-at beliefs were in control of funding, and were explicitly publicly promoted at the expense of saner beliefs.

    An exemplar of OpenPhil / Oxford EA reasoning about timelines is that, as late as 2020, their position on timelines seemed to center on Ajeya Cotra's "Biological Timelines" estimate which put median timelines to AGI at 30 years later. Leadership dissent from this viewpoint, as I recall, generally centered on having longer rather than shorter median timelines.

    An exemplar of poor positioning on AI ruin is [...]

    ---

    First published:
    December 20th, 2025

    Source:
    https://www.lesswrong.com/posts/ZpguaocJ4y7E3ccuw/contradict-my-take-on-openphil-s-past-ai-beliefs

    ---



    Narrated by TYPE III AUDIO.

    Show More Show Less
    6 mins
  • "Opinionated Takes on Meetups Organizing" by jenn
    Dec 21 2025
    Screwtape, as the global ACX meetups czar, has to be reasonable and responsible in his advice giving for running meetups.

    And the advice is great! It is unobjectionably great.

    I am here to give you more objectionable advice, as another organizer who's run two weekend retreats and a cool hundred rationality meetups over the last two years. As the advice is objectionable (in that, I can see reasonable people disagreeing with it), please read with the appropriate amount of skepticism.

    Don't do anything you find annoying

    If any piece of advice on running "good" meetups makes you go "aurgh", just don't do those things. Supplying food, having meetups on a regular scheduled basis, doing more than just hosting board game nights, building organizational capacity, honestly who even cares. If you don't want to do those things, don't! It's completely fine to disappoint your dad. Screwtape is not even your real dad.

    I've run several weekend-long megameetups now, and after the last one I realized that I really hate dealing with lodging. So I am just going to not do that going forwards and trust people to figure out sleeping space for themselves. Sure, this is less ideal. But you [...]

    ---

    Outline:

    (00:41) Dont do anything you find annoying

    (02:08) Boss people around

    (06:11) Do not accommodate people who dont do the readings

    (07:36) Make people read stuff outside the rationality canon at least sometimes

    (08:11) Do closed meetups at least sometimes

    (09:29) Experiment with group rationality at least sometimes

    (10:18) Bias the culture towards the marginal rat(s) you want

    The original text contained 4 footnotes which were omitted from this narration.

    ---

    First published:
    December 19th, 2025

    Source:
    https://www.lesswrong.com/posts/HmXhnc3XaZnEwe8eM/opinionated-takes-on-meetups-organizing

    ---



    Narrated by TYPE III AUDIO.

    ---

    Images from the article:

    Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.

    Show More Show Less
    16 mins
  • "How to game the METR plot" by shash42
    Dec 21 2025
    TL;DR: In 2025, we were in the 1-4 hour range, which has only 14 samples in METR's underlying data. The topic of each sample is public, making it easy to game METR horizon length measurements for a frontier lab, sometimes inadvertently. Finally, the “horizon length” under METR's assumptions might be adding little information beyond benchmark accuracy. None of this is to criticize METR—in research, its hard to be perfect on the first release. But I’m tired of what is being inferred from this plot, pls stop!

    14 prompts ruled AI discourse in 2025

    The METR horizon length plot was an excellent idea: it proposed measuring the length of tasks models can complete (in terms of estimated human hours needed) instead of accuracy. I'm glad it shifted the community toward caring about long-horizon tasks. They are a better measure of automation impacts, and economic outcomes (for example, labor laws are often based on number of hours of work).

    However, I think we are overindexing on it, far too much. Especially the AI Safety community, which based on it, makes huge updates in timelines, and research priorities. I suspect (from many anecdotes, including roon's) the METR plot has influenced significant investment [...]

    ---

    Outline:

    (01:24) 14. prompts ruled AI discourse in 2025

    (04:58) To improve METR horizon length, train on cybersecurity contests

    (07:12) HCAST Accuracy alone predicts log-linear trend in METR Horizon Lengths

    ---

    First published:
    December 20th, 2025

    Source:
    https://www.lesswrong.com/posts/2RwDgMXo6nh42egoC/how-to-game-the-metr-plot

    ---



    Narrated by TYPE III AUDIO.

    ---

    Images from the article:

    Show More Show Less
    12 mins