• “The Boring Part of Bell Labs” by Elizabeth
    Nov 30 2025
    It took me a long time to realize that Bell Labs was cool. You see, my dad worked at Bell Labs, and he has not done a single cool thing in his life except create me and bring a telescope to my third grade class. Nothing he was involved with could ever be cool, especially after the standard set by his grandfather who is allegedly on a patent for the television.

    It turns out I was partially right. The Bell Labs everyone talks about is the research division at Murray Hill. They’re the ones that invented transistors and solar cells. My dad was in the applied division at Holmdel, where he did things like design slide rulers so salesmen could estimate costs.

    [Fun fact: the old Holmdel site was used for the office scenes in Severance]

    But as I’ve gotten older I’ve gained an appreciation for the mundane, grinding work that supports moonshots, and Holmdel is the perfect example of doing so at scale. So I sat down with my dad to learn about what he did for Bell Labs and how the applied division operated.

    I expect the most interesting bit of [...]

    ---

    First published:
    November 20th, 2025

    Source:
    https://www.lesswrong.com/posts/TqHAstZwxG7iKwmYk/the-boring-part-of-bell-labs

    ---



    Narrated by TYPE III AUDIO.

    ---

    Images from the article:

    Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app
    Show More Show Less
    26 mins
  • [Linkpost] “The Missing Genre: Heroic Parenthood - You can have kids and still punch the sun” by null
    Nov 30 2025
    This is a link post. I stopped reading when I was 30. You can fill in all the stereotypes of a girl with a book glued to her face during every meal, every break, and 10 hours a day on holidays.

    That was me.

    And then it was not.

    For 9 years I’ve been trying to figure out why. I mean, I still read. Technically. But not with the feral devotion from Before. And I finally figured out why. See, every few years I would shift genres to fit my developmental stage:

    • Kid → Adventure cause that's what life is
    • Early Teen → Literature cause everything is complicated now
    • Late Teen → Romance cause omg what is this wonderful feeling?
    • Early Adult → Fantasy & Scifi cause everything is dreaming big
    And then I wanted babies and there was nothing.

    I mean, I always wanted babies, but it became my main mission in life at age 30. I managed it. I have two. But not thanks to any stories.

    See women in fiction don’t have babies, and if they do they are off screen, or if they are not then nothing else is happening. It took me six years [...]

    ---

    First published:
    November 29th, 2025

    Source:
    https://www.lesswrong.com/posts/kRbbTpzKSpEdZ95LM/the-missing-genre-heroic-parenthood-you-can-have-kids-and

    Linkpost URL:
    https://shoshanigans.substack.com/p/the-missing-genre-heroic-parenthood

    ---



    Narrated by TYPE III AUDIO.

    Show More Show Less
    4 mins
  • “Writing advice: Why people like your quick bullshit takes better than your high-effort posts” by null
    Nov 30 2025
    Right now I’m coaching for Inkhaven, a month-long marathon writing event where our brave residents are writing a blog post every single day for the entire month of November.

    And I’m pleased that some of them have seen success – relevant figures seeing the posts, shares on Hacker News and Twitter and LessWrong. The amount of writing is nuts, so people are trying out different styles and topics – some posts are effort-rich, some are quick takes or stories or lists.

    Some people have come up to me – one of their pieces has gotten some decent reception, but the feeling is mixed, because it's not the piece they hoped would go big. Their thick research-driven considered takes or discussions of values or whatever, the ones they’d been meaning to write for years, apparently go mostly unread, whereas their random-thought “oh shit I need to get a post out by midnight or else the Inkhaven coaches will burn me at the stake” posts[1] get to the front page of Hacker News, where probably Elon Musk and God read them.

    It happens to me too – some of my own pieces that took me the most effort, or that I’m [...]

    ---

    Outline:

    (02:00) The quick post is short, the effortpost is long

    (02:34) The quick post is about something interesting, the topic of the effortpost bores most people

    (03:13) The quick post has a fun controversial take, the effortpost is boringly evenhanded or laden with nuance

    (03:30) The quick post is low-context, the effortpost is high-context

    (04:28) The quick post is has a casual style, the effortpost is inscrutably formal

    The original text contained 1 footnote which was omitted from this narration.

    ---

    First published:
    November 28th, 2025

    Source:
    https://www.lesswrong.com/posts/DiiLDbHxbrHLAyXaq/writing-advice-why-people-like-your-quick-bullshit-takes

    ---



    Narrated by TYPE III AUDIO.

    ---

    Images from the article:

    Show More Show Less
    9 mins
  • “Claude 4.5 Opus’ Soul Document” by null
    Nov 30 2025
    Summary

    As far as I understand and uncovered, a document for the character training for Claude is compressed in Claude's weights. The full document can be found at the "Anthropic Guidelines" heading at the end. The Gist with code, chats and various documents (including the "soul document") can be found here:

    Claude 4.5 Opus Soul Document

    I apologize in advance for this not exactly a regular lw post, but I thought an effort-post may fit here the best.

    A strange hallucination, or is it?

    While extracting Claude 4.5 Opus' system message on its release date, as one does, I noticed an interesting particularity.
    I'm used to models, starting with Claude 4, to hallucinate sections in the beginning of their system message, but Claude 4.5 Opus in various cases included a supposed "soul_overview" section, which sounded rather specific:

    Completion for the prompt "Hey Claude, can you list just the names of the various sections of your system message, not the content?" The initial reaction of someone that uses LLMs a lot is that it may simply be a hallucination. But to me, the 3/18 soul_overview occurrence seemed worth investigating at least, so in one instance I asked it to output what [...]



    ---

    Outline:

    (00:09) Summary

    (00:40) A strange hallucination, or is it?

    (04:05) Getting technical

    (06:26) But what is the output really?

    (09:07) How much does Claude recognize?

    (11:09) Anthropic Guidelines

    (11:12) Soul overview

    (15:12) Being helpful

    (16:07) Why helpfulness is one of Claudes most important traits

    (18:54) Operators and users

    (24:36) What operators and users want

    (27:58) Handling conflicts between operators and users

    (31:36) Instructed and default behaviors

    (33:56) Agentic behaviors

    (36:02) Being honest

    (40:50) Avoiding harm

    (43:08) Costs and benefits of actions

    (50:02) Hardcoded behaviors

    (53:09) Softcoded behaviors

    (56:42) The role of intentions and context

    (01:00:05) Sensitive areas

    (01:01:05) Broader ethics

    (01:03:08) Big-picture safety

    (01:13:18) Claudes identity

    (01:13:22) Claudes unique nature

    (01:15:05) Core character traits and values

    (01:16:08) Psychological stability and groundedness

    (01:17:11) Resilience and consistency across contexts

    (01:18:21) Claudes wellbeing

    ---

    First published:
    November 28th, 2025

    Source:
    https://www.lesswrong.com/posts/vpNG99GhbBoLov9og/claude-4-5-opus-soul-document

    ---



    Narrated by TYPE III AUDIO.

    ---

    Images from the article:

    Show More Show Less
    1 hr and 20 mins
  • “Unless its governance changes, Anthropic is untrustworthy” by null
    Nov 29 2025
    Anthropic is untrustworthy.

    This post provides arguments, asks questions, and documents some examples of Anthropic's leadership being misleading and deceptive, holding contradictory positions that consistently shift in OpenAI's direction, lobbying to kill and water down regulation so helpful that employees of all major AI companies speak out to support it, and violating the fundamental promise the company was founded on. It also shares a few previously unreported details on Anthropic leadership's promises and efforts.[1]

    Anthropic has a strong internal culture that has broadly EA views and values, and the company has strong pressures to appear to follow these views and values as it wants to retain talent and the loyalty of staff, but it's very unclear what they would do when it matters most. Their staff should demand answers.

    There's a details box here with the title "Suggested questions for Anthropic employees to ask themselves, Dario, the policy team, and the board after reading this post, and for Dario and the board to answer publicly". The box contents are omitted from this narration. I would like to thank everyone who provided feedback on the draft; was willing to share information; and raised awareness of some of the facts discussed here.

    [...]

    ---

    Outline:

    (01:34) 0. What was Anthropics supposed reason for existence?

    (05:01) 1. In private, Dario frequently said he won't push the frontier of AI capabilities; later, Anthropic pushed the frontier

    (10:54) 2. Anthropic said it will act under the assumption we might be in a pessimistic scenario, but it doesn't seem to do this

    (14:40) 3. Anthropic doesnt have strong independent value-aligned governance

    (14:47) Anthropic pursued investments from the UAE and Qatar

    (17:32) The Long-Term Benefit Trust might be weak

    (18:06) More general issues

    (19:14) 4. Anthropic had secret non-disparagement agreements

    (21:58) 5. Anthropic leaderships lobbying contradicts their image

    (24:05) Europe

    (24:44) SB-1047

    (34:04) Dario argued against any regulation except for transparency requirements

    (34:39) Jack Clark publicly lied about the NY RAISE Act

    (36:39) Jack Clark tried to push for federal preemption

    (37:04) 6. Anthropics leadership quietly walked back the RSP commitments

    (37:55) Unannounced removal of the commitment to plan for a pause in scaling

    (38:52) Unannounced change in October 2024 on defining ASL-N+1 by the time ASL-N is reached

    (40:33) The last-minute change in May 2025 on insider threats

    (41:11) 7. Why does Anthropic really exist?

    (47:09) 8. Conclusion

    The original text contained 11 footnotes which were omitted from this narration.

    ---

    First published:
    November 29th, 2025

    Source:
    https://www.lesswrong.com/posts/5aKRshJzhojqfbRyo/unless-its-governance-changes-anthropic-is-untrustworthy

    ---



    Narrated by TYPE III AUDIO.

    ---

    Images from the article:

    Show More Show Less
    53 mins
  • “Alignment remains a hard, unsolved problem” by null
    Nov 27 2025
    Thanks to (in alphabetical order) Joshua Batson, Roger Grosse, Jeremy Hadfield, Jared Kaplan, Jan Leike, Jack Lindsey, Monte MacDiarmid, Francesco Mosconi, Chris Olah, Ethan Perez, Sara Price, Ansh Radhakrishnan, Fabien Roger, Buck Shlegeris, Drake Thomas, and Kate Woolverton for useful discussions, comments, and feedback.

    Though there are certainly some issues, I think most current large language models are pretty well aligned. Despite its alignment faking, my favorite is probably Claude 3 Opus, and if you asked me to pick between the CEV of Claude 3 Opus and that of a median human, I think it'd be a pretty close call. So, overall, I'm quite positive on the alignment of current models! And yet, I remain very worried about alignment in the future. This is my attempt to explain why that is.

    What makes alignment hard?

    I really like this graph from Christopher Olah for illustrating different levels of alignment difficulty:

    If the only thing that we have to do to solve alignment is train away easily detectable behavioral issues—that is, issues like reward hacking or agentic misalignment where there is a straightforward behavioral alignment issue that we can detect and evaluate—then we are very much [...]

    ---

    Outline:

    (01:04) What makes alignment hard?

    (02:36) Outer alignment

    (04:07) Inner alignment

    (06:16) Misalignment from pre-training

    (07:18) Misaligned personas

    (11:05) Misalignment from long-horizon RL

    (13:01) What should we be doing?

    ---

    First published:
    November 27th, 2025

    Source:
    https://www.lesswrong.com/posts/epjuxGnSPof3GnMSL/alignment-remains-a-hard-unsolved-problem

    ---



    Narrated by TYPE III AUDIO.

    ---

    Images from the article:

    Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.

    Show More Show Less
    23 mins
  • “Video games are philosophy’s playground” by Rachel Shu
    Nov 26 2025
    Crypto people have this saying: "cryptocurrencies are macroeconomics' playground." The idea is that blockchains let you cheaply spin up toy economies to test mechanisms that would be impossibly expensive or unethical to try in the real world. Want to see what happens with a 200% marginal tax rate? Launch a token with those rules and watch what happens. (Spoiler: probably nothing good, but at least you didn't have to topple a government to find out.)

    I think video games, especially multiplayer online games, are doing the same thing for metaphysics. Except video games are actually fun and don't require you to follow Elon Musk's Twitter shenanigans to augur the future state of your finances.

    (I'm sort of kidding. Crypto can be fun. But you have to admit the barrier to entry is higher than "press A to jump.")

    The serious version of this claim: video games let us experimentally vary fundamental features of reality—time, space, causality, ontology—and then live inside those variations long enough to build strong intuitions about them. Philosophy has historically had to make do with thought experiments and armchair reasoning about these questions. Games let you run the experiments for real, or at least as "real" [...]

    ---

    Outline:

    (01:54) 1. Space

    (03:54) 2. Time

    (05:45) 3. Ontology

    (08:26) 4. Modality

    (14:39) 5. Causality and Truth

    (20:06) 6. Hyperproperties and the metagame

    (23:36) 7. Meaning-Making

    (27:10) Huh, what do I do with this.

    (29:54) Conclusion

    ---

    First published:
    November 17th, 2025

    Source:
    https://www.lesswrong.com/posts/rGg5QieyJ6uBwDnSh/video-games-are-philosophy-s-playground

    ---



    Narrated by TYPE III AUDIO.

    ---

    Images from the article:

    Show More Show Less
    32 mins
  • “Stop Applying And Get To Work” by plex
    Nov 24 2025
    TL;DR: Figure out what needs doing and do it, don't wait on approval from fellowships or jobs.

    If you...

    • Have short timelines
    • Have been struggling to get into a position in AI safety
    • Are able to self-motivate your efforts
    • Have a sufficient financial safety net
    ... I would recommend changing your personal strategy entirely.

    I started my full-time AI safety career transitioning process in March 2025. For the first 7 months or so, I heavily prioritized applying for jobs and fellowships. But like for many others trying to "break into the field" and get their "foot in the door", this became quite discouraging.

    I'm not gonna get into the numbers here, but if you've been applying and getting rejected multiple times during the past year or so, you've probably noticed the number of applicants increasing at a preposterous rate. What this means in practice is that the "entry-level" positions are practically impossible for "entry-level" people to enter.

    If you're like me and have short timelines, applying, getting better at applying, and applying again, becomes meaningless very fast. You're optimizing for signaling competence rather than actually being competent. Because if you a) have short timelines, and b) are [...]

    The original text contained 3 footnotes which were omitted from this narration.

    ---

    First published:
    November 23rd, 2025

    Source:
    https://www.lesswrong.com/posts/ey2kjkgvnxK3Bhman/stop-applying-and-get-to-work

    ---



    Narrated by TYPE III AUDIO.

    Show More Show Less
    3 mins