Effective Altruism News
Effective Altruism News
- TL;DR: We estimate how often Qwen 3 4B exhibits rare harmful behaviors with 30× fewer rollouts than naive sampling, using a new method that interpolates between the model and a less-safe variant in logit space. Authors: Francisco Pernice (MIT), Santiago Aranguri (Goodfire). Introduction.
- Please help Andre. He's struggling. 🆘 Plus: community weekend recap, movie nights, and one (1) puppy with an agenda 🐶 ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ...
- Anthropic are now actively using the approach to alignment often called “ Alignment Pretraining” or “Safety Pretraining” — using Stochastic Gradient Descent on a large body of natural or synthetic documents showing the AI assistant doing the right thing in morally challenging situations.
- TL;DR: vLLM-Lens is a vLLM plugin for top-down interpretability techniques such as probes, steering, and activation oracles. We benchmarked it as 8–44× faster than existing alternatives for single-GPU use, though we note a planned version of nnsight closes this gap.
- I have tried and failed to write a longer post many times, so here goes a short one with less detail. Discourse has primarily focused on models' ability to develop new exploits against important software from scratch. That capability is impressive, but the tech industry has been dealing with people regularly finding 0-day exploits for important pieces of software for more than twenty years.
- TL;DR: Voters are now surprisingly open to talking about existential risk from AI. This seems to have changed in the last 6 months. When campaigning for AI safety-friendly politicians (e.g., Alex Bores), we should talk more about AI in general, and about AI risk in particular. This is currently actionable for the CA-11 and NY-12 Democratic primaries.
- Imagine you’re looking for a personal trainer. You open one trainer’s webpage and read their testimonials: “I had an experience tied for the most intense experiences of my life”; “They do it all with fun, care, and a sense of humour.” You notice that none of the testimonials mention improved body composition, fitness, or bloodwork. What would you think?.
- The post Senior Video Operations appeared first on 80,000 Hours.
- What might explain AI researcher pay, and why it matters
- I’m an EA who has been trying to find ways to make animal suffering more salient. I’ve been working on a feature-length documentary called ‘The Dying Trade’ for the last 5 years and I’ve just released it on YouTube.
- Cross-posted from The Counterfactual by the Forum Team. Subtitle: A concrete strategy for deploying the largest wave of philanthropic capital in history. . The OpenAI Foundation holds $180 billion in equity. Anthropic’s co-founders have pledged to donate 80% of their wealth. When the time comes to spend all this money, what should we actually do with it?. Here’s my best guess.
- Last month, Anthropic announced Mythos Preview, the most powerful cyberweapon in history, capable of finding and exploiting zero-day vulnerabilities in every major operating system and web browser. Meanwhile, many frontier AI company employees increasingly expect full automation of AI R&D in the next year or two, followed by the rapid automation of thousands of other important tasks and jobs.
- Teenage panic attacks are not uncommon. Teenagers are going through a crucial time of learning how to manage emotions and deal with stress, and this can be a tough challenge at times. Teenage panic attacks can occur just one or a few times, but in some cases they can develop into panic disorder (chronic, repeated panic attacks).
- Over the years, I’ve written two op-eds for The New York Times about quantum computing, at the NYT editors’ invitation: I’ve also visited the NYT office and helped NYT reporters with numerous stories about quantum computing and beyond. In the wake of Cade Metz’s infamous NYT hatchet job against Scott Alexander and the rationalist community, […]...
- After the AI super PAC endorsed her and two other Democrats, Rep. Val Hoyle went back and forth on whether she was happy with their support
- MIRI CEO Malo Bourgon at the Buckley Institute at Yale: Humans didn't wipe out 10,000+ species because we were evil. We did it because our goals weren't aligned with theirs. A superintelligence relates to us the same way. Not hostile. Just indifferent, and far more capable.
- On the dangers of being self-enamored
- Tom Davidson explains how AI could enable a small group to seize power, why he puts the risk of an AI-enabled coup at 10% in the next 30 years, and what democracies must do to prevent it. The conversation covers robot armies, the mechanics of takeover, democratic backsliding, the AI race, and the steps companies and governments should take to maintain a balance of power.
- Meat is the flesh of tortured innocent animals who did not want to die
- Disclaimer: I’m not vegan. I’m not even vegetarian. I eat meat all the time. I’ve been a firm critic of efforts to objectively quantify the difference in suffering across very different species. That said, I cannot help but agree that eating meat is probably the morally worst thing I do, and I also have to agree that eating different kinds of meat are different levels of bad.
- What looks like public education about farming is often industry PR in disguise. This blog breaks down how agriculture front groups manufacture public trust in Canada, and how advocates can counter these efforts. The post Agriculture Front Groups In Canada And The Public Trust Agenda appeared first on Faunalytics.
- Today, beneath the gilded ceilings of the House of Lords, one King delivered his speech to the nation, while we, no less crowned (and rather better armoured), listened from our rocky throne, antennae poised, claws crossed - only to find, once again, that we magnificent 10-legged creatures had been entirely overlooked.
- The strange path to global monopoly
- EA Forum Digest #291 Global development takes the spotlight this week Hello!. It’s In Development Highlight week on the EA Forum! The authors and Editor in Chief from the new global development magazine are on the Forum all week, ready to answer your questions. Start by reading their articles:
- Alicorn writes things sometimes
- Should you be worried about the hantavirus outbreak? Should you be afraid? Should you be panicking? Should you start freaking out? If you’ve been following the coverage of the hantavirus outbreak aboard the cruise ship MV Hondius, these are the questions you’ve seen posed in headlines. And a small tip from inside the media: If […]...
- I say this with love
- I had a conversation with someone who claimed offhandedly that AI will dramatically raise agricultural productivity (via agritech advancements) in low-income countries and trigger growth as a result. My instinct was to respond that we've already had substantial advancements in agricultural technology, and yet it hasn't resulted in the magnitude of yield growth, let alone economic growth, you'd...
- Red Button, Blue Button. On April 24th, 2026, Tim Urban put forth the following poll on Twitter/X: Everyone in the world has to take a private vote by pressing a red or blue button. If more than 50% of people press the blue button, everyone survives. If less than 50% of people press the blue button, only people who pressed the red button survive. Which button would you press?.
- Estimating the resources CAISI needs to deliver on American AI readiness
- Today's model specs are written for current and near-future versions of LLMs, and AI labs typically treat them as provisional. But what if the AI behaviors we set now stick around and end up governing far more capable future models by default?
- Gen Z are a bunch of cowards…or are they risking it all on crypto? The editors of The New Critic report on their generation’s Risk-geist.
- Overview. When asked about how they would give away money, or about how to have a moral career, the leading LLMs typically give answers in an EA spirit, and informed by thinking from people and organizations in the EA community. In many cases the term “effective altruism”, and/or EA jargon, are used explicitly.
- (An LLM Whisperer placed a strong request that I put this 2024 story somewhere not on Twitter, so it could be scraped for AI datasets besides Grok's. I perhaps do not fully understand or agree with the reasoning behind this request, but it costs me little to fulfill and so I shall. -- Yudkowsky). And another day came when the Ships of Humanity, going from star to star, found Sapience.
- Looking over my favourite posts, I notice that many of them are making specific versions of a more general claim, which is essentially: don’t confuse selective processes for predictive processes. Here, I’m going to try to make that more general claim, rehash some examples in light of it, and end with a few ambient confusions I think this framework can help with, for the reader to ponder.
- And we're hiring
- Explaining, for those out of the loop, what is coming and how we know
- Kroger's "Fresh for Everyone" slogan stops at the cage door. Unmask the truth behind their broken promise and help end cage cruelty for good. The post Kroger’s Cage-Free Egg Policy: Unmasking the Truth Behind the Broken Pledge appeared first on Mercy For Animals.
- An examination of video footage from an Australian rodeo found that calves experience fear and stress while confined in the chute — before the calf-roping event even begins. The post Rodeo Calves Experience Fear While In The Chute appeared first on Faunalytics.
- The EU's AI Act and Code of Practice requires providers of the most advanced AI models to meet the ‘state of the art’ (SOTA) in safety and security. In a new policy memo, we argue that SOTA is best understood as a process-driven concept, advanced by the broader expert ecosystem.
- This is a crosspost of the full text of Money for nothing: the roles of evidence in GiveDirectly’s journey to $1 billion delivered from In Development, made for the EA Forum's In Development Highlight Week. GiveDirectly will be taking part in the discussion thread, but the author, Paul Niehaus, may not see your comments here.
- “I would volunteer and work at a bunch of nonprofits, but it just never felt good enough. Then when I found effective altruism… it just blew my mind.” -Kearney Capuano, Program Associate at Coefficient Giving See more impact stories at 👉 effectivealtruism.org/stories #EffectiveAltruism #EffectiveAltruismStories...
- for those whose eyes evolved to see
- A disease that was once a death sentence is increasingly treatable
- Local shoppers pressure one of the nation’s largest grocers after failing to fulfill their 2025 commitment LOS ANGELES — Kroger promised customers it would go 100% cage-free. Instead, the nation’s number one supermarket chain failed to deliver, leaving millions of hens confined in cages across its supply chain, raising serious concerns about corporate accountability and […].
- For the past decade, the fight to make it legal and feasible to build housing at scale in California felt Sisyphean. California YIMBY and our allies pushed against exclusionary land use policies, and a political class content to blame the…. The post On the Race for California Governor: An Abundance of <span class="dewidow">Pro-Housing Candidates</span> appeared first on California YIMBY.
- The Availability Problem: Imagine you have cancer, or chronic pain, or a progressive degenerative disease of some sort. You have exhausted the traditional treatment options available to you, and none of them have worked. However, there are treatments that are still undergoing clinical trials which might help you.
- New York lawmakers are advancing legislation that could make the state the first on the East Coast to preemptively ban octopus factory farming, a practice scientists and advocates warn would pose significant animal welfare and environmental concerns. This week, a key Assembly bill advanced out of committee with a favorable vote, marking a major step […].
- GiveWell is launching a new request for information (RFI) to expand and strengthen our malaria grantmaking in Africa and help our donors make a greater impact. Expressions of interest can be submitted through one of two tracks, the first for malaria chemoprevention and vector control pilot programs and the second for research and evaluation.
- This post was drafted by Buck, and substantially edited by Anders. "I" refers to Buck. Thanks to Alex Mallen for comments. People who work inside AI companies get access to information that I only get later or never. Quantitatively, how big a deal is this access?. Here’s an operationalization of this. Consider the following two ways my knowledge could be augmented:
- 1.1 Tl;dr. Alignment is often conceptualized as AIs helping humans achieve their goals: AIs that increase people’s agency and empowerment; AIs that are helpful, corrigible, and/or obedient; AIs that avoid manipulating people. But that last one—manipulation—points to a challenge for all these desiderata: a human’s goals are themselves under-determined and manipulable, and it’s awfully hard to...
- 1.1 Tl;dr. Alignment is often conceptualized as AIs helping humans achieve their goals: AIs that increase people’s agency and empowerment; AIs that are helpful, corrigible, and/or obedient; AIs that avoid manipulating people. But that last one—manipulation—points to a challenge for all these desiderata: a human’s goals are themselves under-determined and manipulable, and it’s awfully hard to...
- This is a crosspost of the full text of Exporters Without Borders: Why You Should Start a Company Instead of Working in Aid from In Development, made for the EA Forum's In Development Highlight Week. If you enjoy the article, you can subscribe to In Development's substack here. June Jambiha was a quintessential hustler.
- It really is Sydney Sweeney’s world, and we’re all just living in it. Human female breasts are an evolutionary mystery along several dimensions. First, breast permanence is unique to humans. All other mammals develop breast prominence during pregnancy or nursing, and the mammary tissue recedes after weaning. This process is called “involution”.
- Anthony Aguirre is the CEO of the Future of Life Institute. He joins the podcast to discuss A Better Path for AI, his essay series on steering AI away from races to replace people. The conversation covers races for attention, attachment, automation, and superintelligence, and how these can concentrate power and undermine human agency.
- Executive summary
- More Than Good is a new podcast from Effective Altruism Australia, aimed at introducing the ideas and principles of effective altruism to a broader audience. The episodes are framed around moral questions and how people think about doing good, covering topics like global inequality, animal welfare, ethics, philosophy and more. For a global movement, there is relatively little content that is...
- In a recent tweet, Anthropic seems to have asserted that hyperstition is responsible for observed misalignment in their AIs. Strangely, the research they use as evidence actually doesn’t seem to be related to hyperstition at all?
- On inflating your case
- My median guess: it's as good as a crystal ball that sees 2.5 months into the future.
- A survey of Muslim consumers in Türkiye revealed significant gaps in public awareness around animal welfare in halal practices. However, many demonstrated a willingness to change their behavior when given accurate information. The post Halal’s Animal Welfare Gap: What Muslim Consumers Believe And Know appeared first on Faunalytics.
- The Center for Open Science (COS) is introducing the Open Scholarship Training for Researchers Series, a collection of seven self-paced online courses developed by COS in response to what researchers have told us they actually need. Enrollment is now open for the first two courses, with additional courses launching through Winter 2026.
- This talk was recorded live at Vision Weekend USA, held December 5–7, 2025 in the Bay Area. Vision Weekends are our flagship conference series, bringing together leading scientists, entrepreneurs, funders, and policymakers to explore frontier science and technology and to imagine paths toward flourishing futures. Hosted on Acast. See acast.com/privacy for more information.
- This talk was recorded live at Vision Weekend USA, held December 5–7, 2025 in the Bay Area. Vision Weekends are our flagship conference series, bringing together leading scientists, entrepreneurs, funders, and policymakers to explore frontier science and technology and to imagine paths toward flourishing futures. Hosted on Acast. See acast.com/privacy for more information.
- This talk was recorded live at Vision Weekend USA, held December 5–7, 2025 in the Bay Area. Vision Weekends are our flagship conference series, bringing together leading scientists, entrepreneurs, funders, and policymakers to explore frontier science and technology and to imagine paths toward flourishing futures. Hosted on Acast. See acast.com/privacy for more information.
- This talk was recorded live at Vision Weekend USA, held December 5–7, 2025 in the Bay Area. Vision Weekends are our flagship conference series, bringing together leading scientists, entrepreneurs, funders, and policymakers to explore frontier science and technology and to imagine paths toward flourishing futures. Hosted on Acast. See acast.com/privacy for more information.
- This talk was recorded live at Vision Weekend USA, held December 5–7, 2025 in the Bay Area. Vision Weekends are our flagship conference series, bringing together leading scientists, entrepreneurs, funders, and policymakers to explore frontier science and technology and to imagine paths toward flourishing futures. Hosted on Acast. See acast.com/privacy for more information.
- This talk was recorded live at Vision Weekend USA, held December 5–7, 2025 in the Bay Area. Vision Weekends are our flagship conference series, bringing together leading scientists, entrepreneurs, funders, and policymakers to explore frontier science and technology and to imagine paths toward flourishing futures. Hosted on Acast. See acast.com/privacy for more information.
- This talk was recorded live at Vision Weekend USA, held December 5–7, 2025 in the Bay Area. Vision Weekends are our flagship conference series, bringing together leading scientists, entrepreneurs, funders, and policymakers to explore frontier science and technology and to imagine paths toward flourishing futures. Hosted on Acast. See acast.com/privacy for more information.
- Haleh Fotowat | Harnessing Biological Intelligence for Building Living Machines with Nervous SystemsThis talk was recorded live at Vision Weekend USA, held December 5–7, 2025 in the Bay Area. Vision Weekends are our flagship conference series, bringing together leading scientists, entrepreneurs, funders, and policymakers to explore frontier science and technology and to imagine paths toward flourishing futures. Hosted on Acast. See acast.com/privacy for more information.
- This talk was recorded live at Vision Weekend USA, held December 5–7, 2025 in the Bay Area. Vision Weekends are our flagship conference series, bringing together leading scientists, entrepreneurs, funders, and policymakers to explore frontier science and technology and to imagine paths toward flourishing futures. Hosted on Acast. See acast.com/privacy for more information.
- This talk was recorded live at Vision Weekend USA, held December 5–7, 2025 in the Bay Area. Vision Weekends are our flagship conference series, bringing together leading scientists, entrepreneurs, funders, and policymakers to explore frontier science and technology and to imagine paths toward flourishing futures. Hosted on Acast. See acast.com/privacy for more information.
- This talk was recorded live at Vision Weekend USA, held December 5–7, 2025 in the Bay Area. Vision Weekends are our flagship conference series, bringing together leading scientists, entrepreneurs, funders, and policymakers to explore frontier science and technology and to imagine paths toward flourishing futures. Hosted on Acast. See acast.com/privacy for more information.
- This talk was recorded live at Vision Weekend USA, held December 5–7, 2025 in the Bay Area. Vision Weekends are our flagship conference series, bringing together leading scientists, entrepreneurs, funders, and policymakers to explore frontier science and technology and to imagine paths toward flourishing futures. Hosted on Acast. See acast.com/privacy for more information.
- This talk was recorded live at Vision Weekend USA, held December 5–7, 2025 in the Bay Area. Vision Weekends are our flagship conference series, bringing together leading scientists, entrepreneurs, funders, and policymakers to explore frontier science and technology and to imagine paths toward flourishing futures. . Hosted on Acast. See acast.com/privacy for more information.
- This talk was recorded live at Vision Weekend USA, held December 5–7, 2025 in the Bay Area. Vision Weekends are our flagship conference series, bringing together leading scientists, entrepreneurs, funders, and policymakers to explore frontier science and technology and to imagine paths toward flourishing futures. Hosted on Acast. See acast.com/privacy for more information.
- This talk was recorded live at Vision Weekend USA, held December 5–7, 2025 in the Bay Area. Vision Weekends are our flagship conference series, bringing together leading scientists, entrepreneurs, funders, and policymakers to explore frontier science and technology and to imagine paths toward flourishing futures. Hosted on Acast. See acast.com/privacy for more information.
- Import AI 456: RSI and economic growth; radical optionality for AI regulation; and a neural computerWhat laws does superintelligence demand?
- Greetings from a world where…...
- an oddly specific but brief gripe-post
- A forecaster's breakdown of the Hondius cruise ship outbreak
- I wrote this essay as a submission to Dwarkesh Patel’s blog prize, though I have been meaning to write this up for a while. Usually, for a company to become profitable, they need to increase revenue, decrease costs, or some mixture of the two.
- The biggest hook they had in me was this fear that I’m dangerously inadequate and *they* somehow held the keys to mitigating that.
- What I'm reading, May '26, pt.1
- I used an LLM to help draft this post, but I’ve edited/rewritten it extensively and endorse it. AI in Context is a channel about transformative AI and its risks, published by 80,000 Hours. Writing up our current approach to thumbnails, which is nowhere near perfect, for easy shareability and cross-pollination of lessons. Would love to hear what other people are trying!. Making thumbnails.
- A few years back, I got a big pile of money from working at a tech startup. I put a lot of that money into a donor-advised fund. Since now I make hardly any money, that DAF might represent the majority of my lifetime donations. How much of my DAF should I donate per year?. In particular, how much should I donate in light of short AI timelines?. I created a simple model to answer this question.
- Most people have an above average number of legs, and what that means for our political imagination
- And a review of girl scouting in general. The post Book review: Girl Scout Handbook 1956 appeared first on Otherwise.
- Engineered pathogens pose a grave threat to society, plausibly constituting an existential risk (‘x-risk’) to humanity. Yet remarkably few people are working full-time on this problem. By my count, there are ~160 people on the planet whose full-time job is reducing bio x-risk. This entire group could fit on a single short-haul flight.
- What can countries with high stunting rates today learn from Japan’s experience of going from 70% to 5%?
Loading...