Effective Altruism News
Effective Altruism News
- Michael Thatcher, President and CEO of Charity Navigator: “There are a lot of problems in the world, and so figuring out where you can have the highest level of impact with the resources that you have is actually the smartest thing you can do.". See more impact stories at 👉 effectivealtruism.org/stories #EffectiveAltruism #EffectiveAltruismStories
- OpenAI insists it doesn’t fund or direct LTF — but one of the super PAC’s operatives describes it as a “corporate funder” with “a say”...
- You can save lives by writing on the internet, says a guy writing on the internet
- A qualitative study of German slaughterhouse workers reveals how they manage — and occasionally struggle with — the emotional demands of killing animals. The post How Slaughterhouse Workers Learn To Emotionally Detach appeared first on Faunalytics.
- Polo GGB opened its doors to local high school students with the aim of bringing young people closer to the world of scientific research and raising awareness about malaria and vector control. Through direct interaction with our researchers, students are introduced to the innovative technologies and research projects carried out at Polo GGB, including the […].
- I used an LLM to help draft this post and it likely contains >10% AI-generated text, but I’ve edited/rewritten it extensively and endorse it. TL;DR: It’s unclear how much the intelligence explosion will directly affect agriculture, because it's one of the least cognitive-labor-intensive industries.
- Just beyond central Vancouver, the Squamish Nation is building one of the most ambitious and unusual housing developments in the world, and getting rich in the process. How they did it has lessons for
- [Update (June 11, 2026): Anthropic has since "un-silenced" the new safeguards (source).]. [Thanks to Julian Minder for helpful discussion and review.]. Claude Fable 5 and its new safeguards. Yesterday, Anthropic publicly released Claude Fable 5. Fable 5 is a Mythos-class model – a model class above Opus, Anthropic's previous premium tier – and, as assessed by multiple benchmarks, it is...
- Suffering-focused ethics (SFE) is a family of moral views that gives special priority to reducing suffering. As you might know, we at the Center for Reducing Suffering find SFE deeply compelling—it is, after all, the backbone of our work. Part of our mission is to research and build a field around SFE. Unfortunately, SFE remains highly neglected in both academia and broader moral discourse.
- Fresh features in time for the World Cup!
- Detecting Hidden Behaviors in LLMs via Activation-matched Finetuning — preprint, 2026. [ Paper] [ Code]. TLDR. Given a model with some unknown, abnormal behavior (backdoors, censorship, reward hacking,...), construct an aligned reference by training a clean model to match the suspect's residual-stream activations on a benign prompt corpus.
- California YIMBY is excited to announce our endorsement of Xavier Becerra to be the next Governor of California. This election will be pivotal for California’s future. And the choice could not be any clearer. Xavier Becerra is the best candidate…. The post California YIMBY Endorses Xavier Becerra <span class="dewidow">for Governor</span> appeared first on California YIMBY.
- “If you need your working day to be fulfilling, if you need to feel like you’re making a difference, trust that voice, because it means you have the passion to actually make a difference in the world. Pursue it, because we need more people with that passion doing good work.”... Read more...
- Last week, the AI company Anthropic released a blog post titled “When AI builds itself”. This led to a media frenzy, with news outlets around the world publishing headlines that the company was urging a global pause on AI development, or calling for AI non-proliferation. However, the post does not call for a pause.
- TL;DR: Recent work from Goodfire & UK AISI – Verbalized Eval Awareness Inflates Measured Safety – shows that newer open-weight models verbalize evaluation-awareness (VEA) more often, and that this inflates measured safety. Between OLMo-3-32B-Think and OLMo-3.1-32B-Think – identical base, SFT, DPO, and RL data, differing only in an additional ~3 weeks of the RLVR stage – VEA roughly doubles.
- We are seeking a Policy and Research Associate to join our team to address poverty and insecurity in low and middle-income countries by incubating solutions, rigorously testing them in the field, and working with local partners to scale what works.
- The Data and Research Associate will primarily work with Jishnu Das on (a) rolling out a health insurance study in Kenya and Nigeria; (b) harmonizing and analyzing a unique dataset of Standardized Patients studies; and (c) providing support on IRB applications and new data collection in the field.
- When somebody says something, either they mean it, or they are responsible for meaning it.
- And a vibes-based assessment of what they mean.
- Hoy celebramos que, gracias a nuestros donantes, Ayuda Efectiva ya ha salvado 1.000 vidas. Pero las cifras grandes son difíciles de visualizar: ¿qué significa realmente ese hito?
- (see full author list at the end). About a year ago, METR showed that the length of tasks frontier models can reliably complete doubles every few months. A related safety-relevant question is this: what length of tasks can models complete without any chain of thought (CoT)? We investigate in our new paper.
- This is a short post to explain a distinction between three different types of model organism (MO) research: Type. Purpose. Example. Worst-case model organisms. Stress-test safety and control techniques by making the problem as hard as possible. Password-locked models for capability elicitation; sleeper agents for stress-testing alignment training; red-team malign inits in control.
- Models' no-CoT time horizon has doubled roughly every year.
- Statement: I'm far from an EA or AI expert, these are all opinions from an 19 years old EA laypeople (which is probably biased or woefully wrong). Welcome to give me any critique in the comments(rather than just downvoting me).
- This working draft of AI Now’s upcoming report traces corporate power in the data center industry in the United States, focusing on the flows of money and power that determine who both drives and benefits from the current data center boom. The aim of this research is to help local communities and their advocates fight […].
- In this case study, we used a method called process tracing to demonstrate the impact of our Animal Product Impact Scales on Anima International France and their decision to change their organizational strategy. The post Tracking Our Direct Impact: A Case Study Using Process Tracing appeared first on Faunalytics.
- UK AISI, Model Transparency Team. Epistemic status: Most experiments were run over a period of ~2-3 days during a hackathon at UK AISI, and were fairly heavily vibe coded. Expect some of this to be rough around the edges. Tl;dr:
- EA Forum Digest #295 Hello!. CEA is hiring for a financial controller, a recruiter, and for roles on the Events team. All roles listed here. It’s also organisation update week, so check out this thread for jobs, research updates and opportunities relating to EA orgs. — Toby (for the Forum team) We recommend:
- This post is based on my personal views, which mostly overlap with the views of my employer ControlAI but does not necessarily fully reflect them. This applies in particular, but not exclusively, to technical opinions about AI development and geopolitical predictions. You might’ve heard that superintelligent AI (ASI) poses extreme risks like human extinction and other comparably undesirable...
- How do we know when the world has changed? On June 1, a team of scientists published a preprint scientific paper claiming they had edited human embryonic DNA with more precision than any previous attempt. As a technical achievement, the work is undoubtedly impressive, largely avoiding the errors that had accompanied earlier efforts to gene […]...
- guest post!
- The Claude Fable 5/Mythos 5 System Card has a section in which they talk about illegible reasoning, and provide an "extreme" example thereof. Models developing their own uninterpretable, unmonitorable internal language has been a major theoretical concern for a while, and when o3 was released last year with its disclaim overshadow disclaim vantage style word salad CoT, it seemed like the...
- TL;DR: Recent work from Goodfire & UK AISI – Verbalized Eval Awareness Inflates Measured Safety – shows that newer open-weight models verbalize evaluation-awareness (VEA) more often, and that this inflates measured safety. Between OLMo-3-32B-Think and OLMo-3.1-32B-Think – identical base, SFT, DPO, and RL data, differing only in an additional ~3 weeks of the RLVR stage – VEA roughly doubles.
- Your farmed animal advocacy update for early June 2026
- A new outlet for discussion in Ulster
- The post Optimizing Government-Led Community Health: A New Model for Sustainable Scale appeared first on Living Goods.
- This is a linkpost for https://www.anthropic.com/news/claude-fable-5-mythos-5. Discuss...
- Ten years ago, a shocking discovery sparked a movement. Today, Crustacean Compassion is celebrating a decade of changing how the world sees and treats crabs, lobsters, prawns and crayfish.
- I grew up in South Florida, which leads the nation in drowning deaths for children.
- A simple taxonomy of the main proposals for post-AGI universal redistribution
- I'm a freelance web designer and developer who has been concerned about AI and prioritising a transition into AI safety since late 2025. This post is a summary of my experience so far, as a possibly useful addition to the conversation around the need for generalists in AI safety.
- In my post “ Why I’m not a Bayesian”, I argued that the Bayesian approach of assigning credences to propositions with binary truth values only works in simple and restricted domains. Instead, I claimed, a better approach to epistemology is to assign degrees of truth to models of the world.
- Over the past 15 months or so, ARC's technical agenda has developed quite a bit. The advent of the Matching Sampling Principle (MSP), and ideas like it, has begotten a host of concrete technical problems; progress on those problems has given us more philosophical clarity on the big picture, which has led to even more technical progress.
- June 2026: We've just launched this program and are inviting the first Affiliates. We expect to invite more over time; register your interest below. About the program Research Affiliates pursue their own research directions for reducing risks of astronomical suffering (s-risks), with CLR’s funding, affiliation, and research community.
- In a new paper in Cyber Security: A Peer-Reviewed Journal, Sarah Powazek, Director of CLTC’s Public Interest Cybersecurity Program, addresses the challenge of “usability” in cybersecurity, particularly for…. The post New Paper Highlights the Need for Usable Cybersecurity appeared first on CLTC.
- In an op-ed published by Tech Policy Press, Ann Cleaveland, Executive Director of the Center for Long-Term Cybersecurity, argues that, in the face of significant new cyber threats…. The post Op-Ed Calls for “Project Kaleidoscope” to Bolster Community Cyber Defense in the Age of AI appeared first on CLTC.
- Over the past 15 months or so, ARC's technical agenda has developed quite a bit. The advent of the Matching Sampling Principle (MSP), and ideas like it, has begotten a host of concrete technical problems; progress on those problems has given us more philosophical clarity on the big picture, which has led to even more technical progress.
- TL;DR: My new prior is that top-of-the-line LLMs working on easy tasks generate code that is maybe 10 % more complicated than necessary. I also think we accept this complexity too easily, because it comes from code that is right here, right now, solving an immediate problem.
- TL;DR: What is slop, and why? Is it fundamental? Is it in the room with us right now? And, most importantly, how do we exorcise it?. Previously in this series: This Week In Fashion and On Automatic Ideas. A potential post for this Substack starts when I pick up an idea by talking to a smart person or revisiting an evergreen topic.
- You won’t believe how low big tech has stooped in their slime campaign against Alex Bores...
- The battle lines of the AI morality debate are being laid down. On one side you have the ChatGPT dogma: AI as mere tools with no real preferences or even beliefs. On the other you have the twitter AI whisperers: AIs as complex beings with rich personalities and desires which deserve our respect. And in the middle you have the official Anthropic line, that they are genuinely uncertain, as is...
- This study reveals how guided Arctic king crab tours normalize animal suffering through storytelling, shaping tourist behavior, and masking ethical concerns. The post Safari Of Suffering: The Reality Of King Crab Tourism appeared first on Faunalytics.
- works better than you'd think
- The Great Exhibition Road Festival is a free annual celebration of science and the arts each summer in South Kensington, led by Imperial College London. Visitors could enjoy hands-on workshops, interesting talks, performances and installations from iconic museums, research and culture organisations in South Kensington.
- By Abhi Kumar, Associate Program Officer in Farm Animal Welfare. Note: We used AI (Claude) to draft this post from other documents related to this RFP. All content was reviewed by Abhi and the CG team for accuracy. Over 100 billion animals are farmed and slaughtered for food every year.
- "LLMs just imitate humans.". A very repeated claim about AI, and it's false. In this clip from Modern Wisdom, Eliezer Yudkowsky breaks down how the recent breakthrough of applying reinforcement learning to chain of thought lets models move past imitation. Have the model take 20 attempts at a problem, find the one that works best, then train it to think more like that successful attempt.
- Grateful to Benjamin Vincent and Alex Rubinsteyn for our many conversations on this topic, and comments on drafts of this essay!. Introduction. When most people hear of “cancer vaccine,” they’ll think of normal vaccines. Perhaps they’ll even think of what ostensibly is a cancer vaccine: the HPV vaccine.
- I often use what I’ll call the “safety-usefulness tradeoff model”, which is: developers face a tradeoff between "safety" and "usefulness" of an AI deployment, and the developer has only limited willingness or ability to sacrifice usefulness for the sake of safety.
- I often use what I’ll call the “safety-usefulness tradeoff model”, which is: developers face a tradeoff between "safety" and "usefulness" of an AI deployment, and the developer has only limited willingness or ability to sacrifice usefulness for the sake of safety.
- When is "increasing safety budget" a useful concept?
- TL;DR: Bun is a very large and very influential open-source project. It is being migrated from the easier-to-read Zig programming language to harder-to-read but memory-safe Rust. This is done almost entirely by the AI tool Claude Code.
- When the world wakes up to the unacceptable danger of AI development, what happens to those responsible? The Berkeley trials, perhaps.
- "We see these AIs as a galaxy glittering with capabilities, but at their center, invisible to the naked eye, holding all the constellations together, is an unimaginably massive black hole of data."
- Executive summary
- Most flags used to be ugly. They were probably better that way.
- Hi everyone!. Over the last six months or so, those of you who listen to the 80,000 Hours Podcast might occasionally have heard an unfamiliar voice asking questions to our guests. The person behind that unfamiliar voice is me, Zershaaneh!. I'm not saying I'm also Banksy, but I'm not not saying that.
- She knew she wanted to help animals, she just couldn’t decide how. Becca Rogers had been sitting with that question since 2019, when she left PETA after 1.5 years doing undercover work and stepped into a tech ed company. She still cared deeply about animals and she needed to find her way back, but the […]...
- The post Our top tips for becoming a better applicant appeared first on 80,000 Hours.
- We're not on indigenous land
- What does an AI even ‘want’ anyway?
- A survey of 500 U.S. dog guardians explores how ethical beliefs about animals influence training methods, showing that human-centered views are linked to punishment while welfare-focused views favor gentler approaches. The post Ethical Beliefs Shape How People Train Their Dogs appeared first on Faunalytics.
- When will markets price the singularity?
- At Clearer Thinking, we're running a collaboration survey about the psychological challenges of various kinds related to working on high-impact problems (e.g., existential risk, AI safety, climate change, animal welfare, global health, bio/nuclear safety, and other topics), and what people find helpful in dealing with those challenges. We are interested in hearing from you whether you...
- In just 10 days over the summer of 1854, 500 people died of cholera in the Soho neighborhood of London. The city’s population had more than doubled to 2.3 million people in the first half of the 1800s, and its sewage system could not keep up. But the streams of human waste flowing into the […]...
- If by whiskey.....
- Greetings from a world where…...
- In philosophy of mind, "mental causation" means mental entities have causal effects, especially physical ones. If physicalism is true, then physical effects are explainable in terms of physical causes (or at least, fundamental physical laws), needing no recourse to causation by anything that is not in fundamental physics.
- We're looking at livelihoods research again (contribute ideas or reach out to support the team with your expertise), sharing our new methods site, and releasing updated SADs guidance. Looking at livelihoods and growth again.
- "We are approaching a runaway to superintelligence that could threaten our shared human future."
- Image credit: Jebulon. To prevent superintelligent AI from killing everyone, I would like there to be a strong international agreement banning the development of ASI until it can be proven safe. But that sort of agreement requires a lot of political buy-in and coordination. In the meantime, it may be easier to get light-touch AI safety regulations passed.
- Hive Slack Threads: May
- Instead of using static position increments (+1) per token, RoPE-based language models can learn per-token and per-layer position increments. This has no detectable effect on model performance but allows us to see what the model thinks the distance is between each position and how this varies per-layer... Example sentence with each character plotted based on per-layer learned position increments.
- Crosspost. The best charities save lives for a few thousand dollars. If you earn $100,000 per year and give away 10%, you can save about 100 people over the course of your life. I think we generally aren’t good at visualizing what’s really at stake. So here is me attempting vaguely to grok 100 lives. Ted Bundy killed around 30 people.
- What happens after we #PauseAI
- We introduce an evaluation for activation verbalizers: can they surface a target model's reasoning as it solves a math problem in a single forward pass? For open-weight NLAs, the answer seems to be: "possibly, but definitely not reliably". Lots of important capabilities currently require AI models to reason "out loud" in a natural-language chain of thought, which means that we can monitor...
- Plus some other stuff
- When I speak to you one-on-one, assuming you’re listening, we’re in parity.
- I'm leading a non-profit team building a pathogen-agnostic early-warning system. As AI systems become increasingly capable substitutes for expert human biologist expertise, the risk that someone could engineer a pathogen to spread widely before detection is going up.
- Epistemic status: don’t know whether I actually believe all of this, but I think it’s worth considering. A “corrigible” agent, per the LW wiki, is: …one that doesn’t interfere with what we would intuitively see as attempts to ’correct’ the agent, or ’correct’ our mistakes in building it; and permits these ’corrections’ despite the apparent instrumentally convergent reasoning saying otherwise.
Loading...