Crossposed from https://stephencasper.com/reframing-ai-safety-as-a-neverending-institutional-challenge/. Stephen Casper. . “They are wrong who think that politics is like an ocean voyage or a military campaign, something to be done with some particular end in view, something which leaves off as soon as that end is reached. It is not a public chore, to be got over with.
Buy tickets now! June 6th - 8th
Introduction. Anthropic recently released Stage-Wise Model Diffing, which presents a novel way of tracking how transformer features change during fine-tuning. We've replicated this work on a TinyStories-33M language model to study feature changes in a more accessible research context.
TL;DR Having a good research track record is some evidence of good big-picture takes, but it's weak evidence. Strategic thinking is hard, and requires different skills. But people often conflate these skills, leading to excessive deference to researchers in the field, without evidence that that person is good at strategic thinking specifically.
We made a long list of concrete projects and open problems in evals with 100+ suggestions!. https://docs.google.com/document/d/1gi32-HZozxVimNg5Mhvk4CvW4zq8J12rGmK_j2zxNEg/edit?usp=sharing. We hope that makes it easier for people to get started in the field and to coordinate on projects.
Here's his blog https://substack.com/@truthandtradition
Here is mine https://benthams.substack.com/
Clear proof that Charlie Kirk makes lots of terrible arguments
Also better dinner parties and the seven habits of highly depolarizing people
It is relatively easy to identify a list of things that we want, in the sense of preferring a life with more of them to less of them.
Here, I debunk the debunkers — the moral skeptics. *...
I’m writing a new guide to careers to help AGI go well, in collaboration with 80,000 Hours.
From Trolleys to Drowning Children
Discuss...
If you’re anything like me — a policy dork who spends too much time on X — you’ve been unable to escape discussion of a new book called Abundance. Written by the Atlantic’s Derek Thompson and the New York Times’s Ezra Klein (also a co-founder of Vox), Abundance is one of those policy books with […]...
Having a good research track record is some evidence of good big-picture
takes about AGI, but it's weak evidence. Strategic thinking is hard, and
requires different skills. But people often conflate these skills, leading
to excessive deference to researchers in the field, without evidence that
that person is good at strategic thinking specifically.
What markets think will happen to the "Gold Card" pathway to citizenship, the Department of Education, and the rule of law in the US
AI's biggest impact will come from broad labor automation—not R&D—driving economic growth through scale, not scientific breakthroughs.
Discuss...
TLDR: Consider indicating your approximate location on the EA forum community members map, so local organizers can find you, and maybe take initiative yourself to organize lunch meetups, walks, coworking, anything else you want, all within 10min of where you live or work... .
AIs are inching ever-closer to a critical threshold. Beyond this threshold lie great risks—but crossing it is not inevitable.
Transformer Weekly: California report on frontier AI risks, a new AI benchmark, and more Action Plan comments
Author's note: Hi everyone! This is a cross-post from my blog. It's a short, accessible piece intended for those who find the idea of measuring suffering "icky," uncomfortable, or cold. I've noticed this as a fairly common reaction to effective altruism, and I wanted to write from a place of sincerely empathising with it.
I recently left OpenAI to pursue independent research. I’m working on a number of different research directions, but the most fundamental is my pursuit of a scale-free theory of intelligent agency. In this post I give a rough sketch of how I’m thinking about that. I’m erring on the side of sharing half-formed ideas, so there may well be parts that don’t make sense yet.
I'm not writing about what's most important. The post Writing about non-AI topics feels weird appeared first on Otherwise.
Demographer Lyman Stone writes:
How do governments defend controversial laws that limit public oversight of factory farms? This study explores the rise of ag-gag laws in Canada. The post Ag-Gag Politics: How Governments Justify Secrecy In Agriculture appeared first on Faunalytics.
Editors’ Note: Ellen Aprill explains why the hybrid nature of the United States Institute of Peace (USIP), both a government and nonprofit entity, was at the heart of the standoff between Department of Government Efficiency (DOGE) and USIP officials earlier this week. The United States Institute of Peace (USIP) is the latest target of President … Continue reading →...
Cate Hall is the CEO of Astera. She’s a former Supreme Court attorney and the ex-No. 1 female poker player in the world. Before joining Astera, she co-founded and served as COO and later co-CEO of Alvea, a pandemic medicine company that set the record for the fastest startup to take a drug candidate to Phase I clinical trial.
In this special episode, we feature Nathan Labenz interviewing Nicholas Carlini on the Cognitive Revolution podcast. Nicholas Carlini works as a security researcher at Google DeepMind, and has published extensively on adversarial machine learning and cybersecurity.
Dr. Alfredo Parra discusses why it's so important to address extreme suffering, highlighting cluster headaches as a vivid example of severe pain that's often overlooked by common global health metrics. Alfredo explains the shortcomings of tools like DALYs in representing the true severity of intense yet less prevalent pain conditions.
Some rich people have worldviews that are uncommon within Effective Altruism. These people might be on board with doing good, but less aligned with the pragmatic, calculated approach common in Effective Altruist circles. Last year, I joined a group of “strange stakeholders” investing in a co-created tantric retreat centre.
In the first few months of the Covid-19 pandemic, the media did not exactly cover itself in glory. To quote myself from an early February 2020 piece, when the virus had already been spreading for more than a month in China and the US already had confirmed cases: In the last week or so, new […]...
Sightsavers will host two events and two exhibition spaces at the event in Berlin on 2-3 April, calling on attendees to join us in targeting inequality.
The post Why AGI could be here by 2028 appeared first on 80,000 Hours.
The co-facilitators from Spain and Costa Rica have published the Zero Draft for the modalities of the Independent International Scientific…
Compassion in World Farming Polska is part of CIWF, an international organisation advocating for better conditions for farm animals, promoting plant-based diets, and pushing for more sustainable agriculture. Their campaigns have led to real changes for animals, and they are passionate about fighting injustice against them. Learn more and apply here. Media and Campaigns Specialist.
One of the skills I think is most valuable and I simultaneously do not have is the ability to write well.
The post When do experts expect AGI to arrive? appeared first on 80,000 Hours.
How the dismal science can help us end the dismal treatment of farm animals. By Martin Gould. Note: This post was crossposted from the Open Philanthropy Farm Animal Welfare Research Newsletter by the Forum team, with the author's permission. The author may not see or respond to comments on this post. This year we’ll be sharing a few notes from my colleagues on their areas of expertise.
To understand how close we are to transformative AI, here’s the metric I find most interesting right now: how long are the tasks AI can do?
We're seeking experts and researchers in food safety, toxicology, biomanufacturing, cell biology, food science, and other related fields to discuss cultured meat and food safety as part of our US govt-funded workshop series. The post Join our next working session on cultured meat safety: June 4, 2025 in Chicago, IL appeared first on New Harvest.
How the dismal science can help us end the dismal treatment of farm animals
Ordinary yellow pineapples were once so precious they were rented for display at dinner parties, but centuries of innovation made them commonplace.
'Societal resilience' measures might offer some protection to the proliferation of dangerous AI capabilities
“Everything’s an emergency” is the title of a book that I have not read. It’s also the title of a very good substack by Bess Stillman, wife of the recently deceased Jake Seliger.
It’s now well established that for decades, major oil companies knew that burning fossil fuels would cause global warming, and yet did everything in their power to obstruct climate policy. They intensively lobbied policymakers, ran advertising campaigns, and funded think tanks to cast doubt on climate science. According to two new papers recently published in […]...
Old MacDonald had a... Shrimp? View this email in your browser
Hello! Our favourite links this month include:
Yes, Shrimp Matter, a piece by the CEO of the Shrimp Welfare Project, explaining why he left a career in private equity to improve the lives (and deaths) of shrimp.
A new paper from Fin Moorhouse and Will MacAskill on...
This systematic review investigates the key factors influencing how zoo animal welfare is perceived by the public, as zoos strive to maintain public support while simultaneously fulfilling their conservation mission. The post Factors Influencing Public Perceptions Of Zoo Animal Welfare appeared first on Faunalytics.
This systematic review investigates the key factors influencing how zoo animal welfare is perceived by the public, as zoos strive to maintain public support while simultaneously fulfilling their conservation mission. The post Views Of Zoo Animal Welfare Are Complex And Contradictory appeared first on Faunalytics.
We’re excited to announce the Vancouver Forum for Effective Altruism!. Whether you’re new to Effective Altruism, or have already made high-impact work a part of your life, you’re invited to join the Forum. Join us this April 4-5 in Vancouver, Canada to make new connections, learn about the latest in EA, and enjoy a meal together... . Registration is now open!. .
Will MacAskill went on Rob Wiblin’s 80k hours podcast last week to talk about AI.
There is good reason to think that lawyers may be one of the most automatable professions under the current trajectory of AI development.
Plus: DOD looks to accelerate software acquisition, Trump floats CHIPS repeal, White House tries to cut off maintenance of Chinese chip tools, and the confirmation hearing for new OSTP head. The post OpenAI and Anthropic try to fend off competition with new models, ideas for the U.S. AI Action Plan, and important new misalignment research appeared first on Center for Security and Emerging...
TL;DR: In a sentence: We are shifting our strategic focus to put our proactive effort towards helping people work on safely navigating the transition to a world with AGI, while keeping our existing content up. In more detail: We think it's plausible that frontier AI companies will develop AGI by 2030.
TL;DR: In a sentence: We are shifting our strategic focus to put our proactive effort towards helping people work on safely navigating the transition to a world with AGI, while keeping our existing content up. In more detail: We think it’s plausible that frontier AI companies will develop AGI by 2030.
I made the following infographic adapting the Introduction to Effective Altruism post. The goal I set for myself was to make the post more accessible and easier to digest to broaden its potential audience, and to make the ideas more memorable through graphics and data visualizations.
This article is the long, in-depth version of the piece we wrote for Queer Majority, diving deeper into the data and exploring...
Here's Tovia's YouTube channel https://www.youtube.com/@ToviaSinger1/videos
The post The top 25 happiest countries in 2025 plus our four favourite findings from the 2025 World Happiness Report appeared first on Happier Lives Institute.
Tl;dr: Despite huge investment and continuing growth, alt proteins remain an uncertain cause area. A huge number of surveys and some data from the real world have failed to produce a consensus around the effectiveness of alternative proteins in replacing animal products; however, the preponderance of evidence supports the idea that alternative proteins have displaced animal products to some...
"I'm coming around to the maybe-lines-just-keep-going-up-indefinitely thing"
In Sparse Feature Circuits (Marks et al. 2024), the authors introduced Spurious Human-Interpretable Feature Trimming (SHIFT), a technique designed to eliminate unwanted features from a model's computational process. They validate SHIFT on the Bias in Bios task, which we think is too simple to serve as meaningful validation. To summarize:
Students from around the world can apply by March 25, 2025 to receive a cultured meat starter kit plus $1000 to get your cell ag research off the ground. The post Meat the Future Challenge: Win $1000 USD and an Integriculture Cultured Meat Starter Kit appeared first on New Harvest.
EA Forum Digest #232
Hello!. I liked a lot of posts this week, hope you have an extra minute! — JP (for the Forum team)
We recommend: Existential Choices Debate Week lead to a bunch of good posts:
Read the live symposium comment thread with Will MacAskill and other special guests (Toby Tremlett🔹) — 150 comments!
11 recommendations to recruit the world's top AI talent
Like a collector, I gather resources and information. Storing it and categorizing it. I do this for any project I undertake. But for Forecasting AI Futures, I realized the collection might be quite useful for others. So, I set to work on restructuring it and make it look nice!. Here is the result: Forecasting AI Futures Resource Hub.
Summary: We propose measuring AI performance in terms of the length of tasks AI agents can complete. We show that this metric has been consistently exponentially increasing over the past 6 years, with a doubling time of around 7 months.
Cross-posted from my blog. Contrary to my carefully crafted brand as a weak nerd, I go to a local CrossFit gym a few times a week. Every year, the gym raises funds for a scholarship for teens from lower-income families to attend their summer camp program.
We often talk about ensuring control, which in the context of this doc refers to preventing AIs from being able to cause existential problems, even if the AIs attempt to subvert our countermeasures. To better contextualize control, I think it's useful to discuss the main threats. I'll focus my discussion on threats induced by misalignment which could plausibly increase existential risk.
Summary: We propose measuring AI performance in terms of the length of tasks AI agents can complete. We show that this metric has been consistently exponentially increasing over the past 6 years, with a doubling time of around 7 months.
I am excited to be taking on a new role as the Pan-African Engagement Officer. Having previously served as the Senior Stakeholder Engagement and Communications Advisor for Target Malaria Ghana, I am eager to continue to contribute to the project’s mission on a broader scale. My background is in the humanities, having earned a BA […].
What we've been reading: science, metascience, tech, housing, fertility and more...
In the last week of February, the Target Malaria Burkina Faso team held two meetings with the women’s associations of Bobo-Dioulasso and the educational community in the Karangasso Sambla district on the fight against malaria. The objective of these meetings was to inform these key stakeholders about research into genetic modification technology as a complementary […].
TLDR: ENAIS is teaming up with community builders from The Netherlands to seed AIS Netherlands. You can read about the role below, and if you would be interested in founding such an organisation, fill out this expression of interest form.
Faunalytics founder Che Green shares hard-won lessons and insights from 25 years of data-driven advocacy in animal protection. The post 25 Years, 10 Lessons: Insights From Faunalytics’ Founder Che Green appeared first on Faunalytics.
sorry regular readers I'm writing this because I want to be able to link it on X
Pig farming stakeholders tend to focus more on their current challenges than their vision for the future, highlighting gaps between industry and public expectations. The post Facing The Future Of Pig Farming appeared first on Faunalytics.
Reviewing progress and results of from the first year of our 35-partner, 17-country, EU-funded consortium project. The post New Harvest hosts FEASTS 2nd General Assembly appeared first on New Harvest.
Fish Welfare Opportunities Ahead
Europe, Canada, Chile... There's plenty of chances to help fishes
View this email in your browser
Fish Welfare Opportunities Ahead. Hello readers, welcome to the Animal Ask newsletter,
Thinking about what to include and highlight in this edition of our newsletter was a no-brainer: we have been...
The post Africa Health Agenda International Conference 2025: Leaders call for health financing reforms appeared first on Living Goods.
I have, like I suspect many readers, been in quite a bad mood for the last two months. My go-to joke explaining why — which I feel like should land with readers of this newsletter — has become: “I didn’t realize quite how much my overall optimism about the state of the world depended on […]...
Humanity’s battle against tuberculosis has been one of slow and imperfect progress. The disease no longer kills one in seven people in the US, as it did in the 19th century. But look elsewhere and its burden is still terrible: TB killed more than 1.2 million people in 2023, likely making it once again the […]...
The central question being discussed in the current debate is whether marginal efforts should prioritize reducing existential risk or improving the quality of futures conditional on survival. Both are important, both are neglected, though the latter admittedly more so, at least within EA.
This is a light, informal taxonomy of jobs that conceptually will display significant resistance to automation, even after the diffusion of transformative AI systems that can outperform humans in all forms of cognitive and physical labor. This is intended to provide practical, non-technical context around plausible human jobs in a post-AGI economy.
A radioactive model of personal identity
A simple parable about a drowning child sparks a moral revolution. Can AI help us do the most good in the world? Good Robot was made in partnership with Vox’s Unexplainable team. Episodes will be released on Wednesdays and Saturdays. For more, go to vox.com/goodrobot
Support Future Perfect by becoming a Vox Member today: vox.com/members
Learn more about your ad choices.
Shopper-led investigation reveals that Whole Foods profits from one of the worst factory-farming practices, contradicting public ethical food claims and raising concerns about meat quality and nutrition. LOS ANGELES — Mercy For Animals’ new shopper-led report, “White Striping at Whole Foods,” spotlights the use of fast-growing chicken breeds, referred to as “Frankenchickens,” in Whole Foods’ […].
Shoppers led a groundbreaking investigation into Whole Foods’ chicken supply across North America—and the results point to shocking cruelty and a broken promise. The post The Proof Is In—Shoppers Record Evidence of Whole Foods’ Chicken Cruelty appeared first on Mercy For Animals.
Greater than the sum: Designing credit that works for women
For Peace, keeping her head above water as a business owner was challenging as she balanced caring for her six children. It wasn't until she gained access to loans and received financial training that she was finally able to expand her business, grow her income, and ensure her teenagers continued in school.
elewis@poverty…
Tue,...
Video: Why spending smarter beats bigger budgets
The good news on global development is that key indicators, like child mortality and school enrollment, are better today than at any point in human history. However, the bad news is that though many more children are surviving, large numbers are not thriving. elewis@poverty…
Tue, 03/18/2025 - 22:13...
Loading...