Convergence 2024 Impact Review home page. Impact overview. 2024 marked the first full year with the new Convergence Analysis 9-person team. This year we published 20 articles on understanding and governing transformative AI. Our research impacted regulatory frameworks internationally.
Why you can build a European airliner, but not a European Google
Opinion | The case for thoughtful cellphone restrictions in schools
For schools adopting cellphone policies, there is no one size fits all approach, and the Legislature is right to leave the specifics of cellphone restrictions up to individual districts. spriyabalasubr…
Tue, 03/25/2025 - 05:43...
People say they prefer stories written by humans over AI-generated works, yet new study suggests that’s not quite true
We conducted a study to test whether this preference of humans over AI in creative works actually translates into consumer behavior.
More color!. The post Clothing prints could be more beautiful appeared first on Otherwise.
We often talk about ensuring control, which in the context of this doc refers to preventing AIs from being able to cause existential problems, even if the AIs attempt to subvert our countermeasures. To better contextualize control, I think it's useful to discuss the main countermeasures (building on my prior discussion of the main threats).
I’m proud to announce this special World Aquatic Animal Day session from ICARE. Giulia Malerbi from the Aquatic Life Institute (ALI) joins moderator Dr Paulina Siemieniec to discuss the fight for stronger legal protections for aquatic life, with a spotlight on preventive bans against octopus farming. Learn more & Register. Discuss...
I am super excited to announce that we've officially started shooting our feature-length documentary on the risks posed by powerful AI.
I recently decided to check how much I've donated to GiveWell's Top Charity Fund over the last three years. I only got properly bought into EA in 2021/22, and only started donating on a monthly basis in Jan 2022. I also don't earn a huge amount, so haven't been donating much monthly in real terms, although 18 months ago I stepped up my donation amount.
Here's my blog https://benthams.substack.com/
Standard decision theory says that all decisions combine two key factors: opinions on values, and beliefs about facts.
Overview: By training neural networks with selective modularity, gradient routing enables new approaches to core problems in AI safety. This agenda identifies related research directions that might enable safer development of transformative AI. Introduction.
In this doc, I'll try to explain my current understanding of the high level methods for handling non-concentrated failures with control. I'll discuss the regimes produced by different methods and the failure modes of these different regimes. Non-concentrated failures are issues that arise from the AI doing a large number of problematic actions over a long period.
If we naively apply RL to a scheming AI, the AI may be able to systematically get low reward/performance while simultaneously not having this behavior trained out because it intentionally never explores into better behavior. As in, it intentionally puts very low probability on (some) actions which would perform very well to prevent these actions from being sampled and then reinforced.
This is a brief overview of a recent release by Transluce. You can see the full write-up on the Transluce website.
AI systems are increasingly being used as agents: scaffolded systems in which large language models are invoked across multiple turns and given access to tools, persistent state, and so.
Thanks to Jesse Richardson for discussion. Polymarket asks: will Jesus Christ return in 2025? In the three days since the market opened, traders have wagered over $100,000 on this question. The market traded as high as 5%, and is now stably trading at 3%. Right now, if you wanted to, you could place a bet … Continue reading Will Jesus Christ return in an election year? →...
I stand with Sisson
Researchers surveyed the perceptions and attitudes of key Chinese workers towards farmed chickens, who endure some of the worst living conditions in cage housing systems. The post What Do Chinese Poultry Producers Think About Animal Welfare? appeared first on Faunalytics.
Summary: Understanding how people first encounter EA and what helps them get involved in EA is important for understanding how to optimize recruitment. With the Meta Coordination Forum survey suggesting that it would be ideal for the EA community to grow rapidly, this is of high importance.
Don't let the title fool you - it's practical and for everyone
More important things happening in biotechnology and medicine
Note: We know this piece is a little longer than usual, so feel free to jump to the 'Wrap-up and Takeaways' section at the end. However,...
on the gap between available-in-theory and used-in-practice
Plus: Consciousness and LLMs, human augmentation, and realistic cyber offense testing
Today marks the launch of a new federal election campaign from Australians for AI Safety — bringing together leading AI experts to advocate for crucial safety policies ahead of the federal election. Good Ancestors has been leading the coordination of this initiative over the past few months and we need your help now!. What the campaign includes:
How vaccines cause odalisques
Greetings from a world where…...
CASCADE Outcome Assessment for Future Food Safety and Basic Nutrition Training of Food Vendors. gloireri
Mon, 03/24/2025 - 11:27
. A. Background. The Private Service Provider (PSP) knowledge, attitudes, and practices (KAPs) assessment was done to see how well PSPs who were trained by the CASCADE project and worked with the National Home-Grown School Feeding Program (NHGSFP) remembered and...
The African Union (AU) has recognized the potential of gene drive technology, endorsed its development and established a high-level panel in 2017 to foster a conducive environment for research, development of regulatory frameworks, and stakeholder engagement across African Union Member States, while ensuring a collaborative approach to the ongoing fight against malaria. With support from […].
Call for Applications:Co-Innovation Platform
gloireri
Mon, 03/24/2025 - 07:52
Call for Applications. Join the First Cohort of Co-Innovation Partnerships for Nutrition-Sensitive Social Protection. Apply Now. Are you working to enhance the nutrition impacts of social protection systems? .
In many countries, a significant share of aid is spent domestically on hosting refugees, offering student scholarships, and administrative costs.
I describe a threat model by which AI R&D capabilities could cause harm, a specific capability threshold at which this risk becomes unacceptable, early warning signs for detecting that threshold, and the protective measures needed to continue development safely past that threshold. I recommend that labs start measuring for the warning signs today.
1. Introduction. One of the most reasonable ethical aims from a variety of perspectives is to focus on s-risk reduction, namely on steering the future away from paths that would entail vastly more suffering than Earth so far.
Robert Evans has made an in depth four part series on the Zizians, which includes a lot of background on Rationality in the bay area, CFAR, and a lot of the ideas leading up to the murders (first three episodes).
Part One: The Zizians: How Harry Potter Fanfic Inspired a Death Cult. Part Two: The Zizians: Birth of a Cult Leader. Part Three: How The Zizians Went Full On Death Cult.
this week in security — march 23 edition
Paragon spyware customers and victims revealed, CISA scrambles to contact fired staff, Wiz to join Google Cloud, and more. ~this week in security~. a cybersecurity newsletter by @zackwhittaker
volume 8, issue 12
View this email in your browser | past issues | RSS
~ ~
THIS WEEK, TL;DR. Paragon spyware used by at least...
All across the world, people are having fewer children. The birth rate is declining quickly in countries ranging from the United States to Finland to Mexico and Turkey. Fertility data used to be a fairly esoteric concern. Not anymore. Vice President JD Vance talks about it regularly, Elon Musk calls it civilization’s greatest threat. There’s […]...
TL;DR: Promise: Far-UVC demonstrates remarkable air-cleaning capabilities. In one of the most promising studies, its effectiveness was equivalent to “changing the air completely over in the room 184 times every hour,” compared to the CDC’s recommendation of 5+ changes—and even hospital operating rooms only require 20 (Wood 2020).
80K recently shifted their strategic prioritise to focus on AGI: 80,000 Hours is shifting its strategic approach to focus more on AGI — EA Forum. As a uni group organiser, 80K advising has been my 'go to' recommendation for people who want careers advice beyond what I can provide. This has been a pretty universal recommendation regardless of career / cause area interests.
Hi everyone,. So far, Vetted Causes has published two charity reviews, and people have questioned our intentions and criticized us for being anonymous. Based on this I (the founder of Vetted Causes) have decided it’s best to reveal my identity. Previously, I’ve referred to myself as Isaac on this forum, but that was a pseudonym. My real name is Adam Hebert.
Crossposed from https://stephencasper.com/reframing-ai-safety-as-a-neverending-institutional-challenge/. Stephen Casper. . “They are wrong who think that politics is like an ocean voyage or a military campaign, something to be done with some particular end in view, something which leaves off as soon as that end is reached. It is not a public chore, to be got over with.
Buy tickets now! June 6th - 8th
Introduction. Anthropic recently released Stage-Wise Model Diffing, which presents a novel way of tracking how transformer features change during fine-tuning. We've replicated this work on a TinyStories-33M language model to study feature changes in a more accessible research context.
TL;DR Having a good research track record is some evidence of good big-picture takes, but it's weak evidence. Strategic thinking is hard, and requires different skills. But people often conflate these skills, leading to excessive deference to researchers in the field, without evidence that that person is good at strategic thinking specifically.
We made a long list of concrete projects and open problems in evals with 100+ suggestions!. https://docs.google.com/document/d/1gi32-HZozxVimNg5Mhvk4CvW4zq8J12rGmK_j2zxNEg/edit?usp=sharing. We hope that makes it easier for people to get started in the field and to coordinate on projects.
Here's his blog https://substack.com/@truthandtradition
Here is mine https://benthams.substack.com/
Clear proof that Charlie Kirk makes lots of terrible arguments
Also better dinner parties and the seven habits of highly depolarizing people
It is relatively easy to identify a list of things that we want, in the sense of preferring a life with more of them to less of them.
Here, I debunk the debunkers — the moral skeptics. *...
I’m writing a new guide to careers to help AGI go well, in collaboration with 80,000 Hours.
From Trolleys to Drowning Children
Discuss...
If you’re anything like me — a policy dork who spends too much time on X — you’ve been unable to escape discussion of a new book called Abundance. Written by the Atlantic’s Derek Thompson and the New York Times’s Ezra Klein (also a co-founder of Vox), Abundance is one of those policy books with […]...
Having a good research track record is some evidence of good big-picture
takes about AGI, but it's weak evidence. Strategic thinking is hard, and
requires different skills. But people often conflate these skills, leading
to excessive deference to researchers in the field, without evidence that
that person is good at strategic thinking specifically.
What markets think will happen to the "Gold Card" pathway to citizenship, the Department of Education, and the rule of law in the US
AI's biggest impact will come from broad labor automation—not R&D—driving economic growth through scale, not scientific breakthroughs.
Discuss...
TLDR: Consider indicating your approximate location on the EA forum community members map, so local organizers can find you, and maybe take initiative yourself to organize lunch meetups, walks, coworking, anything else you want, all within 10min of where you live or work... .
AIs are inching ever-closer to a critical threshold. Beyond this threshold lie great risks—but crossing it is not inevitable.
Transformer Weekly: California report on frontier AI risks, a new AI benchmark, and more Action Plan comments
Author's note: Hi everyone! This is a cross-post from my blog. It's a short, accessible piece intended for those who find the idea of measuring suffering "icky," uncomfortable, or cold. I've noticed this as a fairly common reaction to effective altruism, and I wanted to write from a place of sincerely empathising with it.
I recently left OpenAI to pursue independent research. I’m working on a number of different research directions, but the most fundamental is my pursuit of a scale-free theory of intelligent agency. In this post I give a rough sketch of how I’m thinking about that. I’m erring on the side of sharing half-formed ideas, so there may well be parts that don’t make sense yet.
I'm not writing about what's most important. The post Writing about non-AI topics feels weird appeared first on Otherwise.
Demographer Lyman Stone writes:
How do governments defend controversial laws that limit public oversight of factory farms? This study explores the rise of ag-gag laws in Canada. The post Ag-Gag Politics: How Governments Justify Secrecy In Agriculture appeared first on Faunalytics.
Editors’ Note: Ellen Aprill explains why the hybrid nature of the United States Institute of Peace (USIP), both a government and nonprofit entity, was at the heart of the standoff between Department of Government Efficiency (DOGE) and USIP officials earlier this week. The United States Institute of Peace (USIP) is the latest target of President … Continue reading →...
Cate Hall is the CEO of Astera. She’s a former Supreme Court attorney and the ex-No. 1 female poker player in the world. Before joining Astera, she co-founded and served as COO and later co-CEO of Alvea, a pandemic medicine company that set the record for the fastest startup to take a drug candidate to Phase I clinical trial.
In this special episode, we feature Nathan Labenz interviewing Nicholas Carlini on the Cognitive Revolution podcast. Nicholas Carlini works as a security researcher at Google DeepMind, and has published extensively on adversarial machine learning and cybersecurity.
Dr. Alfredo Parra discusses why it's so important to address extreme suffering, highlighting cluster headaches as a vivid example of severe pain that's often overlooked by common global health metrics. Alfredo explains the shortcomings of tools like DALYs in representing the true severity of intense yet less prevalent pain conditions.
Some rich people have worldviews that are uncommon within Effective Altruism. These people might be on board with doing good, but less aligned with the pragmatic, calculated approach common in Effective Altruist circles. Last year, I joined a group of “strange stakeholders” investing in a co-created tantric retreat centre.
In the first few months of the Covid-19 pandemic, the media did not exactly cover itself in glory. To quote myself from an early February 2020 piece, when the virus had already been spreading for more than a month in China and the US already had confirmed cases: In the last week or so, new […]...
Sightsavers will host two events and two exhibition spaces at the event in Berlin on 2-3 April, calling on attendees to join us in targeting inequality.
The post Why AGI could be here by 2028 appeared first on 80,000 Hours.
The co-facilitators from Spain and Costa Rica have published the Zero Draft for the modalities of the Independent International Scientific…
Compassion in World Farming Polska is part of CIWF, an international organisation advocating for better conditions for farm animals, promoting plant-based diets, and pushing for more sustainable agriculture. Their campaigns have led to real changes for animals, and they are passionate about fighting injustice against them. Learn more and apply here. Media and Campaigns Specialist.
One of the skills I think is most valuable and I simultaneously do not have is the ability to write well.
The post When do experts expect AGI to arrive? appeared first on 80,000 Hours.
How the dismal science can help us end the dismal treatment of farm animals. By Martin Gould. Note: This post was crossposted from the Open Philanthropy Farm Animal Welfare Research Newsletter by the Forum team, with the author's permission. The author may not see or respond to comments on this post. This year we’ll be sharing a few notes from my colleagues on their areas of expertise.
To understand how close we are to transformative AI, here’s the metric I find most interesting right now: how long are the tasks AI can do?
Loading...