This is an interim report sharing preliminary results. We hope this update will be useful to related research occurring in parallel. Executive Summary. Problem: Qwen1.5 0.5B Chat SAEs trained on the pile (webtext) fail to find sparse, interpretable reconstructions of the refusal direction from Arditi et al.
This is an excerpt from the Introductions section to a book-length project that was kicked off as a response to the framing of the essay competition on the Automation of Wisdom and Philosophy. Many unrelated-seeming threads open in this post, that will come together by the end of the overall sequence. If you don't like abstractness, the first few sections may be especially hard going. .
Donald Trump has won a second term in the White House, and if his next administration is anything like his first, he’ll likely further weaken what few legal protections exist for animals. During his first four years in office, Trump’s Cabinet: When slaughterhouses became Covid-19 hot spots in the early days of the pandemic, Trump […]...
We are excited to introduce four new charities launched through our August-October 2024 Charity Entrepreneurship Incubation Program! This...
Interesting research: “ Hacking Back the AI-Hacker: Prompt Injection as a Defense Against LLM-driven Cyberattacks“: Large language models (LLMs) are increasingly being harnessed to automate cyberattacks, making sophisticated exploits more accessible and scalable. In response, we propose a new defense strategy tailored to counter LLM-driven cyberattacks.
Despite industry marketing and claims of improved welfare standards for chickens, this report reveals that most companies are failing to make progress, and that misleading labels continues to undermine consumer choice. The post Chicken Industry Welfare Washing appeared first on Faunalytics.
The progress studies conference didn't have anyone to defend fusion energy, so I will.
And what the US government is doing about it
Thursday, November 7
Rethink Priorities has been conducting a range of surveys and experiments aimed at understanding how people respond to different framings of Effective Altruism (EA), Longtermism, and related specific cause areas. There has been much debate about whether people involved in EA and Longtermism should frame their efforts and outreach in terms of Effective altruism, Longtermism, Existential risk,...
Based on a simple ecosystem model with predators and prey, I argue that fishing may increase total harms to wild animals whereas agriculture may decrease total harms. However, as this is based on a very simplified model (it only includes … Lees verder →...
There are self-locating probabilities
Since I started PauseAI, I’ve encountered a wall of paranoid fear from EAs and rationalists that the slightest amount of wrongthink or willingness to use persuasive speech as an intervention will taint the person’s mind for life with self-deception-- that “politics” will kill their mind.
This summer, I participated in a human challenge trial at the University of Maryland. I spent the days just prior to my 30th birthday sick with shigellosis. What? Why?. Dysentery is an acute disease in which pathogens attack the intestine. It is most often caused by the bacteria Shigella. It spreads via the fecal-oral route.
Really interesting research: “ An LLM-Assisted Easy-to-Trigger Backdoor Attack on Code Completion Models: Injecting Disguised Vulnerabilities against Strong Detection “: Abstract: Large Language Models (LLMs) have transformed code com-
pletion tasks, providing context-based suggestions to boost developer productivity in software engineering.
The UK just announced plans to create a continuous metagenomic sequencing system for early detection of new respiratory pathogens. The system is based on point-of-care sequencing of patients showing up with severe respiratory infections, aiming for diagnosis within 6 hours and allowing the NHS to track emerging pathogens of both artificial or natural origin.
Feeling responsible for what you can’t control is pointless labor that no one asked for, like drinking poison and hoping your ebbing strength will magically be given to those in need.
Crossposting notes. This funding memo was written by Nuño Sempere and Rai Sur, the cofounders of Sentinel; I (Saul) am crossposting to the Forum with their permission. I've made some minor changes in formatting to match the Forum, but all of the content is the same.
Disclaimer: This is a weak take — mostly based on anecdotes and vibes — but I thought it’s worth sharing nonetheless, if only as a means of soliciting better takes. Disclaimer #2: I am very appreciative of everyone who is working tirelessly to pursue farmed animal protection via policy, and I want nothing more than to see it succeed!
Most matter pushes against nearby matter, via a positive “pressure.” That’s why Earth holds its shape, instead of collapsing down to a point mass.
2024 election takes
A wishlist
'Twas a Republican sweep. At least we have funny tweets? A politics update by Jacob Cohen (@Conflux).
The cleanest argument that current-day AI models will not cause a catastrophe is probably that they lack the capability to do so. However, as capabilities improve, we’ll need new tools for ensuring that AI models won’t cause a catastrophe even if we can’t rule out the capability.
TL;DR: EAGxVirtual 2024 (15–17 November) is coming up in less than 2 weeks! This free, online conference is an opportunity to learn from and meet with EAs in 75+ different countries. Applications are open until 14 Nov!. Apply now—. In our announcement post, we explained why we are hosting EAGxVirtual on 15–17 November – to connect the global EA community.
Wednesday, November 6
The case for cautious optimism
This month's Faunalytics Index provides facts and stats about companion animal care costs, the environmental impact of fur, shrimp farming, and more. The post Faunalytics Index – November 2024 appeared first on Faunalytics.
Human intervention in wild animal welfare is a complex issue within animal ethics. Depending on your risk tolerance, certain interventions on behalf of wild animals may be more justifiable. The post Risk Aversion In Wild Animal Welfare appeared first on Faunalytics.
Discuss...
About three hours ago I posted a brief argument about why one of the US presidential candidates is strongly preferable to the other on five criteria of concern to EAs: AI governance, nuclear war prevention, climate change, pandemic preparedness and overall concern for people living outside the United States.
A repeal of last year's executive order, for one thing
EA Forum Digest #214
Hello!. Funding Strategy Week is now underway on the Forum (until 10th Nov), follow the link to see all the posts that have been published so far. We’re particularly interested in hearing from people who have lost funding from Good Ventures, so consider contributing to this thread if this describes you.
This is not advice
Tuesday might have been the last traditional Election Day of my life in Washington, DC, where I’ve been voting for the past 12 years. The ballot included Initiative 83, a measure adopting ranked choice voting (RCV); it passed overwhelmingly. While it’s possible that the DC government could just refuse to implement the measure (they’ve done […]...
Huge US election volumes, Robinhood added presidential prediction markets, Richard Ngo on why he is not a Bayesian.
Microsoft is warning Azure cloud users that a Chinese controlled botnet is engaging in “highly evasive” password spraying. Not sure about the “highly evasive” part; the techniques seem basically what you get in a distributed password-guessing attack:
GAIN@COP29
tsood
Wed, 11/06/2024 - 11:59
GAIN@COP29. 11-22 November 2024. Background. GAIN returns to the UNFCCC Conference of Parties (COP) committed to maintaining momentum and building upon the strong foundations established over the last few years.
TLDR: The shortest version of this argument is very simple: your expectations for an organization should be higher where their budget and staff size are higher. In other words, we should have different expectations for a 20-person organization with a $1.5 million budget than a 2-person $150,000 budget organization.
Hi! We're researchers from Animal Charity Evaluators (ACE), a 501(c)(3) non-profit that uses evidence and reason to help people help animals. We conduct charity evaluations to identify the organizations that will likely make the most significant difference for animals.
We share initial findings from 84 global experts who contributed to the first part of our open consultation for France's 2025 AI Action Summit, revealing four key shifts needed in AI governance and concrete recommendations for action. The complete analysis will be presented at The Athens Roundtable on December 9.
People have tweaked the decoder-only Transformer architecture enough in 4 years that we’re apparently now calling the current recipe “ Transformer++”.
i.e. a Transformer but with. a fused attention implementation (the scaled dot-product backend -> FlashAttention). Subquadratic memory complexity in input sequence length. Practically: can double GPU utilization and so halve training time.
In my upbringing there was a deep sectarian division between the neds and the moshers. 1. I was mosherlike (by default, being weird) but had some ned friends.
We would drive around the country lanes doing “circuits”, aimless circles looking for someone else wanting to race. They would blast happy hardcore and trance: “dance” music.
A preview of what Election Night will look like, and my relationship with politics, interspersed. By Jacob Cohen (@Conflux).
The Inter-American Development Bank and J-PAL LAC Launch Collaborative Visiting Program to Foster Evidence-Based Policy in Latin America and the Caribbean
The Inter-American Development Bank (IDB) and the Abdul Latif Jameel Poverty Action Lab Latin America and the Caribbean (J-PAL LAC) have officially launched a new Visiting Researcher Program, marking a new phase in the strategic...
I hear there was some famous guy who was big on the whole loving your neighbor thing
To succeed in a crowded international governance landscape, the AI Summit Series needs a clearly defined niche. Its defining traits should be that it: (a) invites companies and civil society to...
In favor of minimally competent utopia design
Some marine animals were historically feared but are now more favored and protected. Is it possible to also improve the social standing of sharks, who currently face significant bias?. The post Effective Messaging For Sharks appeared first on Faunalytics.
Plus: New tunnels, monorails, canals, small modular reactors, and horseless carriages
Joseph Rotblat was born in Warsaw on November 4, 1908. He is one of the advisors I’d like to have on my shoulder, so on his birthday, I want to share a bit about his life and reflect on it. As a brief overview, Rotblat: Was the only scientist to leave the Manhattan Project on moral grounds.
Tuesday, November 5
I’ve been writing about the possibility of AIs automatically discovering code vulnerabilities since at least 2018. This is an ongoing area of research: AIs doing source code scanning, AIs finding zero-days in the wild, and everything in between. The AIs aren’t very good at it yet, but they’re getting better.
Here’s some anecdotal data from this summer:
EAGx Virtual and EA conferences around the world
Since stepping into the role of CEO at PxD in April, the past six months have been a whirlwind – challenging, rewarding and full of momentum. As I reflect on all the things we’ve accomplished.... Read more. The post CEO Diaries: Six Months of Wins, Challenges, and Growth appeared first on Precision Development.
In June this year, Good Ventures announced that it would stop supporting certain sub-causes, and not expand into new cause areas by default. Neither Good Ventures nor Open Philanthropy included a public list of the sub-causes or organisations they were no longer supporting.
The US presidential election is imminent. Given the amount of polarization and partisan sorting of political issues in the US, you might...
Probably the most impactful thing you can do about the presidential election right now
Helping Children and Families Live Healthier Helen Keller Intl believes in a world where no one is deprived of the opportunity to live a healthy life and reach their true potential. Working with our global community, we’re supporting millions of families across Africa, Asia, and the United States to overcome barriers to good health, sound nutrition, […].
This analysis summarizes and reflects on the following research: Citation: Piazza, J. A. (2023). Political polarization and political violence. Security Studies, 32(3), 476-504. https://doi.org/10.1080/09636412.2023.2225780 Talking Points Political polarization makes support for and the occurrence of political violence more likely.
For democracy, against AIDS, lead, and factory farming
Effective altruism, happiness, writing, social justice, science, short stories, fun.
The idea of responsible scaling policies is now over a year old. Anthropic, OpenAI, and DeepMind each have something like an RSP, and several other relevant companies have committed to publish RSPs by February.
Welcome to the November 4, 2024 Main edition of The Homework, the official newsletter of California YIMBY — legislative updates, news clips, housing research and analysis, and the latest writings from the California YIMBY team. News from Sacramento We are…. The post The HomeWork: November <span class="dewidow">4, 2024</span> appeared first on California YIMBY.
This paper introduces a method to monetize animal welfare, allowing animal lives to be considered alongside human interests in policy analyses. The post The Price Of Welfare: Monetizing Animal Lives For Better Policies appeared first on Faunalytics.
TL;DR: I think funding diversification is an easy thing to pitch as positive, but it comes with real tradeoffs to growth and time spent fundraising. In this post, I go into why I think these trades are not obvious but still end up thinking they are worth making.
Rethink Priorities' Worldview Investigations Team is sharing research agendas that provide overviews of some key research areas as well as projects in those areas that we would be keen to pursue. This is the first of three such agendas to be posted in November 2024. If we want to do the most good per dollar, then we need to measure the value of various outcomes on a common scale.
On doing well from doing good
Looking back from 2041
Monday, November 4
Do you believe AI will hit a wall?
The Chinese AI chip firm goes public
Really interesting story of Sophos’s five-year war against Chinese hackers.
Election Day Update: For anyone who’s still undecided (?!? ), I can’t beat this from Sam Harris. When I think of Harris winning the presidency this week, it’s like watching a film of a car crash run in reverse: the windshield unshatters; stray objects and bits of metal converge; and defenseless human bodies are hurled into […]...
HPV vaccines offer a rare opportunity to effectively eliminate one type of cancer. By taking this opportunity, it’s possible to save hundreds of thousands of women each year.
Tuberculosis still kills over 1.2 million people a year. In a two-part essay by Kamal Nahas for Asimov Press, we explore why and how to change that. In the 1924 novel, The Magic Mountain, Thomas Mann describes a sanatorium patient named Anton Ferge as he undergoes a painful tuberculosis (TB) treatment. “I lie there with my face covered, so I can’t see anything,” Anton says.
Loading...