Does Your Charity Actually Help?
On cost-effectiveness, evidence, and the philosophical questions you can't avoid
This is the second post in Rethink Priorities’ Cross-Cause Prioritization series. In my last post, I introduced this series and explained why systematic cross-cause prioritization is important yet difficult. This post focuses on the necessity of setting an optimal bar for evidence and cost-effectiveness in funding decisions.
Where Giving Goes Wrong
When a household product you buy doesn’t work, you might ask for a refund and leave a bad review. When a financial product doesn’t do what its makers promise, you might not just leave a bad review, you might contact the authorities. But when a charitable intervention fails to do what its promoters claimed? That's called a Tuesday.
The phenomenon of suboptimal charitable interventions isn’t about ill intentions. People in this space genuinely want to help, and that’s exactly what makes the failures so instructive.
In the 1990s, a development organization tackling water scarcity in sub-Saharan Africa landed on what seemed like a brilliant idea. Kids play. Kids have energy. What if that energy could pump clean water from the ground? Install a merry-go-round connected to a pump mechanism, and communities would benefit simply by letting kids be kids.
The PlayPump raised over $60 million for the project. It didn’t work. In practice, the system was expensive and very inefficient, requiring children to play for hours or, more often, the task then fell on women in the communities, who found it impractical and not a very dignified way to source water.
The PlayPump is a well-known cautionary tale not because the idea was obviously bad, but because it seemed obviously good. It had a compelling story, a clear mechanism, and sympathetic beneficiaries. And yet, PlayPumps failed:
It failed on evidence: $60 million was committed before anyone rigorously tested whether the tool worked.
It failed on cost-effectiveness: that same money, spent elsewhere, could have helped vastly more people.
These aspects are easy to get wrong. Most donors and foundations underweight at least one of them, and the costs, measured in lives and well-being that could otherwise be affected, are enormous.
Why you should care about cost-effectiveness
Most people understand cost-effectiveness instinctively. If two grocery stores sell the same pasta and one charges half the price, you go to the cheaper one. The same logic should apply when what you’re spending on is human or animal well-being: if you can help twice as many people for the same money, you should.
In practice, however, this isn’t the case.
Let’s take a look at the numbers. As of 2026, if you earn $23,000 per year after taxes, you’re in the top 10% of earners globally. If you earn $68,500, you’re in the top 1%. A person living in poverty in a low-income country may live on just a few dollars per day, lacking adequate food, medical care, and housing. The result is that a dollar directed toward helping in the low-income context can go dramatically further than a dollar spent on causes in high-income countries, not because those causes don’t matter, but because the baseline need is so different.
Consider vitamin A supplementation. A vitamin A supplement costs under a dollar to deliver, and yet vitamin A deficiency leaves young children severely vulnerable to infection and death across sub-Saharan Africa and South Asia. According to GiveWell’s 2024 analysis, it costs between $1,000 and $8,500 to avert a child’s death through vitamin A supplementation campaigns, and the program is estimated to be between 9 and 59 times as cost-effective as simply giving cash directly to people living in extreme poverty. Meanwhile, in the United States, the average cost of one night’s stay in a hospital is $3000. The same moral commitment to keeping children alive, applied where the need is greatest and the solutions are cheapest, produces outcomes that are orders of magnitude apart.
Now, say that you care about animal welfare. Rescuing a companion animal from a US shelter costs somewhere between $200 and $1,000 and is a genuinely good thing to do. But for roughly $1, you can spare approximately 11 factory-farmed chickens from a lifetime in cages with less standing area than a sheet of paper, where they cannot express natural behaviors and suffer injuries throughout their lives. Tens of billions of chickens are farmed under these conditions every year. Spending $200 could save one shelter animal or 2200 chickens.
This doesn’t mean that you, under all circumstances, must save the chickens. Or that only one type of cause matters, or that cost-effectiveness is the only measure worth considering. But disregarding cost-effectiveness has a moral cost. It means helping far fewer people or animals than you otherwise could.
Why you should care about evidence
The PlayPump-style failures are more common than most donors realize. Ideas that sound promising often fail to deliver. Ideas that look mundane can be among the most powerful tools we have. The difference, usually, comes down to whether anyone actually tested them.
The credibility revolution in economics, which generated demand for the scientific identification of causal effects through Randomized Controlled Trials (RCTs), transformed our ability to distinguish between these two cases. Before RCTs, it was often difficult to determine whether an intervention caused the observed outcomes or whether unrelated factors drove them.
RCTs have since helped debunk more than a few celebrated interventions. Research from the Abdul Latif Jameel Poverty Action Lab (J-PAL) showed that microcredit (the concept of providing small loans at low interest rates to new businesses in the developing world) had limited evidence of reducing poverty, despite a compelling theory of change. Similarly, providing laptops to schoolchildren was widely celebrated before trials revealed a more complicated picture.
RCTs can also prove what mundane interventions work. The Against Malaria Foundation distributes insecticide-treated bed nets in order to prevent mosquitos from spreading malaria. The evidence base here consists of many RCTs, including a Cochrane meta-analysis that covers studies from different countries and time periods, with consistent results. We can be near certain that bed nets prevent malaria, and GiveWell considers them highly cost-effective. That’s a high bar that relatively few interventions meet.
The takeaway here isn’t just to run more RCTs. It’s that the strength of evidence you have for an intervention should substantially affect your confidence that it will actually work, and that confidence should flow through to how you allocate resources.
Demanding “perfect” evidence is a mistake
So far, this sounds like a straightforward argument for demanding rigorous evidence. But demanding only the highest-quality empirical evidence as a condition of funding anything is actually a bad idea.
Consider what gets excluded the moment you treat high-quality RCTs as the sole price of admission.
Observational evidence, carefully applied, underpins most of empirical social science, including our confidence that smoking causes cancer, a finding no one established by randomizing people to smoke.
Modeling evidence drives virtually all public health decision-making and modern epidemiology.
Philosophical evidence, as I’ll talk about shortly, shapes every judgment about whose interests count, how to weigh present versus future people, and what kind of good we are even trying to produce.
Dismissing any of these categories wholesale would require ruling out a vast range of human understanding.
There is also a temporal dimension. Acting only in domains where high-quality empirical investigation is possible would have ruled out preparing for nuclear war before nuclear weapons were deployed. It would have ruled out early action on many foreseeable but not yet realized risks. Uncertainty about future impacts does not mean the probability of harm should be rounded to zero.
If you set the bar too high, you leave a lot of potentially important and valuable work unfunded. The right question isn’t “does this meet a specific evidentiary standard?” but rather: What does the full picture of available evidence suggest, and how confident should I be?
That is a harder question, but it is the decision-relevant question for donors. It requires holding the quality and type of evidence in mind alongside the stakes of the decision. When you are deciding whether to fund something that might save a life or do nothing, the threshold for action shouldn’t require a Cochrane review, but it also shouldn’t be only a plausible story and good intentions.
There is a cost to setting the bar too low just as there is a cost to setting it too high. Neither extreme is good. Getting it right requires calibration and the willingness to hold uncertainty explicitly.
If you’re doing charity, you can’t opt out of philosophy
You can’t avoid doing philosophy when doing charity. There is no theory-neutral approach to doing good. If you are working across domains, you ultimately are at least implicitly making decisions about tradeoffs that heavily involve philosophical judgments, even if they aren’t stated outright.
One example of this in practice: GiveWell is one of the most evidence-focused organizations in the charitable sector. And yet it engages in philosophy: it is, by its own description, deeply engaged in questions of moral weights, because without answering those questions, it is impossible to compare the value of preventing a malaria death to doubling someone’s income. Those aren’t empirical questions. They are philosophical ones.
Any time you make a tradeoff between different types of goods, any time you decide what to value and who counts, you are making a philosophical judgment. The decision to focus on global health over animal welfare, or on near-term poverty over existential risk, is not a conclusion that emerges from data. It is in part a philosophical stance, whether you acknowledge it or not.
Moreover, intuitions or common sense cannot substitute for explicit philosophical reasoning. As I have noted elsewhere, common sense is underspecified and doesn’t resolve hard tradeoffs. The question “should humanity have a flourishing future?” has a common-sense answer. The question “how many resources should we place on creating that future relative to preventing global poverty now?” does not. Every answer to the latter takes stances on philosophical and empirical questions, even if many people leave the philosophical positions they are taken implicit when answering. But implicit stances are not more rigorous than explicit ones. They are just less examined.
As Keynes observed about a parallel dynamic in economics: “practical men, who believe themselves to be quite exempt from any intellectual influences, are usually the slaves of some defunct economist.” The same applies here. Views that people describe as mere intuitions or common sense are often downstream of philosophical ideas promoted in their environments. When someone thinks such intuitions and implicit stances are indisputable or obviously correct, it’s more often than not evidence they haven’t examined or tested those ideas at all.
Philosophy also doesn’t need to be definitive to be worth using. The evidence for many empirical interventions is mixed and contested. The debate over the effects of deworming is a prime example, with a complex and still-unresolved evidence base. We act despite incomplete certainty all the time. Expecting certainty from philosophical reasoning but accepting uncertainty in empirical reasoning would be a double standard.
At Rethink Priorities, we’ve tried to make philosophical reasoning transparent. One example of this is our Donor Compass Tool, a quiz donors can take that outputs a recommended giving allocation that reflects their philosophical beliefs. It is a concrete attempt to handle moral uncertainty explicitly. Rather than assuming you have one correct view, these approaches ask: given that you hold non-zero credence in multiple plausible moral frameworks, how should they shape your allocation decisions? The answer is often different from what you would arrive at by acting on only your “favorite” theory.
The tool is coming out next month.
Putting It Together: Three Questions, One Framework
Cost-effectiveness, evidence quality, and philosophical judgment are not independent concerns. They are three dimensions of the same underlying question: how do we make decisions that actually improve the world as much as possible?
The version of charitable giving that takes all three seriously looks messier than the version that latches onto a single metric or a single type of evidence. Starting somewhere, making your assumptions explicit, and refining over time is how you get better.
This also means being honest about the costs of getting it wrong.
If you ignore cost-effectiveness, you help far fewer people than you could.
If you ignore evidence, you may help no one at all, or cause harm.
If you ignore philosophy, you are still doing philosophy, just less transparently (and likely doing it worse).
We shouldn’t let the perfect be the enemy of the good. But I also believe we shouldn’t let the plausible be a substitute for evidence, or mistake unexamined intuitions for a principled framework.
This post is part of a series on how cause prioritization can go wrong — and ways to think about it better. See the first post in the series here and subscribe to receive the next parts straight to your inbox.
Thanks to Sarina Wong, Urszula Zarosa, and Elisa Autric for feedback on this post.

