In a recent commentary, I examined what economic theory can tell us about the effects a universal basic income would have on work incentives. But theory alone is not enough. We need also to look at evidence. The following will review the evidence from a set of experiments that were conducted in the 1970s as part of an attempt to make antipoverty policies of that era more effective.
These welfare experiments — or income maintenance experiments (IMEs) as we should more properly call them — were true randomized field trials. Such trials are considered the gold standard for testing new medicines or new crop varieties, but they are used all too rarely for testing economic policies. (By way of exception, another set of welfare experiments were conducted in the 1990s in conjunction with the welfare reforms of the Clinton years.)
Critics often say that UBI supporters pay insufficient attention to the IMEs. As Bryan Caplan puts it, in a recent piece for at the Library of Economics and Liberty,
If I were an enthusiastic UBI advocate, I would know this experimental evidence forwards and backwards. Almost all of the advocates I’ve encountered, in contrast, have little interest in numbers or past experience. What excites them is the “One Ring to Rule Them All” logic of the idea: “We get rid of everything else, and replace it with an elegant, gift-wrapped UBI.” For a policy salesman, this evasive approach makes sense: Slogans sell; numbers and history don’t. For a policy analyst, however, this evasive approach is negligence itself. If you scrutinize your policy ideas less cautiously than you read Amazon reviews for your next television, something is very wrong.
Writing on the Heritage Foundation website, Robert Rector and Mimi Teixeira echo Caplan’s sentiments. They point to the IMEs of the 1970s to support their view that a UBI would harm recipients and increase dependence on government. Their conclusion:
Universal basic income policy is an idea with a record of failure; policymakers seeking to reform the welfare state should focus instead on policies proven to work.
But are those really the lessons of the IMEs?
First, what the experiments can’t tell us
Each of the income maintenance experiments of the 1970s enrolled from several hundred to several thousand households, divided into two groups. They assigned one group to an experimental income support policy while a control group continued to be covered by existing welfare programs. IMEs testing various policies took place in New Jersey, Iowa, North Carolina, Indiana, Colorado, and Washington. They covered both urban and rural areas; both single parent and two-parent households; and various ethnic groups.
Critics are right to say it is important to understand exactly what these experiments can and can’t tell us. Let’s begin with the key negative: We can learn nothing directly from the IMEs about the effects of a UBI because they did not test such a policy.
Instead, the IMEs tested several variants of a negative income tax (NIT). An NIT and a UBI are not the same thing. As I explained in my earlier post, a negative income tax is means-tested. All versions of the NITs tested in the 1970s experiments incorporated substantial benefit reduction rates. Each added dollar earned by recipients resulted in a reduction of benefits ranging from 30 to 80 cents. In contrast, a UBI has no benefit reductions. As a result, the IMEs offer no more direct evidence about the effects of a UBI than a clinical trial of the effects of vitamin C on heart attacks would offer about the protective effects of aspirin.
The confusion between a UBI and an NIT is partly a matter of terminology. The organization US BIG uses the term Basic Income Guarantee (BIG) to refer to a broad family of income support programs that have the common feature of guaranteeing a minimum level of income to everyone without work requirements. A UBI, as I have defined it, is the most “universal” member of the BIG family in the sense that everyone gets the full payment regardless of income, with no benefit reductions.
More generally, BIGs also include NIT policies such as Milton Friedman’s early version, the NIT variants tested in the IMEs, and related programs like a plan advanced by Charles Murray . All of those policies include provisions that reduce the basic benefit by a fraction of a dollar for each dollar earned, beyond some defined amount.
What the IMEs do show about incentives
Let’s turn now to what we can learn from the IMEs, beginning with what they tell us about the incentive effects of a negative income tax.
As critics point out, the raw data from the IMEs show that almost all experimental groups reduced their average work efforts compared to their controls. Gary Burtless of the Brookings Institution summarizes the data in a paper prepared for a 1986 conference sponsored by the Boston Fed. The table shows that husbands reduced their work by an average of 119 hours per year, wives by an average of 93 hours, and single female heads of households by an average of 133 hours per year. Only two subgroups, black husbands in New Jersey and black wives in Gary, Indiana, increased their work compared to their control groups.
Those results are consistent with the theoretical findings in my earlier post on work incentives. Figure 2 there showed that “sweetening” an existing means-tested welfare scheme by increasing the minimum income guarantee and reducing the benefit reduction rate would produce ambiguous results. Some participants would increase their work efforts and others would cut back.
The greater the increase in the minimum income guarantee, the more likely a reduction in average work effort, because of the income effect. The greater the decrease in the benefit reduction rate, the more likely an increase in work effort, because of the substitution effect. Also, an increase in either parameter would increase the number of people eligible for the program, thereby potentially reducing the work effort of people who previously had incomes just above the new cutoff level.
Interpretation of the raw data on work responses in the IMEs is complicated by fact that the NIT plans faced by the experimental groups included variations in both the minimum income guarantee and the benefit reduction rate. Furthermore, the tested NITs were not always “sweeter” in both respects compared to the welfare policies available to their respective control groups.
Some experimental groups received minimum income guarantees of as much as 135 percent of the poverty level, well above what they would have received from AFDC and food stamps, while others received as little as 50 percent of the poverty level. Some experimental groups faced benefit reduction rates of up to 80 percent, which would have been higher than the benefit reduction rates faced by at least some households in the control groups. Other experimental groups faced benefit reduction rates of as little as 30 percent, which would have been lower than those faced by the control groups.
All in all, according to analysis of the data presented in Burtless’ Table 4, the effects of changes in each parameter, taken separately, appear to be broadly consistent with the theoretical model presented in my earlier post:
- For both intact families and single heads of household, groups facing a 75 percent benefit reduction rate under the NIT exhibited greater average labor withdrawal than those facing a 50 percent rate.
- For both intact families and single heads of households, groups with higher guaranteed minimums had a greater reduction in work hours, other things being equal.
- Husband-wife families showed a greater reduction in work than single parent families, which is what we would be expect if the control groups of the single parent families were more likely to be on welfare plans with high benefit reduction rates.
Unfortunately, those findings are clouded by methodological flaws in the IMEs. An overview of the findings of the Boston Fed conference points to numerous problems with design, execution and analysis, including inadequate theoretical models, poor formulation of objectives, and unsatisfactory management and administration. These methodological problems cast doubt on whether evidence from the IMEs really meets the “gold standard” characterization.
The most important problem was apparently widespread underreporting of work effort by participants in the experimental groups. To quote Burtless,
Several analysts have found evidence that at least part of the employment and earnings reduction reported in the experiments was spurious. Recipients of negative income tax payments had a clear incentive to underreport their employment and earnings, because to do so permitted them to receive a larger payment than the one to which they were legally entitled. Wage earners enrolled in the control group did not face this kind of misreporting incentive.
Burtless goes on discuss studies that use other data sources, including IRS records, to correct the reporting bias. In the Gary experiment, underreporting appears to have accounted for all of the negative work response. In the Seattle-Denver experiment, underreporting did not greatly change the work response of heads of households, but the reported reduction in hours disappeared for secondary workers.
In an invited response to Burtless’ paper, Orley Ashenfelter of Princeton University notes that a failure to address the problem of underreporting in advance represented a serious design flaw of the IMEs:
Only an experiment fully informed at the design stage about the possibility for income underreporting, and that tested for its effect, would shed any light on this critical issue. Sadly, the design of none of these experiments was so informed.
By ignoring the evidence of underreporting, critics overstate the case not only against a UBI, but also against an NIT. As if that were not enough, they compound the overstatement by implying that the observed work reductions represented withdrawals from the labor force. Instead, according to research cited by Dylan Matthews in a post on Vox, even among participants in the IMEs who reduced their hours worked, full withdrawal from the labor force was a relative rarity.
Instead, the reduction in hours worked more often took the form of longer periods of job search between spells of employment. For some that might mean loafing, but for others, it could well mean a more thorough search process resulting in a better job match. In the case of young secondary workers in families receiving NIT benefits, reduction in work often meant more time spent in school. As one participant in the Boston Fed conference reported, the probability of graduation from high school was 25 to 30 percent higher in families receiving the NIT than in the control group.
A UBI, although not studied in the experiments of the 1970s, could well have similar effects on job search and schooling.
Critics are right to say that UBI advocates should study the income maintenance experiments of the 1970s. Those experiments show that incentives do matter. A successful UBI policy (or any form of BIG, including a negative income tax) needs to take incentive effects into account. The experiments support at least three specific conclusions that apply to any form of basic income.
- Policies that include high benefit reduction rates are highly likely to have adverse work incentives. A UBI, which, in its pure form, has no benefit reductions, is far less likely to have such effects.
- Adding any kind of a basic income, whether a pure UBI or an NIT, is more likely to have adverse work incentives if it is added as a “sweetener” on top of existing means-tested programs like SNAP, TANF, and housing assistance rather than replacing them.
- Increasing marginal income tax rates to finance a basic income would also have negative effects on work effort. To avoid those effects, a basic income should, as explained in another earlier commentary, be financed to the greatest extent possible by replacing existing income support programs, including benefits for people with higher levels of income.