NIH Nutrition Research Takes it on the Chin
A "top" nutrition researcher exits, followed by two damning articles challenging his studies and claiming that a flagship $170 million NIH initiative is "designed to fail."
[Marion] Nestle, a professor emerita at New York University and author of the book “Food Politics,” called it one of the most important nutrition studies done since the discovery of vitamins.
On Thursday, she called [Kevin] Hall’s early retirement a “national tragedy,” writing that it casts doubt on the MAHA movement’s credibility.
CNN, “Top NIH nutrition researcher studying ultraprocessed foods departs, citing censorship under Kennedy,” April 17, 2025
If I were judging MAHA (Make America Healthy Again) credibility in the early months of RFK Jr.’s leadership of the Department of Health and Human Services, the retirement of an NIH nutritionist, “top” or otherwise, would not be high on my list of concerns.
What would? How the NIH leadership responds to the criticism that they’re embarking on a $170 million research project—Nutrition for Precision Health (NfPH)—that is, as the headline of a STAT op-ed phrased it just six days later, “designed to fail.” If NIH-funded nutrition research was the serious endeavor we all wish it to be, that STAT op-ed and the BMJ article on which it is based (both published on April 22nd) might have dominated nutrition discourse in late April.
Instead, the story that went viral was the early retirement of NIH physicist-turned-nutrition/obesity researcher Kevin Hall. It’s probably easier to count the major news organizations that didn’t run with the story—along with variations on the headline “Top NIH official accuses RFK Jr's agency of `censorship’”—than those that did.
The two stories are inextricably linked, but it was Hall’s retirement that had the kind of spin that the media has taken to embracing these days: courageous researcher fighting insidious MAHA policies. It also had the ultra-processed food angle that is front and center these days in nutrition journalism, as I discussed in a January post.
The designed-to-fail critique of NIH’s flagship initiative is ultimately the far more important story. The NfPH, after all, constitutes the future of nutrition research in the U.S.. But the BMJ and STAT articles were making the exact same point about Hall’s so-very-influential research, which is what entangles the two stories.
If the NfPH, the flagship of NIH nutrition research for the next decade, is indeed “designed to fail,” as the STAT and BMJ articles imply, then we can assume, as they also implied, that Hall’s wildly influential research has failed us as well.
So there’s a lot to cover here. As a result, this will be a multi-part post. Part 1 will cover Kevin Hall and one of the two issues with the research that made him so influential. Part 2 will cover the second issue and the NfPH.
The context: that annoying challenge of establishing reliable knowledge
When I think about why nutrition journalists, confronted by an obvious dearth in meaningful studies to cover weekly, would avoid a story with these weighty implications, the obvious reason is that they’re not, well, click bait. They seem like inside baseball. Researchers might care about these issues, but why would anyone else (let alone an editor who has to sign off on the story)?
These methodological issues of study design don’t speak directly to what we should eat or why particular foods might be good or bad for us, but rather indirectly: Whether we should believe the studies that make those claims. If the journalist has already reported on the claims as though we should believe them, that’s another reason to ignore them. (And, yes, it should be the opposite.)
Regrettably, few issues are more fundamental to a functioning scientific enterprise, to the establishment of reliable knowledge—i.e., what science is supposed to accomplish—than the proper design of the relevant experiments. All experiments have inherent flaws and limitations and those will always constrain how they should be interpreted. The history of bad science, of the science of things that aren't so,1 is the history of the over-interpretation of poorly designed experiments.
This is why the Nobel Laureate biophysicist Georg von Békésy once wrote about the benefits that scientists accrue from both friends and enemies.2 The friends “are willing to spend the time necessary to carry out a critical examination of the experimental design beforehand and the results after the experiments have been completed.” Even better, he added, were the enemies. “An enemy is willing to devote a vast amount of time and brain power to ferreting out errors both large and small, and this without any compensation.”
The trouble with enemies, though, is that they only critique your experiments after they’re done and you’ve published your results. You typically don’t get the benefit of their critical thinking in advance.3 Good scientists will swallow their pride, acknowledge the criticisms or at least discuss them without rancor, and use the critiques to design their next experiments. Everyone benefits.
The reason outsiders find this problematic (journalists, for instance, covering science) is that enemies in science tend to be the researchers who disagree with the interpretation of the evidence, who are guided by, if not promoting competing hypotheses. It’s these folks on the other side of an important scientific issue who care enough about it to air their criticisms publicly and risk the blowback that they’re likely to get. The back and forth of the criticism in these scientific conflicts—the punches and counterpunches—can go on for decades.
Most researchers lack the intellectual rigor exemplified by von Békésy. They’ll find it easier to dismiss criticism as the product of closed-minded opponents rather than engage with it seriously. Yet their critics may be no more stubborn than they are—simply unwilling to abandon views they’ve spent years developing and building careers around without far stronger evidence than is being offered.
In an ideal world, what the critics are saying is you do the right experiment and do it carefully enough, you may convince me. But you haven’t done it yet, and here’s why.
Hall’s story involves all these issues, but there’s nothing subtle about the criticisms of the studies he’s been doing, as the BMJ and STAT articles make clear, and the same problems are about to be repeated in the $170 million NfPH initiative.
Early retirement as a national tragedy
After 21 years at his “dream job,” as Hall wrote in a lengthy X post on April 16th, he was parting ways with NIH. He had, he said, “ambitious plans to more rapidly and efficiently determine how our food is likely making Americans chronically sick,” but he had come to believe that wouldn’t be possible, or at least not without unacceptable restrictions on his scientific freedom.
Unfortunately, recent events have made me question whether NIH continues to be a place where I can freely conduct unbiased science. Specifically, I experienced censorship in the reporting of our research because of agency concerns that it did not appear to fully support preconceived narratives of my agency’s leadership about ultra-processed food addiction.
I’ve worked with Hall in the past, having co-founded a not-for-profit, the Nutrition Science Initiative, that funded a study on which he was a principal investigator. I can easily imagine him bridling at this kind of administrative interference, as he did when he worked with us.
I suspect, though, that he’d have been willing to tolerate the NIH interference, if his research was being supported by the NIH leadership at the level he deemed necessary. Apparently it wasn’t.4 Before he could establish whether this situation would change for the better under the MAHA regime, as CNN explained, he had to make a quick decision whether to “accept voluntary early retirement, part of the federal government’s push to shed workers.” He took it.
Even without the MAHA interference angle, though, Hall’s early retirement was newsworthy for two reasons:
He had indeed become far and away the most influential nutrition researcher at NIH, if not one of the top handful nationwide.
He achieved that level of influence thanks to the study that the Washington Post, in its article about Hall’s retirement, called “a landmark study providing the most compelling evidence to date that ultra-processed foods are harmful to Americans’ health.” That harm, which we’ll have to keep in mind, is that they make people eat too much.
This was the study that Marion Nestle, grand dame of nutrition policy, called, per the CNN quote in the epigraph, almost literally the greatest thing since sliced bread.5 Hall himself, in his X post, said his research “leads the world on this topic.” And it does, although the value of its leadership is what’s at issue.
In January, I wrote about how journalists had covered Hall’s ultra-processed food (UPF) research. I also noted the paradox that it should have presented to nutrition journalists. The New Yorker, for instance, had described Hall’s study as “widely recognized as the most rigorous examination of the subject so far,” while also quoting Harvard’s Walter Willett describing that same study as “worse than meaningless.” This suggested far more more to this story than the media was reporting. It still does.
Willett was an author of the the April 22nd BMJ article along with his Harvard colleague David Ludwig, a nutritionist and endocrinologist, and the University of Pennsylvania biostatistician and epidemiologist Mary Putt. Ludwig was the primary author of the BMJ article. Ludwig and Putt are the authors of the STAT op-ed.
The two articles are effectively the answer to the question: why did Willett describe Hall’s UPF study as “worse than meaningless” and what does that imply about the coming NfPH science?
What gives this story an extra complication is that Ludwig, in particular, and Willett, to a lesser extent, have been criticizing Hall’s’ research and his papers for the better part of a decade. Whether Hall thinks of these Harvard researchers as “enemies,” I can’t say, but they’re certainly not friends.
Hall and Ludwig (and, again, to a lesser extent Willett) are proponents of very different ways of thinking about obesity, and so they come to different conclusions when they interpret the data. Hall assumes that people get fat because they consume too many calories and maybe too much fat. Ludwig, as I do, thinks carbohydrates are uniquely fattening and is a proponent, as am I, of a hypothesis known as the carbohydrate insulin model.
Hall almost invariably interprets his research as either falsifying or being “inconsistent with” this carbohydrate insulin model. If Ludwig’s right, Hall is wrong and vice verse. This is why Hall has publicly criticized Ludwig’s work (and, with it, his scientific integrity). And it’s why Ludwig has criticized Hall’s criticisms (and their occasionally ad hominem nature) and criticized his work. Willett has been a co-author with Ludwig on many of the critiques. I have co-authored articles with Ludwig and Willett, including those challenging Hall’s work and criticizing it implicitly, if not explicitly. Hall has also criticized my work in very public venues: the journal Science most notably. It is a tangled web.
The history of vehement and public disagreement can make the latest critiques by Ludwig, Willett and Putt (and mine in this post) look personal. Anyone who thinks this is unusual in scientific disputes hasn’t spent enough time reading up on the history of science. It’s easy to dismiss the arguments on that basis—the Harvard researchers (and Taubes) are going after the NIH guy yet again because they don’t like his results, and this is an old story. It is. But it’s a vitally important one.
At the heart of the problem is a point Francis Bacon made 400 years ago when he more or less inaugurated the scientific method: We all tend to believe the evidence that agrees with our preconceptions and ignore or reject the evidence that doesn’t. “The human understanding,” Bacon observed, “still has this peculiar and perpetual fault of being more moved and excited by affirmatives than by negatives, whereas rightly and properly it ought to give equal weight to both.”
Marion Nestle is a good example of how this particular cognitive bias plays out and the greater bias it leaves in its wake. Nestle knows both Hall and Ludwig well. She is a huge fan of Hall’s work—”national tragedy” and all that—and she’s co-authored opinion pieces in the past with Ludwig. (Here and here in JAMA.) She has thanked Ludwig on her influential foodpolitics.com blog for alerting her to important stories. Still, she does not consider Ludwig’s critiques of Hall’s work to be of interest to her readers, despite it speaking to the validity of the study she finds so enormously influential. Her blog makes no mention of Ludwig’s critique.
Why not? Nestle believes, as she recently told the New York Times that “[t]he biggest problem in American diets is people eat too much.” Hall’s UPF study purportedly shows that the problem with food ultra-processing is it causes people eat too much, and so they gain weight. In short, Nestle embraces Hall’s findings as so vitally important, because they affirm her beliefs.
When I asked Nestle about this Ludwig v. Hall issue in an email exchange (we also know each other well) she said she had read Ludwig’s critiques but nonetheless described Hall’s findings as “unambiguous.” Nestle and I then spun down a spiral of illogic. I said that if Ludwig’s criticisms are right, then Hall’s findings cannot be unambiguous. The nature of Ludwig’s critique de facto creates ambiguity. She agreed that Ludwig makes good points (at least one), but reiterated that Hall’s findings are… unambiguous.
The nature of the flaws
What everyone agrees on in this business is that nutrition science comes with unique challenges. Every method of shedding light on the important questions of diet and health, as Ludwig, Willett and Putt (LW&P, for brevity) noted in the BMJ, comes with very specific limitations on what we can learn. The more affordable the study, regrettably, the greater those limitations.
We cannot assume, for instance, that animal research on specific diets translates to humans. We cannot assume that the associations between chronic disorders and diet observed in human populations—in epidemiological studies (Willett’s research, perhaps ironically)—translates to causality.6 And when we do experiments (clinical trials) and randomize people to eat different diets while going about their lives—what LW&P call “behavioral counseling trials”—we can’t assume that the participants in these free-living studies actually eat as they are counseled to eat. After 3 to 6 months, it seems clear that they don’t.
To overcome these challenges [write LW&P], nutrition researchers often turn to feeding studies in which participants receive prepared diets, typically in an inpatient setting. Feeding studies provide the opportunity to maintain rigorous control of dietary intake (minimizing non-adherence) and other environmental conditions. Owing to cost and complexity, however, these trials are typically of short duration (diet arms ≤2 weeks).
This is the design that Hall embraced for his research and the NfPH will use as well. With these inpatient feeding studies, the researchers can have complete control over what their subjects eat. Nestle and the nutrition journalists can then correctly tout them as “tightly controlled”—the ideal of any experiment—but that control comes with a vitally important trade-off.
Housing study participants as in-patients in metabolic wards, as Hall did and the NfPH will do, is exorbitantly expensive.7 The longer the subjects are confined in the wards, the greater the expense. It doesn’t help that the subjects tend to go (non-technical term) stir crazy while confined. The longer they’re in, the more likely they’ll quit.
This is why (“owing to cost and complexity,” as LW&P put it) these tightly-controlled feeding periods tend to be short. In the experiments that established Hall’s eminence in the nutrition research stratosphere, he housed his patients for four weeks in the NIH metabolic ward, feeding them one test diet for two weeks and then another test diet for the following two weeks. In the landmark UPF study, one diet was ultra-processed, the other was minimally processed. Two weeks on each in random order. The NfPH, using three test diets, will also use two-week feeding periods.
So what’s wrong with this study design?
Let’s start with an obvious question: can we assume that the eating behavior of the participants in the two weeks on the test diets—how much they eat of each—can be extrapolated to their eating behavior over the months and years that obesity and other chronic diseases take to manifest themselves? Is two weeks long enough?8
Hall’s studies implicitly assume that we can. Ludwig and Putt, in their STAT essay, lead with a thought experiment to suggest, quite obviously, otherwise:
Imagine a clinical trial with sedentary, overweight adults. One group is assigned to remain sedentary, the other to undergo intensive physical training with daily runs, calisthenics, and sports. After a week or two, the training group would probably feel sore and tired, and their endurance might be reduced. But we wouldn’t conclude that physical activity is bad for health. Clearly, we’d need a better, longer study to see the benefits.
The same is true for diet, as Ludwig has been arguing since 2017 (if not before), because humans take longer than two weeks to adapt to significant changes in diet. Here’s LW&P in the BMJ on the issue of two week-long trials, discussing only some of the physiological phenomena involved:
…adaptation to a major change in diet may require several weeks to months before transient effects subside and long term effects can be reliably observed. For instance, serum ketones reach steady state levels two to three weeks after a very low carbohydrate diet is started. Until then, nitrogen balance tends to be negative as the brain transitions from use of glucose (produced in part by gluconeogenesis from amino acids) to ketones as the primary metabolic fuel. Conversely, after switching from a ketogenic diet to a high carbohydrate diet, progressive changes in glucose tolerance can be observed for at least one month.
Although macronutrient adaptation presently lacks a formal definition, this prolonged multiorgan process can be assessed with numerous biomarkers… The preponderance of evidence indicates that metabolic adaptation cannot be considered complete after just a few days, contradicting a commonly stated rationale for use of diet arms of short term duration…
As the body adapts physiologically to the new diet and the human subjects adapt psychologically to the consequences of their new diet, how much they choose to eat (their eating or appetitive behavior) and how their bodies respond to what they eat (accumulating excess fat, for instance, or not) will change as well. Simply put, studying two weeks on a diet, no matter how tightly-controlled the study, tells the researcher (and so us) only what happens in two weeks on the new diet, no longer.
It tells the researchers about the transient effects of taking up a new diet—and perhaps how those effects are influenced by whatever the study subjects had been eating, as I’ll discuss in Part 2—but nothing about the long-term effects.
This seemed clear in 2021, when Hall and his colleagues published a Nature Medicine article comparing ad libitum energy intake (i.e., how much people ate when they can eat as much as they want) on a plant-based low-fat diet compared to an animal-based ketogenic diet (KD). The protocol was the same as Hall’sUPF study. Inpatients. Two weeks on each diet. The order of the diets determined by randomization. The results: the subjects consumed far more calories on the animal-based KD than on the plant-based, low-fat diet. Hall interpreted this, per his preconceptions, as “inconsistent” with predictions of Ludwig’s carbohydrate-insulin- thinking.9
But the huge difference in calories consumed on the two diets observed in the study, was far smaller in the second week of eating than the first. It was waning with time, suggesting precisely the adaptation phenomenon that Ludwig was pointing out. In week one, Hall’s subjects ate roughly a 1000 an average of 834 calories a day more on the KD than the low-fat diet. In week two it was down to 500 544 calories. [Thanks to Ludwig for pointing out my error.]
Hall had no way to know what would have happened had the trial continued for more than two weeks. It might be more of the same, but he had no way to know. Given adequate time for participants to adapt to the different diets, as LW&P proposed, and their responses might have differed dramatically.10
The two-week-too-short problem is conspicuous in Hall’s UPF study as well. Here’s how Ludwig and Putt described it in their STAT op-ed:
During two-week inpatient stays, 20 volunteers initially ate about 600 calories more a day on the ultra-processed diet. However, this effect shrank by about 25 calories each day throughout the trial. At this rate, the diets would no longer differ after another two weeks…
Does ultra-processed food cause obesity? Maybe, but we’ll never know from short-term trials like these.
“Maybe, but we’ll never know” is not the answer we typically get from experiments that scientists evoke words like landmark to describe. The title of Hall’s paper had indeed been unambiguous (using Nestle’s preferred description): “Ultra-Processed Diets Cause Excess Calorie Intake and Weight Gain: An Inpatient Randomized Controlled Trial of Ad Libitum Food Intake.”
But that title didn’t tell the whole story, and maybe not the more important story.
Considering the rate at which the effect Hall observed was waning with time, as Ludwig and Putt noted, Hall could have been precise and titled the paper “Ultra-Processed Diets Cause Excess Calorie Intake and Weight Gain for two weeks and maybe four,” but that might have given away the game.
In Part 2, I’ll discuss the second major flaw in Hall’s studies and then how these same failings are built into the design of the NIH’s flagship Nutrition for Precision Health initiative. If I have space, I’ll suggest how these UPF studies should be done (if UPF research was a serious endeavor). If not, there may be a part 3.
Borrowing here from the Nobel Laureate Irving Langmuir’s famous, but not famous enough, definition of what he called pathological science.
The Von Békésy quote comes from the 1994 Nobel Prize lecture of George Olah, who said it was one of his “favorite quotations.”
Another problem, Békésy wrote is that “enemies can sometimes develop into friends and lose a good deal of their zeal,” as had happened over his career: “Everyone, not just scientists, needs a few good enemies.”
This was an implication of the CNN article on his resignation, which included excerpts from the the email Hall sent to NIH leadership:
“We have been hobbled on several occasions with intermittent inability to purchase food for our study participants or obtain research supplies,” he told Kennedy and [new NIH director Jay] Bhattacharya in the March 28 letter. “The future of our studies seems bleak given the inability to replace outgoing trainees who are the workhorses of our research.”
Further, Hall wrote, “I’ve also experienced incidences of censorship in my ability to discuss our research.”
Perhaps reading too much into this, the “further” suggests “censorship” was secondary.
CNN quoted an HHS spokesman disputing Hall’s portrayal of the interactions: “It’s disappointing that this individual is fabricating false claims. Any attempt to paint this as censorship is a deliberate distortion of the facts.”
The invention of sliced bread dates, coincidentally and conveniently, to 1928 and the golden era of vitamin research.
Metabolic wards are essentially the research version of hospital wards. If you’ve ever stayed overnight in a hospital and had to pay the bill, little more need be said.
One way to think of it, as I noted in January, is to simply ask why people wouldn’t eat less of a particular diet if they found themselves, after a couple of weeks, getting fatter because they ate too much of it? Humans, after all, are not laboratory rats. We will (at least in theory) consciously change our behavior to a change in circumstances.
It wasn’t for many reasons, to be discussed in future posts.
I made this point in a discussion with Hall on X when his study was published. I was curious how confident he was that the two week experience on these diets could be extrapolated to months, let alone years, as he was proposing. I asked him to predict what would have happened had the trial lasted, say, six months, rather than only two weeks. He chose not to respond.
Fabulous article Gary. I love the way you unpack research. I'm looking forward to the next one.
As always, I appreciate the deeper understanding of the science — in this case, the limitations with Hall’s UPF study.
I’m not sure yet how (or whether) that discussion relates to Hall’s complaints about censorship. Alice Callahan’s NYT article addresses the removal of the term “health equity” as one example of censorship. This has nothing to do with the MAHA agenda (the obliteration of DEI language is a MAGA agenda item). If that act of censorship was representative of a general culture shift at NIH, I would certainly understand Hall not wishing to stick around.
I understand that the greater point of this article is to address why journalists chose to spotlight the censorship story over the “doomed to fail” story, and I look forward to continued reading as your series unfolds.