How to Read a Supplement Study: Funding, Endpoints, Sample Size, and Effect Size (2026)

April 19, 2026

A practical science literacy guide to reading supplement research. Study hierarchy, surrogate vs hard endpoints, industry funding bias, p-hacking, subgroup traps, and effect size vs statistical significance.

Medically reviewed by Dr. Emily Torres, Registered Dietitian Nutritionist (RDN)

Most supplement marketing cites studies. Most of those studies do not actually support the claim. The gap between "a study showed" and "the evidence supports" is where the supplement industry lives. Learning to read a study takes under an hour of concept work and pays back forever. You need to know five things: where the study sits in the evidence hierarchy, what endpoint it measured, how many people it enrolled and for how long, who paid for it, and whether the effect size was clinically meaningful or just statistically significant. This guide walks through each.

Science literacy is not scientism. You do not need to dismiss every non-randomized study or reject every industry-funded paper. You need to calibrate confidence. A single small trial with a surrogate endpoint and an industry sponsor moves you a little. A Cochrane meta-analysis of multiple well-powered RCTs with hard endpoints moves you a lot.

The evidence hierarchy

From weakest to strongest

Study type	Typical purpose	Common pitfalls	Weight in decisions
Case report	Describe rare event or novel observation	Not generalizable; no control	Hypothesis-generating only
Cross-sectional	Snapshot of prevalence/association	Cannot establish timing; confounders	Low — exploratory
Case-control	Retrospective comparison	Recall bias; selection bias	Low-moderate
Prospective cohort	Follow groups forward	Unmeasured confounders; long duration	Moderate
Randomized controlled trial (RCT)	Test causal effect	Small samples; short duration; surrogate endpoints	High, if well-run
Meta-analysis / systematic review	Pool multiple RCTs	Heterogeneity; publication bias	High
Cochrane review	Rigorously protocolized systematic review	Narrow question scope	Highest available for supplements

What this means in practice

If a supplement is supported primarily by cross-sectional studies and a couple of small RCTs, you are looking at a signal, not a conclusion. If a Cochrane review has pooled the RCTs and found a small or null effect, that outweighs any new trial of similar size that contradicts it.

Endpoints: surrogate vs hard

Definitions

A hard endpoint is a clinically meaningful outcome: mortality, stroke, heart attack, fracture, hospitalization, diagnosis of a disease.

A surrogate endpoint is a biomarker believed to track a hard endpoint: LDL cholesterol, blood pressure, HbA1c, bone mineral density, inflammatory markers.

Why the distinction matters

Surrogate endpoints move faster and cheaper than hard endpoints, but they do not always translate. The history of medicine is full of drugs that moved a surrogate without moving mortality (the CAST trial on antiarrhythmics is a classic example). Supplement trials almost always use surrogates because hard endpoints require large, long, expensive studies.

When a supplement ad cites "clinically proven to lower LDL," the translation is: "a biomarker moved in a study." Whether that biomarker change produces longer or healthier life is a separate question.

Sample size and study duration

Why sample size is the first number to check

A study of 20 people cannot reliably detect anything but a massive effect. Most supplements do not produce massive effects. Trials with small samples are prone to "winner's curse" — a real but small effect gets estimated too large by chance, then shrinks in replications.

Power calculations

A credible study reports a pre-specified power calculation: "We enrolled 180 participants to have 80% power to detect a 10% difference at alpha 0.05." Studies that do not report power calculations, or that adjust sample size after looking at the data, should be read with extra skepticism.

Duration

Many supplement endpoints require at least 8 to 12 weeks to measure. Six-week trials on skin elasticity, cartilage recovery, or cognitive performance often underestimate effects that need more time. Conversely, long trials that start reporting results at an early interim analysis can overstate short-term effects.

Funding and conflicts of interest

Industry-funded research skews positive

Lesser and colleagues (BMJ 2007) found that nutrition studies funded by industry were more likely to report results favorable to the sponsor than independently funded studies. Later work on pharmaceutical and food industry funding has repeatedly replicated this pattern.

This does not mean industry-funded research is fabricated. It means that study design choices, endpoint selection, and selective publication all slightly tilt findings. A single industry-funded positive trial should update you less than an independent one of the same size.

Conflict of interest disclosures

Reputable journals require authors to disclose funding sources and conflicts. Read the disclosure section before reading the abstract. If the corresponding author is a paid consultant to the sponsor and the study is positive, calibrate accordingly.

P-hacking and multiple endpoints

What p-hacking looks like

A study measures 20 outcomes. One reaches p < 0.05 by chance. The paper headlines that finding. This is called "multiple comparisons" or "the garden of forking paths," and it inflates false positives.

Red flags

The primary endpoint listed at registration (check ClinicalTrials.gov) differs from the primary endpoint in the published paper.
The abstract emphasizes a secondary or subgroup analysis.
No correction (Bonferroni, Benjamini-Hochberg) is applied for multiple comparisons.
Significant results are reported only for subgroups (for example, "in men over 55 with low baseline vitamin D").

Subgroup analysis

Subgroup findings should be treated as hypothesis-generating, not conclusive, unless the study was pre-specified to test that subgroup with adequate power.

Effect size vs statistical significance

Why "statistically significant" is not enough

A p-value tells you how unlikely the observed data would be if the null hypothesis were true. It does not tell you how large the effect is or whether it matters clinically.

A well-designed study with 5,000 participants can detect a trivially small effect as statistically significant. The right question is: how large is the effect, and does it matter?

Useful effect size measures

Cohen's d: standardized difference between two means. d = 0.2 is small, 0.5 is moderate, 0.8 is large.
Risk ratio / odds ratio: how much more (or less) likely an outcome is in the treatment group.
Number needed to treat (NNT): how many people must take the supplement for one additional person to benefit. NNT of 10 is strong; NNT of 500 is trivial for most healthy people.
Absolute risk reduction: actual percentage-point change, not relative. A drop from 2% to 1% is a 50% relative reduction but only a 1-percentage-point absolute reduction.

Relative risk reductions are often used in marketing because they sound larger than they are.

Replication

One study is a hypothesis

No matter how well-designed, a single study is a starting point. Replication — ideally in different populations by different research groups — is what turns a finding into evidence. Supplements with positive single trials that fail to replicate (for example, resveratrol for longevity in humans) should be held loosely.

Pre-registration

Check whether the trial was pre-registered (ClinicalTrials.gov, ISRCTN, or a journal registration). Pre-registration reduces the opportunity for outcome switching and selective reporting.

Five questions to ask any supplement study

Who paid? Industry sponsorship is a calibration factor, not a disqualifier.
How big? Sample size and pre-specified power.
How long? Matches the biology of the claimed effect?
What endpoint? Hard outcome or surrogate marker?
Replicated? Is there a meta-analysis or Cochrane review on this question?

If you can answer these five questions, you can read most supplement research more critically than most of the marketing department citing it.

Nutrola and evidence-based choices

Nutrola is built around evidence-tracking rather than marketing claims. The app tracks 100+ nutrients, supplement intake, and biomarker changes at €2.50 per month with zero ads, so users can run their own n-of-1 alongside the published evidence. Nutrola Daily Essentials ($49/month, lab tested, EU certified, 100% natural) is positioned around ingredients with multi-RCT or Cochrane-level support rather than single-trial buzz.

Nutrola is reviewed 4.9 stars across 1,340,080 reviews.

Frequently Asked Questions

Is an RCT always better than a cohort study?

For causal questions about treatment effects, yes — a well-run RCT is stronger than a cohort study of similar size. But cohort studies are essential for long-term outcomes (mortality, chronic disease) that RCTs rarely measure. The two study types complement each other.

What is a clinically meaningful effect size?

It depends on the outcome. A 3 mmHg drop in systolic blood pressure is modest individually but meaningful at a population level. A 1-point improvement on a 100-point sleep scale is usually not meaningful. Always ask what magnitude matters for the specific outcome.

Should I trust an industry-funded study?

You can read it, but weight it less. Industry-funded studies are more likely to report favorable findings. A single industry-funded positive trial should not outweigh a Cochrane review showing null.

What is the difference between a systematic review and a meta-analysis?

A systematic review is a structured, protocolized search and summary of the literature. A meta-analysis quantitatively pools the results of multiple studies. Cochrane reviews are both.

How do I find out if a supplement has good evidence?

Start with the NIH Office of Dietary Supplements fact sheets, Cochrane reviews, and major meta-analyses in indexed journals. Supplement company websites are not evidence bases; they are sales materials that cite evidence selectively.

Why does Nutrola emphasize reading studies?

Because the gap between published evidence and marketing claims is the single biggest source of wasted money in this category. Teaching users how to read a study is cheaper and more durable than giving them a list of approved products.

Medical disclaimer

This article is for educational purposes and does not constitute medical advice. Study interpretation for personal health decisions should ideally be done with a qualified clinician. Do not start, stop, or change a supplement or medication based solely on a single study.

References

Lesser LI, et al. Relationship between funding source and conclusion among nutrition-related scientific articles. PLoS Med / BMJ.
Higgins JPT, et al. Cochrane Handbook for Systematic Reviews of Interventions.
Ioannidis JPA. Why most published research findings are false. PLoS Med.
Chan AW, et al. SPIRIT 2013 statement: defining standard protocol items for clinical trials. Ann Intern Med.
Schulz KF, et al. CONSORT 2010 statement: updated guidelines for reporting parallel group randomised trials. BMJ.
Fleming TR, DeMets DL. Surrogate end points in clinical trials: are we being misled? Ann Intern Med.
Head ML, et al. The extent and consequences of p-hacking in science. PLoS Biol.

Ready to Transform Your Nutrition Tracking?

Join thousands who have transformed their health journey with Nutrola!

Download on theApp Store

GET IT ONGoogle Play