SaveYourself.ca •Sensible advice for aches, pains & injuries
 

The “Impress Me” Test

Controversial therapies are usually fighting over scraps of “positive” evidence that damns them with faint praise

1,600 words, published 2009, updated Aug 23rd, 2014
by Paul Ingraham, Vancouver, Canadabio
I am a science writer, the Assistant Editor of ScienceBasedMedicine.org, and a former Registered Massage Therapist with a decade of experience treating tough pain cases. I’ve written hundreds of articles and several books, and I’m known for sassy, skeptical, referenced analysis and a huge bibliography. I am a runner and ultimate player, and live in beautiful downtown Vancouver, Canada. • full bioabout SaveYourself.ca

It is common for those who promote dubious therapies and treatments to claim scientific support based on studies that technically had statistically significant positive results — but when you look at the data you only find evidence of a trivial beneficial effect. The evidence may be positive, but it fails to impress. The treatment is damned with faint praise.

If a therapy actually works well it should be fairly easy to prove it.1 When tested, the results should be impressive enoug that they leave little room for argument. When a treatment is clearly shown to be effective, it’s exciting! It makes headlines, and it should.

But it’s also incredibly rare.

Most slightly “positive” study results are actually just bogus

The weaker a positive result, the more likely it is to simply be wrong: not actually positive at all. There are several ways that a positive study can actually be negative…

Early studies of a treatment tend to be sketchier and “positive,” often conducted by proponents trying to produce scientific justification for their methods. Eventually less biased investigators do better quality studies, and the results are negative. This is a classic pattern in the history of science.

So you can see why I’m a little skeptical when someone enthusiastically shows me one paper from an obscure journal reporting a “significant” benefit to, say, acupuncture — which has probably been the subject of more of these “positive” studies than any other treatment.

Better than nothing?

If you’re a glass-is-half-full person, you might be happy to say that weakly positive results are “better than nothing.” Science says chiropractic adjustment of my back might improve my back pain by 3%? Heck, I’ll take 3%!

Sometimes the better-than-nothing interpretation is fair and fine,7 and I’ve used it myself many times. Just don’t confuse optimistic pragmatism with actual knowledge. Weakly positive results, even real ones, do not mean it’s truly established that a treatment “works a little bit.” The bar for that is higher.

Specifically, a treatment has to beat the null hypothesis in most studies over time. This is hard.

The null hypothesis — a pillar of the scientific method — is that most hopeful theories amount to nothing when when carefully checked. In plain English, the null hypothesis says, “Most ideas turn out to be wrong.” And therefore most weakly positive results will turn out to the product of bias and wishful thinking. And that’s fine.

Treatments should be considered useless until proven effective. The burden of proof is on the pusher of the idea (and it’s a heavy burden). Treatments must work well and clearly to actually beat the null hypothesis. They must impress! Until this do, the null hypothesis looms over them, still very likely to win in the long run.

The null hypothesis has kicked a lot of theories butts over the centuries. It is still the champ.

Maybe it’s just hard to confirm?

The wishful thinker is also inclined to say, “But maybe there is a strong effect and it’s just erratic, hard for science to pin down!”

Perhaps.

But any strong effect that is so hard for science to pin-down that we can’t even prove it exists is also awkward or useless in practice. If a standardized treatment protocol can’t deliver the goods in a somewhat reliable fashion, it’s not really medicine — or at least it’s not medicine I want to spend my money on until its “erratic” nature is better understood.

Fighting over scraps

The science of painful problems is still surprisingly rudimentary and preliminary. We can try to critically assess it, and I do, but “replication needed” is usually all that really needs to be said. That covers all the bases. At the end of the day, if slightly promising results cannot be confirmed by other researchers, it doesn’t really matter what was wrong with the original research. Either a treatment works well enough to consistently produce impressive results … or it doesn’t.

Controversy about many popular therapies is much ado about not much, and we’re mostly fighting over pathetic scraps of evidence. After decades of study, the effectiveness of a therapy should be clear and significant in order to justify its continued existence or “more study.” If it’s still hopelessly mired in controversy after so many years — more than a century in some cases (*cough* homeopathy *cough*) — how good can it possibly be? Why would anyone — patient or professional — feel enthusiastic about a therapy that can’t clearly show its superiority in a fair scientific test? Where’s the value in even debating a therapy that is clearly not working any miracles, that has a trivial benefit at best?

The long-term persistence of such debate constitutes evidence of absence. Several dozen crappy studies with weekly positive results is roughly equivalent to proof that there’s no beef, with or without high quality studies to put the nail in the coffin. More research is a waste of time and resources.8

Science, as they say, really delivers the goods: missions to Mars, long lives, the internet. A therapy has to deliver the goods. It’s got to help most people a fair amount and most of the time … or who cares?

Until it impresses you, it’s just some idea that hasn’t yet showed much promise.

It’s okay not to know

Readers and patients are forever asking me what my “hunch” is about a therapy: does it work? Is there anything to it? I’m honoured that my opinion is so sought after, but I usually won’t take the bait. Like Carl Sagan, “I try not to think with my gut.”

It’s okay not to know. It’s okay for the jury to be out.

And it had better be, because there’s still a great deal of mystery in musculoskeletal health science. Most of the scientific evidence that I interpret for readers of SaveYourself fails the “impress me” test. Even when that evidence is actually positive — and it’s hard to tell — it’s often only slightly positive. Even when there’s evidence that a therapy works, it’s usually weak evidence: some studies concluded that maybe it helps some people, some of the time … while other studies, almost always the betters ones, showed no effect at all. I’m supposed to get excited about this? To justify real confidence in a therapy, we want really good evidence, evidence that makes you sit up and take notice, evidence that ends arguments because it’s just that clear.

Anything less fails to impress!

I don’t want to believe. I want to know.

Carl Sagan

We must somehow find a way to make peace with limited information, eagerly seeking more, without being dogmatic about premature conclusions.

Science Science And The Game Of 20 Questions, by Val Jones

About Paul Ingraham

I am a science writer, former massage therapist, and assistant editor of Science-Based Medicine. I have had my share of injuries and pain challenges as a runner and ultimate player. My wife and I live in downtown Vancouver, Canada. See my full bio and qualifications, or my blog, Writerly. You might run into me on Facebook and Google, but mostly Twitter.

Notes

  1. Standard proof caveat: nothing is is ever truly “proved,” of course.” When we talk of proof in science, we don’t mean total certainty, but more like the certainty you feel about the sun rising tomorrow. BACK TO TEXT
  2. Many study results are called “stastically significant,” giving unwary readers the impression of good news. But it’s misleading: statistical significance means only that the measured effect of a treatment is probably real (not a fluke). It says nothing about how large the effect is. Many small effect sizes are reported only as “statistically significant” — it’s a nearly standard way for biased researchers to make it found like they found something more important than they did. See Statistical Significance Abuse: A lot of research makes scientific evidence seem more “significant” than it is.

    BACK TO TEXT
  3. One of my favourite is another technically correct but misleading stats term, “trending.” When the results are positive but not statistically insignificant, paper authors will often still summarize by saying that there was a “positive trend” in the data: not enough to claim significance, mind you, but not actually negative. It’s a good way of making a worthless study still sound a little positive. BACK TO TEXT
  4. Except it’s usually noteworthy that, even by cheating and lying and bending every rule in their favour, they still couldn’t produce better results! BACK TO TEXT
  5. SY Ingraham. Bogus Citations: 11 classic ways to self-servingly screw up references to science, like “the sneaky reach” or “the uncheckable”. SaveYourself.ca. 1924 words. BACK TO TEXT
  6. Ioannidis. Why Most Published Research Findings Are False. PLoS Medicine. 2005. BACK TO TEXT
  7. Whether you use unimpressive positive results to justify giving a treatment a try depends largely on other factors: Is it expensive? Is it dangerous? Will it interfere with other, better treatment options? And so on. It’s a pragmatic calculation, not a scientific conclusion. BACK TO TEXT
  8. Gorski et al. Clinical trials of integrative medicine: testing whether magic works? Trends in Molecular Medicine. 2014.

    A lot of dead horses are getting beaten in alternative medicine: pointlessly studying silly treatments like homeopathy and reiki over and over again, as if it’s going to tell us something we don’t already know. This point has been made ad infinitum on ScienceBasedMedicine.org since its founding in 2009, but here Drs. Novella and Gorski make the case against testing “whether magic works” in a high-impact journal, Trends in Molecular Medicine.

    BACK TO TEXT