Widespread statistical malpractice

Alvaro de Menard writes,

It is difficult to convey just how low the standards are. The marginal researcher is a hack and the marginal paper should not exist. There’s a general lack of seriousness hanging over everything—if an undergrad cites a retracted paper in an essay, whatever; but if this is your life’s work, surely you ought to treat the matter with some care and respect.

You have to read the whole long post to see how he got to that point. Pointer from Tyler Cowen.

16 thoughts on “Widespread statistical malpractice

  1. Write the conclusion and abstract first. That should clarify your thinking, and lead to better use of statistics.

  2. But actually diving into the sea of trash that is social science gives you a more tangible perspective, a more visceral revulsion, and perhaps even a sense of Lovecraftian awe at the sheer magnitude of it all: a vast landfill—a great agglomeration of garbage extending as far as the eye can see, effluvious waves crashing and throwing up a foul foam of p=0.049 papers. As you walk up to the diving platform, the deformed attendant hands you a pair of flippers. Noticing your reticence, he gives a subtle nod as if to say: “come on then, jump in”.

    This can guy write.

  3. Statistics have been misused in sociology for decades, glad people are finally starting to say something.

  4. de Menard’s explanations for the problem are:
    1. Lack of oversight from the government agencies footing the bill for most of the research
    2. Lack of due diligence by the journals publishing the research
    3. Journals’ incentives to publish startling papers
    4. Journals’ reluctance to retract bad papers
    5. Lack of theoretical basis/understanding for much of the research
    6. In the “publish or perish” world, peer reviewers also need to publish and will themselves need peer reviewers, leading to an unspoken, “I’ll scratch your back if you scratch mine,” understanding
    7. Researchers not bothering to read the papers they cite or to check whether the papers they cite have been retracted
    8. Ignorance of statistics

    His solutions center around reforming government oversight. Getting government out of the picture is likely to be more effective.

    • I don’t know how one reads, let alone writes, that post without concluding that the whole existing system has been so thoroughly corrupted and gamed and overwhelmed by “cargo cult science” that it can be fixed or salvaged at all. Certainly not with minor tweaks, and probably not even with major reforms. Radical upheaval or outright dissolution and replacement is the minimum response required if we actually care about reliably trustworthy results.

      Basically, every clever idea that requires human beings to do their duty without reliable detection and penalty for violation, has already been thought of, implemented, and failed entirely. Not just failed entirely, which is bad enough, but made it around two orders of magnitude more burdensome to get papers done. Not good papers, just any paper, which are still mostly bad papers. “Huge additional costs, zero apparent benefit” is the worst of all possible worlds, and such a bad world, that one just needs to move to a totally different planet.

      So, the only solution is a completely different mechanism and institution of accountability. Notice that DARPA used subsidized markets. Publishing should be much easier and quicker, and must riskier like big bets or starting a business if one is making bold, confident claims.

      There will be fewer papers, but better papers.

      • “cargo cult science”

        +1. Lmk if you’re willing to license this one.

        Somewhat unrelated, but curious on how these results might jive with all of the social science on the virus. Any thoughts?

        • Cargo cult science is an old term; it goes all the way to Feynman’s CalTech speech in 1974 and has It’s own Wikipedia page.

          Virus stuff is a mixed bag. There is “hard microbiology” which seems most the reliable and replicable and prudent about keeping claims tethered to what good statistics tell us about the confidence of the results.

          When the science gets softer, the evidence weaker, the causal density higher, the ‘spherical cow’ mathy modeling more central, or the issue more politically charged (e.g., variolation, or the effects of hydroxychloroquine), then it starts to become impossible to determine how much trust to place in the results.

        • As Handle says, the phrase “cargo cult science” goes back to a speech by Richard Feyman. It’s included in Feyman’s book “Surely You’re Joking, Mr. Feyman” as the last chapter (IIRC).
          That book is a truly wonderful collection of Feyman stories about himself — he was truly a man of many parts, and had a sense of humor that never took himself too seriously. Among other stories, he writes about how he “played” with math in his head…, for fun, and how he taught himself Portuguese when he was invited to give speeches in Brazil. Oh, and he played bongos.
          =>=> =>
          http://calteches.library.caltech.edu/51/2/CargoCult.pdf

          • Postscript: Feyman also taught himself to pick locks, and was a great fan of pranking his co-workers.

  5. Probably another ignorant argument from me, but here goes:

    I’m thinking that the ability to come up with interesting and statistically meaningful results is incredibly rare:

    “Smoking causes cancer”

    “Ulcers result from a bacterium called H pylori”

    How many of these are we likely to get per century?

    I know that “big data” was supposed to take us to the next level. But, so far it’s been mostly non-informative.

    In the meantime, we’ve got thousands of academics that need to publish something in order to get tenure. The equilibrium of mostly meaningless nonsense logically follows.

  6. Frankly there are just too many PhDs. The talent pool is too diluted.

    Imagine if the NBA had 1,000 teams instead of 30. Those extra 10,000+ players will be playing basketball by its official rules but it will mostly be of very poor quality compared to what we see from LeBron James. There’s nothing you can do about that if you’ve committed to having 1,000 teams. There simply aren’t 15,000 basketball players who are as talented as an average NBAer.

    Similarly, I suspect in most fields there is a 1% who consistently produce papers with meaningful results – the LeBron James of their field. And then there is the 99% who is technically playing by the rules of the game (p=0.5 etc). But they’re simply not talented enough to consistently produce at a high quality.

  7. “All social sciences are primarily dedicated to the identification and retrodiction of behavioral homogeneities, with the unintended consequence of ignoring or dismissing heterogenizing, particularizing, and potentially unpredictable “forces” such as ideas. Even when positivist social scientists have no predictive ambitions, and even when they treat ideas as dependent variables, they seem disinclined to treat them as robust causal forces that might scatter behavior idiosyncratically rather than regularizing it conveniently.”

    -Jeffrey Friedman in Power without Knowledge

    End public subsidies for research and higher education. End the destructiveness and waste of their tyrannical and entrenched rents seeking regime, the worst in the world.

      • Well, I had hoped that Friedman’s “heterogenizing, particularizing, and potential unpredictables” to offer a relevant perspective on the reproducibility crisis. He writes in the introduction I about what the book is getting at and the point being supported by the first quote:

        “Technocracy, I will suggest, could be effective in achieving its ends—whether or not these ends are democratically determined—only if the human behavior that technocrats attempt to control can be reliably predicted. But the prediction of human behavior is an extremely difficult task, far more so than the predictive tasks at which natural science excels. An effective technocracy, therefore, may very well be out of reach.” That might be more plain English.

        • I’ve got a hard copy of Friedman’s book right here next to me, but I’m too reluctant to open it for fear of wanting to throw it against the wall shortly after starting. Why write a text that is only intelligible to the high priests? Major missed opportunity.

          Sorry, but I’m not going to be able to get through this with my sanity intact, nor is 99% of the population: “identification and retrodiction of behavioral homogeneities, with the unintended consequence of ignoring or dismissing heterogenizing…”

          And yes, Friedman is brilliant as I’ve pointed out in this site previously.

  8. Menard’s problem is that he went into this think the bulk of scientists are more intelligent and ethical than the bulk of any other profession. He wrote:

    I find this belief impossible to accept. The brain is a credulous piece of meat4 but there are limits to self-delusion. Most of them have to know.

    No, they don’t have to know. How many times do you have to see credentialed people mess up cause and effect before you realize that the concept itself is beyond almost all of them?

Comments are closed.