The testing scam

I used to be a big proponent of testing to help manage the virus. But now I am backing off that. Here is the problem.

Suppose that as a scam, I say that I have a test for the virus. But in fact, I plan to use a random number generator that 5 percent of the time will say that you have the virus and 95 percent of the time will say that you don’t.

If half the population has the virus and half does not, then my scam will be exposed very quickly. My test will be making lots of mistakes, telling people who have it that they don’t and vice-versa.

But if less than 5 percent of the population has the virus, it may not be so clear. Most of the people who “test negative” in my scam will in fact be negative, so I will have that going for me. My problem, which may not be readily apparent, is that most of my positives will be false positives and a few of my negatives will be false negatives.

I am not saying that existing tests are pure scams. But to be better than pure scams, there has to be a much lower margin of error than you might think.

The tests that we have are giving nonsensical results, such as a husband and wife with identical symptoms getting opposite results, or studies that if they were extrapolated would imply that more than 100 percent of New York state has had the virus.

I was one of those FDA-bashers who thought that requiring certification for tests was peacetime bureaucratic thinking. I have come to realize that in order to be useful, the tests have to be highly accurate. If that is where FDA was coming from, I can now appreciate that.

After I wrote the above, but before posting, a commenter pointed me to an essay by Peter Kolchinsky, which aligns with my thinking.

The meaning of getting a positive result also depends on the percent of the population that has been infected. If 50 percent of people have been infected, then a test with a 97 percent sensitivity and a 2 percent false-positive rate is still likely to be 98 percent right if it tells you you’re positive. If only 2 percent of people are infected, then such a test would be only 50 percent right if it said you’re positive.

31 thoughts on “The testing scam

  1. Paul Romer uses ‘toy model simulations’ to argue that (even very imperfect) universal tests, plus quarantine of those who test positive, greatly reduce (or delay?) contagion:

    https://paulromer.net/covid-sim-part3/

    “How much difference does it make if the test used to send people into quarantine is bad? Not as much as you might think.
    The simulated data here contrast policies that isolate people who test positive using four different assumptions about the quality of the test. Even a very bad test cuts the fraction of the population who are ultimately infected almost in half.”

    Many questions arise:
    Are simple toy models apt?
    Here, Romer models only false negatives. What about false positives?
    Would rapid investment in preemptive shielding and isolation of vulnerable demographics be a superior strategy (given scarcity of tests, problems of accuracy, etc)?
    At what level of general test inaccuracy does forcible quarantine of an individual who tests positive violate individual rights?
    And so on and so forth.

  2. Solution: “identify and isolate” plan B: assess everyone for mortality risk using mortality data against each individual’s physical and health state. Give everybody their “C19 mortality risk score”. Isolate and restrict the activity of all high risk individuals.

    What is anti-fragile Arnold’s mortality risk score?
    What is the score for Risky Randy?

    Goal: beat the virus, save the economy and minimize deaths.

  3. Even worse there was this article in the weekend WSJ on reinfection of recovered patients in S Korea, https://tinyurl.com/yb7pdvv7. Even if you assume a 99% specificity with the 95% sensitivity, and assume a 1% prevalence (510,000 cases, 50 times higher than the 10700 confirmed cases) you get a positive predictive value (PPV) of 49 percent. Yet know one quoted in the article suggests that these reinfections were not false positives. So has S Korea been isolating people who were not infected? Is the true confirmed case number half what they’ve reported? It seems the best way partially address this is to do a repeat confirmatory test when an individual tests positive.

    • I made this exact point on another blog. The stories about reinfections keep popping up every day, and the most likely explanation is that the people involved were either false positives the first time or the second time around.

    • I agree with the repeat confirmatory test concept. But there is a concern that test failures are not random, but instead will be repeated in the same person.

      • Thinking about this further, the PPV in this analysis is assumes random testing of the population, symptomatic and asymptomactic people. So, in a low prevalence population more asympotomatic people will be tested. But, I assume in S Korea, they were mostly testing symptomatic people, so their testing a smaller population with a much higher prevalence of infections. This could significantly increase PPV. By how much? Would need to know total number of people tested and how many of those tested positive. So, maybe the false positive issue is not as serious as I previously thought.

        • I found the S Korea testing data here, https://tinyurl.com/stjex5f
          They’ve tested 589,520 (assume unique individuals) and 10708 positive, for a prevalence of about 1.85 percent. This increases the PPV to 64.2. So false positives are still one third of total.

  4. Testing is still good for knowing whether the virus has already ripped through a population, and whether it will be safe to relax soon.

    Kolchinsky writes:

    But it’s also less of an issue if there is very high rate of positives, as in a German village that reported 14 percent positive for the virus—even a 2 percent false-positive rate can’t explain such a high number of cases. As SARS-CoV-2 spreads, even poorly calibrated tests will become more accurate, but it is hoped that that doesn’t have to happen.

    It happened. There is now also an American village that reported 21% positive called New York City.

  5. or studies that if they were extrapolated would imply that more than 100 percent of New York state has had the virus.

    Can you provide a link to this?

    • I believe this is coming from people extrapolating the Los Angeles county and Santa Clara antibody studies showing that the percentage of the population with antibodies is 28-55 times higher than reported cases in LA and 50-85 in Santa Clara. However, I believe it makes no sense to extrapolate these under-report rates to other counties because this rate depends on how widespread testing is in a particular location. Another important factor would be the age profile of the infected; assuming age doesn’t impact how likely someone is going be infected, if a location skews towards a population where younger healthier people are infected at higher rates than another location, perhaps also with a much greater proportion of asymptomatic cases, you would expect a higher under-reported number.

      The more insightful number is the estimate of how much of the population in the county was infected- the LA study shows 2.8-5.6%, which most would argue is highly plausible, and probably too low for a place like NYC, which is confirmed by their antibody testing indicating that 21% of the random sample were infected; again, a number that appears to be plausible.

      Also note that an antibody test showing 21% positives is going to be less impacted by false positives than one showing 5%. If 100 people are tested with 2% false positive rate, you’d get 2 positives even if no one had the disease. In Santa Clara and LA that would be nearly half of your positives. In NYC that would be 2 out of the 21 positives. I don’t know what the false negative rates are but when you get such a high positive rate it is a number that can’t simply dismissed due to false positives. I find it highly unlikely that the test has a 20% false positive rate.

  6. To be sure we don’t throw the baby out with the bath water, we probably still want to screen where we can cheaply. Checking for fever and low oxygen saturation can help alert to possible infection early and allow early treatment. The Asian countries seem to use temperature scanning much more than we do in the USA. I’ve read entry and exit to apartment buildings in Peking requires one to have one’s temperature checked. But there are alternatives to such a manpower intensive approach: Robots help get Hong Kong university exams on track
    https://sc.mp/f6zyl

    • And Steve Sailer has more on how checking your oxygen saturation with an oximeter (a cheap gadget that costs less than a single virus test) can save your life, relating anecdotal evidence of people not being seen by a doctor until there oxygen saturation is very low: https://www.unz.com/isteve/the-next-thing-to-stock-up-on/ also. Has another post on why Dr Kling’ s idea of a no ventilator advance directive may be a good idea.

  7. I think you’re comparing testing to an ideal policy, when you should be comparing it to current policy. It’s the same error as people who point to market failure and conclude that a government takeover would solve all the problems without introducing new ones.

    Take Kolchinsky’s numbers with a 2% true infected rate, for an example. 100 people are tested. 4 of them test positive and are required to self-quarantine until they have 2 weeks without symptoms. Only 2 of them were actually sick. Meanwhile, 96 people go about their business – probably all of them truly uninfected. Isn’t that better than the current situation where 80 people are required to self-quarantine, 20 people are ‘essential’ and go to work, and the only indication on whether essential people are sick is whether they self-report symptoms?

    It’s true that two of the four people required to quarantine wouldn’t have had to if we had perfect knowledge. That’s still much better than the current situation where 78 of the people required to quarantine wouldn’t have to.

    Speaking of lockdown socialism, part of why it’s unsustainable is the sheer numbers. 20 people can’t pay for 80 to stay home for months, let alone years. With testing, 96 people would need to pay for 4 to stay home for weeks or a month – a much more tolerable ratio. Further, since the 4 have a clear start date, it’s likely their quarantine can have a clear end date as well – also an improvement from the current situation. We could even afford to make quarantine preferable to working if the quarantine ratio is small enough. Perhaps pay people more than their current salaries for proof that they are home. The test would prevent this from becoming moral hazard.

    False negatives are similar: they’re a huge problem compared to hypothetical perfect knowledge, but much much better than relying on self-reported symptoms for screening, which is essentially our current policy.

    All we need from a testing policy is ‘good enough’. It’s still clearly worse than perfect knowledge, but that’s an unreasonable point of comparison.

  8. A sentiment rarely seen on blogs: “I was wrong”. How refreshing to see it here. Raises your credibility. One reason I love this blog.

  9. Hmm, that doesn’t sound right to me. As I understand it, there is nothing unique about the COVID virus that makes it uniquely difficult to test for. The PCR technology used for testing is well established, and I wouldn’t think the rate of false positives or negatives, provided one uses the correct protocols and reagent recipe. And as with any test, one must make sure that the sample is obtained from the correct place in the body. Perhaps they are still working out the kinks for that. But issues with that would only cause false negatives. I’m guessing that there are still some issues that are being worked out with the testing process for this particular virus. They’ve only been testing in the U.S. for about a month at significant scale, so I wouldn’t be surprised if there are issues.

    • Also, given that certain segments of the population are much more vulnerable than others, or have the potential to spread much more than others, there are testing venues that would yield substantially more benefits than others. For example, there would be few downsides, but many benefits to frequent and large scale of testing the elderly (especially in nursing homes) and healthcare workers, including those who work in nursing homes. Given the little I know about nursing homes, I can see a situation where once established in a facility, the virus could really wreck havoc.

  10. I stated weeks ago that the testing resources were being wasted (the RT-PCR tests) on the general population. It was never a practical plan to test everybody every few days to determine who has to be quarantined at any given time- the time delays involved were a hurdle that could not be overcome, and then you have to add in the logistical problems of sample collection, test runs, quality control runs, and data entry at both ends. This plan for mass testing was always pure fantasy. The testing resources should have been used exclusively on hospital patients, nursing home residents, and the people who work closely with those groups.

    The antibody tests are more important for the general population, it might be possible to do all this at home with a commercial product at some point (think home pregnancy test). The data I have seen suggests you can get the sensitivity down to around 99% and the specificity at least as good, and I do expect that the prevalence is surely high enough already in lots places that they can used to assess actual fatality rates- and I also expect that this prevalence will grow enough everywhere else that the error rates no longer matter.

    • Well I agree with much of what you say, but not all of it. I agree that large scale and frequent testing of the entire U.S. population is not, and never was going to happen. Some basic math shows this to be true, as there are roughly 330 million people in the U.S. and testing them all on, say a weekly or monthly basis, would imply a gigantic daily testing number. However, “symptomatic” persons come in various flavors. I would guess that the number of persons with with some sort of symptoms is much larger than the number of persons currently tested, although I couldn’t tell you if that number is 2x, 20x or somewhere in between. Even in light of this uncertainty, I think increased testing would highlight a pretty significant portion of folks who are actually positive, but wouldn’t be identified given the current intensity of daily testing.

      To me, identifying folks who are positive, even if it is less than half of “actual” infections (and maybe even a lot less than half) probably has benefits, and there is a good chance that these benefits are significant or very significant:
      1. Allows these people to self isolate, if they so choose. Because many infections occur within the family unit, there is a good chance that folks will indeed self isolate if they live in a household with 2 or more people, as people usually are concerned with the health of their family members.
      2. Allows these folks, and their healthcare practitioners, to think ahead with respect to their healthcare options, should their condition worsen.
      3. Could facilitate some sort of test and trace, although I acknowledge that this may or may not happen.
      4. Provide better data on hotspots, which may help (or not help)
      5. Provide better data on actual transmission mechanisms

      That said, I am open to changing my mind. For example, I can see a case where testing provides few benefits if a). it turns out that the percentage of persons who have contracted the disease climb so high (maybe 40%?….not sure what the magic number is), where herd immunity kicks in in a major way, and b). the virus doesn’t mutate in such a way, such that those previously infected, adjusted for this wearing off of immunity effect, still maintains herd immunity. I can see other scenarios in which testing provides few benefits.

      • At this point, with the virus so widespread, if you have symptoms you should self-quarantine. No test needed (well, perhaps needed at the end to verify non-infectiousness). Testing might help in the test-and-trace scenario, where we look for the asymptomatic carriers who don’t know they are infectious. However, our ability to create a system to do such test-and-trace on a national or even state scale, in the time-frame required and with the effectiveness necessary, seems doubtful.

        A better place to do testing might be on a bottom-up level, where individual business test their employees and customers immediately, so that people entering would feel and be safe to work and shop. This would require fast and cheap tests, but perhaps these are coming.

        In the meantime, testing becomes just one of a suite of things we do to reduce the virus’ infectiousness. In other words, we have to learn how to live with it.

        • I agree. Testing is not a complete solution. And test-and-trace may or may not happen in an effective way. But I still believe that testing can probably provide lots of benefits, even if implemented in a highly imperfect way.

  11. The problem, though, with the antibody tests is going to be political- given the actions taken, no one in elected office is going to want to see any data that shows the fatality rate isn’t at least 0.5%, and they would prefer that no data shows it under 1%.

    • Right now, municipal NYC has around 12k deaths, with an extrapolated 21% of the population infected. It has a population of 8.4 million people. So that would mean 1.764 million infected, for an infected fatality rate of .68%. So for right now at least, it isn’t looking like NYC is seeing an infected mortality rate under .5%.

      My recollection is that S. Korea, the Diamond Cruise ship, and Guangdong province all have case fatality rates between .5% and 1.0%. I think Iceland and Germany are in that range as well. So given that multiple places with relatively extensive testing all seem to have fatality rates clustering in a band, and that the extrapolated data from NYC is falling in that band as well, I think that it is unlikely that we will see an infected fatality rate much below .5%, unless we have big improvements in case management (including perhaps medications currently in clinical trials).

      I am not sure what magnitude of impact the new protocol on ventilators and prone breathing will have on fatality rates.

      I am also not sure how quickly the virus will mutate to cause milder infections. I would suspect that perhaps that is already happening, as people with bad symptoms are most likely to be limiting their contact with others, outside of seeking treatment.

      So even if we don’t get a vaccine by the end of 2022, I would suspect that SARS-Cov-2 will no longer be an issue for the economy by then, due to some combination of improved case management reducing fatality and duration of hospitalization, lower reproductive numbers due to higher percentages of the population with acquired immunity, selection of more contagious but less severe strains of the virus, and changes in human behavior that slow the spread of the virus.

  12. I was one of those FDA-bashers who thought that requiring certification for tests was peacetime bureaucratic thinking. I have come to realize that in order to be useful, the tests have to be highly accurate. If that is where FDA was coming from, I can now appreciate that.

    WOW!!!!!

  13. This is the same problem we have with facial recognition algorithms for terrorists. If false positives are 1%, and the Atlanta airport sees 275,000 passengers per day, and an actual terrorist shows up once every five years or so, then you have 2,750 positives per day, essentially all false.

  14. Just artificially seed the batch of tests with a few known positives and known negatives, every day. Check the results of those to see if the testers are well-calibrated, or just scamming.

    Doesn’t seem like it should be that hard.

  15. If the goal is to reopen large parts of the economy, Romer’s analysis is still valuable. Say that a cheap saliva test can be administered to all employees. Yes, an imperfect test will yield false positives. So you send those employees away until they are symptom-free and test negative three times.

    That is much better than the current situation which treats all employees as infectious. Don’t make the perfect the enemy of the good.

  16. Lots of tests give false positives. It’s a staple of introductory statistics to ask a question like you did, “the rate of lung cancer in the population is 1%, this test has a false positive rate of 5%, your test comes back positive, what’s the probability that you have lung cancer”. Most people answer 95%. Of course the answer is 20% (hope I got that one right).

    After you’ve had the cheap test, and it comes back positive, then you usually get the expensive one (depending on consequences, costs, etc).

    For covid19, it’s just economics:
    – availability and costs of tests with different levels of discrimination
    – the consequences overall of being sent home when you could be working (seeing as there’s no cure, there’s no medical benefit to you,apart from stress, in being certain)

  17. As many commenters have pointed out, the predictive value of any diagnostic test will vary based on the prevalence of the disease being tested for (i.e. the pre-test probability). We can’t escape the math of Bayes’ Theorem.

    What’s perhaps more problematic/interesting is that test sensitivities/specificities are typically validated relative to some “gold standard”, with the gold standard test often being too expensive/cumbersome/risky to use routinely (e.g. CT angiography vs. ‘real’ angiography to diagnose pulmonary embolism, or large-vessel stroke, or CT vs MRI for various brain pathologies).

    We haven’t really coalesced around a single “gold standard” for diagnosis, and I question comparisons of different testing regimens/products/techniques. We know that upper respiratory RT-PCR testing in some symptomatic patients will be negative at first but then positive a few days later. Is that our “gold standard”… validating a test against its repeat a few days later (is the test equally unreliable at each iteration, or has the disease progressed past a consistent threshold of detection)? Is it a compelling history of exposure combined with clinical features and imaging?

    In patients who have progressed to frank COVID pneumonia, lower respiratory samples may have better yield but obtaining them is more invasive for the patient and higher risk to the healthcare provider. Anecdotes prove very little, but a former colleague of mine working in NYC had multiple negative nasal swabs before a bronchoscopic sample revealed the COVID infection that was strongly suggested the clinical circumstances and CT scan (which was a horrific sight!).

    I’ll leave the question of how much is a “bad test” worth to those who have thought more about it, but it is worth bearing in mind that we can’t even rigorously assess how bad those tests are right now.

    -An anesthesiologist/intensivist

Comments are closed.