We all know the famous Mark Twain line: “There are three kinds of lies: lies, damned lies, and statistics.” That and the adage that one can find a statistic to prove anything may suggest that statistics are just untrustworthy.
In fact, measurement and analysis – the fundamentals of statistics – are the only honest way to understand a full story – with the caveat that the numbers and the methods for their collection are understood and sound.
Stand-alone numbers rarely help one understand an event. If I told you that 7 people in the room had hats, you would know almost nothing about the situation I describe.
But ratios are more descriptive. If I told you that 7 out of 100 (7/100 or 7%) of the people in the room had hats, you would know that most people were not wearing hats. The picture is more clear.
Even better is the time series. If I told you that 7% of the people in the room were wearing hats and that at last year’s event, 14% of people wore hats, you would know that the number of people who wear hats is small and fell year to year. You would have a a story about what was happening with the hats.
But this still isn’t enough. We also need to know how the numbers were collected if we are to trust the story they tell. If the number of people in the room was measured at lunch time one year and during the heart of the event a year later, we would have little confidence in the number of people the room. Similarly, if the hats were counted on heads one year and inside the coatroom the other year, we would not have confidence in the number of hats.
Furthermore, it is important to discount any data that may be biased. If our hat statisticians offered a reward to ensure participation, they likely introduced selection bias (as mentioned in yesterday’s post). They likely missed the number of hats in the room and ended up with the number of hats worn by people who liked the reward. Imagine how different the results would be if the reward was a hat pin (a bias toward those who liked hats) or a tube of sunblock (a bias toward those who do not wear hats).
Back on point, here’s a great example demonstrating how COVID Infection Rate will be impacted by selection bias. As we approach this year’s flu season the infection rate is likely to go down even though the number of people infected with COVID stays the same or increases. The reason is that flu symptom confusion will lead more people to seek out COVID testing. This bias could dramatically increase the number of tests (the denominator) leading the infection rate to fall irrespective of the number of people infected.
Summing up, the clean numbers are, fatality rate, hospital beds (available and filled), and population size. Ratios and time series that include these numbers will help us correctly analyze the situation. However, ratios that include biased numbers such as COVID Infection Rate should be avoided. No matter how often the Governor repeats it, it does not describe the picture we are seeing.
Also, hats are cool.