Playing the odds

As is the case with many strange and unfamiliar things to the general public, there is always the fear that something about nuclear power stations will cause harm to the health of the local residents. In this case it is the effects of radiation. Of course, no-one can see radiation hurting anyone so it is difficult to prove whether or not the population are being affected. Enter the epidemiologist.

Epidemiology is the study of the health of populations through statistical means. In issues surrounding the effects of nuclear power on public health, epidemiological studies usually focus on detecting increased incidence cancer in the vicinity of nuclear installations. The actual effects are discussed here, but here we discuss the limitations of epidemiological studies.

MEASURING IN A RANDOM WORLD

In the chaotic real world, many types of events are random and governed by probability rather than determinism. For example, driving over a certain alcohol level will not definitely cause a crash, and similarly driving below that alcohol level will not definitely mean a safe journey. But the probability of a crash when drunk driving is much higher than the probability when driving sober. Similarly, living in a more carcinogenic environment will not definitely cause cancer and living in a less carcinogenic environment will not definitely mean good health. But in a more carcinogenic environment, the probability increases.

Where probability of cancer in a population increases, the expectation of cancer cases increases. Of course, this expectation cannot be calculated because the science is not well enough understood. Instead, the expectation is measured directly by averaging the cancer rate over all regions to get this baseline expectation or mean. But in the real world, which is probabilistic and not deterministic, for any given area, due to randomness, there is no guarantee that the actual cancer incidence will meet that expectation. It could be higher or it could be lower. This fact makes things complicated for epidemiology. It is not enough to simply look at the area around a nuclear facility, see a higher cancer rate than the expectation and presume that cancer rates are elevated in the region, the possibility that it might just be random noise must be discounted. To do this, epidemiologists employ a significance test.

The significance test begins with a null hypothesis, which is taken to be true. The null hypothesis is always that the mean of cancer rates in the region under study is the same as the overall average, the rate that would be expected if the power station was having no effect at all on the population. The significance is all about proving this hypothesis wrong by measuring the cancer rate and determining that it is significantly different from what is expected. If the epidemiologist cannot find a high enough cancer rate to disprove the null hypothesis, then he is forced to accept that the there is no evidence of any increase, which of course is good news (unless you’re an anti-nuclear campaigner).

But how high does the cancer rate have to be to be considered different from the average? In the standard 5% test, if the measured rate would only have a 5% probability of occuring under the null hypothesis, that is if the region under study was normal, then the difference is considered significant and null hypothesis is proven false. If the null hypothesis is false, then the alternate hypothesis, in which the mean in the region under study is different from the average, is concluded to be true.

The problem of this method should be clear. If the alternate hypothesis is true if the measured mean only has a 5% probability of occurring, then it still means that for every twenty cases that concludes the mean is anomalous, we expect one of them would be a false positive. In other words, even if the cancer rate is statistically significant, it does not imply that it is definitely different from the baseline mean. On the flipside, even if the cancer rate was on the baseline, it does not rule out the possibility that in fact the region does experience a different cancer rate.

This is the problem with statistics in a random world. They must be used with caution.