26<!--            Reliability and validity are presented together because they are related, and are often
27                confused with one another.-->
Reliability is a property of a measure that refers to its <b>precision</b>, or the degree to
which multiple observations of a given phenomenon yield identical results. In public health,
measures such as death rates or birth outcomes are often used to indicate the true underlying
risk of illness or disability in a population. But sometimes these measures of risk fluctuate
when the true underlying risk of disease does not. The reasons for the variability usually
include one or more of the following factors: 1) the health event is relatively rare, 2)
the population size is relatively small, or
3) the health events do not occur at regular time intervals.
Even for complete count datasets, such as birth and death certificate datasets, random
fluctuations over time can yield estimates that are not reliable. Consider the case of
low birth weight in a small community. In this community one low birth 
weight infant is born each month, on average. But low birth weight is a health event that does
not necessarily occur at regular intervals - there is randomness in the timing of low birth
weight occurrence. In our small community, if three mothers give birth to low birth weight infants
at the end of December of Year 1, and none do in January or February of Year 2, it may appear
as though the rate of low birth weight births has declined from Year 1 to Year 2.
Fortunately, statistical techniques can be used to help assess whether there was a significant
difference in rates from Year 1 to Year 2. The confidence
interval is the statistical measure that conveys the reliability of an estimate. If an
estimate has a wide confidence interval, it decreases the likelihood that the difference
is statistically significant.
Rates that fluctuate over time, in the absence of changes in underlying risk, are considered
unreliable. Such rates are also commonly referred to as "unstable." Since the underlying risk
typically changes very slowly, the term, "unstable" is commonly used to refer to any rates
that fluctuate in a random pattern over relatively short timeframes.
Validity is a property of a measurement that refers to its <b>accuracy</b>, or the degree to
which observations reflect the true value of a phenomenon. In public health, the validity
of most measures is quite good. The cause of death on a death certificate is certified by a
physician, survey measures have been tested to maximize validity, and birthweight is measured
and reported at the birth hospital. There are some measures that we question, for instance
self-reported drug and alcohol use, but on the whole, public health measures have a high
degree of validity.
In the three figures below, the bulls-eye of the target represents the true underlying
risk of disease in a population and the holes in the target represent multiple objective
measurements of the risk. In the first figure, the measure is reliable - it measures nearly
the same value each time. But the measure in Figure 1 is not valid - the average of the
scores is not close to the true underlying risk. In the second figure, the scores are not
very reliable - there is a lot of variability in the scores, but they center around the true
risk value, so they are valid (at least on average). In the third figure, the
measure is both reliable and valid.
[Bulls-eye diagram showing reliability and validity concepts]
111                                                        <br/><br/>
The term "precision" is often used in relation to reliability, while the term, "accuracy" is used
to describe validity.
