[sci.med] question about blood chemistry tests

glenm@athena.UUCP (10/24/86)

I am wondering how the "normal" ranges for blood chemistry
tests are established.  I assume they use mean and standard
deviation values to set up a range that includes 95% of the
population.

Is this really the case?  If so, doesn't this approach cause
problems in some cases?  For example, a person whose thyroid
hormone levels were near the upper end of the "normal" range would 
probably have more energy than one with levels near the low
end of the "normal" range.

	Glen McCluskey
	..tektronix!athena!glenm

oliver@unc.UUCP (Bill Oliver) (10/27/86)

In article <688@athena.UUCP> glenm@athena.UUCP (Glen McCluskey) writes:
>
>I am wondering how the "normal" ranges for blood chemistry
>tests are established.  I assume they use mean and standard
>deviation values to set up a range that includes 95% of the
>population.
>
>Is this really the case?  If so, doesn't this approach cause
>problems in some cases?  For example, a person whose thyroid
>hormone levels were near the upper end of the "normal" range would 
>probably have more energy than one with levels near the low
>end of the "normal" range.
>
>	Glen McCluskey
>	..tektronix!athena!glenm

For the most part, physicians in general and certainly virtually all 
pathologists have stopped using the term "normal" for individual laboratory
results because of the problems with both the concept of "normal" meaning
not indicative of a disease, and "normal" meaning having a Gaussian 
distribution.  Instead, most folk refer to a test's "reference range".

In setting up a refernce range, the investigator must first find an
appropriate group of individuals representing the target population.  Since
most applications wish to use healthy individuals, you want to
use a healthy population -- students, technicians, etc. which have been
evaluated as to health and appropriate demographics.  
If this isn't possible (or desirable), you choose
a population that is known to be free of the disease state that the
given test is used to evaluate -- thus you would look for absence in
liver disease in the population used to set up the reference range 
for AST (a liver enzyme).  Some institutions use all admission lab data
for new patients regardless of disease state.

Once you have the test data, you look at the distribution.  Some tests, such
as serum sodium, potassium, bicarb, and chloride do have a Gaussian 
distribution at most institutions, and the reference range is set at 
+/- 2 standard deviations.  Most tests, however, are skewed, and it is
necessary to use a non-parametric evaluation.  
For instance, glucose, AST, and LD are skewed high and total protein,
albumin, and packed cell volume are skewed low.  Most of the time, a
percentile ranking method is used, and the central 95% of the population
makes up the reference range.  In some instances, when subpopulations
have known differences in distributions, they are broken out and listed
separately.  Thus, for some tests there are pediatric versus adult
reference ranges and for others there are separate reference ranges
based on sex, race, chemotherapy, or whatever.  Each individual 
institution thus puts out different reference ranges based on the
population the range is based on and how the local pathologists
and clinicians have decided they want the reference range displayed
(in many instances it is not necessary or desirable to actually
set out a subpopulation based reference range if the clinicians
are comfortable with their knowledge that a given population 
tends to run a little high or low).

In addition, each analytic method has a different reference range.  Thus,
an AST run at a hospital which uses an Ektachem analyzer will have a
different reference range than one which uses, say, an RA 1000 centrifugal
method.  Not only will the reference ranges be different, but virtually
all statistical parameters will be different.  This can sometimes cause
problems in small hopitals which use one machine during the day when
the high volume batch processing of samples is possible, and then 
switches to a different machine at night to cut down on overhead.  It
is sometimes impossible to compare values run on the different machines
with each other.

Deciding whether or not a single set of laboratory values indicate
a disease state thus involves a bit more than simply looking to
see if a value is in a given reference range.  If we use a test
which has, for instance, a true Gaussian distribution, we then are
accepting that 5% of people without evidence of disease will be out
of the reference range.  In addition, since people who do have disease
will have their own distribution, and this distribution will overlap with
the reference range, some of the patients with the disease will fall into
the reference range.

Accordingly, tests are designed with particular sensitivities and specificities.
Screening tests will have a rather narrow reference range, so that they will
pick up most or all people with the disease (a high sensitivity), but will
also call a bunch of folk without the disease out of range ( a low specificity).
Those who are picked up on the screening test will then be further evaluated
with a more specific (ususally more expensive) method.   Again, each test
method will have it's own sets of values for false positives (people without
disease that are out of reference range), and false negatives (people
with disease who are in reference range), and thus each test method will
have it's own sensitivity, selectivity, and positive predictive value
(the probabiltiy that an out of range test result will be associated with
a given clinical disease state) or negative predictive value (the
probability that an in range result will be associated with not having the
disease).

Bill Oliver, MD