[bit.listserv.sas-l] Zip codes --> counties

TCD@CORNELLA.BITNET (Tim Dorcey) (02/06/90)

     I have a client who has a bunch of data arranged according to
zip codes.  He would like to aggregate the data for a given county (in the
state of New York) and plot the results using the map data sets provides with
SAS.  Does anyone know how to determine what county a zipcode region falls in?
Would it be something like  x1 < zip code < x2   ==> a certain county or is a
look-up table required?  Thanks,

Tim Dorcey                        BITNET:   TCD@CORNELLA
Statistical Software Consultant   Internet: TCD@CORNELLA.CIT.CORNELL.EDU
Cornell Information Technologies
Cornell University
Ithaca, NY  14853

PRAHL@MACC.WISC.EDU (Walter Prahl, MACC) (02/07/90)

>     I have a client who has a bunch of data arranged according to
>zip codes.  He would like to aggregate the data for a given county (in the
>state of New York) and plot the results using the map data sets provides with
>SAS.  Does anyone know how to determine what county a zipcode region falls in?
>Would it be something like  x1 < zip code < x2   ==> a certain county or is a
>look-up table required?  Thanks,

The SAS Institute sells a tape (identified as D103) that
cross-references ZIP codes and counties.  It is one of the tapes in
the Institute's "Data Library Series", which is briefly discussed on
page 449 of the Version 5 SAS/Graph manual.  I used this tape several
years ago to do something like what the client wants to do here.  I
don't have any of the material in front of me, but I remember finding
(to my disappointment) that this is a much more difficult problem than
one would like it to be.

For starters, realize that the cross-reference from ZIP code to county
is not unique: some ZIP codes cross county lines.  In fact, believe it
or not, even the cross-reference from ZIP code to STATE is not unique!
If I remember correctly, the Institute's tape handles this by
providing a "primary" county for each ZIP code, and then a "secondary"
county.  Problem is, in a typical data set there is no easy way to
determine whether a given observation is from the primary or the
secondary county, when it exists (even though the primary county is,
presumably, the most likely).

As I remember, there were also various other difficulties.  For
example, the data on the Institute's tape was several years old, and
ZIP codes are surprisingly volatile (the post office reassigns ZIP
codes sometimes).  Before ordering the tape, you might want to explain
these problems to the client.
----------------------------------
Walter Prahl (608) 262-0284
University of Wisconsin -- MACC
Internet:  prahl@vms.macc.wisc.edu
Bitnet:    prahl@wiscmacc.bitnet