[sci.astro] v16i090: Smithsonian Astronomical Observatory, Part01/49

rsalz@uunet.uu.net (Rich Salz) (12/07/88)
Submitted-by: Alan Wm Paeth <awpaeth@watcgl.waterloo.edu>
Posting-number: Volume 16, Issue 90
Archive-name: sao/part01

# This is a shell archive.  Remove anything before this line,
# then unpack it by saving it in a file and typing "sh file".
# Contents:  NOTES-SAO README build compile decode decodeall encode encodeall
#	makeshar saoconv.c saoextract stardouble stardust.c starsplit.c
 
echo x - NOTES-SAO
sed 's/^@//' > "NOTES-SAO" <<'@//E*O*F NOTES-SAO//'
-------------------------------------------------------------------------------
                             REDUCED SAO

                            Alan Wm Paeth
                     Computer Graphic Laboratory
                        University of Waterloo
                        Ontario, Canada N2L 3G1

-------------------------------------------------------------------------------


The following describes the SAO (Smithsonian Astronomical Observatory) dataset,
reduced to RA/DECL/MAG records and posted in ASCII format in 48 parts.

The SAO dataset was furnished c/o Nasa Goddard for non-commercial use. It is
probably the most comprehensive electronic stellar database having both full
sky coverage and general availability. Containing 258,997 records to magnitude
11.6, it is popularly considered to have a limiting magnitude (that point at
which omissions equals inclusions) of 9.5. The original 40 megabyte dataset
is distributed on tape. Its records contain detailed information on proper
motion, photometry and other information.

This reduced dataset provides about 1.4M of excerpted information, encoded to
minimized overall transmission costs. Decoding requires a UNIX software system
and C compiler. All data may be redistributed for non-commercial private use.
(This is encouraged to save unnecessary repeated postings from any one site).

The remainder of this document is in two sections. The first is a functional
description of how to decode the postings to form the canonical set of 48 files
which form the reduced SAO. The final section details the steps used to reduce
and distribute the dataset.


[1] UNPACKING INSTRUCTIONS

Summary

The SAO distribution is a suite 48 files, normally distributed on the net with
one file enclosure per posting (the largest file is intentionally just under
50K bytes). Each file is a free-standing entity describing the location, class
(single, double, variable) and magnitude of all stars within one spatial area,
sorted by increasing magnitude. The area represented is implicit in the file
name; no file extension is used. This arrangement by spatial region simplifies
both distribution and use of the dataset. A typical file from the distribution
suite is named "SAO22+60"

In a nutshell, decoding of any one file begins by stripping out unnecessary
message header information (if present) leaving a characteristic "uuencoded"
ASCII file of sixty-four character lines. Next, a specialized program (called
"stardust", but invoked implicitly during decoding) is compiled to assist the
decoding. A final script decodes the file. An example of the three steps is:

    #strip header from posting and save data as "SAO22+60.dzu"
    ./compile                         # make up the stardust program
    ./decode SAO22+60                 # decode the one file "SAO22+60"

Similarly, once all 48 source files have been retrieved from each posting and
saved as .dzu files, then the entire dataset may be created (with the program
compilation performed implicitly) be executing one shell command:

    ./decodeall                       # decodes entire dataset
    
For those more interested in the specifics of how coding takes place, the
operation of these files is described below.

Nuts and Bolts

Encoded SAO files stripped of message header information are stored with the
extension ".dzu".  This designation indicates that the data has been
"starDusted", "lempel-Zif compressed" and "Uuencoded", in that order. In nested
fashion, the reverse set of operations must then be performed to reconstruct
the original file. Each decoding step removes the final character of the file
extension (each character represents a successive coding step) until no file
extension remains, at which time the required SAO file will be rebuilt. Of
these three steps, two are common to Unix systems. The third step called
"stardust" is a C program written specifically to compress each SAO file. It
is distributed as part of the reduced SAO suite.

As an example, we wish to reconstruct the SAO file "SAO18-30" (the "teapot" of
Sagittarius) from a posting. We first strip off extraneous message detail and
form the file "SAO18-30.dzu". Assuming we have only the source for stardust.c,
we then perform:

    cc stardust.c -o stardust          # generate "stardust" tool
    uudecode SAO18-30.dzu              # map uuencoded ASCII to binary
    uncompress SAO18-30.dz             # uncompress binary to "dusted" ASCII
    ./stardust <SAO18-30.d >SAO18-30   # reintegrated stardusted file to SAO 

This entire operation (without the C compile) is present in the shell script
"decode". The parent "decodeall" runs this operation on all 48 files.
Similarly, the scripts "encode" and "encodeall" perform the reverse operation.

An additional software tool "starsplit.c" is also provided in the distribution
to split the entire dataset into its 48 constituent files (certain operations
are best performed on the entire data "en masse", but the datset is always
presented as 48 separate spatial entities for the sake of distribution and
convention). For additional details read the source code.


[2] CONVERSION DETAILS

Reduction

The SAO reduction is to a file format compatible with the "StarChart" software
suite published by Alan Paeth (awpaeth.waterloo.edu) on net.sources.unix (then
"net.sources") in 1987 and  now under active development by Craig Counterman
(ccount@royal.mit.edu). Because of the shear enormity of the dataset and the
cost of distribution, notable omissions in the reduction include SAO numbers
and spectral classes, leaving merely Ra/Decl/Mag and a star designator.

We do not consider the SAO reduction a replacement to the Yale Bright Star
Catalog reduction previously posted (the latter includes spectral classes plus
common names and designations), but a means to supplement Yale with additional
information for the express purpose of making detailed finder charts at a very
large scale. The SAO data is inappropriate for most wide-field or low power
charts and would provide needlessly cluttered output.

Conversion

The conversion moved stellar locations (through proper motion) from 1950.0 to
2000.0 positions. An epoch change to 2000.0 (owing to the precession of the
equinoxes) was then performed. Visual magnitudes were used, photographic
magnitudes were substituted when the former was absent. Missing records were
deleted (there are 230 SAO numbers no longer having a unique associated star).
These conversions all took place at the original precision of the dataset of
about 0.01", which was then rounded into standard Right Ascension and
Declination numeric values in the format used by the StarChart software suite.

The "raw" SAO data at the reduced resolution was then sorted by location and
magnitude and run through a simple awk (Unix record processor) script to merge
stars of identical location at the new positional resolution into unique stars.
A composite magnitude was computed by proper application of logarithms. Any
star designation of "single" was promoted to "double" and a new composite
magnitude computed. Multiple systems are similarly grouped and composited but
also show a "double" designation, which should be taken to mean "non-single".
Variable stars are much more rare and this designation is kept to represent
the aggregation on the whole should any one component be variable. This reduces
SAO by 560 records to a final value of 258,277 fixed length ASCII records,
which should not change.

File Splitting

The data (still sorted by magnitude) was then split into 48 files based on
spatial location. This simplifies both distribution and program use. Because a
detailed chart will rarely require more than a few of the forty eight files
simultaneously, this provides a simple means to limit the volume of data which
any program must process. The split boundaries lie on even hours of Right
Ascension (twelve splits) at the Celestial Equator and at +-30 degrees in
Declination. This conveniently allows the Ecliptic to lie within the inner set
of charts, with one chart representing roughly one constellation of the zodiac.
The equatorial charts are also (roughly) square, being 30 degrees on each side.
The polar charts span sixty degrees in Declination, but cover an identical
spatial area, as a consequence of the Equal Area Cylindrical projection (and
the fact that sin(30deg) = 0.5 exactly).

The 48 sorted, split files constitute the reduced SAO dataset, and they are
distributed as 48 separate postings. Sites are free to merge and sort the
datasets into single files, but in the interests of compatibility, the
"starsplit" C program is also distributed to allow spatial fragmentation back
into "canonical" pieces.

File Names and Format

File names are eight characters and of the form SAOhhsDD where "hh" is the Ra
designator taken from the set (00 02 04 06 08 10 12 14 16 18 20 22) and "sDD"
is the De designator taken from the set (-90 -30 +00 +30 +60). The designators
give the minimum values for Ra and De occurring in the file, bounded at the
high end by the next larger designator (with and implicit limit of 24 hours and
+90 degrees for Ra and De). For instance, SA002-60 contains stars with RA and
DE such that (02 hours <= RA < 04 hours) and (-60 deg <= DE < 0 deg).

Each fixed length record consists of sixteen ASCII characters (digits, +, -, S,
V, D are the only values which occur) followed by a carriage return. The format
is HHMMSS+DDMMmmmCC where "HH" is HHMMSS give hours, minutes and seconds in
Right Ascension, +DDMM give signed degrees and minutes in Declination, mmm give
magnitude*100 for positive values and magnitude*10 for negative and CC is one
of "SS", "SD" or "SV" signifying "star, single", "star, double [multiple] or
"star variable", respectively. This format cannot record magnitudes below 9.99.
However, because only 3719 records or 1.4% of the reduced SAO records are this
dim, they are retained at magnitude "9.99", while records of magnitude "10.0"
are recorded as "9.98" to provide unambiguity. The latter is an insignificant
change in magnitude, as the original SAO data records magnitude to the first
decimal point.

The four brightest stars in the reduced SAO are listed as a format example.

064510-1643-16SD
062358-5242-09SS
143937-6050010SD
183657+3847010SS

The dimmest stars in the original dataset were (here precessed to E2000.0):

    ra      decl   mag
 -------  ------  ----
 1h06m26s -31o01' 11.6
 6h52m42s -52o16' 11.6
 9h57m24s -42o26' 11.6 (now grouped with a star of mag 9.3)
11h39m55s -38o30' 11.6
21h41m10s -34o57' 11.6
 2h37m47s -41o54' 11.8
23h18m44s -34o56' 11.8
 7h55m39s -43o47' 11.9 (now grouped with a star of mag 7.5)

Being below the limiting magnitude of the dataset, these most often represent
faint companions to brighter stars.

Coding and Distribution

A three step coding process is used to reduce the size of each file prior
to distribution. The heart of this process is the second step: the Unix
"compress" program. A large increase in overall compression can be achieved by
preprocessing the data to compress into a form which is more regular, possibly
by coding patterns which "compress" has not taken advantage of.

The program which accomplishes this is called "stardust". Because the fixed
length records tend to have very similar characters in each card column
(especially considering that they lie in similar spatial areas and are sorted
by magnitude), we perform a differencing operation across adjacent scanlines,
in which a "0" indicates that no change has occurred and more generally, digits
will change modulo ten from previous digits (the exact details are implicit in
the source code). This regularity is normally missed by compress, because such
"runs" occur every record length of seventeen characters, not in adjacent
columns. The differencing operation makes for a much higher density of ASCII
zeros in the file, which compress happily encodes.

The final step is "uuencode" which remaps the 8-bit binary output of compress
into ASCII characters for subsequent transmission, thereby increasing the
overall file size by about 4/3. Following this step, the largest file is
still under 50K in size, a valuable consideration for many mail systems.
Unfortunately, a "full" version of compress is required for this compaction;
the "-b12" twelve-bit reduced coding table compress variant (the only version
available on some smaller systems) cannot achieve packing below 50K, despite
a compression difference of only a few percent.

Present and Future Plans

A modest extension to SAO would include the last three digits of each star's
SAO number and a spectral class indication. However, it is estimated that this
would increase the size of the entire dataset by 50% and will probably not be
done. Instead, astronomical sites may wish to obtain the original dataset and
reconvert as per their specific needs.

Present work involves identifying those stars within SAO which are already
present within the posted Yale data. By publishing a simple omissions list,
the file SAO' may then be formed. The superset of both Yale and SAO may then
be constructed simply by concatenating Yale with this SAO' and then sorting
(as appropriate). This would yield a master dataset with proper names, stellar
class and other detailed information for the first ~9000 stars, with the
SAO data providing detail for stars dimmer than about magnitude 6.5.

All software used to map from the tape dataset into the final distribution
files has been included or the sake of professional sites wishing to perform
custom, private reductions. Questions and comments should be directed to:
Alan Paeth (awpaeth@waterloo.edu, awpaeth@watcgl@waterloo.csnet).
@//E*O*F NOTES-SAO//
chmod u=rw,g=r,o=r NOTES-SAO
 
echo x - README
sed 's/^@//' > "README" <<'@//E*O*F README//'
Name          Description
----          -----------
NOTES-SAO     description of unpacking instructions and the conversion process
README        this file
SAO?????      canonical SAO files in dataset (48)
SAO?????.dzu  above in encoded format
build         convert raw tape data to final 48 (coded) files for distribution
compile       compile "stardust" program for decoding.
decode        decode one SAOxxxxx.dzu into an SAOxxxxx file
decodeall     decode all 48 SAO*.dzu files into the canonical dataset
encode        encode one SAOxxxxx file into a SAOxxxxx.dzu file
encodeall     encode all canonical SAO files into .dzu files
makeshar      file to shar the non-data portions of the SAO suite
saoconv.c     (used to convert the SAO tape into the raw dataset)
saoextract    runs saoconv; converts SAO tape into raw dataset
stardouble    awk script to remove duplicate spatial entries from raw sao 
stardust.c    source for stardust.c -- used to ready sao for (de)compression
starsplit.c   split raw sao into 48 constituent parts
@//E*O*F README//
chmod u=rw,g=r,o=r README
 
echo x - build
sed 's/^@//' > "build" <<'@//E*O*F build//'
#!/bin/csh -f
#
# This script converts the master (40M) tape dataset "sao" into 48 encoded
# canonical reduced SAO files for distribution. Site use requires access to
# the original SAO data as distributed by NASA.
#
# compile specialized tools as needed
#
cc saoconv.c -lm -o saoconv
cc starsplit.c -o starsplit
#
# run it
#
@./saoextract sao
@./starsplit  <sao.star
@./encodeall
@//E*O*F build//
chmod u=rwx,g=rx,o=rx build
 
echo x - compile
sed 's/^@//' > "compile" <<'@//E*O*F compile//'
#!/bin/csh -f
#
# compile custom program necessary for decoding (and encoding).
cc stardust.c -o stardust
@//E*O*F compile//
chmod u=rwx,g=rx,o=rx compile
 
echo x - decode
sed 's/^@//' > "decode" <<'@//E*O*F decode//'
#!/bin/csh -f
#
# decode - convert one SAO encoded file (with implicit .dzu extension) into
#          unencoded output. Note: no encoded files are deleted by this script.
#
# example "./decode SAO12+30"
#
uudecode $1.dzu
chmod +rw $1.dz
uncompress -c <$1.dz >$1.d
@./stardust <$1.d >$1
rm $1.dz $1.d
#rm $1.dzu
@//E*O*F decode//
chmod u=rwx,g=rx,o=rx decode
 
echo x - decodeall
sed 's/^@//' > "decodeall" <<'@//E*O*F decodeall//'
#!/bin/csh -f
#
# Run "decode" on all .dzu files (48) to form canonical dataset
#
@./compile
#
@./decode SAO00+00
@./decode SAO00+30
@./decode SAO00-30
@./decode SAO00-90
@./decode SAO02+00
@./decode SAO02+30
@./decode SAO02-30
@./decode SAO02-90
@./decode SAO04+00
@./decode SAO04+30
@./decode SAO04-30
@./decode SAO04-90
@./decode SAO06+00
@./decode SAO06+30
@./decode SAO06-30
@./decode SAO06-90
@./decode SAO08+00
@./decode SAO08+30
@./decode SAO08-30
@./decode SAO08-90
@./decode SAO10+00
@./decode SAO10+30
@./decode SAO10-30
@./decode SAO10-90
@./decode SAO12+00
@./decode SAO12+30
@./decode SAO12-30
@./decode SAO12-90
@./decode SAO14+00
@./decode SAO14+30
@./decode SAO14-30
@./decode SAO14-90
@./decode SAO16+00
@./decode SAO16+30
@./decode SAO16-30
@./decode SAO16-90
@./decode SAO18+00
@./decode SAO18+30
@./decode SAO18-30
@./decode SAO18-90
@./decode SAO20+00
@./decode SAO20+30
@./decode SAO20-30
@./decode SAO20-90
@./decode SAO22+00
@./decode SAO22+30
@./decode SAO22-30
@./decode SAO22-90
@//E*O*F decodeall//
chmod u=rwx,g=rx,o=rx decodeall
 
echo x - encode
sed 's/^@//' > "encode" <<'@//E*O*F encode//'
#!/bin/csh -f
#
# encode - convert one good SAO file into an encoded .dzu file.
#
# example: "./encode SAO12+30"
#
#./stardust <$1 -e | compress -c -b12 | uuencode $1.dz >$1.dzu # we wish!
@./stardust  <$1 -e | compress -c      | uuencode $1.dz >$1.dzu
@//E*O*F encode//
chmod u=rwx,g=rx,o=rx encode
 
echo x - encodeall
sed 's/^@//' > "encodeall" <<'@//E*O*F encodeall//'
#!/bin/csh -f
#
# encode all 48 canonical SAO files to form encoded .dzu set.
#
@./compile
#
@./encode SAO00+00
@./encode SAO00+30
@./encode SAO00-30
@./encode SAO00-90
@./encode SAO02+00
@./encode SAO02+30
@./encode SAO02-30
@./encode SAO02-90
@./encode SAO04+00
@./encode SAO04+30
@./encode SAO04-30
@./encode SAO04-90
@./encode SAO06+00
@./encode SAO06+30
@./encode SAO06-30
@./encode SAO06-90
@./encode SAO08+00
@./encode SAO08+30
@./encode SAO08-30
@./encode SAO08-90
@./encode SAO10+00
@./encode SAO10+30
@./encode SAO10-30
@./encode SAO10-90
@./encode SAO12+00
@./encode SAO12+30
@./encode SAO12-30
@./encode SAO12-90
@./encode SAO14+00
@./encode SAO14+30
@./encode SAO14-30
@./encode SAO14-90
@./encode SAO16+00
@./encode SAO16+30
@./encode SAO16-30
@./encode SAO16-90
@./encode SAO18+00
@./encode SAO18+30
@./encode SAO18-30
@./encode SAO18-90
@./encode SAO20+00
@./encode SAO20+30
@./encode SAO20-30
@./encode SAO20-90
@./encode SAO22+00
@./encode SAO22+30
@./encode SAO22-30
@./encode SAO22-90
@//E*O*F encodeall//
chmod u=rwx,g=rx,o=rx encodeall
 
echo x - makeshar
sed 's/^@//' > "makeshar" <<'@//E*O*F makeshar//'
#!/bin/csh -f
public shar -c NOTES-SAO README build compile decode decodeall encode encodeall makeshar saoconv.c saoextract stardouble stardust.c starsplit.c >SaoMaster.shar
@//E*O*F makeshar//
chmod u=rwx,g=rwx,o=rwx makeshar
 
echo x - saoconv.c
sed 's/^@//' > "saoconv.c" <<'@//E*O*F saoconv.c//'
/*
 * saoconv.c -- convert sao data into starchart format
 *
 * copyright (c) 1988 by Alan Wm Paeth. All rights reserved.
 *
 * SAO input is taken as the raw 150 character ASCII records as appearing
 * in the tape distribution. The output is in the form of StarChart style
 * records. This version includes facilities to clip to a limiting magnitude
 * of 10, and to record the number of dim (or missing) stars. Four command
 * line parameters may be provided to give limiting RA and DECL values, but
 * for current purposes the mapping into spatial regions is done later.
 * 
 */

#include <stdio.h>
#include <math.h>

/*
 * the following give the input and output proper motion locations and epochs
 */

#define EPIN   1950.0
#define EPOUT  2000.0
#define POSIN  1950.0
#define POSOUT 2000.0

#define STDIN 0
#define STDOUT 1
#define ISIZE 150
#define OSIZE   5

#define LR 0.0
#define HR 360.0
#define LD -90.0
#define HD 90.0

#define DLDEGSEC 3600.0
#define DLMINSEC 60.0
#define RAHRSSEC 54000.0
#define RAMINSEC 900.0
#define RASECSEC 15.0
#define RADTODEG (180.0/3.14159265358979324)
#define DEGTORAD (3.14159265358979324/180.0)

#define R90 (90.0*DEGTORAD)
#define R360 (360.0*DEGTORAD)

#define DSIN(x) sin((x)*DEGTORAD)
#define DCOS(x) cos((x)*DEGTORAD)
#define DTAN(x) tan((x)*DEGTORAD)

#define F(col1, col2) (in[col2] = '\0', atof(&in[col1-1]))
#define I(col1, col2) (in[col2] = '\0', atoi(&in[col1-1]))

double atof();
char in[ISIZE+1], out[OSIZE+2];
int nomag, dimmag;

main(argc, argv)
    char *argv[];
    {
    int rval;
    double ra, dl, ra2, dl2, mg;
    char tp;
    double lr, hr, ld, hd;
/*
 * arg checking
 */
    if (argc == 1)
	{
	lr = LR;
	hr = HR;
	ld = LD;
	hd = HD;
	}
    else if (argc == 5)
	{
	lr = atof(argv[1]);
	hr = atof(argv[2]);
	ld = atof(argv[3]);
	hd = atof(argv[4]);
	}
    else err("use: saoconv {ralow rahi declow dechi {flt degs}}\n");
    if (lr < LR) err("ra too low");
    if (hr > HR) err("ra too high");
    if (ld < LD) err("dcl too low");
    if (hd > HD) err("decl too high");
    if (lr >= hr) err("ra overlap");
    if (ld >= hd) err("decl overlap");
    lr *= DEGTORAD;
    hr *= DEGTORAD;
    ld *= DEGTORAD;
    hd *= DEGTORAD;
/*
 * the stuff
 */
     while (rval = saoread(&ra, &dl, &mg, &tp, POSIN, POSOUT))
	{
	if ((rval > 0) && (ra >= lr) && (ra < hr) && (dl >= ld) && (dl < hd) )
	    {
	    xform(ra, dl, &ra2, &dl2, EPIN, EPOUT);
	    saowrite(ra2, dl2, mg, tp);
	    }
	}
    fprintf(stderr, "nomag: %d, dimmag: %d\n", nomag,dimmag);
    exit(0);
    }

saoread(r, d, m, t, ein, eout)
    double *r, *d, *m, ein, eout;
    char *t;
    {
    int len;
    double dec, ras, mag, vis, pho;
    double mua, mud;
    int hdc, dup;
    char tp;
    len = read(STDIN, in, ISIZE);
    if (len == 0) return(0);
    if (len < 0)
	{
	fprintf(stderr, "read error; code = %d\n", len);
	fflush(stderr);
	return(0);
	}
    if (len != ISIZE)
	{
	fprintf(stderr, "runt record: %d-of-%d bytes \n", len, ISIZE);
	fflush(stderr);
	return(0);
	}
    if (in[6] == 'D') return(-1);
/*
 * warning: reads must be from successively lower numbered columns.
 */
    dec = F(140,150);
    ras = F(130,139);
    hdc = I(124, 124);
    dup = I( 95, 95);
    vis = F( 81, 84);
    pho = F( 77, 80);
    mud = F( 52, 57);
    mua = F( 18, 24);
/*
 * get brighter of visual/photographic magnitudes
 */
    mag = (vis < 20.0) ? vis : pho;
    if (mag > 12.0)
	{
	nomag++;
	return(-1);
	}
/*
 * CLIP DIM STARS TO MAG 9.99, (stars of mag 10.0 become 9.98 to disambiguate)
 */
    else if (mag > 10.0)
	{
	dimmag++;
	mag = 9.99;
	}
    else if (mag == 10.0) mag = 9.98;
    tp = 'S';
    if (dup > 4) tp = 'V';
    else if ((dup > 0) || hdc) tp = 'D';
    *r = ras + ( (eout-ein)*mua*RASECSEC/DLDEGSEC*DEGTORAD);
    *d = dec + ( (eout-ein)*mud         /DLDEGSEC*DEGTORAD);
    if (*r > R360) *r -= R360;
    if (*r < 0) *r += R360;
    if ((*d > R90) || (*d < -R90)) err("proper motion beyond +-90 decl");
    *m = mag;
    *t = tp;
    return(1);
    }

saowrite(r, d, m, t)
    double r, d, m;
    char t;
    {
    int rah, ram, ras, dld, dlm, sign, mag;
/*
 * convert ra to base 60
 */
    r *= RADTODEG;
    d *= RADTODEG;
    r *= DLDEGSEC;
    rah = r/RAHRSSEC;
    r -= rah*RAHRSSEC;
    ram = r/RAMINSEC;
    r -= ram*RAMINSEC;
    ras = r/RASECSEC + 0.5;
/*
 * round and propagate ra in seconds and minutes
 */
    if (ras >= 60)
	{
	ras -= 60;
	ram++;
	}
    if (ram >= 60)
	{
	ram -= 60;
	rah++;
	}
/*
 * convert de to base 60
 */
    sign = (d < 0.0);
    if (sign) d = -d;
    dld = d;
    d -= dld;
    dlm = d * DLMINSEC + 0.5;
/*
 * round and propage de in minutes
 */
    if (dlm >= 60)
	{
	dlm -= 60;
	dld++;
	}
/*
 * error check new output values (propagation/range errors)
 */
    if (rah >= 24)
    fprintf(stderr, "int. ra error %02d%02d%02d%s%02d%02d%03dS%c\n",
	rah, ram, ras, sign ? "-":"+", dld, dlm, mag, t);
    if (dld >= 90)
    fprintf(stderr, "int. decl. error %02d%02d%02d%s%02d%02d%03dS%c\n",
	rah, ram, ras, sign ? "-":"+", dld, dlm, mag, t);
/*
 * compute rounded magnitudes, print in three digits
 */
    mag = (m < 0.0) ? m*10 : m*100;
    if (mag > 999)
	{
	fprintf(stdout, "%02d%02d%02d%s%02d%02d%04dS%c\n",
	rah, ram, ras, sign ? "-":"+", dld, dlm, mag, t);
	fprintf(stderr, "mag too low\n");
	}
    else
    fprintf(stdout, "%02d%02d%02d%s%02d%02d%03dS%c\n",
	rah, ram, ras, sign ? "-":"+", dld, dlm, mag, t);
    }

xform(rin, din, rout, dout, ein, eout)
    double rin, din, *rout, *dout, ein, eout;
    {
    double t, t2, x, y, z, w, d;
/*
 * update
 */
    rin *= RADTODEG;
    din *= RADTODEG;
/* */
    t2 = ( (ein+eout)/2.0 - 1900.0 ) / 100.0;
    x = 3.07234 + (.00186 * t2);
    y = 20.0468 - (.0085 * t2);
    z = y/15;
    t = eout-ein;
    w = .0042 * t * (x + (z * DSIN(rin) * DTAN(din)) );
    d = .00028 * t * y * DCOS(rin);
    *rout = rin + w;
    if (*rout >= 360.0) *rout -= 360.0;
    if (*rout < 0.0) *rout += 360.0;
    *dout = din + d;
/*
 * update
 */
    *rout /= RADTODEG;
    *dout /= RADTODEG;
    }

err(s)
    char *s;
    {
    fprintf(stderr, "%s\n", s);
    fflush(stderr);
    exit(1);
    }
@//E*O*F saoconv.c//
chmod u=rw,g=r,o=r saoconv.c
 
echo x - saoextract
sed 's/^@//' > "saoextract" <<'@//E*O*F saoextract//'
#!/bin/csh -f
#
# This script runs "saoconv" with appropriate sorting and merging to convert
# the 40M SAO tape dataset (the standard input) into the master "StarChart"
# format SAO dataset "sao.star". Errors and summary detail logged in "saoerr".
#
# We don't use the "-n" sort flag because this buys about 1.2% in overall
# compression, which is valuable in reducing the largest dataset from a
# critical size (in terms of transport by e-mail) of 49.5kbytes to 48.9kbytes.
# The penalty incurred is in reversing the sorted order of the two brightest
# stars (Sirius and Alpha Centaurii alone possess negative magnitudes) which
# is unimportant. The favorable alphabetic sorting on ra/decl fields for
# records having like magnitudes (runs) produces more compressable output.
#
(./saoconv <$1 |\
   sort +0.0 |\
   ./stardouble - |\
   sort +0.11 -0.16 +0.0 -0.10 >sao.star ) >& saoerr
@//E*O*F saoextract//
chmod u=rwx,g=rx,o=rx saoextract
 
echo x - stardouble
sed 's/^@//' > "stardouble" <<'@//E*O*F stardouble//'
#!/bin/awk -f
  {
  id =  substr($0, 1,11);
  mag = substr($0,12, 3);
  des = substr($0,15, 2);
  tail= substr($0,17,99);
  if( id == oid )
      {
      mag = log(exp(-0.00921034*mag) + exp(-0.00921034*omag)) * (-100/.921034); 
      if (des != "SV") des = "SD";
      if (length(otail) > length(tail)) tail = otail;
      }
  else if( oid != 0 ) printf "%s%03d%s%s\n", oid, omag, odes, otail
  oid = id;
  omag = mag;
  odes = des;
  otail= tail;
  }
END { printf "%s%03d%s%s\n", oid, omag, odes, otail }
@//E*O*F stardouble//
chmod u=rwx,g=rx,o=rx stardouble
 
echo x - stardust.c
sed 's/^@//' > "stardust.c" <<'@//E*O*F stardust.c//'
/*
 * stardust.c -- (un)pulverize files into dust more digestable by compress
 *
 * stardust -e <file.star | compress >small # encode for better compression
 * uncompress small | stardust >file.star   # decode after decompress
 *
 * copyright (c) 1988 by Alan Paeth (awpaeth@watcgl)
 */

#include <stdio.h>

#define BS 17
#define STDIN 0
#define STDOUT 1
#define CD 15	/* buf offset to (S,V,D) star code; (0,1,2) if encoded */

int ln, en;

main(argc, argv)
    char *argv[];
    {
    char b1[BS], b2[BS], b3[BS], *cur, *lst, *t;
    if (argc > 2) err("too many cmd parms");
    if (argc == 2)
	{
	if (argv[1][0] != '-') err("switch expected");
	if (argv[1][1] != 'e') err("only '-e' {encode} allowed");
	en = 1;
	}
    cur = b3;
    lst = b2;
    if (!rline(lst)) err("input file is empty");
    wline(lst);
    while (rline(cur))
	{
	cline(b1, (!en && (ln>2)) ? b1 : lst, cur);
	wline(b1);
	t = cur; cur = lst; lst = t;
	}
    }

cline(bo, ba, bb)
    char *bo, *ba, *bb;
    {
    int i;
    for (i=0; i<BS; i++)
	{
	char c, t;
	c = *ba++ - (t=*bb++);
	if (t != '0')
	    {
	    if (c < 0) c += 10;
	    if (en && ((c < 0) || (c > 9))) err("non-digit or non-match");
	    }
	*bo++ = c + '0';
	}
    }

rline(b)
    char *b;
    {
    int l, cd;
    char *c;
    l = read(STDIN, b, BS);
    if (l == 0) return(0);
    ln++;
    if (l != BS) err( (l<0) ? "read error" : "runt record");
    c = &b[CD];			/* map star code to digits */
    cd = (*c >= '0') && (*c <= '9');
    if (!cd && !en) err("!you meant to encode!");
    if ( cd &&  en) err("!you meant to decode!");
    if (en)
	{
	if (*c == 'S') *c = '0'; else if (*c == 'D') *c = '1'; else *c = '2';
	}
    return(1);
    }
	
wline(b)
    char *b;
    {
    char t, *c = &b[CD];
    t = *c;
    if (!en)
	{
	if (*c == '0') *c = 'S'; else if (*c == '1') *c = 'D'; else *c = 'V';
	}
    write(STDOUT, b, BS);
    *c = t;
    }

err(s)
    char *s;
    {
    fprintf(stderr, "stardust error (line %d) -- %s\n", ln, s);
    exit(1);
    }
@//E*O*F stardust.c//
chmod u=rw,g=r,o=r stardust.c
 
echo x - starsplit.c
sed 's/^@//' > "starsplit.c" <<'@//E*O*F starsplit.c//'
/*
 * starsplit.c -- spatially scatter Starchart data into output files.
 *
 * copyright (c) 1988 by Alan Wm Paeth. All rights reserved.
 * 
 * This program operates on StarChart style datasets to "multiplex" data into
 * a number of output files. Code changes and multiple data runs will be
 * required on Unix systems which do not support 48 simultaneous files open
 * for output. The split points in the arrays rabreak and debreak are the
 * canonical splits for the reduced SAO distribution and should not be changed. 
 * When changed for non/custom SAO applications, read on.
 * 
 * The compile time arrays rabreak and debreak define N+1 and M+1 boundaries
 * which enclose the NxM area grid for output, written to files with names
 * formed from the "HEAD" constant and the smaller ra/de boundaries. Data not
 * fitting into any cell (as when ra or de don't span 0..24 or -90..90) are
 * written to stderr.
 *
 * NOTE: a "0" should appear in "debreak" to represent the celestial equator,
 * to deal with a code contrivance which assures that negative decl values
 * round in the right direction (the zero assures symmetry in the split).
 * This code exists to deal with the imprecision in decl (minutes are not
 * used to in splitting) which may move a boundary by 1 degree less epsilon.
 * Fortunately, the code always writes the input record to exactly one output
 * file, so that no data is lost
 */

int rabreak[] = { 0, 2, 4, 6, 8, 10, 12, 14, 16, 18, 20, 22, 24 };
int debreak[] = { -90, -30, 0, 30, 90 };

#include <stdio.h>
#include <strings.h>	/* for strlen, can probably be omitted */

#define HEAD "SAO"	/* beginning of file name */

#define ARLEN(a) (sizeof(a)/sizeof(*a))
#define RA (ARLEN(rabreak)-1)
#define DE (ARLEN(debreak)-1)
#define FILES (RA*DE)

#define RAPOS 0
#define DEPOS 7
#define DESGN 6

main()
    {
    FILE *f[FILES];
    char buf[200], n[20];
    int r, d, i, ra, de, deneg;
    int monot, zerof, badra, cruft;
    monot = zerof = badra = 0;
    for (r = 0; r<RA; r++)
	{
	monot |= (rabreak[r] >= rabreak[r+1]);
	badra |= (ra < 0);
	for (d=0; d<DE; d++)
	    {
	    zerof |= (debreak[d] == 0);
	    monot |= (debreak[d] >= debreak[d+1]);
	    sprintf(n, "%s%02d%+03d", HEAD, rabreak[r], debreak[d]);
	    f[DE*r+d] = fopen(n, "w");
	    }
	}
    if (monot || !zerof || badra)
	{
	if (badra) fprintf(stderr, "negative ra value found\n");
	if (monot) fprintf(stderr, "coded ra/de tables non-monotonic\n");
	if (zerof) fprintf(stderr, "no zero (celstial equ.) in decl table\n");
	exit(1);
	}
    while (1)
	{
	fgets(buf, sizeof(buf), stdin);
        if (feof(stdin)) break;
	ra = (buf[RAPOS]-'0')*10 + buf[RAPOS+1]-'0';
	de = (buf[DEPOS]-'0')*10 + buf[DEPOS+1]-'0';
	deneg =(buf[DESGN] != '+');	/* note: -0 (+eps) decl. case */
	if (deneg) de = -de;
	for (r=0; r<RA; r++)
	    {
	    if ((rabreak[r] <= ra) && (ra < rabreak[r+1])) break;
	    }
	cruft = (ra >= rabreak[RA+1]);
	for (d=0; d<DE; d++)
	    {
	    if ((!deneg) && (debreak[d] <= de) && (de <  debreak[d+1])) break;
	    if (( deneg) && (debreak[d] <  de) && (de <= debreak[d+1])) break;
	    }
	cruft |= (de >= debreak[DE+1]);
	i = DE*r + d;
	if ((i<0) || (i > FILES))
	    {
	    fprintf(stderr, "bad file index. bogus input line?:\n%s\n", buf);
	    exit(1);
	    }
	fwrite(buf, 1, strlen(buf), cruft ? f[i] : stderr);
	}
    for (i=0; i<FILES; i++) fclose(f[i]);
    exit(0);
    }
@//E*O*F starsplit.c//
chmod u=rw,g=r,o=r starsplit.c
 
echo Inspecting for damage in transit...
temp=/tmp/shar$$; dtemp=/tmp/.shar$$
trap "rm -f $temp $dtemp; exit" 0 1 2 3 15
cat > $temp <<\!!!
     254    2084   13675 NOTES-SAO
      18     143    1033 README
      16      64     378 build
       4      16     106 compile
      13      51     311 decode
      54     114     958 decodeall
       8      46     252 encode
      54     114     956 encodeall
       2      20     174 makeshar
     278    1020    6104 saoconv.c
      18     146     887 saoextract
      19      78     507 stardouble
      97     359    1994 stardust.c
     100     550    3384 starsplit.c
     935    4805   30719 total
!!!
wc  NOTES-SAO README build compile decode decodeall encode encodeall makeshar saoconv.c saoextract stardouble stardust.c starsplit.c | sed 's=[^ ]*/==' | diff -b $temp - >$dtemp
if [ -s $dtemp ]
then echo "Ouch [diff of wc output]:" ; cat $dtemp
else echo "No problems found."
fi
exit 0

-- 
Please send comp.sources.unix-related mail to rsalz@uunet.uu.net.