[bionet.sci-resources] New Format for NIH Guide

kristoff@GENBANK.BIO.NET (Dave Kristofferson) (02/23/91)
There will be a format change for the NIH Guide postings next month to
make it more amenable to automatic processing and distribution.  Here
are the details as obtained from Bill Jones at NIH.  Note that the
majority of the text will still be easily readable by humans so most
of you can probably ignore this message unless you are interested in
processing this information further.

Dave Kristofferson

                ---------------

      Description of Delimiters Used for the NIH E-Guide

All delimited documents will have a maximum line length of 72
characters.  The delimiters used will begin with $$ in column 1.  To
make the document more visually pleasing, the last field of each
delimiter is followed by a blank, then padded to 72 characters with
asterisks (*).  Example: $$N1 BEGIN *********************************

The following describes the format and function of each delimiter:

-- Id Record

Format: $$XID file-name file-type file-descriptor sequence-id

The first record for the main body of the E-Guide and the first record
of each full text RFA transmitted will contain a $$XID record.  The
file-name and file-type fields will be 1 to 8 characters in length;
they are primarily for use by VMS systems that need such information.
The file-descriptor is a 1 to 20 character field that more
specifically describes the document.

For the main body of the E-Guide, the file-name will be NIHGUIDE, the
file-type will be yyyymmdd (where yyyy is the 4 digit year, e.g. 1990;
mm is the 2 digit month, and dd is the 2 digit day), and
file-description will be VvvNnn (where vv is the volume and nn is the
number of the issue).

The sequence-id will be of the form PnOt, and is designed to handle
the rare case where the main body of the E-Guide or an RFA has more
than 1,500 lines and must be sent in multiple parts.  The "n"
value will current part number, and "t" will indicate the total
number of parts for the component being mailed.  Note that the
main body of the E-Guide and each RFA will have its own sequencing.
For example, P1O1 means part 1 of 1 (the normal case); P2O3 means
part 2 of 3 total parts.

Example: $$XID NIHGUIDE 19900914 V13N28 P1O2 ***************************

For RFAs, the file-name will be RFA, the file-type will be the RFA
number with dashes and spashes ("-","/") removed, and the
file-description will be the full RFA number.  The reason for the
difference between the file-type and file-description is that some
RFAs would exceed 8 characters if the dashes and slashes were not
removed.

Example:  $$XID RFA DA9011 DA-90-11 P1O1 ***************************

If, in the future, other documents are distributed in this electronic
fashion, they will also have the $$XID record.  I will update this
document and redistribute it if and when that occurs.

-- Index Record

Formats: $$INDEX BEGIN
         $$INDEX END
         $$INDEX id expiration-date (if applicable)

The Index record is used to delimit the beginning and end of the index,
and to mark the beginning of each index entry.

The $$INDEX BEGIN record will appear before the first line of the
index, and $$INDEX END will appear following the last line of the
index.  One $$INDEX id record will appear immediately preceding the
first line of each index entry.  The id has the form LetterNumber (ln)
where the Letter is N for notices, P for program announcements, E for
erratum, and R for RFAs and RFAs.  The Number field starts at 1 for
each letter and increments by one for each entry.  If the document has
an expiration date, then that date (in mm/dd/yy form) will appear in
the expiration-date field.

Example:

   $$INDEX BEGIN
   $$INDEX N1 ******************************************************

   NATIONAL RESEARCH SERVICE AWARD STIPEND INCREASE
   Public Health Service
   Index:  PUBLIC HEALTH SERVICE

   $$INDEX R1 05/06/91 *********************************************

   CLINICAL CORE CENTERS FOR ORAL HEALTH RESEARCH (RFA DE-91-02)
   National Institute of Dental Research
   Index:  DENTAL RESEARCH

   $$INDEX END

-- Entry Records

The text of each entry in the E-Guide is preceeded by an Entry start
record and followed by an entry termination record.

Format: $$ln BEGIN (document-number) (FULL TEXT [if available])
        $$ln END

The "ln" is the id from the Index record, and the document-number is the
RFA or PA number.  If full text is available, the string FULL TEXT
follows the document number.  The full text of RFAs and PAs will still
be sent as separate documents, the first line of which will be a $$XID
record where the file-descriptor field matchs the document-number of the
Entry record.

Examples: $$N1 BEGIN ***********************************************
          $$N1 END *************************************************
          $$P1 BEGIN ***********************************************
          $$P1 END *************************************************
          $$R1 BEGIN DA-90-11 **************************************
          $$R1 END *************************************************
          $$R2 BEGIN HC-87-12 FULL TEXT ****************************
          $$R1 END *************************************************

Note in the above example, there is no document number for entry N1 or
P1, and no full text for entry R1.  The full text for R2 will be
transmitted separately.


-- Summary

In the above described scheme, each document begins with an
identification record ($$XID).  Each entry in the E-Guide is
described by an index entry, which is given an id ("ln") by the
$$INDEX record (expiration dates are also given in the $$INDEX
record).  Each entry is delimited by the Entry start and termination
records, $$ln BEGIN and $$ln END.  If the entry describes an RFA or
PA for which full text is available, then the Entry record number
also contains the RFA/PA number.  The actual full text is
transmitted as a separate document, the first line of which is a
$$XID record where the field-description field matches the
document-number of the Entry record.  The Subject:  and X-Comment:
informational records in the E-Guide header records will remain
asis).  The following is the structure of a sample (& small) E-Guide:


$$XID NIHGUIDE 19910104 V20N1 P1O1 *************************************
NIH GUIDE - Vol. 20, No. 1, January 4, 1991


                                   NOTICES

$$INDEX BEGIN **********************************************************

$$INDEX N1 *************************************************************

NATIONAL RESEARCH SERVICE AWARD STIPEND INCREASE
Public Health Service
Index:  PUBLIC HEALTH SERVICE

                           NOTICES OF AVAILABILITY

$$INDEX R1 05/06/91 ****************************************************

CLINICAL CORE CENTERS FOR ORAL HEALTH RESEARCH (RFA DE-91-02)
National Institute of Dental Research
Index:  DENTAL RESEARCH

$$INDEX END ************************************************************

$$N1 BEGIN *************************************************************
.. text of notice N1
$$N1 END ***************************************************************

$$R1 BEGIN DE-91-02 FULL TEXT ******************************************
.. text of notice R1
$$R1 END ***************************************************************

The full text of RFA DE-91-02 will be distributed separately and have
the following format:

$$XID RFA DE9102 DE-91-02 P1O1 *****************************************
.. full text of RFA DE-91-02