[net.news.adm] more hobgoblins, thoughts on improving the situation

bobm@rtech.UUCP (Bob Mcqueer) (09/04/86)

[]---

Concerning the "#W" lines in map files: it seems to me that the "date"
portion of that line is a very useful piece of information, but non-adherence
to format is an even worse problem here than with the "#L" lines recently
discussed.  Looking over the map files, I see:

1) "date" format beginning with day-of-week
2) "date" format beginning with month (no day-of-week)
3) "date" format beginning with numeric day of month
4) six digit integer YYMMDD preceding semicolon
5) 4) following semicolon or with no semicolon.
6) MM/DD/YY at some random placement
7) variations with and without time-of-day portions

The above appear in a significant number of entries.  There are also many
entries which convey the desired information to a human reader, but
don't adhere to any of the above formats.

Please note: it is not my intention AT ALL to chastise anybody for not
entering the dates "properly".  I'm just noting that as things stand now,
it is difficult to write a routine which will reliably pick the date out of
most of these lines.

I was wondering.  Could part of the collection process for maintaining
mapfiles and building the distributed archives be attachment of a time stamp
to each entry indicating when it was received?  The date stamp would then
be in a consistent format.  True, this hides the fact that somebody may
have mailed the map maintainer an ancient entry, but I think the guarantee
of being able to machine parse the date is an equitable trade-off.

Another thought: there ought to be a little program floating around which
would check the validity of map entries, including the format of lines
which could reasonably be machine readable.  Is there such?  If so, making
the tool available and known would help matters.  If everybody had the tool,
and it produced explicit diagnostics, it might become reasonable for the map
maintainer to insist on proper format before including entries in the archives.

Bob McQueer
-- 
{amdahl, sun, mtxinu, hoptoad, cpsc6a}!rtech!bobm

tgt@cbosgd.UUCP (Tim Thompson) (09/04/86)

In article <431@rtech.UUCP>, bobm@rtech.UUCP (Bob Mcqueer) writes:
> Concerning the "#W" lines in map files: it seems to me that the "date"
> portion of that line is a very useful piece of information, but non-adherence
> to format is an even worse problem here than with the "#L" lines recently
> discussed.  Looking over the map files, I see:
> 
>  [ formats omitted for brevity ]
>
> The above appear in a significant number of entries.  There are also many
> entries which convey the desired information to a human reader, but
> don't adhere to any of the above formats.
> Bob McQueer
> -- 
> {amdahl, sun, mtxinu, hoptoad, cpsc6a}!rtech!bobm

Bravo, Bob! I'm the map maintainer for Indiana, Illinois, Pennsylvania, Ohio,
North Carolina, and Tennessee. I just took over the job, and before I did
anything else, I went through by hand and standardized the "#W" line for
every single entry for each state. It was a TIME-CONSUMING job, but now
that it's done, all I have to do is make sure that incoming updates adhere
to the standard. Let me quote from the document that (from what I understand)
comes with the Usenet software:

 The entire map is intended to be processed by pathalias, a program that
 generates UUCP routes from this data.  All lines beginning in `#' are
 comment lines to pathalias, however the UUCP Project has defined a set
 of these comment lines to have specific format so that a complete
 database could be built.
 
 [...]
 
 #W      who last edited the entry and when
 
 This field should contain an email address, a name in parentheses,
 followed by a semi-colon, and the output of the date program.
 Example:
 
 #W	ucbvax!fair (Erik E. Fair); Sat Jun 22 03:35:16 PDT 1985
 
 The same rules for email address that apply in the contact's email
 address apply here also. (i.e. only one system name, and user name).
 It is intended that this field be used for automatic aging of the
 map entries so that we can do more automated checking and updating
 of the entire map. See getdate(3) from the netnews source for other
 acceptable date formats.

[ END OF QUOTATION ]

From this piece of documentation, I got the standard that I'm going to use
for these six states. So right now, if I were to update the entry for 
the site cbosgd, the entry would look like this:

#W	cbosgd!tgt (Tim Thompson); Thu Sep  4 13:42:14 EDT 1986

I never got a chance to see getdate(3) for other acceptable formats, and
besides, it seemed easier to use the output from an an existing utility
than to have to format the date by hand.


> Another thought: there ought to be a little program floating around which
> would check the validity of map entries, including the format of lines
> which could reasonably be machine readable.  Is there such?  If so, making
> the tool available and known would help matters.  If everybody had the tool,
> and it produced explicit diagnostics, it might become reasonable for the map
> maintainer to insist on proper format before including entries.

I'm currently gathering some thoughts on a program that would provide the
functions you mention. If such a beasty already exists out there on the
met somewhere, someone PLEASE send me e-mail, before I go ahead and
re-invent the wheel.

> Bob McQueer
> -- 
> {amdahl, sun, mtxinu, hoptoad, cpsc6a}!rtech!bobm

					Tim Thompson
					cbosgd!tgt


-- 
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 Timothy G. Thompson           AT&T Network Systems             Columbus, Ohio
                                    cbosgd!tgt
 DISCLAIMER:  These ramblings are my own. However, a thousand monkeys pounding
    on a thousand typewriters would eventually produce the exact same thing!!
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++