jordan@ucbarpa.Berkeley.EDU (Jordan Hayes) (08/29/86)
Brian Reid <reid@decwrl.UUCP> writes:
The vast majority of the USENET map entries have
latitude/longitude that is either missing or else in the wrong
format.
Here is the full set of #L fields from the map entries for all
of the Bay Area sites (most of the sites don't even have a
latitude and longitude entry). Notice how many of the lines
below are wrong:
acacia 122 17 59 W / 37 32 57 N
altunv 122 44 W / 38 27 N
amd 37.2 N / 122 W
[ ... ]
lll-bio ground zero nuclear wasteland; who gives a fuck?
This is classic_stuff!
/jordan
ps: I agree that this is a problem and should be cleared up. I also
think that the maps in general could use a shakedown. Wouldn't it
seem reasonable to reject map changes that are in the wrong format?
Who is taking care of the bay area maps?
jbuck@epimass.UUCP (Joe Buck) (08/30/86)
In article <15454@ucbvax.BERKELEY.EDU> jordan@ucbarpa.Berkeley.EDU (Jordan Hayes) writes: [ Quote from Brian Reid about bad #L lines ] >ps: I agree that this is a problem and should be cleared up. I also > think that the maps in general could use a shakedown. Wouldn't it > seem reasonable to reject map changes that are in the wrong format? > Who is taking care of the bay area maps? No, it's not reasonable at all. Accurate connection information benefits the net as a whole; rejecting someone's entry because of an error in the #L line is foolish. If someone sends a map update, it's quite likely because the current map may cause mail to be lost. It the person came close, the best solution would be for the map-maintaining volunteer to fix it, and mail back the corrected copy to the sender. If that's too much work, I'd rather have an incorrect #L line. One way of putting pressure on people to improve their #[A-Z] lines would be to post some software that actually does something interesting with them. As long as they are just comments, there's little incentive to maintain them. On the other hand, out-of-date information is worse than no information. Having map entries expire after, say, two years (or even one year) would probably improve the accuracy tremendously. -- - Joe Buck {ihnp4!pesnta,oliveb,nsc!csi}!epimass!jbuck Entropic Processing, Inc., Cupertino, California
bobm@rtech.UUCP (Bob Mcqueer) (08/30/86)
[]---- Hmmmmm. What a coincidence. I've been toying around lately with putting usenet stuff into a relational database (look who I work for and guess which one) to be able to do nice sophisticated queries thereon. I decided that the #L lines were being entered in a pretty "loose" fashion. For what its worth, my decisions on how to make sense out of a reasonable number of those lines: 1) I assign an accuracy 0 - 4. 0 means I couldn't make sense out of the line, and you get a default based on country / region (default to the South Pole if I don't recognize the country). 1 means degrees accuracy, 3 means degrees / minutes and 4 degrees / minutes / seconds. 2 indicates better than "degrees", but "city" was attached. The rationale is that a city is generally bigger than a minute, but even the largest metropolitan areas are only on the order of one degree (of course the further north they're situated, the easier for them to cover multiple degrees of longitude). 2) I split into two pieces on the "/" first. Most people seem to be obeying this one. 3) for each piece, I look for "NnSsWwEe" first, assuming "N" and "W" if I don't find them (I know - geographical chauvinism). Then I use strtok() to to grab things delimited by " \tNnSsWwEe\"'", and take the numbers as degrees / minutes / seconds, stopping early if I run out of numeric tokens. Then I see if the stopping token was "city". 4) the whole specification then gets the lower accuracy of the two pieces. This represents my guess for something that will work a resonable amount of the time. The "accuracy" lets you filter the less reliable ones depending on what you're doing with the data. If you want something even MORE hopeless try to parse reasonable sub-pieces out of the telephone number lines. Not only do you have non-uniformity of entry, but what constitutes an interesting sub-part (such as US / Canadian area codes which may be more useful than geographic information sometimes) differs from country to country, and there are various alternate carriers, extension specifications, etc. I'm trying to fish out the area codes, and it seems to me that those rules may also produce a useful sub-piece of the phone number for other countries as well, but I'm not sure, since I don't know what the "breaks" in those phone numbers mean. THEN, what about the date that's supposed to be entered in the #W line...... Bob McQueer -- {amdahl, sun, mtxinu, hoptoad, cpsc6a}!rtech!bobm
reid@decwrl.DEC.COM (Brian Reid) (08/30/86)
In article <432@epimass.UUCP> jbuck@epimass.UUCP (Joe Buck) writes: >One way of putting pressure on people to improve their #[A-Z] lines >would be to post some software that actually does something >interesting with them. As long as they are just comments, there's >little incentive to maintain them. I have a program that plots USENET maps using the CIA World Bank II map data base, various cartographic projections, and the mod.map data. It is pretty neat. I am still working on it; for example, at this very moment I am adding an option that will switch between Mercator projections and Lambert Conformal projections; sometimes one is better than the other. While I am willing to release the program eventually, it requires almost half a gigabyte of support data to do its job, so I think that for the moment I will just post the maps that it produces (PostScript files) and not the program itself. Brian
rlr@brahms.VOLUME.EDU (the real Rich Rosen) (09/08/86)
> lll-bio ground zero nuclear wasteland; who gives a fuck? > >This is classic_stuff! > >/jordan It's a good thing old bandy boy didn't use the word 'anus', eh? Otherwise you would have been real pissed off. Right, brain-of-newt? Hahahahahaha. ucbvax!brahms!rlr the real Rich Rosen/UCB Math Dept/Berkeley CA 94720 It is good to be Rich--the Rabbi himself will give the eulogy at your funeral.