jwz@lucid.com (Jamie Zawinski) (11/09/90)
I know of two formats in which Webster's dictionary can be found online. I have a GNU Emacs package (by Jason Glasgow) for talking to one of them, and a Unix program (by Ed James) for talking to the other. mintaka.lcs.mit.edu runs a server of the first kind, and pasteur.berkeley.edu runs a server of the second kind; but pasteur won't talk to any machines not at berkeley, so I can't use it any more. This is unfortunate, because the second format is a better one. So my first question is, are there any machines out there which run a server of the second sort which will talk to me? My second question is, are either of these formats the same as that which the NeXT webster server uses? If not, what is the format that the NeXT server uses? And are there any NeXTs out there which will answer webster connections to arbitrary machines on inet? Here is a brief description of the two formats I know of, so you will know what I'm talking about; The mintaka kind uses port 103; it is very simple, supporting single-word commands of the form "DEFINE word"; it does spelling correction as well, when you ask for the definition of a word that it doesn't know about, or when you issue the command "SPELL word". There is also a command for listing all words beginning with a given prefix. The definition which is sent back looks like phi.lis.tine \'fil-*-.ste-n; f*-'lis-t*n, -.te-n; 'fil-*-st*n\ \-.iz-*m\ n cap 1: a native or ingabitant of ancient Philistia often cap 2a: a crass prosaic often priggish individual guided by material rather tha n intellectual or artistic values : BABBITT 2b: one uninformed in a special area of knowledge - philistine aj that is, the paragraphs come filled, and lines are pre-wrapped at 79 columns. There is little hope for making this look any prettier, since it's been chewed on already. The other kind of server, of which pasteur.berkeley.edu is a variety, uses port 1964, and has an interface very much like SMTP or NNTP - responses begin with three digit numbers, 2-- means ok, 5-- means failure, etc. The big win of this server is that it preserves font-change and special-character information. The definition body that comes back is broken up into records. There are two levels of encoding; at the first level structural elements of the definition are sent one per line, in a form like <character> : <field-1> ; <field-2> ; <field-3> ... where the character says what kind of record this is (definition, label, cross-reference, etc). Each kind of record has a fixed number of fields in it, separated by semicolons. This means that if a word has several definitions (as philistine does, above) then each definition will be in its own record. When the fields contain text, as definitions do, they contain typesetter information. Special characters and font-changes are encoded with "overstruck" characters, that is, a sequence like <char-1> <backspace> <char-2> will either change the font, or will map to one or more different characters. No line-breaks are included, so a client gets to format and wrap the definitions as it likes. One interesting fact is that it is apparently that the mintaka database was derived from the pasteur database (or a common source) because I have come across definitions in mintaka's dictionary which have had the font-information improperly stripped out! Parts of the font change codes were still visible in a few cases. So, any answers? -- Jamie PS: if you have access to a server of the same genotype as pasteur, and you have a TI Explorer Lisp Machine, you can use the code in /usr/jwz/public/dictionary-client.lisp on spice.cs.cmu.edu to talk to it with a hypertextized interface (clicking on words defines them, making it easy to navigate around the dictionary). GNU Emacs code for talking to the other kind is available at your favorite emacs archive site.
royle@iuvax.cs.indiana.edu (Keenan Royle) (11/09/90)
iuvax.cs.indiana.edu is a webster server. it is also the home of the software to use NeXT as a webster server for a generic UNIX clients. (anon ftp) -- Keenan Royle royle@cs.indiana.edu postmaster@cs.indiana.edu royle@iubacs.bitnet
pcg@cs.aber.ac.uk (Piercarlo Grandi) (11/19/90)
On 9 Nov 90 04:58:02 GMT, royle@iuvax.cs.indiana.edu (Keenan Royle) said: royle> iuvax.cs.indiana.edu is a webster server. royle> it is also the home of the software to use NeXT as a webster royle> server for a generic UNIX clients. (anon ftp) This is the second message with details about locating a webster server. I am not sure, and apologies in advance if I am wrong, but I seem to remember that Webster is copyrighted material and one has to pay a copyright fee for making copies of it, e.g. broadcasting or public performance fees, or copying pats of it over the network. Probably MIT, Berkeley and Indiana have paid the appropriate fees for their sites, but access from other sites is a copyright violation, if what I surmise above is true. Not only that, but accessing somebody's else webster server without prior permission is not good net etiquette anyhow, just like accessing somebody's else NNTP server, unless they are explicitly made available for network wide access, like anonymous FTP servers are. -- Piercarlo Grandi | ARPA: pcg%uk.ac.aber.cs@nsfnet-relay.ac.uk Dept of CS, UCW Aberystwyth | UUCP: ...!mcsun!ukc!aber-cs!pcg Penglais, Aberystwyth SY23 3BZ, UK | INET: pcg@cs.aber.ac.uk