[comp.protocols.tcp-ip] Seeking advice on establishing a LARGE centralized mail system

pritch@tut.cis.ohio-state.edu (Norm Pritchett) (04/26/89)

I would like to hear from individuals experienced in establishing a
centralized electronic mail service for a large user base (4 figures
or greater). 

Here at the Ohio State University we have a campus-wide token ring
network interconnecting individually-administered departmental networks
whose sizes range from a handful to hundreds.  It is not very easy to
provide a total count of hosts but it should be pretty close to a thousand.

Some of our departments already implement one scheme or another for
providing uniform addressing of mail for its users such that a sender
need not be concerned with which particular machine to direct the
message to.  In these cases, the sender addresses the message to the
department's Internet domain name (e.g. user@eng.ohio-state.edu or
user@cis.ohio-state.edu) and the message is delivered to the recipient
on his "home" machine.

We would like to implement a similar scheme at the university-wide
level where a sender could address a message to
some-userid@ohio-state.edu and have the message delivered to the
recipient on his home system.  The major obstacle is with the
"some-userid" part: we wish it to be representative of the recipient's
real name (or actually be his real name) while at the same time have
it uniquely identify him/her among the 75,000+ faculty, staff and
students where there are numerous unresolvable name collisions.  A
format of Firstname.MI.Lastname which eliminates many collisions still
leaves many remaining.

If there is anyone who has experience in setting up a similar thing or
has constructive advise, please correspond with me via mail at one of the
following Internet addresses:

	pritch@cis.ohio-state.edu
	npritchett@osu-20.ircc.ohio-state.edu
	pritchett@eng.ohio-state.edu

-- 

Norm Pritchett, The Ohio State University College of Engineering Network
Internet: pritchett@eng.ohio-state.edu	BITNET: TS1703 at OHSTVMA
UUCP: pritch@sydney.columbus.oh.us	CCNET: ENG::PRITCHETT (6172::PRITCHETT)

af%sei.ucl.ac.be@CUNYVM.CUNY.EDU ("Alain FONTAINE ", Postmaster - NAD) (04/28/89)

For what it is worth (two belgian cents = approx 0.0005 dollar..) :

We have established an unified address  scheme here. But we did not find
any way to  allow external correspondants to send mail  to an individual
when  only  knowing   his  name,  *and*  avoid   clashes...  This  seems
theoretically impossible. The sender must *know* and *specify* some more
information to  garantee uniqueness.  So the addresses  used are  of the
form   :  personal-identifier@unit.ucl.ac.be,   where   'unit'  is   the
standardized three or four letter sigle  of the laboratory or service in
which  the person  can  be found.  Of  course, it  is  difficult for  an
external correspondant trying to contact  somebody for the first time to
guess the 'unit' to  be used. On the other hand, clashes  are a very low
probability event, since units never count more than 50 persons.

Implementation : the DNS would be  a marvelous tool for this, since each
unit  could have  and manage  its  own name  server. Halas,  (one of  my
favorite gripes), the arbitrary division  of mail addresses into a local
and  a domain  part makes  it  impossible to  use  the DNS  down to  the
individual  level. So  the  current situation  is  that one  centralized
machine contains a centralized database of mail routing information, and
nearly all domain-addressed mail goes physically (uh, should we say that
about zeroes-and-ones on wires and disks and ...) through that machine.

Alain FONTAINE                       +--------------------------------+
Universite Catholique de Louvain     | If your mail software barks at |
Service d'Etudes Informatiques       | my address, you may try :      |
Batiment Pythagore                   |                                |
Place des Sciences, 4                |     FNTA80@BUCLLN11.BITNET     |
B-1348 Louvain-la-Neuve, BELGIUM     +--------------------------------+
phone +32 (10) 47-2625

mar@ATHENA.MIT.EDU (05/02/89)

We've been thinking of tackling this problem here at MIT.  Our initial
planning is as follows:

*  The full name of every member of the MIT community will be known to
   the mail hub.  Mail sent to someone's full name will result in:
	1) The mail is delivered if the name is unique and the person
	   has a mailbox 
	2) An error response is generated saying "[full name] does not
	   have an electronic mail address, please send mail to MIT
	   Room ..., Cambridge MA 02139"
	3) An error response is generated saying "[full name] is
	   ambiguous, please choose one:" followed by a list of people
	   giving the name, title, address, and a unique email
	   identifier.
	4) An error response saying "addressee unknown".
*  Every member of the MIT community will be given a unique
   identifier for email purposes.  For most active email users, this
   will be their login name.  For other people and those with name
   conflicts, it will be their initials and a number, similar to the
   NIC's whois database.

This information will be kept up-to-date by Moira, the Athena Service
Management System, and regularly updated on the mailhub.  Users will be
allowed to update some of their own information, and to become
unlisted if they want to.

Moira currently contains all of the necessary information for the
students here at MIT, only the staff and remaining faculty must be
added.  The primary development effort will be modifications to the
mail hub.
				-Mark Rosenstein
				MIT Project Athena Systems Development

cfe+@ANDREW.CMU.EDU ("Craig F. Everhart") (05/03/89)

The CMU installation of the Andrew system, andrew.cmu.edu, supports a
name space of 8500 users.  For an installation of this size, I believe
it to be difficult or impossible to make somebody's unique ID correspond
in a predictable way to their full legal name.  (Some smaller
installations, such as CMU's Computer Science department (cs.cmu.edu)
with maybe 3000 users, feel that they can make the unique ID be the
preferable Firstname.I.Lastname, so Andrew also supports that canonical
format for such installations.)

We tackled the problems of mapping name probes to the space of all names
in a distributed manner, and came up with Andrew's White Pages, a name
lookup service that can match probes using abbreviations, phonetic
heuristics, and the like.  The service runs via AFS on any workstation,
not simply on a mail hub.  We use it for mail delivery, as well, with
results such as:
	- mail delivery to the named user
	- error response generated if the name probe was unique but only a
fuzzy match
	- error response generated if the name probe was ambiguous; possible
matches listed if there aren't more than a given number of them
	- error response generated if no match could be found.
This has been in place for two years or more.  In progress is a
mechanism whereby people can update aspects of their own White Pages
entries automatically, with optional administrative approval.

Integrating larger lists of names, with optional mail deliveries such as
paper campus-mail delivery, is a cute idea.  We haven't pursued it very
hard, but we think it could be fun.

		Craig Everhart
		Andrew message system

pritch@tut.cis.ohio-state.edu (Norm Pritchett) (05/04/89)

I'd like to thank all those who answered my query regarding the
subject of this posting.  For those who wanted me share what I found,
that will be forthcoming -- I still have messages coming in at a
steady rate and I'd like to wait for them to trickle off before I
share. 

From the collective responses I got I was able to devise a pretty good
scheme.  I won't share it yet because some ideas are still being
hashed out among some fellow networking folks on campus but if you are
familiar with DND there's a lot of similarity to that.

In my original posting I (intentionally) didn't present an accurate
idea of the size of userbase we had to address because I didn't want
to disuade some people from responding just because they thought their
system wouldn't work for us.  I mentioned 4 figures or larger in my
message -- what we really need is a scheme that will comfortably
handle a population in excess of 75,000.  If some of you have thought
about trying to develop such a system this large but have been
disuaded for some reason or another (I've heard from a few such
places) I think we've got something for you... stay tuned.
-- 

Norm Pritchett, The Ohio State University College of Engineering Network
Internet: pritchett@eng.ohio-state.edu	BITNET: TS1703 at OHSTVMA
UUCP: pritch@sydney.columbus.oh.us	CCNET: ENG::PRITCHETT (6172::PRITCHETT)

mullen@itd.nrl.navy.mil (Preston Mullen) (05/04/89)

The Andrew Message System at Carnegie-Mellon University has a component
called "White Pages" that employs a fuzzy name recognition mechanism.
According to the author, Craig Everhart <cfe+@andrew.cmu.edu>, it
"matches name variants to people's names reasonably well without any
pre-identification of the possible variants of everybody's names."
(By the way, the + in his address is a flag that bypasses the smart
name recognition.)  The Andrew Message System is built on top of the
Andrew File System, but the White Pages name recognition component is
easy to separate out.

When I asked about this in October, I was told that the software is owned
by IBM and that the licensing policy had not yet been determined.

I had hoped (and still hope) to use this kind of name matching in a
general approach like that recently suggested by <mar@ATHENA.MIT.EDU>
in his message to tcp-ip of Mon, 1 May 89 13:28:20 EDT.  One might
want to set things up so that an address with exactly one match on
the wrong component (e.g., first name only) would result in a response
similar to the one sent for ambiguous names; in such a case, it might
be better to force the sender to confirm the intended addressee than
to deliver to the wrong person.

	Preston Mullen
	Laboratory for the Study of Human-Computer Interaction (Code 5530)
	Naval Research Laboratory
	Washington DC 20375

P.S.  There is probably a better mailing list than tcp-ip for this topic,
      but which one?