[net.mail.headers] Name data base or Name and Host data base

Jacob_Palme_QZ@QZCOM.MAILNET (09/16/84)

I have an important design question for a future interface to
RFC mail, where I would like the advice of other people in the
header-people group.

I would like to automatically build up a data base of people,
with whom our users often communicate, from the names and
adresses of people in the header fields of incoming messages.

This data base can be organized in two ways:

(1) A data base of names of people. For each name, the full
return path to that person is stored.

(2) A data base of hosts and names. For each name, only the
host name is stored. The return path to that host is then
stored on the host.

Note that there are in-between alternatives. For a name like
AAA!BBB!CCC%DDD@EEE this could be stored as
(a) the name "AAA!BBB!CCC%DDD" at the host "EEE", or as
(b) the name "AAA!BBB"CCC" at the host "DDD" or as
(c) the name "CCC" at the host "BBB".
Of those three alternatives, I would much prefer alternative
(c) with the real name and the real host.

But my main problem is selection between alternatives (1) and
(2) above. Some arguments:

Arguments for (1):

This solution will not cause any problem if two hosts have the
same name (except if two people have both the same name and the
same host, but this has VERY low probability). Also, faulty path
information for one person will only affect the return path to
that person, not to anyone else. It will thus be easier to
create the data base automatically from pathes in incoming
messages. With solution (2), a new return path for a host in a
new incoming message cannot immediately be stored on that host,
since the faulty path will then cause all messages to that host
to go wrong. Thus, solution (2) will require more manual
checking.

Arguments for (2):

The correct and proved path to a certain host need only be
stored once in the data base, and will then be available for all
messages to be sent to anyone at that host in the future.
Less information is duplicated in the data base with this
solution.

Note: We presently are using solution (1) and it works rather
well. But since we plan to re-write everything, we can decide
on solution (2) when design the next version of the program.

What is your opinions? Please help me!

Margolin@MIT-MULTICS.ARPA (Barry Margolin) (09/19/84)

A problem with your proposals (2) and (3) is that they violate the
sanctity of the local-part.  Your software that might try to recognize
          AAA!BBB!CCC%DDD@EEE
 as meaning the user CCC on Usenet host BBB would be making possibly
incorrect assumptions about how EEE parses its local parts.  On a
suitably unusual computer "AAA!BBB!CCC%DDD" could be a local user name;
there is no way for your site to know how what those special characters
mean to EEE.  If, perhaps, you know that EEE treats % in the local part
as synonymous with @, then you have the next level of assumption, about
how DDD parses ITS local-part, and then AAA.
                                        barmar