kre@ucbvax.ARPA (Robert Elz) (08/03/85)
Addressing semantics present the most difficult problem to solve currently facing us. This problem is also the most urgent to solve quickly. I'm going to try and discuss this without resort to any magic symbols (! @ and the like) as as soon as some people see any of those symbols various preconceptions leap out at them, and they are prevented from seeing the real issues. Its not going to be easy, and I'm sure that at least a couple of times I'm not going to be able to do it. What we have to decide here, is what information must be present in a mail address in order to uniquely specify the intended recipient. First I'm going to assume a couple of things that I think are non contentious in this forum (though they are by no means necessary). A mail address is just a means of specifying a person (or program, or whatever) that should receive the mail. In a tiny system that's all that we would ever need. Since there is no practicable way that we can make these names unique worldwide, we add to this the notion of a "host" - the place where this person receives his mail. In a smallish network this would be all that would be needed, however as the number of hosts grows, specifying names for them uniquely also becomes impossible. Here is where the recent discussion has exposed two possible methods for proceeding. We can associate with each host some additional "attribute" that we must also specify, to make its name unique. (And as necessary, we can add more attributes to the host name, or we can successively qualify the attributes already given). Or we can group hosts into "lumps", and then specify the name of the host, and the lump to which it belongs. Again, we can bundle lumps of hosts together to make bigger lumps, and so on as necessary. To name a host we specify its name, and the name of all lumps it belongs to. (Sometimes as a contraction we are permitted to omit the names of any lumps that the sender also belongs to, but this is just an abbreviated form of addressing, and like all abbreviations, should be avoided wherever possible). The former technique is apparently the one that is being considered (adopted?) by the ISO MHS committee. (See Mark Horton's article <1343@cbosgd.UUCP>). [Note: I am not claiming that the MHS naming scheme is anything very closely related to the one I am about to describe - just that the principles are similar]. It is also the technique advocated in a series of articles by Peter Honeyman (<537@down.FUN>, <545@down.FUN>, <552@down.FUN>). The attribute he suggests that we should use is (I suspect) not the same one that the ISO people would adopt, but the essentials of the scheme are still there. The "attribute" used is the name of some other host that this one has a direct link to. If that doesn't prove to be unique, then the attribute host is further qualified by giving it a similar attribute. The "lumps" scheme is the one commonly labelled "domains". Before looking ad advantages & disadvantages of each of these, lets consider a small anecdote to put things in perspective. While I have been in the US, my employer has seen fit to change its telephone number. This means that I am going to have to have new business cards printed. On those cards I will include my paper-mail address, and my phone number (and conceivably the telex number as well, though no-one I know would even want to contact me that way). Each of these things can be specified in a way that they can be used anywhere. Wherever I am, I can give a card to someone I meet, and if they have access to the proper equipment they can use the address on my card to contact me. The addresses also remain constant - other than actions I may take (or my employer may take) they are unlikely to change for years. It is vaguely possible that the body responsible for getting messages to me may need to change my address, but they do this very rarely. Certainly nothing any other user of similar addresses can do can affect my address, in any of these forms. I would also like to include my electronic-mail address. To really be useful, this must meet the same criteria. It must be usable (as it stands) by anyone, anywhere. Of course, the situation in the e-mail world is not nearly this stable, but ideally, it would be, and just perhaps, we can help push things this way, by leading from the front. Now, to look at attribute addresses, and domains. 1) Domain addresses can be lengthy - they include a lot of information, that is, in most cases, redundant. In many cases, some of this excess information is nearly meaningless (doesn't mean anything recognisable to the average observer, its simply a magic incantation). 2) Attribute addresses can be shorter, only enough attributes to make the address unique need be specified. Of course, its perfectly legal to overspecify an address, but there is (usually) no way to determine in advance exactly which attributes need to be specified, or even, if in fact, any current set of attributes will give a unique address at all, or whether a new one will need to be used. Now lets examine how a name gets assigned in each of the two cases. With domains, the maintainer of the domain name list is asked to add a specific (chosen) name to the list of names known in that domain. If the name chosen is unacceptable (either because it violates the appropriate syntax requirements, or because it duplicates an existing name, and hence would cause ambiguities), it is rejected, and the proposer would then chose another. Having reserved a name, nothing more need ever be done. With attributes, each site simply chooses a name, and makes known some relevant attributes that apply to it. Should the chosen name duplicate the name of an existing site, then the attributes are used to disambiguate. This is the one BIG disadvantage of this scheme, if my site had been the one whose name was (inadvertently) duplicated, then my address now needs an attribute that it didn't need before, and until I know the attributes of the new site, I can have no idea what new attribute people mailing to me must specify. That is, there's no way I can predict this and print it on my business cards. To my mind, domain type addresses are the only ones that make sense. Except in the case that all of the currently 'top' level domains are bundled together, and placed in a new 'top' domain, addresses don't change once issued. And that action is one that is taken only by the people who manage the name space, and is likely to be made very rarely, and with plenty of advance warning. This is similar to the procedures when the telephone company decide to change your area code, or when the post office (or whatever appropriate body it is) decides to rename the street that you live in. Those well publicised changes we can tolerate, unannounced changes occasioned by some new site joining the network are intolerable. Now domains do have some problems. There has to be someone to co-ordinate names in each domain (some "authority"). See Henry Spencer's article <5836@utzoo.UUCP> in which he co-opts me as a volunteer to do this work (:-). Henry makes the point that volunteer labour isn't easy to get to do this task. Of course, "labour" isn't really needed, we have all this computing power just waiting to be used. No-one has ever said that the naming authority must be a "human". The task to be performed isn't overly onerous, and creating a program to receive mail from someone wanting to register a name, check the proposed name for syntax problems and potential clashes, and either add the name to the list, or reject it, is not something I would feel to be beyond my capabilities. Neither is it a perpetual task. The fact that this has not yet (to my knowledge) been automated, only testifies to the comparative simplicity of the task. There is a third possibility for addressing (not mentioned above). An address could be specified by detailing the route used to get to the address. This would be something akin to specifying a postal address as "go north 3 blocks, take a left, continue 2 blocks, take a right, then veer right again past the big tree, continue till you see a supermarket on the right, then take the next left, and the third house on the right is it". Of course this presupposes that everyone starts from the same point, which we usually solve by making that point be some 'well known monument' and leave it up to the individual to work out how to get from where he is to the monument. Sometimes we may even give routes from a few well known monuments, so people can pick one that they know, and is close to where they are starting from. These addresses are unambiguous, and remain constant (unless someone blocks off a street, or there's a traffic jam, or ..) Of course, its inconceivable that anyone would actually choose a scheme like this for addressing .. or is it? (see the article <686@umd5.UUCP>)
franka@mmintl.UUCP (Frank Adams) (08/07/85)
I would like to propose a UUCP naming scheme which would be simple to implement, yet deal with the need to supply a unique, unvarying address. What I propose is to designate a few sites as "root" sites. Your full address is a route from any one root site to your host (and then to you). The requirements to make this work are threefold: 1) The names of root sites must be reserved; no other site may be permitted to adopt such a name. 2) Each host must know how to deliver mail to a root site. (This may mean requiring the user to prefix a route to the root site to the destination address with existing mailers.) 3) Each root site must know how to deliver mail to every other root site. The only problem I see with these is how to designate the root sites. I suspect that about a dozen are sufficient, so this could be resolved in an ad hoc manner. Some caveats: I do not mean that mail should be forced to follow the implied route specified by the address. The point of this scheme is that "dumb" mailers can follow a simple set of directions to forward mail, while "smart" mailers can reroute mail without error. I have not addressed the issues involved in cross-net mail here, either. I have the impression that those problems are more syntactic than semantic. Whether the items in an address represent machines or domain names does not matter *for a mailer on another network*. How they are presented does. I believe that this scheme avoids the danger of a "takeover" of the net, as well. It would be relatively simple, if such were attempted, to redesignate the root systems; all that is required is a check that their names are not duplicated.
henry@utzoo.UUCP (Henry Spencer) (08/11/85)
> Now domains do have some problems. There has to be someone to > co-ordinate names in each domain (some "authority")... volunteer > labour isn't easy to get to do this task... No-one has ever said that > the naming authority must be a "human". The task to be performed isn't > overly onerous, and creating a program to [handle name registration] > is not something I would feel to be beyond my capabilities. Neither is > it a perpetual task... This actually just shifts the issue, to finding a volunteer to provide the machine time for the job. For name registration, this probably isn't too much of a problem (although Lauren could probably tell you some interesting stories about the legal aspects, e.g. pinheads who feel they have a divine right to use some specific name and threaten to sue the registry when they find the name is taken already...[I kid you not]). It brings in a more troublesome issue, however. What does random site X do when a user there (or a machine it connects to) asks for mail transmission to site foo.bar, which X doesn't know about? Right: it punts the mail to the domain-administration site. Given the explosive growth rate of the network, how long will it be before that site is swamped and its sponsors get fed up and pull the plug? Having multiple administration sites for each domain only postpones the problem slightly. Of course, site X "ought" to know about any site it talks to frequently, so that it doesn't need to hit the domain administrator every time. But this assumes that the dissemination of such knowledge can keep pace with the growth of the network, which is an *assumption*, not a self-evident fact. I'm afraid I have little confidence in it. One idea which almost nobody has discussed, but which might really help, is to take the position that site X must *not* just forward the mail to the domain administrator. It must ask the domain administrator for the routing information, and then use that information *itself*. This has the disadvantage that it slows down mail traffic considerably, but the advantage that it gives X considerable incentive to do the work locally if at all possible -- incentive that is lacking otherwise. If we really want domains to work, it is vitally important to do everything possible to limit the load on the administering sites. Perceptive observers may have noticed that utzoo is a fairly obvious candidate as one of the domain-administration sites for eastern Canada, and that utzoo has not yet volunteered to do it. Don't hold your breath. -- Henry Spencer @ U of Toronto Zoology {allegra,ihnp4,linus,decvax}!utzoo!henry
tp@ndm20 (08/22/85)
>is to take the position that site X must *not* just forward the mail to >the domain administrator. It must ask the domain administrator for the >routing information, and then use that information *itself*. This has . . . >to limit the load on the administering sites. As somewhat of an add on to Henry's idea, how about this. If a site can not route a message, it asks the domain administrator the proper route. This route, through an *automatic* mechanism, is updated into the requestor's database, so he will never have to ask about that route again. This provides an automatic way for routing databases to be updated. As I understand the idea of domains, these routes would not automatically disseminate, as the idea is to minimalize any node's knowledge of the full configuration of the net. This scheme allows a node to only keep track of the sites he actually mails to. If the route ever fails, then the node can ask the domain administrator and get an updated route. The problem with this whole line of reasoning is that it requires new software that is completely different from what is already in place. The mailer would have to know to ask for a route, and hold the message until it got one. It should also recieve undeliverable mail, contact the domain administrator for a new route, and re-send it. It could be a long time before someone found out his mail was undeliverable. Unless the routes given out by the domain administrator are kept around, the administrator will be plagued by route requests, which probably accounts for just as much load (if not more) as if it just forwarded the message. The catch is that if the are kept around, you never know when they become invalid. Terry Poot Nathan D. Maier Consulting Engineers (214)739-4741 Usenet: ...!{allegra|ihnp4}!convex!smu!ndm20!tp CSNET: ndm20!tp@smu ARPA: ndm20!tp%smu@csnet-relay.ARPA