[comp.mail.misc] dynamic routing for UUCP mail

dhesi@bsu-cs.UUCP (Rahul Dhesi) (08/02/87)

Currently, UUCP mail routing has these possibilities:

o    The sending site (the person or the software) creates a route, and that
     route is adhered to by intermediate sites
o    Intermediate sites can choose to reroute (using smail's rerouting
     options) either only if the first host in the route is not recognized,
     or unconditionally

DISADVANTAGE:  Typically, UUCP sends the message to a complete route
when it leaves the site.  In general, following smail suggestions,
intermediate sites will not do any rerouting except when for some
reason they do not recognize the first host in the route.  Thus
intermediate sites will not try to improve the route because they
cannot know whether the user wants them to reroute the message or not.

Even if the message left the originating site without a complete route,
the very next site will either reject it (host not recognized) *or* use
smail to create a complete route -- so we are back to square one, where
a hard-coded route now exists and will be used by all other sites.

Now consider the following possibility for smail.  This assumes that all
UUCP sites are running smail (not currently true but feasible in the near
future)

o    The sending site finds the best route, of the form a!b!c!d!user,
     but simply sends the message to a!d!user

o    The forwarding site `a' now finds the best route; suppose it is 
     e!f!g!d!user (i.e. it could well be different from what the
     originating site calculated);  site `a' sends the message to
     e!d!user

o    If a forwarding site does not recognize the host in an address,
     it sends the message on to some other neighboring host that is
     believed to have a current pathalias database;  prior agreement
     would be needed to avoid loops

Each forwarding site thus receives a message with an address of the
form host!user, calculates the best route, but sends the message on
*without* including the entire route.

The result is that messages are dynamically routed, each host only
sending the message on in the correct general direction, without trying
to specify a full route.

However, if the user wishes to use a specific route, he or she simply
specifies the entire route.  When an intermediate site sees a route
rather than an address, i.e. one that includes more than one host name,
it does *not* recalculate the best route if at all possible but simply
sends the message to the next machine in the route.

ADVANTAGE:  Intermediate hosts can dynamically route messages using
their best current knowledge of their vicinity, but senders can still
choose to specify a complete route if they wish.

IMPLEMENTATION:  The change in smail would be quite simple:  Do all
routing/rerouting as usual, but just before handing outgoing messages
to uucp, check to see whether the incoming message was being sent to a
route (a!b!c!...!d!user) or an address (d!user).  If it was an address,
then strip everything between the first and last hosts in the
new route before invoking uucp.

The only obvious problem I see would be caused by the existence of
duplicate host names;  but since the UUCP Project apparently accepts
map entries for hosts with conflicting names, I think it's a safe bet
that this is not a significant problem.  If you are sending mail to a
site whose name duplicates that of another, you would simply have to
specify a complete route.

Another subtle point might be the handling of a destination host that
is not in the UUCP map.  Mail sent to x!y!user might be treated as
being sent to a route, even if the sender really wanted to simply make
sure x!y is used (x being in the maps) but with normal rerouting until
x is reached.  This could possibly be achieved by interpreting x+y!user
to mean:  treat x+y as a single host name unless you are x, in which
case send the message on to y!user.  (But pathalias lookups would still
be for ...x!y).  Again the change in smail should be simple.  We
would have to see if the presence of `+' confuses UUCP.

QUESTION:  Are there any unexpected pitfalls in this?  Would there be
some interaction with sendmail that would prevent incremental routing
like this?
-- 
Rahul Dhesi         UUCP:  {ihnp4,seismo}!{iuvax,pur-ee}!bsu-cs!dhesi

ron@topaz.rutgers.edu (Ron Natalie) (08/04/87)

Dynamic routing where each host runs pathalias on the last host path is
wrong.  This can only lead to problems.  Mail should work as follows:

1. The originator (or his agent) should generate the path by pathalias
   or some other reasonable method.  By the originator this can mean
   either the user composing the message or the mail system the message
   is posted from.  By the "or his agent" I mean, for example, that certain
   machines can be clustered around one mail forwarding machine. 
   They post with an address "destination-host!user" and always send
   the message to the agent machine.  The agent then substitutes the
   pathalias path.

2. Other hosts passing the message if they do not know the next hop 
   should return an error message to the sender.

3. The one optimization that might be done is path compression...
    "a!b!c!dest!user" may be sent directly to "c" if it knows
    a route.

-------------------
If you are going to have dynamic rerouting of the addresses, as you
described, you must operate in the following method...

You get a message with the address

    a!b!c!d!e!f!dest!user

You are machine "c" and you know a better way to "dest" is through
the path "x!y", then you rewrite the address...

    a!b!c!x!y!dest!user

and send the message to "x".

If you have no idea how to get to "d" and want to send the message to
"z" which you think is running pathalias, then you should rewrite to

    a!b!c!z!d!e!f!dest!user

In each case, the sender should examine the hosts preceding it in the
uucp path and make sure that none of the hosts that it is inserting
in the new route are listed (which is evidence of a loop).  Never rewrite
the information before your own host name.

To assure this will work will require specific changes to UUCP mailers
as well.  However, updating of path databases on UUCP is not timely
enough that any amount of intervention will guarantee that loops will
not form, so you need to have a procedure for bypassing them when they
might happen.

-Ron

john@xanth.UUCP (John Owens) (08/04/87)

In article <915@bsu-cs.UUCP>, dhesi@bsu-cs.UUCP (Rahul Dhesi) writes:
> Now consider the following possibility for smail.  This assumes that all
> UUCP sites are running smail (not currently true but feasible in the near
> future)

Sorry, but I don't think so.  There are plenty of sites that are
well-maintained and managed, with reaseonable system managers who are
interested in communicating with the rest of the world, and are still
not running smail or anything like it.  It is not likely that the vast
majority of sites who do not have these advantages will ever do so.

It's an important design goal (as far as I can tell) of smail and the
UUCP Maps that dumb uucp sites be able to be used as routes.

> The only obvious problem I see would be caused by the existence of
> duplicate host names;  but since the UUCP Project apparently accepts
> map entries for hosts with conflicting names, I think it's a safe bet
> that this is not a significant problem.

This is still a problem.  Although the UUCP Project will reject an
entry for a name that already exists, there's no way they can keep
people from listing some machine in their maps that doesn't have a map
entry but duplicates a registered site.  (There's no way to tell that
the site listed isn't the "real" one.)  Case in point - look at our
map entry (.odu.edu); we talk to a machine "popeye", but it's not the
one at AT&T registered with the UUCP Project.  We just put a comment
to that effect and do *not* list this "popeye" in our connectivity
data, but more naive people would readily do so.

> QUESTION:  Are there any unexpected pitfalls in this?

Yes - I think it's unworkable to do something that is incompatible
with the rest of the world without being very careful.  For example:

	you!me!someoneelse!x+y!user

let's say you and me and x and y are all "smart" about your "+" idea,
but someoneelse isn't.  It tries to uux x+y!rmail user and it fails.
Or it's too smart for its own good and maybe the mail gets through but
the To: address had been rewritten
	To: user@x+y.UUCP
or something like this.

In summary, I don't see that the disadvantages are troublesome enough
or the advantages are worthwhile enough for the headaches this will
cause.

Thanks for the suggestion, though - we wouldn't be where we are now
without suggestions and discussions like these!


-- 
John Owens		Old Dominion University - Norfolk, Virginia, USA
john@ODU.EDU		old arpa: john%odu.edu@RELAY.CS.NET
+1 804 440 4529		old uucp: {decuac,harvard,hoptoad,mcnc}!xanth!john

peter@ethz.UUCP (Peter Beadle) (08/05/87)

Introduction: I manage mail on a cluster of 5 vaxen, 36 sun3's and 2
symbolics work stations with about 12 links and gateways to 3 disparate
networks.  Internal mail is via HoneyDanBer UUCP on ethernet.  External
mail is via uucp, ACSnet and x400. We threw away sendmail about a year
ago and replaced it with upas (which at least has human readable
rewrite files and a name server). At the same time we forcably switched
to doamin addressing for uucp.

Debate: I have never seen smail, we can't run it for political reasons
(Switzerland lacks a top level domain name and somone to administer the
data base) so I think I am eminently qualified to coment about it.

The problem is that people still can't seperate the network transport
level from the mail system. UUCP and UUX (which actually does the work)
copy a file from one unix machine to another and optionally run a
command on the remote machine. That is all they do. Somone a long time
ago thought "why don't I run the mail program on the remote machine so
I can mail between machines". This worked fine at Bell Labs with about
the same number of machines we have (43). Unfortunately the world is a
big place.

More than 10 years on, smail and the whole uucp map project is an
attempt to introduce a routing layer between the the transport layer
(uucp) and the network application (mail). Hopefully one day it will
extend to file transfer and remote job execution like some other more
modern systems.

Smail replaces the rmail link to the mail program with a system that
takes the address on the incoming mail and returns a path route to the
system the mail is destined for from the current host.  A "sensible"
system would then send the mail with its address unchanged to the first
host in the path route where the process would be repeated.

The mail you send to eiger!peter will remain addressed to eiger!peter
even though smail says to route it to
seismo!mcvax!cernvax!ethz!eiger!peter. It will be dispatched from your
system with  a call to uux like:

	uux - -a dhesi seismo!rmail \(eiger!peter\)

We have had a system similar to this working for about 1 year. Explicit
routes go through unchanged so the problem of unregistered machines and
multiple machine names goes away. They are disambiguated by using an
explicit route from a registered host (ethz!tardis!peter works even
though tardis is a machine in sothern california, not Zuerich
Switzerland).

No additional meta characters are introduced.

Routing round dead links becomes easy because you don't have to rip
appart the address to do it.

The only problem of course is that routing has to be done at each node
and each node then has to know at least the first step of how to get to
every other node. This problem was solved long ago with hierarchical
domain addresses such as rfc822.

We finished the system by encoraging people to use rfc822 addresses
instead of ! addressed uucp so we could have minimal routing
information at each node. Of the 3611 remote delivery mail items sent
from our gateway machine since July 7, only 1818 have been ! addressed
(90% of the ! addressed mesages were generated by the nameserver data
base because local ! addresses are marginally faster than rfc
addresses).

Some questions:

1. Is what I have described the original intent of the uucp map/smail
project?

2. If it wasn't the original intent, what way?

Thanks to Rahul Dhesi for raising the problem.

Peter Beadle

davidsen@steinmetz.steinmetz.UUCP (William E. Davidsen Jr) (08/05/87)

You propose that smail be changed to "send things in the correct general
direction." I'm all in favor of the idea, but it's far easier to write a
filter and run it on the output of the pathalias database.

I'll do it:
	sed 's/!.*!/!/' fullnames >right.direction

I use something like that for my database.


-- 
	bill davidsen		(wedu@ge-crd.arpa)
  {chinet | philabs | sesimo}!steinmetz!crdos1!davidsen
"Stupidity, like virtue, is its own reward" -me

zemon@felix.UUCP (Art Zemon) (08/06/87)

In article <13680@topaz.rutgers.edu> ron@topaz.rutgers.edu (Ron Natalie) writes:
>You get a message with the address
>
>    a!b!c!d!e!f!dest!user
>
>You are machine "c" and you know a better way to "dest" is through
>the path "x!y", then you rewrite the address...
>
>    a!b!c!x!y!dest!user

Please don't do this.  "Dest" may not be a unique name.  If
it isn't and you send it to the wrong one, I'll be pretty
upset.

I may have specified d!e!f!dest because I know that "dest"
is a little machine which only talks to "f" and no one else
knows about it.

Cheers,
--
	-- Art Zemon
	   FileNet Corporation
	   Costa Mesa, California
	   ...!hplabs!felix!zemon

henry@utzoo.UUCP (Henry Spencer) (08/06/87)

> ... There are plenty of sites that are
> well-maintained and managed, with reaseonable system managers who are
> interested in communicating with the rest of the world, and are still
> not running smail or anything like it...

It is worth reminding people that this is not (always) due to problems
like lack of time to install things, or lack of suitable software to
install, or sheer sloth.  A case in point is utzoo:  we decided long ago
that *as a matter of policy* we would not offer a free routing service
to the world.  So any mail that arrives here for relaying onward had better
have a destination address that starts with "neighbor!", where "neighbor"
is one of our immediate neighbors.  (This is distinct from the issue of
what interface we offer our *own* users for sending mail.)

(People may be interested in the reasons for the policy.  There are two.
First, we are a rather well-known site with finite phone budgets, and we
don't want people saying "it's going to Canada, so just send it to utzoo
and let them sort it out".  Second, we have a long-standing preference
for software whose behavior is dumb but predictable, as opposed to smart
but nondeterministic.  Proper functioning of any multi-hop network depends
on at least semi-predictable behavior by relay sites; having had many
frustrating experiences in trying to push mail through "smart" relayers
elsewhere, we have opted for relaying stupidly but reliably.)
-- 
Support sustained spaceflight: fight |  Henry Spencer @ U of Toronto Zoology
the soi-disant "Planetary Society"!  | {allegra,ihnp4,decvax,utai}!utzoo!henry

kyle@xanth.UUCP (Kyle Jones) (08/07/87)

The problem with just sending the mail in "the right direction" is that every
host has its own idea about what the right direction is.  For some hosts the
right direction is always going to be the cheapest link that they talk to,
whether the mail gets any "closer" to its destination or not.  Loops can be
avoided by careful checking of the From_ line but isn't it just as bad to have
your mail sent through 450 sites before it reaches its destination?

Another drawback to dynamic routing (as presented in <915@bsu-cs.UUCP>) is
that a lot of CPU time is wasted by having every site along the way look up a
path, only to throw away all but the first hop.

The ultimate problem is of course to get all UUCP sites to run smail or an
equivalent.  In the words of James Gosling, "Not bloody likely."

earle@jplopto.uucp (Greg Earle) (08/08/87)

Once upon a time there was no `smart' rerouting, and the UUCP mail world was
without form and light :-).  Then came `uumail' and `pathalias' and suddenly
new vistas emerged.  Slowly but surely, `smail' then arrived, and soon
thereafter we all suddenly became enlightened to the fact that doing `fascist'
automatic re-routing of mail was, perhaps, Not A Good Thing.  Many examples
and counter-examples have been given.

Now consider that we have reached a Catch-22: the general consensus has been
reached that if an incoming `rmail' invocation contains a qualified bang
path, then one should not re-route it, and we should assume that the 
raison-d'etre for the full bang path was that the originator knew where
he/she wished the mail route to take.  Meanwhile, I was just struck with
the rememberance that one of the reasons for the development of smart
mailers in the first place was to get around the fact that replies to news
articles were using the incoming Path: to generate the return path, which
thus occasionally became many, many hostnames long!!  Using smart mailers
enabled using the From: line to generate single RFC822 `user@host.UUCP' or
`user@some.do.MAIN' paths.  Now, the sentiment for `leaving bang paths alone'
is all well and good; but given that replies to Usenet news articles probably
account for a healthy proportion of mail traffic, it seems that taking this
attitude isn't really doing any good in the long run.  6 of one, half a dozen
of the other as the saying goes ...

No, I don't have any solutions, either.  Just thought I'd point this out.

	Greg Earle		earle@jplopto.JPL.NASA.GOV
	(formerly of) JPL	earle%jplopto@jpl-elroy.ARPA	[aka:]
	(currently gainfully	earle%jplopto@elroy.JPL.NASA.GOV
	    unemployed)		seismo!cit-vax!elroy!smeagol!jplopto!earle

egisin@waterloo.edu (Eric Gisin) (08/08/87)

|Now consider that we have reached a Catch-22: the general consensus has been
|reached that if an incoming `rmail' invocation contains a qualified bang
|path, then one should not re-route it, and we should assume that the 
|raison-d'etre for the full bang path was that the originator knew where
|he/she wished the mail route to take.  Meanwhile, I was just struck with
|the rememberance that one of the reasons for the development of smart
|mailers in the first place was to get around the fact that replies to news
|articles were using the incoming Path: to generate the return path, which
|thus occasionally became many, many hostnames long!!  Using smart mailers
|enabled using the From: line to generate single RFC822 `user@host.UUCP' or
|`user@some.do.MAIN' paths.  Now, the sentiment for `leaving bang paths alone'
|is all well and good; but given that replies to Usenet news articles probably
|account for a healthy proportion of mail traffic, it seems that taking this
|attitude isn't really doing any good in the long run.  6 of one, half a dozen
|of the other as the saying goes ...
|
|No, I don't have any solutions, either.  Just thought I'd point this out.

Is there any need to support the reply-to-Path mechanism of news anymore?
I'm assuming sites without a smart mailer are small and have one uucp link
to the outside world and a few local links. Is that reasonable?
I think anyone with a number of long-distance links would install
a smart mailer for the obvious economic benefits.

If my assumption is correct, a simple mailer could be put into
future news readers that used the From: address. It would only have
to recognize local domains (or local-host.uucp), and pass everything
thing else to link-to-rest-of-world!domain!user. Of course that
assumes link-to-rest-of-world, or someone down the way, is running
smail or some other domain mailer.

I'm probably being overly optimistic that most major sites
running a domain mailer.

gnu@hoptoad.uucp (John Gilmore) (08/12/87)

egisin@waterloo.edu (Eric Gisin) wrote:
> Is there any need to support the reply-to-Path mechanism of news anymore?
> I'm assuming sites without a smart mailer are small and have one uucp link
> to the outside world and a few local links. Is that reasonable?

No, it isn't.  The mail routing software is one more major subsystem
that requires ongoing maintenance.  I don't run it because I want to
have time for development rather than system administration.  The
occasional 30 seconds I take to look up some site's map entry with a
grep script is nothing compared to the time required to
configure, install, and deal with a major piece of software.

As postmaster on hoptoad I regularly catch mail that "smart" mailers have
misdirected, and send mail to the posmaster where they probably got munged.

My sense is that many of the sites running smail are either large sites
with somebody permanently assigned to doing 'system administration', or
small sites where the owner finds it easier to maintain the software than
to understand the connectivity of the Usenet :-).

> I think anyone with a number of long-distance links would install
> a smart mailer for the obvious economic benefits.

Uh, what obvious economic benefits?  I'll be glad to hear of 'em, but
I already know which sites are local to me and how to route through them.
I also know how to be a good neighbor and send big stuff via paths that
the sender or recipient pays for, rather than the intermediate nodes.
I don't think smail is that smart yet.
-- 
{dasys1,ncoast,well,sun,ihnp4}!hoptoad!gnu	     gnu@postgres.berkeley.edu
Alt.all: the alternative radio of the Usenet.