[comp.mail.misc] Getting the most out of pathalias

msir@uhura.cc.rochester.edu (Mark Sirota) (01/06/89)

I suspect that I'm not getting as much out of pathalias as I could be.  I
downloaded and unshared the files from comp.mail.maps, and ran pathalias
on them.  However, is there some way that I'm supposed to use arpatxt to
integrate my host table into that?

The documentation suggests that pathalias may sometimes come up with mixed
addresses (i.e. host!user@host), but the way I've done it it will only
come up with pure UUCP addresses (since that's all that's in the maps,
right?)

Am I misunderstanding something?  Am I missing something?  What's the
*right* way to use pathalias?  (Right now, I'm just using it as a database
for looking up routes; I've written another little program to do so.)  Can
it somehow be automagically integrated into the mailer/sendmail or
something?

What I'm trying to ask is, "What's the Right Thing to do with pathalias?"
-- 
Mark Sirota - University of Rochester, Rochester, NY
 Internet: msir@cc.rochester.edu
 Bitnet:   msir_ss@uordbv.bitnet
 UUCP:     ...!rochester!ur-cc!msir

fair@Apple.COM (Erik E. Fair) (01/06/89)

arpatxt is a waste of time; you have to maintain a list of duplicate
UUCP/and domain *component* names to prevent mis-routing when you try
to use arpatxt (tedious manual work). Also, NIC HOSTS.TXT will
eventually go away, and then where will you be? I've got a better hack
which I will describe below.

pathalias tries really hard to avoid hybrid addresses. In practice, I
never see any out of the UUCP maps, in part because of the rules that
the UUCP map coordinators apply when massaging the data before posting
it (I know; I'm the Northern California coordinator), and because
pathalias assigns DEAD (or worse) to anything that would generate that
kind of address.

The right thing to do with the pathalias output (all those paths) is to
build a dbm(3) database with them, and add the SEISMO/UUNET patch to
your sendmail source (it adds one new operator: ${name$} with some
slightly strange semantics that basically will look up that name in a
dbmified path database, and if successful, replace it with the path
found. You then have to run it through some cleanup rules, but it works
quite well, and it's an absolutely minimal intrusion into the sendmail
sources [it was written for 5.54 sendmail and slid right into 5.59 with
no change, to give you an idea]).

The effect of this here (with the particular way I've written the
sendmail config) is that if someone (here or elsewhere through here)
uses a path:

	foo!bar!bletch!user	(or bar!bletch!user@foo.UUCP)

sendmail will:

1) look up foo in class U (our UUCP neighbors) and send it on
	as normal if it's in there (no different than standard UUCP mail).

2) look up foo in the pathalias generated database and use the path
	found therein (special parsing goes on here for paths that begin:

		do.main!frob!nitz!user

	they get turned into:

		frob!nitz!user@do.main

	and sent on their merry Internet way (we're on the Internet,
	so this works; generation of such paths described below)).

3) failing 1 & 2, I error back thusly (with the #error mailer):

	Unknown UUCP host - not in the UUCP maps and not one of our
	UUCP neighbors

I do not believe that "rerouting" is a good idea so I don't.

Since I like the Internet, I use it as much as I can. To encourage
such usage (and cut back on the length of the UUCP paths generated by
pathalias) I run pathalias twice over the UUCP maps before generating
the dbm database used by sendmail. First pass is just for the raw
maps. Then I run the pathalias output through an awk script called
"mkglue" that generates a sizeable glue file. Then run pathalias again
over the raw maps with the glue file appended to the input.

The mkglue script walks through pathalias output looking for Internet
domain names (legal ones; I filter by top level domain name) and then
does two things: it reverses the equivalence statements, e.g.

	apple= apple.com

becomes

	apple.com= apple

and after all of these have been found, it produces a completely
connected network of the domain names, e.g.

	INTERNET= {
		apple.com,
		rutgers.edu,
		ucbvax.berkeley.edu,
		} (DEMAND+LOW)

The effect of this is that the paths generated tend to have their first
hop be Internet, rather than UUCP. It also causes UUCP references to
Internet sites to go Internet (if they declared the equivalence [NOT a
domain gateway] in the UUCP maps). It about halves the average path
length.

One other thing that I've tickled my sendmail config into doing
without adding a rule-per-site is domain forwarding with naming hack.
It should be pretty clear from the config (which I will post
separately) what's going on.

I don't really have the time to answer detailed questions by mail;
Apple is keeping me *very* busy. Experiment with the stuff I post as
you please - if you discover a bona fide problem, I'd love to hear
about it, but it's all strictly "as is", caveat emptor.

	Erik E. Fair	apple!fair	fair@apple.com

fair@Apple.COM (Erik E. Fair) (01/06/89)

#!/bin/awk -f
# MKGLUE: UUCP map post processor
# Idea from Mel Pleasant via Eliot Lear
# Erik E. Fair <fair@apple.com>, August, 1988
#
# What we have here is a UUCP map postprocessor. To use:
#	pathalias uucpmaps > /tmp/paths.raw
#	mkglue /tmp/paths.raw > /tmp/glue
#	pathalias uucpmaps /tmp/glue > /tmp/paths.refined
#	do whatever you do with the maps here
#
# what this does is find Internet EQUIVALENCES for UUCP sites, e.g.
#
#	ucbvax=	ucbvax.berkeley.edu
#	apple= apple.com
#
# and then it reverses them, and puts all the domain names it finds into
# a completely connected network called "INTERNET", with COST defined
# below. That cost was determined experimentally on a Cray X/MP-48
# (pathalias will run on such a beast. It takes only 24 seconds to
# process all the maps and the glue file. It's amazing what you can do
# with a supercomputer). Your milage may vary.
#
# The effect of this is to cause nearly all your paths to take their
# first hop through the Internet. DO NOT USE THIS POSTPROCESSOR, unless
# you're actually on the Internet, or you have multiple UUCP neighbors
# who are on the Internet of equivalent call cost to you.
#
# This script will NOT do anything with domain gateway declarations, e.g.
#
#	foo	.bar.com
#
# because these do not provide a mapping between the Internet name and
# the UUCP name of the UUCP host involved. This script makes no
# distinction between "real" Internet hosts and "fake" (MX'd) ones (how
# can I? The information isn't there). Even with an MX host, someone on
# the Internet is accepting mail for them (that's what MX is all about).
#
# Encourage your Internet friends and neighbors to put all the right
# information into the UUCP maps.
#
# Also, your mailer must be able to transform thusly:
#
#	do.main!foo!bar!bazz -> foo!bar!bazz@do.main
#
# since that's what the database will generate. I do it with sendmail,
# and I installed the uunet hacks to 5.59 sendmail to look stuff up in a
# DBM database. I expect that the IDA sendmail stuff can be similarly
# coerced to do this.
#
# If nothing else, you might find the report at the end of the glue file
# interesting.
#
BEGIN{
	COST = "DEMAND+LOW";
#
	domain["arpa"] = 1;
	domain["com"] = 1;	domain["gov"] = 1;
	domain["mil"] = 1;	domain["org"] = 1;
	domain["edu"] = 1;	domain["net"] = 1;
#
#	domain["ac"] = 1;	these aren't really real
#	domain["co"] = 1;
#	domain["gs"] = 1;
#	domain["or"] = 1;
#	domain["re"] = 1;
#
	domain["ar"] = 1;
	domain["at"] = 1;
	domain["au"] = 1;
	domain["ca"] = 1;
	domain["ch"] = 1;
	domain["cl"] = 1;
	domain["de"] = 1;
	domain["dk"] = 1;
	domain["du"] = 1;
	domain["es"] = 1;
	domain["fi"] = 1;
	domain["fr"] = 1;
	domain["il"] = 1;
	domain["is"] = 1;
	domain["jp"] = 1;
	domain["kr"] = 1;
	domain["my"] = 1;
	domain["nl"] = 1;
	domain["no"] = 1;
	domain["nz"] = 1;
	domain["pt"] = 1;
	domain["se"] = 1;
	domain["sg"] = 1;
	domain["uk"] = 1;
	domain["us"] = 1;

	nbad = 0;
	imon_inet = 0;
}

# ignore domain gateways (no clean mapping - we must know the internet name)
/^\./ {next}

$2 == "%s" {
# hopefully only one of these
	if ( $1 !~ /\./ ) {
		localuucpname = $1;
		next;
	}
}

# here's the meat of the matter - find real domains and reverse the
# equivalences so that pathalias will give us paths with internet
# names in them.
$1 ~ /\./ {
	hostname= $1;
	curbad = 0;
# check top of domain name for validity
	i = split(hostname, parts, ".");
	top = parts[i];
	if (domain[top] != 1) {
		printf("# bad domain - %s\n", hostname);
		badtop[top]++;
		nbad++;
		curbad = 1;
	} else domtop[top]++;
	n = split($2, path, "!");
	if (n > 1) {
		uucpname= path[n - 1];
		if (hostname == uucpname)
			next;
# skip two sided dot aliases
		i = split(uucpname, parts, ".");
		if (i < 2) {
			if (! curbad) {
				print hostname "=" uucpname;
				internet[hostname]++;
			}
		} else if (domain[parts[i]] == 1) {
			print uucpname "=" hostname;
			internet[uucpname]++;
		}
	} else if ($2 == "%s") {
		if (imon_inet && localuucpname != "" && !curbad) {
			print localinetname "=" localuucpname;
			internet[localinetname]++;
		}
		if (!curbad) {
			localinetname= $1;
			internet[localinetname]++;
			imon_inet++
		}
	}
}

# now create a completely connected network of the domain names,
# with a low cost, so that we mostly use the Internet in preference
# to any other path
END{
	if (imon_inet) {
		print localinetname "=" localuucpname;
	}
	print "INTERNET={"
	for(hostname in internet) {
		printf("\t%s,\n", hostname);
	}
	printf("\t}(%s)\n", COST);
#
# report on what we found while perusing the map data
#
	printf("# top level domains\n");
	for(top in domtop) {
		printf("#\t%s\t%d\n", top, domtop[top]);
	}
#
	if (nbad > 0) {
		printf("\n# unrecognized summary:\n");
		for(dom in badtop) {
			printf("#\t%s\t%d\n", dom, badtop[dom]);
		}
	}
}

fair@Apple.COM (Erik E. Fair) (01/06/89)

*** sendmail/src/sendmail.h.orig	Wed Jul 23 18:27:00 1986
--- sendmail/src/sendmail.h	Thu Feb 26 18:46:07 1987
***************
*** 321,326 ****
--- 325,336 ----
  # define HOSTBEGIN	'\035'	/* hostname lookup begin */
  # define HOSTEND	'\036'	/* hostname lookup end */
  
+ #if defined(DBM) && defined(UUCPDOMAIN)
+ /* brace characters for uucp name lookup */
+ # define UUCPBEGIN	'\016'	/* uucpname lookup begin */
+ # define UUCPEND	'\017'	/* uucpname lookup end */
+ #endif
+ 
  /* \001 is also reserved as the macro expansion character */
  /*
  **  Information about hosts that we have looked up recently.
***************
*** 531,536 ****
--- 543,549 ----
  EXTERN char	*mxhosts[MAXMXHOSTS+1];	/* for MX RRs */
  EXTERN char	*TrustedUsers[MAXTRUST+1];	/* list of trusted users */
  EXTERN char	*UserEnviron[MAXUSERENVIRON+1];	/* saved user environment */
+ EXTERN char	*PaliasFile;	/* location of pathalias file */
  /*
  **  Trace information
  */
*** sendmail/src/daemon.c.orig	Wed Jul 23 18:27:01 1986
--- sendmail/src/daemon.c	Wed Mar 18 13:18:36 1987
***************
*** 604,606 ****
--- 607,712 ----
  }
  
  #endif DAEMON
+ 
+ #if defined(DBM) && defined(UUCPDOMAIN)
+ /* XXX */
+ #ifdef sun
+ #undef NULL
+ #endif
+ #include <dbm.h>
+ /* XXX */
+ #ifdef sun
+ #undef NULL
+ #define	NULL	0
+ #endif
+ 
+ /*
+ **  PATHALIAS -- return the shortest path to this host.
+ **
+ **	Parameters:
+ **		hbuf -- a buffer containing a hostname.
+ **
+ **	Returns:
+ **		The shortest known path to this host.
+ **
+ **	Side Effects:
+ **		none.
+ */
+ datum
+ pathalias(hbuf)
+ register char *hbuf;
+ {
+ 	register char *bptr;
+ 	register datum hdatum;
+ 
+ 	hdatum.dptr = hbuf;
+ 	hdatum.dsize = strlen(hbuf) + 1;
+ 	hdatum = fetch(hdatum);
+ 	if (hdatum.dptr != NULL)
+ 		if ((bptr = rindex(hdatum.dptr, '%')) == NULL ||
+ 		    *(bptr + 1) != 's')
+ 			hdatum.dptr = NULL;
+ 		else if (*(bptr + 2) == '@')
+ 			/* path!%s@host -> path!host!%s */
+ 			(void) sprintf(bptr, "%s!%%s", bptr + 3);
+ 	return hdatum;
+ }
+ 
+ 
+ /*
+ **  MAPUUCPHOSTNAME -- turn a hostname into the shortest uucp path
+ **
+ **	Parameters:
+ **		hbuf -- a buffer containing a hostname.
+ **		hbsize -- the size of hbuf.
+ **
+ **	Returns:
+ **		none.
+ **
+ **	Side Effects:
+ **		Looks up the host specified in hbuf.  Replace it
+ **		with the shortest uucp path to that host.
+ */
+ mapuucpname(hbuf, hbsize)
+ 	register char *hbuf;
+ 	register int hbsize;
+ {
+ 	register char *bptr;
+ 	register datum hdatum;
+ 	register bool mapped = FALSE;
+ 
+ 	makelower(hbuf);
+ 	hdatum = pathalias(hbuf);
+ 	if (hdatum.dptr != NULL && hdatum.dsize <= hbsize) {
+ 		(void) strcpy(hbuf, hdatum.dptr);
+ 		return;
+ 	}
+ 	if ((bptr = rindex(hbuf, '.')) == NULL)
+ 		return;
+ 	if (strncmp(bptr, ".uucp", sizeof ".uucp") == 0) {
+ 		/* try host with .uucp removed */
+ 		*bptr = '\0';
+ 		hdatum = pathalias(hbuf);
+ 		*bptr = '.';
+ 		if (hdatum.dptr != NULL && hdatum.dsize <= hbsize)
+ 			(void) strcpy(hbuf, hdatum.dptr);
+ 	}
+ 	else
+ 	{
+ 		/* keep removing left-most subdomain until a match is found */
+ 		for (bptr = hbuf;
+ 		     !mapped && (bptr = index(++bptr, '.')) != NULL;
+ 		     mapped = hdatum.dptr != NULL &&
+ 			      hdatum.dsize + strlen(hbuf) + 1 <= hbsize)
+ 			hdatum = pathalias(bptr);
+ 		if (!mapped || (bptr = rindex(hdatum.dptr, '!')) == NULL)
+ 			return;
+ 		*bptr = '\0';
+ 		if ((bptr = malloc(strlen(hbuf) + 1)) != NULL) {
+ 			(void) strcpy(bptr, hbuf);
+ 			(void) sprintf(hbuf, "%s!%s!%%s", hdatum.dptr, bptr);
+ 			(void) free(bptr);
+ 		}
+ 	}
+ }
+ #endif
*** sendmail/src/main.c.orig	Thu Jan 30 14:03:53 1986
--- sendmail/src/main.c	Mon Jan 26 14:33:20 1987
***************
*** 808,813 ****
--- 812,822 ----
  	/* and finally the hostname lookup characters */
  	'[', HOSTBEGIN,	']', HOSTEND,
  
+ #if defined(DBM) && defined(UUCPDOMAIN)
+ 	/* and the uucpname lookup characters */
+ 	'{', UUCPBEGIN,	'}', UUCPEND,
+ 
+ #endif
  	'\0'
  };
  
*** sendmail/src/parseaddr.c.orig	Wed Apr  2 19:04:03 1986
--- sendmail/src/parseaddr.c	Wed Mar 18 12:10:50 1987
***************
*** 741,747 ****
--- 741,751 ----
  			char pvpbuf[PSBUFSIZE];
  			extern char *DelimChar;
  
+ #if defined(DBM) && defined(UUCPDOMAIN)
+ 			if (**rvp != HOSTBEGIN && **rvp != UUCPBEGIN)
+ #else
  			if (**rvp != HOSTBEGIN)
+ #endif
  				continue;
  
  			/*
***************
*** 753,759 ****
--- 757,769 ----
  			hbrvp = rvp;
  
  			/* extract the match part */
+ #if defined(DBM) && defined(UUCPDOMAIN)
+ 			while (*++rvp != NULL &&
+ 			       (**rvp != HOSTEND && **hbrvp == HOSTBEGIN ||
+ 				**rvp != UUCPEND && **hbrvp == UUCPBEGIN))
+ #else
  			while (*++rvp != NULL && **rvp != HOSTEND)
+ #endif
  				continue;
  			if (*rvp != NULL)
  				*rvp++ = NULL;
***************
*** 764,770 ****
--- 774,801 ----
  
  			/* look it up */
  			cataddr(++hbrvp, buf, sizeof buf);
+ #if defined(DBM) && defined(UUCPDOMAIN)
+ 			if (**(hbrvp - 1) == HOSTBEGIN)
+ 				maphostname(buf, sizeof buf);
+ 			else
+ 			{
+ #ifdef sun
+ 				(void) dbmclose();
+ #endif
+ 				if (dbminit(PaliasFile) == 0)
+ #ifdef sun
+ 				{
+ #endif
+ 					mapuucpname(buf, sizeof buf);
+ #ifdef sun
+ 					(void) dbmclose();
+ 				}
+ #endif
+ 				(void) dbminit(AliasFile);
+ 			}
+ #else
  			maphostname(buf, sizeof buf);
+ #endif
  
  			/* scan the new host name */
  			olddelimchar = DelimChar;
*** sendmail/src/alias.c.orig	Thu Apr 17 23:18:56 1986
--- sendmail/src/alias.c	Mon Jan 26 14:26:42 1987
***************
*** 92,97 ****
--- 92,100 ----
  		p = aliaslookup(a->q_user);
  	if (p == NULL)
  		return;
+ #if defined(DBM) && defined(UUCPDOMAIN)
+ 	p = newstr(p);
+ #endif
  
  	/*
  	**  Match on Alias.
*** sendmail/src/Makefile.orig	Wed Jul 23 18:29:21 1986
--- sendmail/src/Makefile	Sat Feb 14 13:23:50 1987
***************
*** 36,42 ****
  CHMOD=	chmod
  O=	-O
  COPTS=
! CCONFIG=-I../include -DVMUNIX -DMXDOMAIN -DOLDJEEVES
  CFLAGS=	$O $(COPTS) $(CCONFIG)
  ASMSED=	../include/asm.sed
  AR=	-ar
--- 36,42 ----
  CHMOD=	chmod
  O=	-O
  COPTS=
! CCONFIG=-I../include -DVMUNIX -DMXDOMAIN -DOLDJEEVES -DUUCPDOMAIN
  CFLAGS=	$O $(COPTS) $(CCONFIG)
  ASMSED=	../include/asm.sed
  AR=	-ar
*** sendmail/src/readcf.c.orig	Sat Jan 11 03:18:47 1986
--- sendmail/src/readcf.c	Thu Feb 26 18:41:55 1987
***************
*** 913,918 ****
--- 913,925 ----
  		WkTimeFact = atoi(val);
  		break;
  
+ 	  case 'p':		/* pathalias file */
+ 		if (val[0] == '\0')
+ 			PaliasFile = "palias";
+ 		else
+ 			PaliasFile = newstr(val);
+ 		break;
+ 
  	  default:
  		break;
  	}

fair@Apple.COM (Erik E. Fair) (01/06/89)

###############################################################################
###                                                                         ###
###                          APPLE.COM                                      ###
###                                                                         ###
###            sendmail configuration for generic complex host              ###
###                with both UUCP and Internet connections                  ###
###                                                                         ###
###            Erik E. Fair <fair@apple.com>                                ###
###                                                                         ###
###############################################################################


# Our local domain ($D is added to $w [hostname] for official name in base.m4)
DDapple.com

# All the names we are known by (put all the names & nick names on the next
# line, separated by spaces. If you need another line, begin it with "Cw")
Cwapple

# Our UUCP name
DUapple

# the list of UUCP hosts that we speak to
FU/usr/lib/uucp/L.sys %[a-zA-Z0-9]


# our pathalias database
Op/usr/lib/mail/paths

###############################################################################
###   baseline definitions that sendmail needs to operate                   ###
###############################################################################


# "I"nternal domains
CI BITNET UUCP

##########################
###   Special macros   ###
##########################

DV25

# my official hostname (is $w (hostname(2)) fully qualified?)

Dj$w
# my name (the name on mailer bounces)
DnMAILER-DAEMON
# UNIX header format
DlFrom $g $d
# delimiter (operator) characters
Do.:%@!^/[]
# format of a total name
Dq$?x$x <$g>$|$g$.
# SMTP login message
De$j Sendmail $v/$V ready at $b

###################
###   Options   ###
###################

# we have full sendmail support here
Oa
# substitution for space (blank) characters
OB.
# default network name
ONARPA
# send a copy of mail headers we bounce to "bounces"
OPbounces
# default delivery mode (deliver in background)
Odbackground
# (don't) connect to "expensive" mailers
#Oc
# temporary file mode
OF0600
# default GID
Og1
# log level
OL9
# Send to me too (even if I'm in an alias expansion)
Om
# default messages to old style
Oo
# read timeout -- violates protocols (timeout an SMTP idle for 2 hours)
Or2h
# queue up everything before starting transmission
Os
# default timeout interval (returns undelivered mail after 3 days)
OT3d
# time zone names (V6 only)
OtPST,PDT
# default UID
Ou1
# encrypted wizard's password (for the undocumented "wiz" SMTP command)
OWnot-likely
# rebuild the aliasfile automagically
#OD
# maximum load average before queueing mail
Ox50
# maximum load average before rejecting connections
OX60

################################## FILES ######################################
# location of alias file
OA/usr/lib/aliases
# location of help file
OH/usr/lib/sendmail.hf
# queue directory
OQ/usr/spool/mqueue
# status file
OS/usr/lib/sendmail.st

###############################
###   Message precedences   ###
###############################

Pfirst-class=0
Pspecial-delivery=100
Pbulk=-60
Pjunk=-100

#########################
###   Trusted users   ###
#########################

Troot
Tdaemon
Tusenet
Tuucp

#############################
###   Format of headers   ###
#############################

H?P?Return-Path: <$g>
HReceived: $?sfrom $s$. by $j$?r with $r$. ($v/$V-eef)
	id $i; $b
H?D?Date: $a
H?F?From: $q
H?x?Full-Name: $x
H?M?Message-Id: <$t.$i@$j>
HSubject:
H?D?Resent-Date: $a
H?F?Resent-From: $q
H?M?Resent-Message-Id: <$t.$i@$j>


###############################################################################
#		RULESET ZERO PREAMBLE                                         #
###############################################################################

S0

# first make canonical
R$*<$*>$*		$1$2$3				defocus
R$+			$:$>3$1				make canonical

# handle special cases.....
R@			$#local$:$n			handle <> form

R$*<@[$+]>$*		$:$1<@$[[$2]$]>$3		lookup numeric internet addr
R$*<@[$+]>$*		$#smtp$@[$2]$:$1@[$2]$3		numeric internet spec
R$-<@$w>		$#local$:$1

# canonicalize using the nameserver if not internal domain

R$*<@$*.$~I>$*		$:$1<@$[$2.$3$]>$4
R$*<@$->$*		$:$1<@$[$2$]>$3

# now delete the local info
R$*<$*.>$*		$1<$2>$3			remove trailing dot
R$*<$*.>		$1<$2>				remove trailing dot
R$*<$*$=w.UUCP>$*	$1<$2>$4			strip UUCP
R$*<$*$=w.$D>$*		$1<$2>$4			strip domain name
R$*<$*$=w>$*		$1<$2>$4			strip unqualified
R$*<$*$w>$*		$1<$2>$3			strip domain name
R$*<$*.>$*		$1<$2>$3			remove trailing dot
R$*<$*.>		$1<$2>				remove trailing dot

R<@>:$*			$@$>0$1				retry after route strip
R$*<@>$*		$@$>0$1				strip null trash & retry
R$*<@>			$@$>0$1				strip null trash & retry

# return uucp mail that looks like decvax!ittvax!marsvax! since it
# will be rejected at the final site with no username on it
R$*!<@$-.UUCP>		$#error$:Destination address truncated



###############################################################################
###   Machine dependent part of ruleset zero (where we decide what to do)   ###
###############################################################################


# .forward on the end of the domain name is a magic cookie we put in our UUCP
# map data so that we only do domain forwarding for sites that we have set up
# by hand, rather than for anything in the UUCP maps. If you are not on the 
# the Internet, do not worry about this. If you are, and you want to be nice to
# that little UUCP site next door, offer them domain forwarding service, and
# put "theiruucpname	.their.domain.com.forward" into your PRIVATE UUCP map
# data.
R$*<@$+.$~I>		$:$>20$1<@$2.$3>
R$*<@$~U.UUCP>		$:$>8$1<@$2.UUCP>		route UUCP

# resolve UUCP links to hosts known to this machine
R$*<@$=U.UUCP>$*	$#uucp$@$2$:$1			resolve local uucp

R$*<@$-.UUCP>		$#error$:Unknown UUCP host - not in the UUCP maps and not one of our UUCP neighbors

# resolve various and sundry other unofficial networks
#R$*<@$+.BITNET>$*	$#smtp$@cunyvm.cuny.edu$:$1@$2.BITNET$3		BITNET
R$*<@$+.MFENET>$*	$#smtp$@nmfecc.arpa$:$1@$2.MFENET$3		MFENET
R$*<@$+.CSNET>$*	$#error$:Obsolete domain tag - please use a real domain name


# when all else fails, look up the whole name in the host table
R$*<@$+>$*		$#smtp$@$2$:$1@$2$3			user@domain

# remaining names must be local
R@			$n					fix magic token
R$+			$#local$:$1				everything else

###############################################################################
###   End of ruleset zero                                                   ###
###############################################################################

###########################
#  Name Canonicalization  #
###########################
S3

# handle "from:<>" special case
R<>			$@@				resolve into magic token

# basic textual canonicalization
R$*<$*<$*<$+>$*>$*>$*	$4				3-level <> nesting
R$*<$*<$+>$*>$*		$3				2-level <> nesting
R$*<$+>$*		$2				basic RFC821/822 parsing

# make sure <@a,@b,@c:user@d> syntax is easy to parse -- undone later
R@$+,$+			@$1:$2				change all "," to ":"

# localize and dispose of domain-based addresses
R@$+:$+			$@$>6<@$1>:$2			handle <route-addr>

# more miscellaneous cleanup
R$+:$*;@$+		$@$1:$2;@$3			list syntax
R$+:$*;			$@$1:$2;			list syntax
R$+@$+			$:$1<@$2>			focus on domain
R$+<$+@$+>		$1$2<@$3>			move gaze right
R$+<@$+>		$@$>6$1<@$2>			already canonical

# convert old-style addresses to a domain-based address
R$+^$+			$1!$2				convert ^ to !
R$+!$+			$@$>9$1!$2			uucp name hackery
R$+%$+			$:$>5$1%$2			user%host%host
R$+<@$+>		$@$>6$1<@$2>			canonical

# Given multiple %'s change rightmost % to @.
S5
R$*<$*>$*		$1$2$3				defocus
R$*%$*			$1@$2				First make them all @'s.
R$*@$*@$*		$1%$2@$3			Undo all but the last.
R$*@$*			$@$1<@$2>			Put back the brackets.

###############################################################################
####   Assorted name hackery to make things simple for people              ####
###############################################################################

# here we look for addresses of the form: user%host.domain@gateway
# and strip off the gateway name (for the ones that we know)

S6

# conventional percent format

R$+%$+.MFENET<@nmfecc.arpa>	$>5$1<%$2.MFENET>	strip
R$+%$+.BITNET<@$+>		$>5$1<%$2.BITNET>	strip

# regulation route-addr format

R<@nmfecc.arpa>:$+@$+.MFENET	$1<@$2.MFENET>		strip
R<@$+>:$+@$+.BITNET		$2<@$3.BITNET>		strip

R$+<@$+.$-.UUCP>		$1<@$2.$3>		fix rn


# mung up names for the outside world - called from smtp mailer
S7
R$+@$+.MFENET		$1%$2.MFENET@nmfecc.arpa	user@host.MFENET
R$+@$+.BITNET		$1%$2.BITNET@cunyvm.cuny.edu	user@host.BITNET

###############################################################################
####   UUCP address hackery                                                ####
###############################################################################

S9
R$+!$=w!$+		$3				collapse loops
R$=w!$+			$2				collapse loops
R$-.$+!$+		$@$>6$3<@$1.$2>			do.main!user
R$-!$+			$@$>6$2<@$1.UUCP>		host!user

################################
#  Sender Field Pre-rewriting  #
################################
S1
R$*<$*>$*		$1$2$3				defocus

###################################
#  Recipient Field Pre-rewriting  #
###################################
S2
R$*<$*>$*		$1$2$3				defocus

###################################
#  Final Output Post-rewriting    #
#  Standard Domain-based version  #
###################################
S4
R@			$n				handle <> error addr


# resolve numeric addresses to name if possible
R$*@[$+]$*		$:$1@$[[$2]$]$3			numeric internet addr

# externalize local domain info
R$*<$+>$*		$1$2$3				defocus
R@$+:$+:$+		$@@$1,$2:$3			<route-addr> canonical

# UUCP must always be presented in old form
R$+@$-.UUCP		$2!$1				u@h.UUCP => h!u

###############################################################################
###   Local, and Program Mailer specifications                              ###
###############################################################################

# Nota Bene: what mailer flags you use depends upon what version of /bin/mail
# you have:
#
# 4th Berkeley Software Distribution (4.1 BSD or later)
#
Mlocal, P=/bin/mail, F=SlsDFMmnr, S=10, R=10, A=mail -d $u
#
# USG UNIX (System III, System V, Xenix 3.0 or later)
#
# Mlocal, P=/bin/mail, F=SlsDFMPpmn, S=10, R=10, A=mail $u
#
# Also, if you are using System V, you should get the Berkeley version of
# /bin/mail as soon as you can and junk the one you've got: it doesn't
# believe in sendmail, so the wrong thing will happen when someone types
# mail user@host (i.e. it will attempt local delivery, rather than call
# sendmail). It also does header hacking when it shouldn't (like adding
# a To: field).

Mprog, P=/bin/sh, F=lsDFMxehu, S=10, R=10, A=sh -c $u

S10

# S20
# I use ruleset 20 for other stuff


###############################################################################
####    IP/TCP/SMTP mailer (going out to internet land)                    ####
###############################################################################

Msmtp,	P=[IPC], F=mnDFMeuXLC, S=14, R=14, A=IPC $h, E=\r\n

S14
R$*@[$+]$*		$@$1@[$2]$3		already ok (inet addr spec)
R@$+@$+			$@@$1@$2		already ok (route-addr)

# if not local, and not a "fake" domain, ask the nameserver
R$+@$+.$~I		$:$1@$[$2.$3$]		user@host.domain
R$*:$*			$1.$2			map colons to dots
R$+@$+			$:$>7$1@$2		fix up names for the internet
R$+@$=X.UUCP		$2!$1@$X		fix remote UUCP
R$+@$=Y.UUCP		$2!$1@$Y		fix remote UUCP
R$+@$=Z.UUCP		$2!$1@$Z		fix remote UUCP
R$+@$-.UUCP		$@$2!$1@$j		undo local UUCP hack
R$-			$@$1@$j			add our official host name

S24
# nothing here - sender and recipient addresses are handled the same

###############################################################################
####   UUCP mailer (bangland)                                              ####
###############################################################################
#
# By default, this mailer will send only one copy of a letter per host,
# regardless of the number of recipients there. However, 4.1 BSD UNIX
# sites have a version of "rmail" that can't deal with this (and so do
# sites that inherited that old mailer).  Some older (and brain-dead, but
# what can you expect from Microsoft?) versions of Xenix are similarly
# afflicted. The "m" flag in the "F=" statement below controls this behavior.
# If you must speak to a site broken in this way, you can handle it two ways:
# 
# 1. define(DUMBUUCP)dnl configuration option, create a CLASS that
#    contains the list of brain-damaged sites, and match that
#    class in ruleset zero, before matching for the normal UUCP sites.
#    For a class "D", the rule should look like this:
#
#    R$+<@$=D.UUCP>	$#dumbuucp$@$2$:$1	BROKEN UUCP SITES
#
# 2. eschew all sendmail.cf hacking, and remove the "m" flag from the "F="
#    statement below. This will cause multiple copies of letters bound for
#    multiple recipients on any single host to be sent, rather than just one
#    copy per host.
#
# If you want uucico to be invoked immediately after a letter is queued
# (i.e. initiate the phone call immediately) remove the "-r" flag in the
# uux command line. Bear in mind that this has significant overhead when
# your system does a lot of UUCP; you'll have lots of uucico's contending
# with each other for modems.
#
# If your uux can't do the "-a" flag, remove it from the command line.
# When present, if something goes wrong at the other end, their uuxqt (if
# they also understand it - if they don't, they'll ignore it, so it's
# harmless, and potentially helpful) will mail a notification to the
# address given, rather than to "daemon" or "uucp" on your system.

Muucp,	P=/usr/bin/uux, F=msDFMhuU, S=13, R=23, M=100000,
	A=uux - -a$f -r $h!rmail ($u)


Mdumbuucp,	P=/usr/bin/uux, F=sDFMhuU, S=13, R=23, M=100000,
	A=uux - -a$f -r $h!rmail ($u)

S13
R$+@$-.UUCP		$2!$1				u@host.UUCP => host!u
R$=w!$+			$2				zap dups

R@$+@$+			$@$U!@$1@$2			ugh, route-addrs
# unfortunately, I have to resolve route-addrs before this rule, because
# it is so general that it matches them too, with disastrous results. - EEF
R$+@$-.$+		$2.$3!$1			uucpize address
R$+			$:$U!$1				stick on our host name

S23
# nothing here because bangland user-agents are supposed to rewrite these
# headers relative to the sender by themselves anyway, and the mailers
# (MTAs) are supposed to leave them the hell alone.

S8
# magic UUCP shit
R$*<@$-.UUCP>		$:${$2$}!$1		look up UUCP site in maps
R$+!%s!$*		$:$1!$2			remove %s database cruft
R$+			$:$>9$1			recanon

S20
# find domains that we forward for
R$*<@$+.$~I>		$:${.$2.$3.forward$}!$2.$3!$1
R.$+.forward!$+.$~I!$+	$@$4<@$2.$3>		match failed - go away
R$*!.$*			$1!$2			remove extra dot
R$+!$+.forward!$*	$1!$3			remove .forward copy
R$+			$:$>8$1			remove %s & recanon

lmb@vicom.COM (Larry Blair) (01/07/89)

In article <23430@apple.Apple.COM> fair@Apple.COM (Erik E. Fair) writes:
=arpatxt is a waste of time; you have to maintain a list of duplicate
=UUCP/and domain *component* names to prevent mis-routing when you try
=to use arpatxt (tedious manual work). Also, NIC HOSTS.TXT will
=eventually go away, and then where will you be? I've got a better hack
=which I will describe below.
=
=Since I like the Internet, I use it as much as I can. To encourage
=such usage (and cut back on the length of the UUCP paths generated by
=pathalias) I run pathalias twice over the UUCP maps before generating
=the dbm database used by sendmail. First pass is just for the raw
=maps. Then I run the pathalias output through an awk script called
="mkglue" that generates a sizeable glue file. Then run pathalias again
=over the raw maps with the glue file appended to the input.

I've been using Erik's mkglue script for about 4 months now.  Our
sendmail has not been hacked for path routing; we use smail.  We are
not on the Internet, so there were a few problems that came up.

One problem that came up was that we had to play some games to get the
domain based routes to go to our preferred Internet neighbor, while not
ending up with all mail to our other domained neighbors going through
that site as well.  To do this, we changed the cost of INTERNET in the
awk script from 1 to 2 and set up the Path.local file:

vsi1	preferred(9), other1(10), other2(10), etc.

Another problem came when we got ourselves registered in the .com domain.
Now vicom.com appeared in the INTERNET group, meaning that most paths now
ended up in that paths file as "%s".  To stop this, we commented out the
lines in the awk script that added our site to INTERNET.

While many of the paths produced appear shorter, they may not be, since
the hop from the Internet site to the forwarder doesn't show up.  It also
often means sending mail thru uunet unecessarily.  When I send mail to a
domainized site in the bay area that uses sun as their forwarder, it means
a slow trip.  Ideally, the non-Internet domained sites would be identified
by their forwarder, so that proper costing could take place.

Erik mentioned that he (and the other coordinators, he presumes) check the
map entries for accuracy re: domain aliases.  I have found that they are not
complete.  My particular complaint, made to the site admin at rice, but not
corrected, is that they _don't_ have rice = rice.edu in their map.
Unfortunately, cs.texas.edu lists a connection to rice.edu, creating an
unecessarily long path; one that I often use for retreiving archived materials.

All in all, there has been little bounced mail, though; certainly less than
I was getting before I started inverting the lists.  I've thanked Erik before
and do so once again.  Using this script has really made the usenet world
look a lot more like the Internet one.
-- 
Larry Blair   ames!vsi1!lmb   lmb@vicom.com

page@swan.ulowell.edu (Bob Page) (01/11/89)

fair@Apple.COM (Erik E. Fair) wrote:
>arpatxt is a waste of time; you have to maintain a list of duplicate

Well, I'm the guy who primarily maintains the arpatxt arpa-privates
file; I've developed a number of crude scripts to save time in
updating the file.

Having said that, the version of arpa-privates coming out (this week)
will be the last from me.

Erik's 'mkglue' script is superior and a lot easier to maintain, so
I'm giving up arpatxt.

..Bob
-- 
Bob Page, U of Lowell CS Dept.  page@swan.ulowell.edu  ulowell!page
Have five nice days.

honey@mailrus.cc.umich.edu (peter honeyman) (01/11/89)

bob seconds erik's disparaging remarks regarding arpatxt, notwithstanding
that "i'm the guy who primarily maintains the arpatxt arpa-privates file."
bob concludes "i'm giving up arpatxt."

let me third that.  i'm the guy who wrote arpatxt, and while i still think
it's a neat hack, i gave up maintaining arpa-privates a few years ago.  (i
thank bob for picking it up -- service beyond the call of duty.)

i haven't looked at erik's stuff, but i trust erik and bob, so i suppose i'll
go that route too.

	peter