[net.sources] 16-bit pathalias, part 1 of 2

campbell@maynard.BSW.COM (Larry Campbell) (12/15/86)

#! /bin/sh
# This is a shell archive, meaning:
# 1. Remove everything above the #! /bin/sh line.
# 2. Save the resulting text in a file.
# 3. Execute the file with /bin/sh (not csh) to create:
#	README
#	CHANGES
#	pathalias.1
#	Makefile
#	def.h
#	config.h
# This archive created: Sun Dec 14 23:58:23 1986
export PATH; PATH=/bin:/usr/bin:$PATH
if test -f 'README'
then
	echo shar: "will not over-write existing file 'README'"
else
cat << \SHAR_EOF > 'README'
This is a hacked version of pathalias that uses a temporary file
to store the link database rather than using virtual memory.  While
this enables a 16-bit machine (like my Rainbow) to crunch the entire
map, it also makes pathalias v-e-r-y s-l-o-w.  This version is based
on pathalias 8.1, posted to mod.sources in January 1986.

I take no credit (or blame!) for this version.  The temp file hack
was done by someone who wishes to remain nameless -- he doesn't have
the time to respond to bug reports or such.

I have forwarded this version to Peter Honeyman and he is now working
on reconciling the two versions.  I've posted this to net.sources,
rather than mod.sources, in the hope that Peter will soon have a better
version more worthy of membership in the mod.sources archives.  The remainder
of this file is the original README written by Peter.

- Larry Campbell (campbell@maynard.BSW.COM, maynard!campbell)
  Sun Dec 14 23:52:02 EST 1986
-----------------------------------------------------------------
From princeton!honey Sun Jan 19 18:35:58 EST 1986
To: whom it may concern
Subject: pathalias instructions

edit config.h
make

if getopt is undefined, obtain a copy from mod.sources.  pathalias input
is regularly published in mod.map.

	peter

ps:

pathalias, written by steve bellovin and peter honeyman, is in the public
domain, and may be used by any person or organization, in any way and for
any purpose.

pps:

There is no warranty of merchantability nor any warranty of fitness for
a particular purpose nor any other warranty, either express or implied,
as to the accuracy of the enclosed materials or as to their suitability
for any particular purpose.  Accordingly, the authors assume no respon-
sibility for their use by the recipient.   Further, the authors assume
no obligation to furnish any assistance of any kind whatsoever, or to
furnish any additional information or documentation.
SHAR_EOF
fi
if test -f 'CHANGES'
then
	echo shar: "will not over-write existing file 'CHANGES'"
else
cat << \SHAR_EOF > 'CHANGES'
-- somewhat later in 9/86
Alter hash table as described below.

-- pdp11 version, 9/86
Remove link/host tracing
Make program much slower

Note that the hash table is still in memory.  If many more sites
are entered into the map, Table must be converted to an array of short,
and the temp file references must be converted to node/link numbers instead
of byte positions within the temp file.  Getnode(), putnode(), etc. must then
convert these references back to postions = sizeof(node)*nodenumber.

-- mod.sources, 4.3bsd, 1/86
Improved alias treatment.
Routes to domain gateways.
New domain treatment -- leading dot in name implies domain.
Print fully qualified domain name for private hosts.
Link/host tracing (-t option).
Use getopt().

-- mod.sources, 8/85
Private hosts documented.
Homegrown scanner -- it's true what they say about lex.
Makedb.
Domains and gateways.
DEAD back link.

-- net.sources, 1/85
No ! in dbm key.
Network character must be one of !@%: -- dot is dead.
Private hosts.
Discourage hybrid addresses.  
Magic @ <-> % rule.

-- 1983-1984
Reverse sense of the -c (cost) flag.
Use cheapest among duplicate links.
Elide network names in output.

-- epoch (smb version)
SHAR_EOF
fi
if test -f 'pathalias.1'
then
	echo shar: "will not over-write existing file 'pathalias.1'"
else
cat << \SHAR_EOF > 'pathalias.1'
.\" Thanks to Alan Silverstein for help with the manual entry.
.TH PATHALIAS 1 
.SH NAME
pathalias, makedb \- electronic address router
.SH SYNOPSIS
.B pathalias
[
.B \-ivc
] [
.BI \-t \0link
] [
.BI \-l \0host
] [
.BI \-d \0link
] [
.ig
.\" the -g option is for pathparse.  it's not really used by pathalias.
.BI \-g \0file
] [
..
.I files
]
.PP
.B makedb
[
.B \-a
] [
.BI \-o \0dbmfile
] [
.I files ...
]
.ad b
.SH DESCRIPTION
.I pathalias
computes the shortest paths and corresponding routes from one host
(computer system) to all other known, reachable hosts.
.I pathalias
reads host-to-host connectivity
information on standard input or in the named
.IR files ,
and writes a list of
host-route pairs on the standard output.
.PP
.I makedb
takes
.I pathalias
output and creates or appends to a
.IR dbm (3)
database.
.PP
Here are the
.I pathalias
options:
.TP 6
.B \-i
Ignore case:  map all host names to lower case.
By default, case is significant.
.TP
.B \-c
Print costs: print the path cost (see below) before each host-route pair.
.TP
.B \-v
Verbose: report some statistics on the standard error output.
.ig
.\" the -g option is for pathparse and is not for public consumption (yet!).
.TP
.BI \-g \0file
Dump the edges of the graph into the named file.
..
.TP
.BI \-l \0host
Set local host name to
.IR host .
By default,
.I pathalias
discovers the local host name in a system-dependent way.
.TP
.BI \-d \0arg
Declare a dead link, host, or network (see below).
If
.I arg
is of the form ``host1!host2,'' the link from host1 to host2
is treated as an extremely high cost (\fIi.e.\fP, \s-1DEAD\s0) link.
If
.I arg
is a single host name,
that host is treated as dead
and is be used as an intermediate host of last resort on any path.
If
.I arg
is a network name, the network requires a gateway.
.TP
.BR  \-t \0arg
Trace input for link, host or network on the standard error output.
The form of
.I arg
is as above.
.PP
Here are the
.I makedb
options:
.TP 6
.B \-a
Append to an existing database;
by default,
.I makedb
truncates the database.
.TP
.BI \-o \0dbmfile
Identify the output file base name.
.SS \fIpathalias\fP Input Format
A line beginning with white space continues the preceding line.
Anything following `#' on an input line is ignored.
.PP
A list of host-to-host connections consists of a ``from'' host in column 1,
followed by white space,
followed by a comma-separated list of ``to' hosts, called
.IR links .
A link may be preceded or followed by a network character to use
in the route.
Valid network characters are `!' (default), `@', `:', and `%'.
A link (and network character, if present) may be
followed by a ``cost'' enclosed in parentheses.
Costs may be arbitrary
arithmetic expressions involving numbers, parentheses, `+', `\-', `*',
and `/'.
The following symbolic costs are
recognized:
.PP
.RS
.nf
.ta 14mR 17m
\s-1LOCAL\s0	25	(local-area network connection)
\s-1DEDICATED\s0	95	(high speed dedicated link)
\s-1DIRECT\s0	200	(toll-free call)
\s-1DEMAND\s0	300	(long-distance call)
\s-1HOURLY\s0	500	(hourly poll)
\s-1EVENING\s0	1800	(time restricted call)
\s-1DAILY\s0	5000	(daily poll, also called \s-1POLLED\s0)
\s-1WEEKLY\s0	30000	(irregular poll)
.fi
.RE
.PP
In addition,
.SM DEAD
is a very large number (effectively infinite),
and
.SM HIGH
and
.SM LOW
are \-5 and +5 respectively,
for baud-rate or quality bonuses/penalties.
These symbolic costs represent an imperfect measure of bandwidth,
monetary cost, and frequency of connections.
For most mail traffic, it is important to minimize the number
of intermediaries in a route,
thus,
.IR e.g. ,
.SM HOURLY
is far greater than
.SM DAILY
/ 24.
If no cost is given,
a default of 4000 is used.
.PP
For the most part, arithmetic expressions that mix symbolic constants
other than
.SM HIGH
and
.SM LOW
make no sense.
.IR E.g. ,
if a host calls a local neighbor whenever there is work,
and additionally polls every evening,
the cost is
.SM DIRECT,
.B not
.SM DIRECT+EVENING.
.PP
Some examples:
.PP
.RS
.nf
.ta 10m 15m
down	princeton!(\s-1DEDICATED\s0), tilt,
	%thrash(\s-1LOCAL\s0)
princeton	topaz!(\s-1DEMAND\s0+\s-1LOW\s0)
topaz	@rutgers(\s-1LOCAL\s0)
.fi
.RE
.PP
If a link is encountered more than once,
the least-cost occurrence dictates the cost and network character.
Links are treated as bidirectional, to the extent that a
.SM DEAD
reverse link is assumed unless better information is available.
.PP
The set of names by which a host is known by its neighbors is
called its
.IR aliases .
Aliases are declared as follows:
.PP
.RS
name = alias, alias ...
.RE
.PP
The name used in the route to or through aliased hosts
is the name by which the host is known
to its predecessor in the route.
.PP
Fully connected networks, such as the
.SM ARPANET
or a local-area network,
are declared as follows:
.PP
.RS
net = {host, host, ...}
.RE
.PP
The host-list may be preceded or followed by a routing
character, and may be followed by a cost:
.PP
.RS
.nf
princeton-ethernet = {down, up, princeton}!(\s-1LOCAL\s0)
\s-1ARPA\s0 = @{sri-unix, mit-ai, su-score}(\s-1DEDICATED\s0)
.fi
.RE
.PP
See also the sections on
.I gateways
and
.I domains
below.
.PP
Connection data may be given while hiding host names
by declaring
.PP
.RS
private {host, host, ...}
.RE
.PP
.I pathalias
will not generate routes for private hosts, but may produce routes
through them.
The scope of a private declaration extends from the declaration to the end of
the input file in which it appears.
It is best to put private declarations at the beginning of the appropriate
input file.
.SS Output Format
A list of host-route pairs is written to the standard output,
where route is a string appropriate for use with
.IR printf (3),
.IR e.g. ,
.PP
.RS
.nf
.ta 10m 20m
rutgers	princeton!topaz!%s@rutgers
.fi
.RE
.PP
The ``%s'' in the route string should be replaced by the
user name at the destination host.
(This task is normally performed by a mailer.)
.PP
Except for
.I domains
(see below),
the name of a network is never used in
expansions.
Thus, in the earlier example, the path from down to
up would be ``up!%s,'' not ``princeton-ethernet!up!%s.''
.SS Gateways
A network is represented by
a pseudo-host and a set of network members.
Links from the members to the network have the weight given in
the input, while the cost from the network to the members is zero.
If a network is declared dead on the command line (with the
.B \-d
option),
the member-to-network links are marked dead,
which discourages paths to members by way of the network.
.PP
If the input also shows a link from a host to the network,
then that host will be preferred as a gateway.
Gateways need not be network members.
.PP
.IR E.g. ,
suppose
.SM CSNET
is declared dead on the command line
and the input contains
.PP
.RS
.nf
.ta 10m 20m
\s-1CSNET\s0 = {...}
csnet-relay	\s-1CSNET\s0
.fi
.RE
.PP
Then routes to
.SM CSNET
hosts will use csnet-relay as a gateway.
.PP
.I pathalias
discourages forwarding beyond dead networks.
.SS Domains
A host or network whose name begins with `.' is called
a domain.
Domains are presumed to require gateways,
.IR i.e. ,
they are \s-1DEAD\s0.
The route given by a path through a domain is similar to
that for a network, but here
the domain name is tacked onto the end of the next host.
Subdomains are permitted.
.IR E.g. ,
.PP
.RS
.nf
.ta 1i
.ta 10m 20m
harvard	.\s-1EDU\s0
\&.\s-1EDU\s0 = {.\s-1BERKELEY\s0}
\&.\s-1BERKELEY\s0	ernie
.fi
.RE
.PP
yields
.PP
.RS
.nf
.ta 10m 20m
ernie	...!harvard!ernie.\s-1BERKELEY\s0.\s-1EDU\s0!%s
.fi
.RE
.PP
Output is given for the nearest gateway
to a domain,
.IR e.g. ,
the example above gives
.PP
.RS
.nf
.ta 10m 25m
\&.\s-1EDU\s0	...!harvard!%s
.fi
.RE
.PP
Output is given for a subdomain if it has a different
route than its parent domain, or if all of its ancestor domains are private.
.SS Databases
.I Makedb
builds a
.IR dbm (3)
database from the standard input or from the named
.IR files .
(\fIMakedb\fP replaces the obsolete 
.B \-b
option of
.IR pathalias ,
which is no longer recognized.)
Input is expected to be sequence of
.SM ASCII
records,
each consisting of a key field and a data field separated by a single tab.
If the tab is missing, the data field is assumed to be empty.
.SH FILES ET AL.
.ta \w'/usr/local/lib/palias.{dir,pag}     'u
/usr/local/lib/palias.{dir,pag}	default dbm output
.br
newsgroup mod.map	likely location of some input files
.br
.IR getopt (3),
available from newsgroup mod.sources (if not in the C library).
.SH BUGS
The order of arguments is significant.
In particular,
.B \-i
and
.B \-t
should appear early.
.PP
.I pathalias
can generate hybrid (\fIi.e.\fP ambiguous) routes, which are
abhorrent and most certainly should not be given as examples
in the manual entry.
.PP
Multiple `@'s in routes are prohibited by many mailers, so
.I pathalias
resorts to the ``magic %'' rule when appropriate.
This convention is not documented anywhere, including here.
.PP
Domains constitute a futile attempt to defeat anarchy and otherwise
retard progress.
.SH AUTHORS
Steve Bellovin (ulysses!smb)
.br
Peter Honeyman (princeton!honey)
SHAR_EOF
fi
if test -f 'Makefile'
then
	echo shar: "will not over-write existing file 'Makefile'"
else
cat << \SHAR_EOF > 'Makefile'
#!/bin/make -f
# pathalias -- by steve bellovin, as told to peter honeyman

# if you can't or don't intend to use dbm files, don't bother with makedb
# DBM = -ldbm
# or if you roll your own ...
# DBM = dbm.o

CC = cc
CFLAGS = -O
# As of 9/86 you can do the whole map with split i/d.  If you don't use
# split i/d, it may still work.  Look at addnode.c's hash table size;
# you'll have to make the hash table smaller to get it to fit.
LDFLAGS = -i
YFLAGS = -d

OBJ =	addlink.o addnode.o extern.o local.o main.o mapit.o mapaux.o \
	mem.o parse.o putnode.o /usr/local/lib/getopt.o
HDRS = def.h config.h
CSRC = addlink.c addnode.c extern.c local.c main.c mapit.c mapaux.c mem.c putnode.c
SRC = $(CSRC) parse.y
LSRC = $(CSRC) parse.c

all: pathalias printit makedb

pathalias: $(OBJ)
	$(CC) $(LDFLAGS) $(OBJ) -o pathalias

printit: printit.o putnode.o config.h def.h extern.o mem.o
	$(CC) $(LDFLAGS) -o printit printit.o putnode.o extern.o mem.o

$(OBJ):	$(HDRS)
printit.o: $(HDRS)

parse.c: parse.y $(HDRS)
	$(YACC) $(YFLAGS) parse.y
	mv y.tab.c parse.c

makedb: makedb.o
	$(CC) $(LDFLAGS) makedb.o $(DBM) -o makedb

makedb.o: config.h

clean:
	rm -f *.o y.tab.? parse.c makedb pathalias printit

clobber: clean
	rm -f pathalias makedb

tags: $(SRC) $(HDRS) makedb.c
	ctags -w $(HDRS) $(SRC)

bundle:
	shar README CHANGES pathalias.1 Makefile ${HDRS} >Shar.1
	shar ${SRC} makedb.c >Shar.2

lint:	$(LSRC)
	lint $(CFLAGS) $(LSRC)
	lint makedb.c

install: pathalias printit makedb
	strip pathalias printit makedb
	alias news cp pathalias printit makedb /usr/spool/news/lib

SHAR_EOF
fi
if test -f 'def.h'
then
	echo shar: "will not over-write existing file 'def.h'"
else
cat << \SHAR_EOF > 'def.h'
/* pathalias -- by steve bellovin, as told to peter honeyman */
#ifndef lint
#ifdef MAIN
static char	*h_sccsid = "@(#)def.h	8.1 (down!honey) 86/01/19";
#endif /*MAIN*/
#endif /*lint*/

#include <stdio.h>
#include <ctype.h>
#include "config.h"

typedef	long Cost;
typedef struct node node;
typedef struct link link;

#ifdef lint
#define vprintf fprintf
#else /*!lint -- this gives null effect warning*/
/* because it's there ... */
#define vprintf		!Vflag ? 0 : fprintf
#endif /*lint*/

#define NTRACE	16	/* can trace up to NTRACE hosts/links */

/* scanner states (yylex, parse) */
#define OTHER 0
#define COSTING 1
#define NEWLINE 2

#define	isnetc(c)	((c)=='!' || (c)==':' || (c)=='@' || (c)=='%')

#define dirbits(c)	(c)

/* flags for n_flag */
#define ISPRIVATE  0x0001 /* this node invisible outside its definition file */
#define COLLISION  0x0002 /* collides with a private host name */
#define ATSIGN	   0x0004 /* seen an at sign?  used for magic @/% rules */
#define MAPPED	   0x0008 /* done mapping this node */
#define	NDEAD	   0x0010 /* node is dead */
#define HASLEFT	   0x0020 /* route has a left side net character */
#define HASRIGHT   0x0040 /* route has a right side net character */
#define	NNET	   0x0080 /* node is a network name */
#define INDFS	   0x0100 /* used when removing net cycles */
#define DUMP	   0x0200 /* we have dumped this net's edges */
#define GATEWAYIN  0x0400 /* heaped via gatewayed net */

#define ISADOMAIN(n) ((n)->n_name[0] == '.')
#define ISANET(n) (((n)->n_flag & NNET) || ISADOMAIN(n))
#define DEADHOST(n) (((n)->n_flag & (NNET | NDEAD)) == NDEAD)
#define GATEWAYED(n) (((n)->n_flag & (NNET | NDEAD)) == (NNET | NDEAD) || ISADOMAIN(n))


/*
 * save some space in nodes -- there are > 10,000 allocated!
 *
 *	node	*n_net		others in this network (parsing)
 * 	node	*n_root		root of net cycle (mapping)
 *
 *	node	*n_private	other privates in this file (parsing)
 *	char	*n_parent	parent in shortest path tree (mapping)
 *		
 */

#define n_root n_net_root
#define n_net n_net_root

#define n_private n_prvparent
#define n_parent  n_prvparent

/* WARNING: if > 2^16 nodes, type of n_tloc must change */
struct node {
	char	n_name[MAXNAME]; /* host name */
	int	n_offset;	/* location of node in temp file */
	int	n_link;		/* head of adjacency list */
	int 	n_net_root;
	int	n_prvparent;
	Cost	n_cost;		/* cost to this host */
	unsigned short	n_tloc;	/* back ptr to heap/hash table */
	short	n_flag;		/* see manifests above */
};

#define	DEFNET	'!'			/* default network operator */
#define	DEFDIR	LLEFT			/* host on left is default */
#define	DEFCOST	((Cost) 4000)		/* default cost of a link */
#define	INF	((Cost) 30000000)	/* infinitely expensive link */

/* data structure for adjacency list representation */

/* flags for l_dir */

/*
 * there's an ugly dependency between the following manifests and the
 * variable Netchars = "!:^@%", defined in extern.c.  this saves 2
 * bytes per link (of which there are well over 20k).  this does not
 * mean i'm satsified with bad design.
 */
#define NETDIR(l)	((l)->l_flag & LDIR)
#define NETCHAR(l)	(Netchars[(l)->l_flag & LNETCHARS])

#define LNETCHARS	0x3
#define LBANG		0x0
#define LCOLON		0x1
#define LAT		0x2
#define LPERCENT	0x3

#define LDIR	0x8	/* 0 for left, 1 for right */
#define LRIGHT	0x0	/* user@host style */
#define LLEFT	0x8	/* host!user style */

#define LDEAD	 0x10	/* this link is dead */
#define LTREE	 0x20	/* member of shortest path tree */
#define LALIAS	 0x40	/* alias */
#define LGATEWAY 0x80	/* this link is a gateway */

/*
 * borrow a field for link/node tracing
 *
 *	link	*l_next;	rest of adjacency list (not tracing)
 *	link	*l_from;	source node (tracing) -- requires a cast
 *		
 */

#define l_next	l_next_from
#define l_from	l_next_from

struct link {
	int	l_next_from;
	int	l_offset;	/* location of node in temp file */
	int	l_to;		/* adjacent node */
	Cost	l_cost;		/* edge cost */
	char	l_flag;		/* right/left syntax */
};

/*
 * static functions don't show up in prof(1), so ...
 * someday i'll be done profiling.
 * yeah, sure, like when hell freezes over.
 */
#define STATIC /*static*/

/* external functions */
extern int	addnode(), *newtable(), addprivate();
extern int	addlink(), addgateway();
extern node	*getnode(), *newnode(), *tmpgetnode();
extern link	*getlink(), *newlink(), *tmpgetlink();
extern char	*local();
extern void	pack();

/* external variables */
extern char	*optarg;
extern int	optind;
extern int	Home;
extern int	Private;
extern int	*Table;
extern char	*Cfile;
extern char	**Ifiles;
extern char	*ProgName;
extern int	Lineno;
extern long	Tabsize;
extern char	*Netchars;
extern int	Vflag;
extern int	Cflag;
extern int	Iflag;
extern int	Tflag;
extern int	Ncount;
extern int	Lcount;
extern char	*Graphout;
extern char	*Linkout;
extern long	Hashpart;
extern int	Scanstate;
SHAR_EOF
fi
if test -f 'config.h'
then
	echo shar: "will not over-write existing file 'config.h'"
else
cat << \SHAR_EOF > 'config.h'
/* pathalias -- by steve bellovin, as told to peter honeyman */

/*#define STRCHR	/* have strchr, not index -- probably system v */
/*#define UNAME		/* have uname() -- probably system v or 8th ed. */
/*#define MEMSET	/* have memset() -- probably system v or 8th ed. */

/* #define GETHOSTNAME	/* have gethostname() -- probably 4.2bsd */
/* #define BZERO	/* have bzero() -- probably 4.2bsd */

/* #define DEBUG /* Useful if you like core dumps (keeps temp files too) */

/* location of temporary files for nodes and links */
/* estimated size about 2000 blocks (1.0mbytes), but will increase */
#define NTEMPFILE "/usr/spool/tmp/tmpnodes"
#define LTEMPFILE "/usr/spool/tmp/tmplinks"

/* This is the maximum size of a site's name.  It rapidly increases the */
/* the temp file's size, so keep it just over the maximum length. */
/* Although long names are unnecessary, some sites use them. */
/* I have kept this as small as possible since I want the temp file to */
/* fit on an RS04, but if you don't have one you may make it larger */

#define MAXNAME 40

/* default place for dbm output of makedb (or use -o file at run-time) */
#define	ALIASDB	"/usr/spool/news/lib/palias"

/* location of the printit phase of this program */
#define PRINTIT "/usr/lib/news/printit"
#define PRINTITC "/usr/lib/news/printit -c"

/**************************************************************************
 *									  *
 * +--------------------------------------------------------------------+ *
 * |									| *
 * |			END OF CONFIGURATION SECTION			| *
 * |									| *
 * |				EDIT NO MORE				| *
 * |									| *
 * +--------------------------------------------------------------------+ *
 *									  *
 **************************************************************************/

#ifdef MAIN
#ifndef lint
static char	*c_sccsid = "@(#)config.h	8.1 (down!honey) 86/01/19";
#endif /*lint*/
#endif /*MAIN*/

#ifdef STRCHR
#define index strchr
#define rindex strrchr
#else
#define strchr index
#define strrchr rindex
#endif

#ifdef BZERO
#define strclear(s, n)	((void) bzero((s), (n)))
#else /*!BZERO*/

#ifdef MEMSET
extern char	*memset();
#define strclear(s, n)	((void) memset((s), 0, (n)))
#else /*!MEMSET*/
extern void	strclear();
#endif /*MEMSET*/

#endif /*BZERO*/

extern char	*malloc();
extern char	*strcpy(), *index(), *rindex();
SHAR_EOF
fi
exit 0
#	End of shell archive
-- 
Larry Campbell                                The Boston Software Works, Inc.
Internet: campbell@maynard.bsw.com          120 Fulton Street, Boston MA 02109
uucp: {alliant,wjh12}!maynard!campbell              +1 617 367 6846
ARPA: campbell%maynard.uucp@harvisr.harvard.edu      MCI: LCAMPBELL