[net.lang.c] "Re: Identifier significance CHALLENGE

davidson@ihlpf.UUCP (12/08/83)

#R:sdcsvax:-6000:ihlpf:18400006:000:840
ihlpf!dap1    Dec  7 10:19:00 1983

It sounds like you are asking for a hashing function from the set of
(Berkeley identifiers) to the set of (Standard C identifiers).  There are
ways to do this, but none of them that I know of would lend themselves very
well to this particular situation.  You could perform normal hashing to get
a number from 1 to 10^8 - 1 and then prepend an X to the front, but your
identiers won't be very identifiable.  I don't see how you are going to
create a utility that can come up with very reasonable choices when choosing
between identifiers such as "thisthing" and "thisthinghere".  It seems like
the best you could do is have a utility to identify any identifiers longer
than 8 characters, inquire for the new name from the user and then do the
replacement throughout.  I wouldn think this could be accomplished in YACC.

Darrell Plank
BTL-IH

thomas@utah-gr.UUCP (Spencer W. Thomas) (12/08/83)

It is my understanding that some clever folks at Georgia Tech have
implemented such a program for mapping N-character Ratfor identifiers to
6-character FORTRAN identifiers.  I don't know the details, and it may
come up with G12345 type identifiers, but it HAS been done.

(This is my memory of a talk given at the Software Tools conference in
Boulder (1980?).)
=Spencer

ags@pucc-k (Seaman) (12/12/83)

One way to handle the hashing problem is to resolve collisions by permuting
the last character, if necessary, after truncating.  For example, suppose
you want a maximum identifier length of 6 characters and the original program
contains (in this order)

	thisisalongidentifier
	thisisanotherlongidentifier
	thisisyetanotherlongidentifier

These would be shortened as follows:

	thisisalongidentifier		->	thisis
	thisisanotherlongidentifier	->	thisit
	thisisyetanotherlongidentifier  ->	thisiu



				Dave Seaman
				..!pur-ee!pucc-k!ags

chris@umcp-cs.UUCP (12/14/83)

Look at your K&R book.  Chapter 2, p. 33:

	"Only the first eight characters of an internal name are
	significant, although more may be used."

(They also mention that external names (i.e. globals) may have
less than 8 significant characters.)

So, a suggestion:

*Prepend* a unique identifier to those which are not already
unique.  For example, you might replace

	long_name_function () ...

with

	_1_long_name_function () ...

if "long_na" (or whatever the name becomes after compilation) is
not unique.  The next occurance might have "_2_" prepended.

The nice thing about this is that the original names are easy to
find (and, if necessary, recreate).
-- 
In-Real-Life: Chris Torek, Univ of MD Comp Sci
UUCP:	{seismo,allegra,brl-bmd}!umcp-cs!chris
CSNet:	chris@umcp-cs		ARPA:	chris.umcp-cs@CSNet-Relay