[fa.info-vax] EUNICE Filename Hashing Scheme

info-vax@ucbvax.ARPA (10/26/84)

From: David L. Kashtan <KASHTAN@SRI-IU>

The basic EUNICE hashing algorithm is as follows (this is actually
a simplified version -- there is extra code to take care of cases
where the user wants UNIX filenames truncated to 14-characters ..etc):

	int hash = 0;
	char *cp = filename;

	/*
	 *	Run through the filename string and accumulate a hash
	 *	value.
	 */
	while(*cp) {
		hash << = 3;
		hash |= (hash & (3 << 16)) >> 16;
		hash += (unsigned char)*cp++;
	}
	/*
	 *	Restrict the hash value to 16 bits
	 */
	hash &= ~0xffff;

	char *cp1 = Hashed_Filename;
	*cp1++ = 'H';
	*cp1++ = 'S';
	*cp1++ = 'H';
	*cp1++ = '0';
	for(i = 0; i < 4; i++) {
		*cp1++ = 'A' + (hash & 0xf);
		hash >>= 4;
	}
	*cp1++ = 'A';	/* This may be incremented later for collisions */

	the two extensions added later are ".HSH" for data (".DIR" for
	directory) and ".HSN" for hash-name <--> real-name translation.
	The .HSN file is a standard VMS text file.  The 1st line is
	the Real name and the 2nd line is the hashed name.


Under VMS 4.0, the hashing scheme will remain, but an escape mechanism
to encode most UNIX filenames under legal VMS 4.0 filenames will be
used (until the filename becomes too long for even VMS 4.0 to handle).
If there is interest, I will be glad to post it.
David
-------