johnl@ima.UUCP (Compilers mailing list) (09/05/86)
I would like to know if anyone has heard of the following algorithm, or knows how to write it, or has any ideas about how it could be written. Say you have N numbers, {X}, which you know ahead of time, and you want to assign them the integers {1-N} in any order, but in a unique fashion. Numbers not in {X} can be assigned anything, although it would be nice if they were assigned to, say 0, or N+1. This is nice for parse tables, or for case statements where the {X} cover a large spread and the compiler uses an if-else tree. What I want is a program which writes the above program. The subroutine: alg(Y) for( i = {1-N} ) if( Y = X[i] ) return i is of order N operations, probably 3 or 4 N. Better is a binary tree: alg(Y) if( Y = X[N/2] ) return N/2 else if( Y < X[N/2] ) if( Y = X[N/4] ) return N/4 else if( Y < X[N/4] ) ... else if( Y > X[N/2] ) if( Y = X[3N/4] ) return 3N/4 ... which is of order 2logN operations, but longer. Also, there is the lookup table of size max{X}. Yuk. None of these are optimal. Now we get to the meat of the problem. I bet that there is an algorithm which will find a set of binary operations, such as binary and, or, not, xor, etc, which, when handed a Y, will return {1-N}. This would be a black box with about logN operations, and would be optimal for both speed and space. For example: X = { 0, 3, 5 }. then alg(Y) return X & 3 will map ( 0 -> 0 ), ( 3 -> 3 ), ( 5 -> 1 ). 2 was skipped, but oh well. Anyway, any ideas? Brian Sutin Department of Astrophysical Sciences/Princeton University { princeton, ... }!astrovax!sutin [This problem is known as perfect hashing. The traditional application is taking a token from a program you're compiling and seeing if it's one of the reserved words. More formally, a perfect hash function for a set of tokens maps exactly one of the tokens into each of a set of hash buckets. Computing perfect hash functions is hard -- it takes minutes of computer time on something like a PDP-11. Once you have the hash function, using it is O(1) because you take a token, hash it, compare it to the token that belongs in the bucket hashed to, and either it's that token or it's not in your set. There has been a fair amount written on both the form of perfect hash functions and methods of computing them, typically by fitting some sort of polynomial. Do any of you readers know of good references? I can't find any right now. -John] -- Send compilers mail to ima!compilers or, in a pinch to Levine@YALE.EDU Plausible paths are { ihnp4 | decvax | cbosgd | harvard | yale | bbncca}!ima Please send responses to the originator of the message -- I cannot forward mail accidentally sent back to compilers. Meta-mail to ima!compilers-request
johnl@ima.UUCP (Compilers mailing list) (09/08/86)
Cc: >There has been a fair amount written on both the form of perfect hash functions >and methods of computing them, typically by fitting some sort of polynomial. >Do any of you readers know of good references? I can't find any right now. There is a very good (but short) article on reciprocal hashing (a form of perfect hashing) by G. Jaeschke in Comm ACM, 24,12 (Dec. 1981). The article explains advantages of using the reciprocal method to obtain perfect hash functions and briefly mentions other methods. References from this article that may also be useful are: (edited) 1. Cichelli, R.J. Minimal perfect hash functions made simple. Comm. ACM, 23, 1 (Jan 1980), 17-19. 2. Jaescjke, G. Minimal Perfect Hashing. Tech Rept. TR 80.02.0003 IBM Heidelberg Scientific Center, Niederlassung Heideberg, Germany. 3. Jaeschke, G., Osterburg, G. On Cichelli's minimal perfect hash functions method. Comm. ACM, 23, 12 (Dec 1980), 728-729. 4. Knott, G.D. Hashing Functions, Computer J., 18 (Aug 1975), 265-278 5. Sprugnoli, R. Perfect hashing functions: a single probe retrieving method for static sets. Comm. ACM, 20, 11 (Nov 1977), 841-850. Paul Guthrie ihnp4!ihdev!pdg -- Send compilers mail to ima!compilers or, in a pinch to Levine@YALE.EDU Plausible paths are { ihnp4 | decvax | cbosgd | harvard | yale | bbncca}!ima Please send responses to the originator of the message -- I cannot forward mail accidentally sent back to compilers. Meta-mail to ima!compilers-request