[net.lang.pascal] Hashing is not O

joel@decwrl.UUCP (Joel McCormack) (02/09/85)

-     Hashing IS NOT an O(1) operation!
- 
-     Hashing is an O(N)/buckets operation, where many times you can use enough
- buckets to make it very fast.

    This was my original message.  I have now gotten messages from three
individuals telling me, with various arguments about how to do good hashing
and non-collision, that I am wrong.

    I started to answer personally, but the traffic has gotten large enough
that I'm back here on this board.

    Look at pages 104-108 of Knuth, The Art of Computer Programming, Vol. 1.
Section 1.2.11 is titled "Asymptotic Representation."  You cannot place a
restriction on the size of a problem and then use big-O notation.  O notation
specifically addresses unbound functions, and how the function (or algorithm)
behaves dependent upon problem size (for example, number of records in the
dictionary being searched).

    The fact (alluded to in my original message, but evidently not made clear 
enough) that in many practical cases hashing can be used to give a bounded 
(and possibly constant) lookup time for a bounded dictionary has nothing to
do with being O(1).

-- 
- Joel McCormack {ihnp4 decvax ucbvax allegra}!decwrl!joel
		 joel@decwrl.arpa

jeff@gatech.UUCP (Jeff Lee) (02/11/85)

>-     Hashing IS NOT an O(1) operation!
>- 
>-     Hashing is an O(N)/buckets operation, where many times you can use enough
>- buckets to make it very fast.
> 		.
>		.
>		.
>     Look at pages 104-108 of Knuth, The Art of Computer Programming, Vol. 1.
> Section 1.2.11 is titled "Asymptotic Representation."  You cannot place a
> restriction on the size of a problem and then use big-O notation.  O notation
> specifically addresses unbound functions, and how the function (or algorithm)
> behaves dependent upon problem size (for example, number of records in the
> dictionary being searched).
> 
>     The fact (alluded to in my original message, but evidently not made clear 
> enough) that in many practical cases hashing can be used to give a bounded 
> (and possibly constant) lookup time for a bounded dictionary has nothing to
> do with being O(1).
> 

I was originally disagreeing with your statement that hashing is
O(N)/(table size). That's why I looked up several of the more popular
collision handling methods. I agree that looking up an item in a table
is not O(1), but hashing CAN be O(1). Hashing is computing, by some
arithmetic function, the "home" address of (normally) an identifier. There
is no reason that this function cannot be O(1). When it comes time to insert
it into the table, things begin to change but that doesn't mean that the
hashing is not O(1).

They make real good tables, though. A bound of 4-5 probes with a good hash
algorithm and a fixed size, chained overflow collision handling is a lot better
than a binary (figuring 5 probes on a tree would mean only 31 items in the
tree).

-- 
Jeff Lee
CSNet:	Jeff @ GATech		ARPA:	Jeff.GATech @ CSNet-Relay
uucp:	...!{akgua,allegra,rlgvax,sb1,unmvax,ulysses,ut-sally}!gatech!jeff

ndiamond@watdaisy.UUCP (Norman Diamond) (02/11/85)

Insertions in hash tables are not O(1) (period).

Insertions can be done in such a manner that searches and retrievals will
always be O(1).  Some of these insertions will be more expensive than
others.

Searches and retrievals in hash tables can be made O(1), period.
-- 

   Norman Diamond

UUCP:  {decvax|utzoo|ihnp4|allegra|clyde}!watmath!watdaisy!ndiamond
CSNET: ndiamond%watdaisy@waterloo.csnet
ARPA:  ndiamond%watdaisy%waterloo.csnet@csnet-relay.arpa

"Opinions are those of the keyboard, and do not reflect on me or higher-ups."