bimbart@kulcs.uucp (Bart Demoen) (10/27/89)
At NACLP'89, some people asked why we choose hashing for dynamic indexing over binary search: they argued that binary trees are easier to update and that hashing could outperform binary search only for really big tables. The former is difficult to prove or disprove, but I felt compelled to test the latter. I implemented a builtin predicate bin_index/1, similar to the rehash/2 builtin (of BIM_Prolog), but which 'binary indexes' a chosen - dynamic - predicate and I did some testing. The test program looks essentially like this test(N) :- M is 2 ^ N - 1 , assert_facts(M)) , % asserts fact(1) . fact(2) . ... fact(M) . rehash(fact/1,M) , % no collisions because of this choice time(call_fact(M),Thash) , bin_index(fact/1) , % the binary tree is perfectly balanced time(call_fact(M),Tbin) , write(Thash) , nl , write(Tbin) . call_fact(0) :- ! . call_fact(N) :- fact(N) , M is N - 1 , call_fact(M) . The results for different N are summarised in the table below. On purpose, neither the machine, nor the repetition factor of the test is given. N #facts hash-success bin-success overhead of the test harness 2 3 0.17 0.17 0.1 3 7 0.37 0.36 0.19 4 15 0.75 0.78 0.37 5 31 1.51 1.62 0.72 6 63 3.03 3.43 1.44 7 127 6.08 7.12 2.87 The table shows - as expected - that hash-success and overhead double as the number of facts doubles, but that bin-success more than doubles. But it also shows that hashing is a reasonable choice even for a small number of facts, while binary indexing might be unreasonable for a large number of facts. So, I feel we have made the correct choice. I only worry about possible comments coming from the southern hemisphere ... I have similar results for hash-fail and bin-fail - where calls to fact(0.5), fact(1.5), ... , fact(N+0.5) are made (yes, the type of an integer and a float are the same internally in BIM_Prolog, so that this was a reasonable test). Just to take away some suspicion you might have: hashing and binary search were both implemented in C, with the same relevant variables declared as registers and in the same style of coding. I am interested in similar or contradicting experiences. This mail expresses my personal opinion and experience, not BIM's nor KUL's ! bimbart@kulcs.uucp