ok@edai.UUCP (Richard O'Keefe) (04/01/84)
I've been asked to justify my contention that Prolog interpreters using tags are likely to be slower than Prolog interpreters which like C-Prolog use a few partitions for their type tests. C Prolog in fact can do almost all its type testing by comparing against two boundaries (which can be kept in registers a lot of the time) and against 0. The basis of my claim is that I've looked at the code produced by the C compiler in both kinds of interpreters, and I've tried to hand- code unification routines in assembler. There is a very important loop that crops up all over the place in a Prolog interpreter: a = <a pointer>; while (a <is a bound reference>) a = <contents of a>; This is generally followed by a switch with case <a is unbound> : ... case <a is atomic> : ... case <a is compound>: which takes two comparisons or their equivalent in either case. I can get this loop down to 4 instructions using C-Prolog's data representation, but with *byte* tags I can only get it down to 5 whichever end I put the tag. The VAX code is 11 bytes -vs- 19. The number of instructions would be the same on an M68000. Of course the skill of the programmar is a greater factor.
AndrewT%Ucb-Vax@basser.UUCP (04/03/84)
I can not agree that a partitioned scheme is faster than tagging. edai!ok claims that the loop for de-referencing a variable is faster in a partioned scheme: "I can get this loop down to 4 instructions using C-Prolog's data representation, but with *byte* tags I can only get it down to 5 whichever end I put the tag. The VAX code is 11 bytes -vs- 19. The number of instructions would be the same on an M68000." I use a tagged scheme. The compiled VAX code for the aforesaid loop involves only 3 instructions, 8 bytes. It lokes like this: jbr L1 L2: movl 4(r11),r11 L1: tstl (r11) jeql L2 A tagged scheme has other advantages. The unification routine can be table-driven. This is, I think, faster and it certainly is much cleaner. A partition scheme (or at least the C-Prolog version thereof) involves horrible, heavily machine dependent, manipulations so that the bit patterns for integers, floats and pointers can be distinguished. Using a tagged scheme the various storage areas (stacks) can be small when created and "grown" to what ever size is necessary. This is harder to do in a partitioned scheme. The C-Prolog approach of initially allocating ~1 megabyte is totally impractical in our environment. We have a swapping (non-virtual memory) system with 4Mb of memory and sometimes 20-30 students using Prolog. -- Andrew Taylor
andrewt@basser.SUN (Andrew Taylor) (04/03/84)
I can not agree that a partitioned scheme is faster than tagging.
edai!ok claims that the loop for de-referencing a variable
is faster in a partioned scheme:
{
I can get this loop down to 4 instructions using C-Prolog's data
representation, but with *byte* tags I can only get it down to
5 whichever end I put the tag. The VAX code is 11 bytes -vs- 19.
The number of instructions would be the same on an M68000.
}
I use a tagged scheme. The compiled VAX code for the aforesaid loop
involves only 3 instructions, 8 bytes. It lokes like this:
jbr L1
L2: movl 4(r11),r11
L1: tstl (r11)
jeql L2
A tagged scheme has other advantages. The unification routine can be
table-driven. This is, I think, faster and it certainly is much cleaner.
A partition scheme (or at least the Cprolog version thereof) involves
horrible, heavily machine dependent, manipulations so that the bit patterns
for integers, floats and pointers can be distinguished.
Using a tagged scheme the various storage areas (stacks) can be small
when created and "grown" to what ever size is necessary. This is harder
to do in a partitioned scheme. The C-prolog approach of initially allocating
~1 megabyte is totally impractical in our environment. We have a swapping
(non-virtual memory) system with 4Mb of memory and sometimes 20-30 students
using prolog.
Andrew Taylor
Pereira%SRI-AI@sri-unix.UUCP (04/16/84)
From: Fernando Pereira <Pereira@SRI-AI> I think Andrew Taylor missed Richard O'Keefe's point. Richard was talking of term representations where a variable cell occupies a single 32 bit word. Andrew Taylor's code is clearly for a representation with 64 bits per cell. For large programs, this factor of 2 in space more than offsets other considerations. Furthermore, number of memory accesses is a major (THE major?) factor in the efficiency of a Prolog system. Cells twice as big, twice as many accesses... Since version 1.4, C-Prolog allows the user to specify the sizes of data areas in the command line. The relocation code needed to expand data areas dynamically is already there, in the function that restores saved states. To ensure that the areas being expanded do not walk over the malloc arena, malloc & friends need to be simulated within C-Prolog. Now if only someone out there would take the hint... -- Fernando Pereira
pereira@sri-unix.UUCP (04/16/84)
I think Andrew Taylor missed Richard O'Keefe's point. Richard was talking of term representations where a variable cell occupies a single 32 bit word. Andrew Taylor's code is clearly for a representation with 64 bits per cell. For large programs, this factor of 2 in space more than offsets other considerations. Furthermore, number of memory accesses is a mojor (THE major?) factor in the efficiency of a Prolog system. Cells twice as big, twice as many accesses... Since version 1.4, C-Prolog allows the user to specify the sizes of data areas in the command line. The relocation code needed to expand data areas dynamically is already there, in the function that restores saved states. To ensure that the areas being expanded do not walk over the malloc arena, malloc & friends need to be simulated within C-Prolog. Now if only someone out there would take the hint... -- Fernando Pereira