[comp.compilers] virtual Pascal machines

worley@eddie.mit.edu.uucp (Dale Worley) (07/18/88)

   From: djones@megatest.uucp (Dave Jones)
   Subject: pointed pointer questions

   The purpose of the exercise is to define a virtual machine compatible
   on the one side with Pascal, and on the other with any reasonable
   hardware. [...]

   So question number one is, to be safe, how many kinds of pointers
   should the virtual machine have?  If you define them according to
   the Pascal types, you get TEN of them for Pete's sake!

It's worse.  Let's look at the Pascal types in light of the IBM 360 (8
bit types, byte addressable, 4 byte word, operands must be aligned)
and (what I remember of the) CDC 6000 (60 bit words, word addressable):

Pascal type		360			6000

real			4 or 8 bytes, aligned	1 word
int			4 bytes, aligned	18 bits (?), located
						in a specific place in
						either the upper or
						lower half of a word
char			8 bits, ordinary	7 bits, packed 5 (!) to a
			pointer			word, weird pointer
			(I'm told Honeywell MULTICS has an even
			stranger character pointer.)
bool			packed 8 to a byte	packed 60 to a word
file			some sort of internal structure, well-aligned
pointer			4 bytes, aligned	half a word
array, structure	alignment depends on	word aligned, unless
			fields in array/struct	it's made of bits
enum			depends on range of enum (see any of the above)

Of course, you can reduce the number of weirdly different storage
requirements if you don't want to pack your datatypes so tightly.  (A
360 user may consider it OK to have one Bool per byte, but a 6000 user
might not like allocating 100,000 Bools one per word!)

The trouble is that although no machine has more than about 3 pointer
types, you can't predict which objects get which pointer types.
(Arrays and structures are even worse, since their alignment depends
on what they are composed of.)

You do have the advantage that you needn't distinguish pointer types
that differ only in alignment requirements (rather than pointer
format), because if you allocate storage correctly, you can generate
code that will never generate an incorrectly aligned pointer.

The best way to handle all this is to have each language type
represented by a different type in the virtual machine.  This results
in a proliferation of virtual types, but in any actual implementation
of the virtual machine the virtual types will be collapsed into a
small number of distinct implementations.  Ditto, the virtual op
codes.

   I am tempted to boot the whole idea, and just say there is one
   kind of pointer: a thirty-two bit word.  And we will burn all
   the the computers that don't behave that way.

But what if I want to pack bits in an array?  Let's have
bit-addressable computers with 64 bit words!

Dale
[From Dale Worley <think!compass!worley@eddie.mit.edu.uucp>]
--
Send compilers articles to ima!compilers or, in a pinch, to Levine@YALE.EDU
Plausible paths are { ihnp4 | decvax | cbosgd | harvard | yale | bbn}!ima
Please send responses to the originator of the message -- I cannot forward
mail accidentally sent back to compilers.  Meta-mail to ima!compilers-request