[comp.lang.modula2] Intermediate format of FP numbers

josef@nixpbe.UUCP (Moellers) (07/18/90)

Hi,
I'm about to start to write a Modula-2 compiler as none of my machines
has one (apart from my ZCPR33 box, but that doesn't count).

I want to build a highly modular, multi-stage compiler, much like Per
Brinch Hansen's Edison compiler published in his book "Programming a
personal computer".

The first stage, lexical analysis, transforms the character string
making up the source file into an intermediate file consisting of tokens
and "arguments" to these tokens. E.g. the fragment
	x := 6;
would be translated into the sequence of tokens and "arguments"
	IDENT x BECOMES INTEGER 6 SEMICOLON
(where x is a number identifying identifier "x", e.g. 5)
which could be stored as the bytes
	1 0 0 0 5 2 3 0 0 0 1 4
	| | | | | | | | | | | +- Token SEMICOLON
	| | | | | | | +-+-+-+--- binary value of integer 1, MSB first
	| | | | | | +----------- Token INTEGER
	| | | | | +------------- Token BECOMES
	| +-+-+-+--------------- binary value of number identifying "x"
	+----------------------- Token IDENTIFIER

Question: what format should I use for floating point numbers?
I do not want to loose precision during this phase.
On the other hand, I would like to use some standard format!
Would IEEE double precision be enough?

--
| Josef Moellers		|	c/o Nixdorf Computer AG	|
|  USA: mollers.pad@nixbur.uucp	|	Abt. PXD-S14		|
| !USA: mollers.pad@nixpbe.uucp	|	Heinz-Nixdorf-Ring	|
| Phone: (+49) 5251 104662	|	D-4790 Paderborn	|

preston@titan.rice.edu (Preston Briggs) (07/20/90)

In article <josef.648309655@peun11> josef@nixpbe.UUCP (Moellers) writes:
>I'm about to start to write a Modula-2 compiler

Sounds good.

>I want to build a highly modular, multi-stage compiler, much like Per
>Brinch Hansen's Edison compiler published in his book "Programming a
>personal computer".

Check also the book "A concurrent pascal compiler for minicomputers",
by Hartmann (Brinch Hansen's student).  It's published in the Springer
lecture notes in computer science series, #50.


>The first stage, lexical analysis, transforms the character string
>making up the source file into an intermediate file consisting of tokens

>Question: what format should I use for floating point numbers?
>I do not want to loose precision during this phase.
>On the other hand, I would like to use some standard format!
>Would IEEE double precision be enough?

One reasonable way to proceed is to *not* convert the numbers.
Instead, keep the string representation and pass it along from phase
to phase.  This has the advantages of never losing precision,
saving FP arithmetic, making your intermediate code simpler
to debug by humans, and increasing portability.

So, the 1st pass (the lexical analyser) would examine the number and
make sure it has a legal format, but never bother converting it
into an internal form.

--
Preston Briggs				looking for the great leap forward
preston@titan.rice.edu

aubrey@rpp386.cactus.org (Aubrey McIntosh) (07/21/90)

In article <10068@brazos.Rice.edu> preston@titan.rice.edu (Preston Briggs) writes:
>One reasonable way to proceed is to *not* convert the numbers.
>Instead, keep the string representation and pass it along from phase
>to phase.  This has the advantages of never losing precision,
>saving FP arithmetic, making your intermediate code simpler
>Preston Briggs				looking for the great leap forward


I have just written a program 'ToInteger' that takes a file of numbers
of the form [+|-][digit][.][digit][e|E][integer]
and makes them simpler.

1.00E+002  	--> 100
9e2    		--> 900		(padding with 0)
005		-->   5  	(implied decimal point alignment)
10.001e4	--> 100.01e3	(exact digit preservation)
0010.0010e4	--> 100.01e3	(leading/trailing 0 suppression)
0		-->  00		(known bug.)
00		-->  00		(not perverse, however)
I have a structure similar to:
  RECORD
  negative : BOOLEAN;
  decimalBeforeDigit : CARDINAL;
  exponent : INTEGER;
  mantissa : string; (*string is currently ARRAY [0..max-1] OF CHAR*)
  END; 

(is it possible to use VI to write Modula-2? :-) )
 As soon as I finish a little problem using InOut (Logitech 3.3t) and
redirected input, I'll post it.  Help is welcome...

-------------------------------

I'd like to have RealInOut print the numbers in engineering format,
(powers of 1000 for exponent. No, wait, that's multiples of 3...)
The smaller exponents,  -21 .. 21, would be replaced with single
letters rather than E+nnn.

e.g.:
 1E-009	--> 1n
 1E-006	--> 1u
 1E-003	--> 1m
 1E+000	--> 1

Has anyone else done this?  Would people want the diffs?
What's smaller than femto and ato?
Are there suggestions for large exponents, e-99.



-- 
Aubrey McIntosh  	"Find hungry samurai." -- The Old Man        
1502 Devon Circle       comp.os.minix, comp.lang.modula2         
Austin, TX 78723 
1-(512)-452-1540  (v)