[comp.lang.c++] Zortech floating point

bright@Data-IO.COM (Walter Bright) (08/14/90)

I'd like to take a moment to explain how Zortech's floating point works,
which I hope will explain the numbers posted previously.

By default, ZTC implements floating point by making function calls to
the library. The library has a run-time switch to use the 8087 or to
emulate it.

There is a compiler switch to generate inline 8087 code, but this code
will not work if no 8087 is present.

Other compilers generate trap instructions that correspond to 8087
instructions. The traps trap to the 8087 emulator. If an 8087 is
present, the emulator patches the trap instruction to be the real
8087 instruction, thus the next time through the code it will be inline
8087 code.

The disadvantages of ZTC's scheme is if an 8087 is present, the code
is slower than inline code (as the numbers in the benchmark indicate).
The advantage is that it is reentrant (of importance in doing multi-
threaded OS/2 code and for special applications). Also, if no 8087 is present,
ZTC's scheme typically runs faster than the trapping/emulating style.
Therefore, which scheme is 'better' depends on your application and
on the target platform.

If, with ZTC's scheme, you need top speed on all platforms, a way to do
it is to isolate the performance critical floating point code to one
module, and write the module like:

	#if INLINE87
	#define	func	func_87
	#endif

	double func(double x, double y)
	{
		return x * y;
	}

compile it twice (with and without inline 8087 code generation):

	ztc -c -o+space -f -DINLINE87 func -ofunc87.obj
	ztc -c -o+space func

and call it with a runtime switch:

	if (_8087)			// !=0 if 8087 is present
		func_87(x,y);		// call 8087 version
	else
		func(x,y);		// call emulator version

This is discussed in more detail on pg. 224 of Zortech's "C++ Compiler
Reference".

I suspect that most of the time in the benchmarks posted is spent in
the floating point routines, and is a fine test of floating point
implementation, but is not a good test for C++ performance.