[comp.databases] FORCE

awd@dbase.A-T.COM (Alastair Dallas) (04/11/90)

The original version of Clipper produced an .exe file, allright, but
it contained largely pseudo-code.  To some extent, you have to use
pseudo code because of the dBASE language's late binding of identifiers.
What am I talking about?  Consider:

	Name = "Jones"
	Fname = space(8)
	@ 10,10 GET Fname
	READ
	USE (Fname)     * or, USE &Fname

	? Name

Ok, this compiles into:

	<assign literal "Jones" to memvar Name>
	<call space() with arg 8>
	<assign result to memvar Fname>
	<get, read>
	<call use() with arg memvar Name>
	<call ?() with arg.... with arg.... uh...>

You don't know what you're printing.  Is it the memvar called Name that
you dealt with a minute ago, or is it the field in the file that you
just put into Use?  

To avoid some sort of psuedo-code in this and similar situations, you
can just say "the user can't open arbitrary DBF files--we have to "see"
each DBF at compile time."  Since Clipper doesn't have this restriction,
nor many restrictions on macro usage, it seems likely that some binding
is still delayed until runtime.  If you delay binding infinitely, you
have an interpreter; the more binding you do up front, the more 
compilerish you are.

As I said, the first version of Clipper compiled code that looked
something like:

	CALL	INTERPRET
	DB	f3h, 02h, 54h, ...

which is a clever way of using psuedo code and still calling yourself
an .exe. compiler.  I understand, however, that subsequent versions
have gotten more sophisticated and granular.  My guess is that today's
Clipper is more like:

	CALL	EVAL_ADDR
	DB	f3h, 02h
	PUSH 	AX
	CALL	ASSIGN_MVAR

and so forth.  Still late-binding of identifiers, but less interpreted
than the 1.0 version.

For the record, I'm told A-T's Professional Compiler (with which I
am not at all involved) will be even "closer to the machine," using
a variety of code-global optimizations.  I heard a rumor at one point
that you'd be able to tell it "these are the only DBFs I expect to 
deal with" and it would optimize to a fare-thee-well, or you could
say "stay loose" and it wouldn't.  If I knew anything, I couldn't
talk about it, so don't listen to me...

Oh, yes, and back to the subject: What I've heard about FORCE is
that it shines on a lot of the dBASE language compatibility issues,
such as late-binding, and in exchange gives you the highly-optimized
code that A-T's compiler will when the right switches are thrown.
Only with FORCE, there's no compatibility switch.  But this is all
rumor and hearsay--please correct me if I'm wrong.

/alastair/

Disclaimer> I'm speaking for myself, as usual, not my employer.

jbrown@herron.uucp (Jordan Brown) (04/11/90)

In article <23925@usc.edu>, atieu@skat.usc.edu (Anthony Tieu) writes:
> Does Clipper produces native code or pseudo code which is executed
> by a runtime interpreter?------

My understanding is that it produces pseudocode.  However, the pseudocode
is bundled with the interpreter into a single .EXE, so it's transparent
to the programmer and the user.

> I have always that Clipper produces exe files.

Yes it does.

I don't consider it to be a failing that it uses pseudocode.  I think
FORCE is the only dBASE-language product that produces native code,
or claims to.

Most dBASE statements *can't* be turned into native code, at least not
without all kinds of high-tech compiler technology.  Even a simple statement
like
	c = a+b
can't.  Why not?

1)  What types are A and B?  If they're numbers, then you need to do an add.
If they're strings, a concatenate.  If they're a date and a number, you
might have to do a different kind of add depending on how dates are
represented.  (For that matter, they may be different types the next time
through...)

2a)  If you're using BCD for numbers (as Vulcan and, I believe, dBASE IV do)
you're going to have to do a subroutine call anyway; who cares whether the
subroutine call is done directly or out of a pseudocode interpreter's
jump table?

2b)  If they're strings, you have to do non-trivial memory handling to
build up the result, and are going to be making several subroutine calls.
See 2a.

A native code implementation can certainly be done; it just has to do
most things using subroutine calls.  The question then becomes how to make
the tradeoffs...

- native code is probably faster.
- native code may well be bigger.  (note that for macros and dBASE indexes
you still need a full expression interpreter...)
- native code works only on one processor type.
- it's harder to implement a virtual memory scheme when using native code.
- native code *may* make it easier to integrate routines in other languages.

The dBASE language is just not well suited to compilation.

An aside, vaguely related...

Alastair, here's a thought problem for you.  Consider the following sequence
and how it works internally.  Does it produce the same result in dBASE III
and IV?  (I know the Vulcan interpreter and compiler will get different
answers and I think that Clipper will be different from dBASE III...
I just checked FoxBase and it doesn't match dBASE III...)

use somedbf		&& assume it has a field "name"
myname="brown"
set filter to name=myname
private myname
myname="dallas"
list

Now, will you get "brown" records, or "dallas" records?

I suspect that III (and the Vulcan interpreter) will give "brown" records,
and IV, Fox, Clipper, and the Vulcan compiler will give "dallas" records.

I can't see any reasonable way to get the compiled version to dup the
behavior of the interpreted version.  Sigh.  Another example of the
inherent non-compilability of the language.
-- 
Jordan Brown
jbrown@jato.jpl.nasa.gov