[comp.lang.forth] Why is Postscript not Forth?

dwp@willett.UUCP (Doug Philips) (02/10/90)

In trying to understand the essence of what 'Forth' is, I've been
trying to understand why many Forth supporters don't want to call
PostScript Forth.  Let's set aside two 'easy' arguments:

	1) PostScript is too S-L-O-W.
I'm *not* concerned with particular implementations, I'm interested in
philosophical reasons.  [Unless you can show that PostScript is
*always* *necessarily* going to *have* to be slower than Forth.]

	2) PostScript is BIG.
My response to this is two fold:
	a) There are packaged Forth systems available now that are
	   BIG (FPC and BBL/Abundance spring immediately to mind).
	b) PostScript comes with a lot of the words predefined for
	   doing graphics/page-layout.  This is not part of the
	   language per se, but more the result of starting with
	   something and growing/planning it towards a particular
	   application.

Why do think that PostScript should/shouldn't be called Forth?

		-Doug

---
Preferred: willett!dwp@gateway.sei.cmu.edu OR ...!sei!willett!dwp
Daily: ...!{uunet,nfsun}!willett!dwp   [in a pinch: dwp@vega.fac.cs.cmu.edu]

sdh@flash.bellcore.com (Stephen D Hawley) (02/11/90)

In article <431.UUL1.3#5129@willett.UUCP> dwp@willett.UUCP (Doug Philips) writes:
>In trying to understand the essence of what 'Forth' is, I've been
>trying to understand why many Forth supporters don't want to call
>PostScript Forth.

Speed of execution of a language is not what is important.  Yes, most
implementations of postscript are pretty slow, and for good reason.  PS
is heavily dependent on floating point arithmetic.  It doesn't have to be,
but the language was heavily designed to not be constrained for typesetting.
How high will your resolution be?  72 dpi? (NeXT screen, NEWS screen?)
300 dpi? (Laserwriters, QMS Colorscript)  400 dpi? (NeXT printer) 600 dpi
(Varityper)?  900 dpi (Linotronics)  etc etc.  Did you know that the Apple
laserwriter carries enough floating point precision to print text the size
of Cleveland?  Carrying all this extra baggage is what slows PS down, but
that doesn't have to be so.

So is PS Forth?  Well, that's unclear.  Forth is a bizarro language.  It's
not so much a language as a set of loose semantics.  I can put a layer on top
of forth that will make it look like BASIC.  Is BASIC Forth?  People have
mentioned implementations of Scheme layered on top of Forth.  Is Scheme Forth?

Of course you can carry this much further, since every useful computer
language is equivaleny to every other.

I put it that, no, PS is not Forth, but merely similar in semantics.

Steve Hawley
sdh@flash.bellcore.com
A noun's a special kind of word.
It's ev'ry name you ever heard.
I find it quite interesting,
A noun's a person place or thing.

ir230@sdcc6.ucsd.edu (john wavrik) (02/11/90)

In <Article 2202 of comp.lang.forth> Doug Philips asks:

# In trying to understand the essence of what 'Forth' is, I've been
# trying to understand why many Forth supporters don't want to call
# PostScript Forth.

The obvious answer is because PostScript is not Forth -- Forth 
programs do not run on Postscript systems and Postscript programs do 
not run on Forth systems. While the two languages have some 
characteristics in common, they are different languages (and the 
people at Adobe claim that their language was conceived before they 
were aware of Forth).   <see footnote below>

The underlying question of "the essence of Forth" is more interesting.
One thing that has become apparent to me during the past year is that
"Forth" is being applied to a wide variety of things.

I.
At one extreme are those who believe that a language is what you print on 
paper or put in a file (as opposed to what a programmer thinks about 
when he writes programs). They see Forth as being distinguished by the 
use of a stack, Reverse Polish syntax, and an interactive environment. 

The two  "Forth written in 'C'"  products I have examined this year 
are both of this type: they have a Forth "look and feel" but lack the 
power and strength that have led Forth users to select Forth. I 
suspect, therefore, that the "essence of Forth" does not lie in the 
use of a stack, RPN, and interactive nature (although these are 
visible characteristics). [I should mention that these products may 
have some benefits, in spite of their shortcomings as versions of 
Forth: they could be useful in providing 'C' programmers a modest 
interactive environment for running their programs -- they are more 
of interest as extensions of 'C' than as implementations of Forth.] 

II.
At the other extreme are those who see Forth as "assembly language done 
right!".  Two major advantages of programming in assembly language are 
the ability to access hardware and the ability to construct languages -- 
both flowing from a very low level interaction with a computer. The main 
disadvantages of assembly language programming are: 

       1.  Dependence on a particular processor
       2.  Lack of an interactive programming environment
       3.  Programming constantly stays at a low level
           (dealing with memory, registers, etc.)

In this point of view, Charles Moore performed a miracle: he showed 
how to define (in software) a computer which can be easily mapped to 
any real processor; and an assembly language for this (virtual) 
computer which is interactive and provides a uniform blend with a high 
level language. Moore combined the advantages of high level and low 
level languages by creating a software pseudo-processor and designing 
an appropriate language for it. He (and his followers) have also 
provided a mechanism to integrate assembly language (CODE) definitions 
into the language where the access to the hardware or host operating 
system provided by the virtual machine is insufficient (or where 
greater speed is needed). 

Forth is a combination (software-realized) processor and language. It 
is probably this that accounts for most of the uses of the language, 
and for most of the subjective sense of its power. Forth is most 
easily understood if it is thought of as analogous to a real 
processor/assembly-language pair -- rather than a conventional 
computer language. 

Assembly language instructions only make sense in the context of the 
architecture of the underlying "chip". Assembly language is understood 
in terms of semantics (what action is to be performed) rather than 
syntax (how something is said). Thus assembly languages are not 
defined by BNF diagrams but by glossaries of actions. 

I feel that the "essence of Forth" will be found in contemplating the 
idea of a virtual computer, the realization of this computer in 
hardware, the blending of a low level language with a high level 
language, and (as a philosophical note) the importance of simplicity.
 
Forth is a unique language. Manufacturers provide us with a wide 
diversity of hardware and operating systems. Forth has provided a 
viable way to retain the power of low level access without substantial
sacrifice of speed and with a great deal of machine independence. The
main idea is to substitute an imaginary (virtual) machine for the real 
machine -- and devise a suitable language for it. [I recommend a 
study of the design of this machine and its language to anyone -- it
is a stroke of brilliance.]

III.
At this point we interject a major problem to be faced by Forth:

Unfortunately, rather than building on the architecture of a single 
software "chip" (virtual machine), some people have decided that 
redesigning the "chip" itself would better suit their needs. As a 
result, the Forth community is now confronted with a variety of Forth-
like assembly languages for a variety of "chips" -- destroying the 
high degree of portability that Forth once enjoyed. 

It is ironic that the Forth community should choose to artifically 
introduce incompatibilities at a higher level -- turning one of the 
most highly portable languages into a chaos.

The ANSI effort does not really address this problem. The team is 
responding by specifying a greatest common denominator of existing 
implementations. The net result is like trying to provide an assembly 
language in which the user is not allowed to know anything about the 
architecture of the chip. 

FOOTNOTE:
My main exposure to Postscript is as a page-description language built in 
to a laser printer. I have not found a free-standing version of the 
language presented as a general computation language -- and my department
is unwilling to let me tie up the laser printer for language experiments.
So this is a summary based on the production of a Postscript driver for
graphics programs.

Postscript is like a Forth application designed for a special purpose 
-- to arrange for the layout of elements on a printed page. It would 
probably not be very difficult to write such an application in Forth. 
Postscript is like Forth in that it uses a stack and Reverse Polish 
syntax, a dictionary, etc. A Forth programmer who needs to make a 
laser printer do something unusual will not find the language hard to 
learn. It does have interesting features (e.g. the stack can contain 
elements of mixed type). 

                                                  John J Wavrik 
             jjwavrik@ucsd.edu                    Dept of Math  C-012 
                                                  Univ of Calif - San Diego 
                                                  La Jolla, CA  92093

dwp@willett.UUCP (Doug Philips) (02/12/90)

In article <19928@bellcore.bellcore.com> sdh@flash.bellcore.com (Stephen D Hawley) writes:
> In article <431.UUL1.3#5129@willett.UUCP> dwp@willett.UUCP (Doug Philips) writes:
> >In trying to understand the essence of what 'Forth' is, I've been
> >trying to understand why many Forth supporters don't want to call
> >PostScript Forth.
> 
> So is PS Forth?  Well, that's unclear.  Forth is a bizarro language.  It's
> not so much a language as a set of loose semantics.  I can put a layer on top
> of forth that will make it look like BASIC.  Is BASIC Forth?  People have
> mentioned implementations of Scheme layered on top of Forth.  Is Scheme Forth?
...
> I put it that, no, PS is not Forth, but merely similar in semantics.

Aha!  I almost got an answer here!  ;-)  Is PostScript merely a layer on top
of Forth, or is it, at heart, a different language?  If it is a different
language, what is the fundamental difference?  Is PostScript's post-fix-"mania"
more Forth-like than Forth?

		-Doug

---
Preferred: willett!dwp@gateway.sei.cmu.edu OR ...!sei!willett!dwp
Daily: ...!{uunet,nfsun}!willett!dwp   [in a pinch: dwp@vega.fac.cs.cmu.edu]

dwp@willett.UUCP (Doug Philips) (02/12/90)

In message <7228@sdcc6.ucsd.edu> ir230@sdcc6.ucsd.edu (john wavrik) writes:
> In <Article 2202 of comp.lang.forth> Doug Philips asks:
> # In trying to understand the essence of what 'Forth' is, I've been
> # trying to understand why many Forth supporters don't want to call
> # PostScript Forth.
>
> The obvious answer is because PostScript is not Forth -- Forth 
> programs do not run on Postscript systems and Postscript programs do 
> not run on Forth systems. While the two languages have some 
> characteristics in common, they are different languages (and the 
> people at Adobe claim that their language was conceived before they 
> were aware of Forth).   <see footnote below>

I was hoping to avoid this kind of answer.  It leaves one open to the
question:  What about the incompatabilites of the various 'dialects' of
Forth (fig-Forth, Forth-79, Forth-83, etc.)?  What is it about those dialects
which allows their differences to be dismissed, but the the differences between
"Forth" and "PostScript" to be important.

> The underlying question of "the essence of Forth" is more interesting.

Yes!

> At one extreme are those who believe that a language is what you print on 
> paper or put in a file (as opposed to what a programmer thinks about 
> when he writes programs). They see Forth as being distinguished by the 
> use of a stack, Reverse Polish syntax, and an interactive environment. 
> 
> The two  "Forth written in 'C'"  products I have examined this year 
> are both of this type: they have a Forth "look and feel" but lack the 
> power and strength that have led Forth users to select Forth. I 
> suspect, therefore, that the "essence of Forth" does not lie in the 
> use of a stack, RPN, and interactive nature (although these are 
> visible characteristics).

Ok, I'm confused.  If, as you seem to imply here, Forth is 'what a programmer
thinks about when he writes programs', how does the underlying implementation
language make one whit of a difference, so long as it provides access to the
hardware in an equally convenient way?

> II.
[Wavrik states here the view that "Forth is 'assembly language done right!'"
and points out problems with traditional assembly language(s):
Dependence on a particular processor.  Lack of an interactive programming
environment.  Programming constantly stays at a low level (dealing with
memory, registers, etc.)]
> In this point of view, Charles Moore performed a miracle:
> ... Moore combined the advantages of high level and low 
> level languages by creating a software pseudo-processor and designing 
> an appropriate language for it. He (and his followers) have also 
> provided a mechanism to integrate assembly language (CODE) definitions 
> into the language where the access to the hardware or host operating 
> system provided by the virtual machine is insufficient (or where 
> greater speed is needed). 

When you say 'have also provided', are you implying that 'CODE' is not
part of the "essence" or "fundamentality" of Forth, or are you saying
that it is one aspect of that essence?

> Forth is a combination (software-realized) processor and language. It 
> is probably this that accounts for most of the uses of the language, 
> and for most of the subjective sense of its power. Forth is most 
> easily understood if it is thought of as analogous to a real 
> processor/assembly-language pair -- rather than a conventional 
> computer language. 

This seems straightforward, and, also seems *not* to conflict with the
idea that the virtual machine can be written in another high-level language,
so long as the access to the 'real' hardware is convenient.

> I feel that the "essence of Forth" will be found in contemplating the 
> idea of a virtual computer, the realization of this computer in 
> hardware, the blending of a low level language with a high level 
> language, and (as a philosophical note) the importance of simplicity.

Two points:
	a) how do the 'Forth in C' implementations fail at this?
	b) perhaps Forth and PostScript are different only in the
	   ability to get to the real hardware?

> Forth is a unique language. Manufacturers provide us with a wide 
> diversity of hardware and operating systems. Forth has provided a 
> viable way to retain the power of low level access without substantial
> sacrifice of speed and with a great deal of machine independence. The
> main idea is to substitute an imaginary (virtual) machine for the real 
> machine -- and devise a suitable language for it. [I recommend a 
> study of the design of this machine and its language to anyone -- it
> is a stroke of brilliance.]

Again, two points:
	a) how is this (necessarily) more true of Forth than of PostScript?
	b) Forth seems to be losing its unique-ness, by this definition,
	   with the growing encroachment of 'C' into domains previously
	   held strong by Forth.  'C' which meets all those criteria.

> III.
> At this point we interject a major problem to be faced by Forth:
> 
> Unfortunately, rather than building on the architecture of a single 
> software "chip" (virtual machine), some people have decided that 
> redesigning the "chip" itself would better suit their needs. As a 
> result, the Forth community is now confronted with a variety of Forth-
> like assembly languages for a variety of "chips" -- destroying the 
> high degree of portability that Forth once enjoyed. 

Would you argue then that the part of Forth which is allows you to
"redesign the 'chip'" is not essential to the nature of Forth?  Is it
just an unfortunate side-effect of Forth's simplicity, or it part of
what makes Forth Forth?

> It is ironic that the Forth community should choose to artifically 
> introduce incompatibilities at a higher level -- turning one of the 
> most highly portable languages into a chaos.

Is this the "not invented here/my way is better" syndrome or is it
language evolution?  Was Forth already perfect?

> The ANSI effort does not really address this problem. The team is 
> responding by specifying a greatest common denominator of existing 
> implementations. The net result is like trying to provide an assembly 
> language in which the user is not allowed to know anything about the 
> architecture of the chip. 

True to the extent that X3J14 is charged with "codifying existing
practice."  Language design is not effectively done in the ANSI forum.
What you seem to be missing here is not that 'the user is
not allowed to know anything about the architecture of the chip,' but
the user is made aware of what aspects of Forth are dependant on the
underlying hardware and *may not* be portable across different
complying implementations.  When you say Forth was 'one of the most highly
portable languages', you are either saying that the virtual machine
itself is highly portable, but not Forth programs, *or* that Forth
programs were highly portable only because the only places you could
port them too, the actual hardware platforms, were so alike.  X3J14 is
not making things 'unportable', all they are doing is enlightening you
as to what is already or is likely to be unportable.  One of the most
amusing things about the 'C' (X3J11) effort was how people would
scream about how ANSI broke 'X', when in fact all ANSI did was point
out that 'X' was already nonportable.

> Postscript is like a Forth application designed for a special purpose 
> -- to arrange for the layout of elements on a printed page. It would 
> probably not be very difficult to write such an application in Forth. 
> Postscript is like Forth in that it uses a stack and Reverse Polish 
> syntax, a dictionary, etc. A Forth programmer who needs to make a 
> laser printer do something unusual will not find the language hard to 
> learn. It does have interesting features (e.g. the stack can contain 
> elements of mixed type). 

I'm close to an answer here! ;-)  I was definitely trying to avoid
PostScript as a page layout language and stick to "essence of
PostScript", because I suspect that it is not the page layout
'extensions' to PostScript which make it unForthlike, but rather some
more fundamental issue which distinguish the two.

From section 'I' above:
> I suspect, therefore, that the "essence of Forth" does not lie in the 
> use of a stack, RPN, and interactive nature (although these are 
> visible characteristics).

Is the visible characteristic of PostScript's mixed-element stack
indicitive of an underlying and fundamental difference between Forth
and PostScript, or is it inconsequential difference?  What are the
fundamental differences?

		-Doug

---
Preferred: willett!dwp@gateway.sei.cmu.edu OR ...!sei!willett!dwp
Daily: ...!{uunet,nfsun}!willett!dwp   [in a pinch: dwp@vega.fac.cs.cmu.edu]

wmb@ENG.SUN.COM (Mitch Bradley) (02/12/90)

 Speed

PostScript is inherently slower than Forth for several fundamental reasons:

1) Operator overloading.  Many of PostScript's basic operators operate
   on multiple data types.  The operator (e.g. "add") must test its operands
   at run time and decide what how to handle them.  In some cases, it
   may be possible to optimize some of this at compile time, but the
   compiler's ability to do so is somewhat compromised by the default
   late binding (see 3), and the "visibility" of innards of the compiled
   procedure.

2) "objects" on the stack.  Each items on the PostScript stack is a typed
   abstract object, rather than a programmer-visible collection of bits.
   The objects must be decoded at run time; you can't just pop the
   PostScript stack and execute the processor's "add" instruction, like
   you can do with Forth on most machines.

   Items 1 and 2 could be summarized by the phrase "strong typing enforced
   at run time".

3) Late binding.  By default, the association between keywords and their
   actions is determined at run-time, rather than compile time.  This
   causes a dictionary search (usually accelerated by hashing) every
   time that a keyword compiled into a procedure is executed.  The late
   binding can be overridden, but most programs do not choose to do so.
   In contrast, Forth uses early binding by default, and the programmer
   must go to extra effort (like using execution vectors or (non-standard)
   DEFER words) to get late binding.

4) Virtual machine.  Forth proponents argue about whether or not Forth is
   a virtual machine.  In PostScript, you have basically zero direct access
   to the real machine, so it is clear that you are programming a virtual
   machine.  Many or most of Forth's basic operations map very closely
   onto the real hardware.

5) Floating point.  This is a performance hit, but it's not as bad as you
   might think.  First of all, fast floating point hardware is becoming
   more and more "standard".  Secondly, some PostScript implementations
   manage to avoid the use of floating point arithmetic in most cases.
   The PostScript implementation upon which the NeWS window system is
   based uses scaled integer arithmetic (with a fixed binary point)
   in most cases, switching automatically to real floating point only
   when the numbers do not fit comfortably in the scaled integer range.


 Size

If you separate out the graphics functions from PostScript, the language
you have left is still a functionally-complete and useful language.
It is probably not much bigger than a Forth implementation with equivalent
features (such as error handling, file access, and floating point).



Please don't interpret this as "PostScript bashing".  I really like
PostScript; I think that it is a beautiful and elegant language.  It is
significantly more consistent in its reverse-polish orientation than
is Forth.  PostScript addresses many real problems that Forth "sweeps
under the rug" (like what really happens when an error occurs).

On the other hand, Forth is inherently more efficient on today's hardware,
because it is less abstract.

The bottom line: Use Forth for what it is good for and use PostScript for
what it is good for.  Each is valuable in its own domain.

Which brings up the question:  What if we had a language with mostly
PostScript syntax, but Forth "real machine, not overloaded" operators
and "early binding" semantics?  That would be a nice language.  But
does the world need yet another language that will not succeed for lack
of a market?  Nope.


 Aside: Forth: Virtual Machine or not?

There is a hot debate about whether Forth should be a "virtual machine"
or a "high level assembler".  The essence of this debate is: "Should the
bit-level details of how the dictionary is implemented be specified or
not specified".  "virtual machine" advocates answer "yes", "high level
assembler" (a poor name, by the way) advocates answer "no".

(I answer "no"; you may wish to take my further comments with a grain of
salt based upon that admission).

Anyway, the inherent performance advantage of Forth over PostScript
is largely due to its "real machine" orientation.  Specification of
Forth compilation in terms of a virtual machine model can negate some
of those advantages.  For example, implementation of the traditional
Forth virtual machine in many modern system environments requires the
use of logical addressing, where an address on the stack is not a native
machine address.

Many of the advantages cited by "Forth virtual machine" proponents
apply more strongly to PostScript than to Forth.


Mitch

wmb@ENG.SUN.COM (Mitch Bradley) (02/12/90)

> a different language?

They are fundamentally different, but first, we must look at the
similarities so we can understand why they are often compared.

S1) Both are mostly reverse polish (equivalently, arguments are communicated
    through a programmer-visible stack).

S2) Both operate by looking up words in a wordlist and executing them.

S3) The user can add words to the wordlist, and those words are invoked
    with the same syntax as pre-existing words.

S4) They are both interactive; the wordlist is available at all times.
    Programs are developed by extending the interactive core, rather
    than by compiling a program then and throwing away the compiler.

Forth and PostScript are compared because PostScript and Forth are the
only popular reverse-polish (== stack oriented) languages .  All other
mainstream languages are either infix or prefix.  (Note that LISP shares
S1, S2, and S3).


The fundamental differences:

D1) PostScript is strongly typed.  Forth is not typed at all.  This
   difference is profound.  It affects just about everything.

D2) PostScript has one way to make a closure, i.e. { .. } .  All
   PostScript control structures use this one basic mechanism, and
   are otherwise postfix.  Each Forth control structure creates its
   own special flavor of a closure, and the syntax is not postfix.

D3) This is not articulated in either language specification, but in
   practice, it is nearly alway true:  Forth gives you pretty much
   direct access to machine hardware data types (addresses, integers,
   logical bit patterns), whereas PostScript doesn't.

D4) Defining words in PostScript are postfix and can be used at any
   place within a definition, either while compiling the definition
   or while executing it.  Defining words in Forth are prefix.  They
   cannot be used while compiling a definition, and only in a restricted
   way while executing a definition (specifically, you cannot, for
   instance, easily write a Forth word which creates a word called
   "foo", unless you can arrange for the name "foo" to appear in
   the input stream at the time your word is executed).

D5) PostScript defaults to run-time binding, with compile-time binding
   as an option; Forth uses compile-time binding, and run-time binding
   requires explicit coding effort to create a DEFER word or its equivalent.


> Is the visible characteristic of PostScript's mixed-element stack
> indicitive of an underlying and fundamental difference between Forth
> and PostScript, or is it inconsequential difference?

As stated above, the visibility of the stack is a similarity, and the
fact that the stack elements are typed abstract objects in PostScript
and untyped bit patterns in Forth is a fundamental difference.


By the way, when James Gosling was first implementing the NeWS PostScript
about 4 years ago, we had a long discussion about Forth implementation
techniques.  He thought that some of the Forth implementation techniques
were clever and interesting, but we both agreed that most of those techniques
are inappropriate for PostScript, because of late binding and abstract
object data types.

Mitch

wmb@ENG.SUN.COM (Mitch Bradley) (02/12/90)

> Is PostScript's post-fix-"mania" more Forth-like than Forth?

No.  It is more consistent than Forth, and more post-fix than Forth,
but not more "Forth-like".

Obviously, this depends on a subjective judgement of what "Forth-like"
means.  I claim that "Forth-like" means "ad-hoc".  Forth is rarely consistent
about anything.  The naming is inconsistent, the syntax is reverse polish
except when it isn't, there are 4 different kinds of strings, there
are some obvious comparison operators that just happen to be missing.

Forth has great ideas at the bottom, and then at some point, implementors
and designers got lazy and punted.  Forth starts out being postfix, but
then rather than deal with the issues of strings, Forth uses look-ahead
syntax to try to make the issue of string allocation and representation
go away (Surprise!  The issue doesn't go away, it just gets deferred to
later, and in the meantime, there are obvious things that you can't do
with string look-ahead defining words).

Then, for the strings that do exist, there are 4 different representations,
each with its own limited set of operators*.

Forth can't even make up its mind about the syntax of a double number.
Even worse, the commonly-used, albeit non-standard, double number syntax
precludes the obvious syntax for floating point numbers.

So, the consistency PostScript's reverse polish PostScript syntax is not
"Forth-like", because consistency is not a hallmark of Forth.


* The 4 kinds of strings, and some of their operators:

1) "adr len"  Address and length of array of bytes.  This is the
   best representation.  Example operator: TYPE

2) "counted string"  Address of an array of from 0 to 256 bytes, the
   first of which is the count of the remaining data bytes.
   Operators: WORD , COUNT , FIND

3) "blank delimited string"  Address of an array of bytes, the end of
   which is denoted by a space character.  Operator: CONVERT

4) "name field" - Sort of like a counted string except that the count
   byte has the (hex) 80 bit set, and the 40 bit and 20 bit are used
   as flags.  Also, the count may not be right; you have to look for
   another 80 bit set in order to find the end of the string.
   Operator: .ID    (Okay, I know this isn't part of the standard, but
   it is widespread).

No wonder Forth doesn't have a good standard string package; nobody
can figure out which string representation to use.  Besides which,
the lack of dynamic memory allocation facilities* makes it pretty hard
to figure out where to put them.

* Memory allocation problem update: my memory allocation proposal passed
at the last ANS Forth meeting!  So Forth will have memory allocation one of
these days when ANS Forth becomes a reality.  More on this later.


Mitch

P.S. Lest you all get the wrong idea, I actually like Forth, and I program
in it almost exclusively, entirely by choice.  I would very much like for
Forth to succeed, and that is why I rail against its flaws.  I want the
language to evolve and for the flaws to be fixed.  The sad truth is that
many Forth practitioners would rather pretend that the flaws are virtues
and justify them or sweep them under the rug, rather than fix them.
(BTW, I am pretty sick of all this "Zen philosophy" nonsense (weakness
is strength, dah-dah, dah-dah) regarding Forth; we are talking about a
programming language that deals with physically real hardware, whose
success or failure is ultimately determined by economics, measured in
real money.  We are not talking about a way in which to live your life
and interact with your fellow man and achieve spiritual salvation.)

peter@ficc.uu.net (Peter da Silva) (02/14/90)

In article <9002121712.AA24871@jade.berkeley.edu> Mitch Bradley <wmb@ENG.SUN.COM> writes:
> * Memory allocation problem update: my memory allocation proposal passed
> at the last ANS Forth meeting!  So Forth will have memory allocation one of
> these days when ANS Forth becomes a reality.  More on this later.

This, unfortunately, isn't likely to completely solve the string issue. C
has memory allocation, but can't handle strings with PS-like ease. The
other thing you need to easily handle strings is garbage collection, so you
can do things like:

	string1 string2 concat

or

	string1 3 5 substr

without having to track your arguments henceforth. You need, instead, to
do something C-like:

	string1 buffer strcpy buffer string2 strcat

or

	string1 3 skip buffer 5 strncpy
-- 
 _--_|\  Peter da Silva. +1 713 274 5180. <peter@ficc.uu.net>.
/      \
\_.--._/ Xenix Support -- it's not just a job, it's an adventure!
      v  "Have you hugged your wolf today?" `-_-'

dwp@willett.UUCP (Doug Philips) (02/14/90)

In <9002121711.AA24864@jade.berkeley.edu>,
	 wmb@ENG.SUN.COM (Mitch Bradley) writes:
> The fundamental differences:
	[between Forth and PostScript --dwp]

> D1) PostScript is strongly typed.  Forth is not typed at all.  This
>    difference is profound.  It affects just about everything.
I agree.  This is a difference I was not originally aware of.

> D2) PostScript has one way to make a closure, i.e. { .. } .  All
>    PostScript control structures use this one basic mechanism, and
>    are otherwise postfix.  Each Forth control structure creates its
>    own special flavor of a closure, and the syntax is not postfix.
I'm not yet willing to concede this point.  If it is fairly simple to
create the other language's closures then I don't see this as fundamental.
I'm not sure how I'd define 'fairly simple' but I'd rather consider the
extremes of the scale, and hope that the actual values are near one of
the extremes.

> D3) This is not articulated in either language specification, but in
>    practice, it is nearly alway true:  Forth gives you pretty much
>    direct access to machine hardware data types (addresses, integers,
>    logical bit patterns), whereas PostScript doesn't.
I'm also not sure about this one.  Perhaps its just a matter of a
different kind of underlying hardware?  (I don't know enough about
this issue except to ask:  How is this different from a chip that directly
executes Forth?  Perhaps a moot point.)

> D4) Defining words in PostScript are postfix and can be used at any
>    place within a definition, either while compiling the definition
>    or while executing it.  Defining words in Forth are prefix.  They
>    cannot be used while compiling a definition, and only in a restricted
>    way while executing a definition (specifically, you cannot, for
>    instance, easily write a Forth word which creates a word called
>    "foo", unless you can arrange for the name "foo" to appear in
>    the input stream at the time your word is executed).
I take it from this that by 'easily' you mean: 'without knowing how the
dictionary structure of the Forth you are using works,' or perhaps, 'in a
portable manner, across more than one type of Forth.'?

> D5) PostScript defaults to run-time binding, with compile-time binding
>    as an option; Forth uses compile-time binding, and run-time binding
>    requires explicit coding effort to create a DEFER word or its equivalent.
I agree that this is a fundamental difference.

		-Doug

---
Preferred: willett!dwp@gateway.sei.cmu.edu OR ...!sei!willett!dwp
Daily: ...!{uunet,nfsun}!willett!dwp   [in a pinch: dwp@vega.fac.cs.cmu.edu]

dwp@willett.UUCP (Doug Philips) (02/14/90)

In <9002121712.AA24871@jade.berkeley.edu>,
	wmb@ENG.SUN.COM (Mitch Bradley) writes:

> > Is PostScript's post-fix-"mania" more Forth-like than Forth?
> 
> No.  It is more consistent than Forth, and more post-fix than Forth,
> but not more "Forth-like".
> 
> Obviously, this depends on a subjective judgement of what "Forth-like"
> means.  I claim that "Forth-like" means "ad-hoc".  Forth is rarely consistent
> about anything.  The naming is inconsistent, the syntax is reverse polish
> except when it isn't, there are 4 different kinds of strings, there
> are some obvious comparison operators that just happen to be missing.
Ok.  However, one of the things that Forthers sometimes claim as one of
Forth's virtues is its simplicity.  It seems to me that this is one
place where PostScript has a simplicity that has 'out Forth-ed' Forth.
You're right tho', this is a subjective call about 'Forth-like'.

> Forth has great ideas at the bottom, and then at some point, implementors
> and designers got lazy and punted.  Forth starts out being postfix, but
> then rather than deal with the issues of strings, Forth uses look-ahead
> syntax to try to make the issue of string allocation and representation
> go away (Surprise!  The issue doesn't go away, it just gets deferred to
> later, and in the meantime, there are obvious things that you can't do
> with string look-ahead defining words).
And you don't dare define your own way around it. ;-) but see next comment.

> Then, for the strings that do exist, there are 4 different representations,
> each with its own limited set of operators*.
Only four?  ;-)  Actually, I thought this was a result of Forth's defering
to the programmer to do what is right for a given situation.

> So, the consistency PostScript's reverse polish PostScript syntax is not
> "Forth-like", because consistency is not a hallmark of Forth.
See above remark about simplicity.

> * The 4 kinds of strings, and some of their operators:
> 
> 1) "adr len"  Address and length of array of bytes.  This is the
>    best representation.  Example operator: TYPE
Hmmm, lets not pretend the string issue exists in a vacuum.
There are those of use weened on 'C' that would claim NULL terminated
strings are best.  I'd rather avoid religious rhetoric and stick to
verifiable/demonstrable claims about real situations.

> No wonder Forth doesn't have a good standard string package; nobody
> can figure out which string representation to use.  Besides which,
> the lack of dynamic memory allocation facilities* makes it pretty hard
> to figure out where to put them.
Again, this is probably either due to an anti-hubris about deciding
the 'one-true way' to do strings, or due to lack of 'obviously right for
all circumstances' way to do strings.  Did I leave out any other
possibilities?

> * Memory allocation problem update: my memory allocation proposal passed
> at the last ANS Forth meeting!  So Forth will have memory allocation one of
> these days when ANS Forth becomes a reality.  More on this later.
Can't wait to get my copy of BASIS11 and check it out!

> P.S. Lest you all get the wrong idea, I actually like Forth, and I program
> in it almost exclusively, entirely by choice.  I would very much like for
> Forth to succeed, and that is why I rail against its flaws.  I want the
> language to evolve and for the flaws to be fixed.  The sad truth is that
> many Forth practitioners would rather pretend that the flaws are virtues
> and justify them or sweep them under the rug, rather than fix them.
I can't resist replying to this... I think perhaps your characterization
of 'many Forth practitioners' may be a bit skewed.  I think there is an
'experts phenomina' at work here too.  By 'experts phenomina' I mean that
someone who has been programming in a language, or using any other
mental framework for problem solving, will eventually be able to chunk
large amounts of knowledge about that language/framework into
'second-nature' or 'unconcious' understandings of how to do something.
Once this has happened it is often hard to get back to the process of
how it happened.  This is what makes expert systems hard to write, because
it is hard to extract the 'why' information from the experts since they
no longer 'know' at an easily accessible conscious level.  What I'm
saying is that perhaps at least some of the practioners you are impunging
here are just looking at Forth with this perspective.  [I'd be
interested in continuing this discussion, but if we do we ought to make it
a different subject line thread.]

> (BTW, I am pretty sick of all this "Zen philosophy" nonsense (weakness
> is strength, dah-dah, dah-dah) regarding Forth; we are talking about a
> programming language that deals with physically real hardware, whose
> success or failure is ultimately determined by economics, measured in
> real money.
Hmmm...  I don't know how you got the idea that Forth's ZEN-ness had
anything to do with 'a way in which to live your life and interact with
your fellow man and achieve spiritual salvation.'?  I also don't see how
you tie 'weakness is strength' in with '...to live your life...'.  I
can't help but think that perhaps you're over-reacting just a bit?  I've
always thought of the ZEN adjective, as applied to Forth, as connoting
simplicity and clarity.  (Although I admit, as you have pointed out,
that Forth often fails at reaching those goals.  I'll refrain from
specific comment until I know more about Forth in general and BASIS11 in
particular).  Perhaps I've just not got the viewpoint to see your
context.  When I hear 'weakness is strength', I understand that to mean
that Forth is traditionally considered 'weak' because it doesn't provide
a lot of the amenities that other languages/systems do, like libraries for
doing strings, or memory allocation, or... and to say that this 'weakness'
is strength is to say that Forth is not cluttered up with someone else's
idea of the best way to do something.  'Not cluttered' meaning both
1) an implementation free of 'junk' words and 2) a conceptual freedom from
'junk' paradigms to overcome.  Then again, maybe I'm just talking out my
hat.

		-Doug

P.S.  I too like Forth and would like to see it grow out of its warts.
I'm just not yet sure what all the warts are or what Forth is.  Hopefully
by the time the *next* ANSI standard is being considered we'll have
enough 'existing practice' to take the next officially condoned
evolutionary step.  In the meantime I'm all for experimenting with what
that kind of Forth would be like, and how far it can be pushed and still
be considered Forth.

---
Preferred: willett!dwp@gateway.sei.cmu.edu OR ...!sei!willett!dwp
Daily: ...!{uunet,nfsun}!willett!dwp   [in a pinch: dwp@vega.fac.cs.cmu.edu]

dwp@willett.UUCP (Doug Philips) (02/14/90)

In article <9002121713.AA24890@jade.berkeley.edu>,
	wmb@ENG.SUN.COM (Mitch Bradley) writes:

> Which brings up the question:  What if we had a language with mostly
> PostScript syntax, but Forth "real machine, not overloaded" operators
> and "early binding" semantics?  That would be a nice language.  But
> does the world need yet another language that will not succeed for lack
> of a market?  Nope.
I've heard that CM himself is not worried about Forth's survivability.
How many PostScript engines are in place right now?  How does that
compare to Forth?  Which is the greater success?  Which would more
likely spawn a successful look-a-like?  (Not quite rhetorical questions,
because I only suspect the answers, I don't know for sure.)  Maybe
this would end up being a direction that Forth should go in?  But then
this gets back to the question of what the essence of Forth is and what
you can/can't change and still have Forth.

>  Aside: Forth: Virtual Machine or not?
> 
> There is a hot debate about whether Forth should be a "virtual machine"
> or a "high level assembler".  The essence of this debate is: "Should the
> bit-level details of how the dictionary is implemented be specified or
> not specified".  "virtual machine" advocates answer "yes", "high level
> assembler" (a poor name, by the way) advocates answer "no".
> 
> (I answer "no"; you may wish to take my further comments with a grain of
> salt based upon that admission).
I answer 'maybe'.  Perhaps what is needed is to avoid having to specify
'bit-level details of how the dictionary is implemented' and to start to
address the question of how to provide a set of words that will achieve
certain semantic-effect-manipulations of the dictionary.  The
definitions of those dictionary-smashing words will mostly likely be
non-portable, but code that uses them would be portable.  And depending on
how you define those words, they might even be immediate words that 'comma
in' to the word being defined fast code for direct manipulation, thus
avoiding one level of nesting and yet still not sacrifice portability.

		-Doug

---
Preferred: willett!dwp@gateway.sei.cmu.edu OR ...!sei!willett!dwp
Daily: ...!{uunet,nfsun}!willett!dwp   [in a pinch: dwp@vega.fac.cs.cmu.edu]

wmb@ENG.SUN.COM (02/16/90)

> I'm not yet willing to concede [ that PostScript's one closure vs.
> Forth's many closures is a fundamental difference ].  If it is fairly simple
 to
> create the other language's closures then I don't see this as fundamental.

It is not easy to create PostScript's closure on top of a Forth closure,
at least not in a portable way.  It is not so hard to do if you know
the implementation details of a particular Forth (John Wavrik posted
a particular solution a while back), but I tried to come up with a
general solution and failed.  It is certainly possible to implement the
PostScript closure "manually" in Forth (using an explicit array, with
a high-level word to interpret its contents), but it's hard to do it
portably "on top of" an existing Forth closure.

On the other hand, I had no trouble implementing Forth's many closures on
top of the PostScript one.

> [ Forth's direct access to hardware data types, PostScript's lack of
> such access ]

In PostScript, you can't look at the bits in a stack cell, and you also
can't pretend that a number is an address and try to "@" that location.
In nearly all Forth implementations, both of these are possible.

In PostScript, stack objects are typed, and if you try to use e.g. a logical
operator on a number object, you get an error signal and the operation aborts.

> >    way while executing a definition (specifically, you cannot, for
> >    instance, easily write a Forth word which creates a word called
> >    "foo", unless you can arrange for the name "foo" to appear in
> >    the input stream at the time your word is executed).
> I take it from this that by 'easily' you mean: 'without knowing how the
> dictionary structure of the Forth you are using works,' or perhaps, 'in a
> portable manner, across more than one type of Forth.'?

Specifically, I mean that I have tried several times to do this, and have
never come up with a good portable way of doing it, without resorting to
extremely gruesome mucking about with the mechanics of the input stream,
which isn't very portable because most of the Forth systems that I care
about have extended those input stream mechanisms to cope with text files,
and the input stream hacks aren't portable across those systems.

There may be some good ANSI news on this input mechanism front however.
At the last meeting, I led a group that worked out a specification for
dealing with text input files, and the results are pretty encouraging.
The basics proposals of the scheme were passed on the last day, and
several more proposals, completing the scheme, are pending.

> ... Forth's confusion about strings, and the possibility of defining
> your own strings package ...

The argument that "you can define your own" is often cited in defense
of Forth's lack of particular features.  However, many people do not
wish to have to "roll their own" this and that.  Perhaps they do not
have the skill.  Perhaps they do not have the time.  Perhaps they would
rather concentrate on their application without having to build up
the tool base by themselves.  Having well-debugged, optimized, supported
tool packages (e.g. strings) can save application developers time and
money.  Given the choice between "rolling your own" and buying, the
"buy" decision is often economically sound.

> > 1) "adr len"  Address and length of array of bytes.  This is the
> >    best representation.  Example operator: TYPE
> Hmmm, lets not pretend the string issue exists in a vacuum.
> There are those of use weened on 'C' that would claim NULL terminated
> strings are best.  I'd rather avoid religious rhetoric and stick to
> verifiable/demonstrable claims about real situations.

This isn't rhetoric.  "Adr len" strings are objectively best, in that
a) Any character can appear in a string (null-terminated and tagged
   strings are weak in this respect).
b) Many types of string manipulation can be performed on "adr len" strings
   without copying, without allocation of extra memory, and without
   concerns about "read-only" storage.  (counted strings are weak in
   these respects).
c) An "adr len" string can be arbitrarily long.
d) Any region of memory can be described as an "adr len" string without
   requiring preallocation of space for either a count byte or a delimiter
   byte.
The one weakness of "adr len" strings is an issue of convenience.  There
are 2 things on the stack instead of 1.

I believe that the above claims are both verifiable and demonstrable.

The ANSI committee has settled upon the "adr len" representation for
all new functions with string arguments.


> [ virtual machine vs. high level assembler ]
> I answer 'maybe'.  Perhaps what is needed is to avoid having to specify
> 'bit-level details of how the dictionary is implemented' and to start to
> address the question of how to provide a set of words that will achieve
> certain semantic-effect-manipulations of the dictionary.  The
> definitions of those dictionary-smashing words will mostly likely be
> non-portable, but code that uses them would be portable.  And depending on
> how you define those words, they might even be immediate words that 'comma
> in' to the word being defined fast code for direct manipulation, thus
> avoiding one level of nesting and yet still not sacrifice portability.

I agree entirely.  I published a paper in one of the FORML proceedings
proposing such a set of "dictionary abstraction" words.  Furthermore,
one such word (COMPILE-TOKEN) is currently on the table at ANSI.  It was
the subject of much fierce debate at the last meeting.

Mitch

toma@tekgvs.LABS.TEK.COM (Tom Almy) (02/17/90)

In article <9002161509.AA19458@jade.berkeley.edu> wmb@ENG.SUN.COM writes:
>> Hmmm, lets not pretend the string issue exists in a vacuum.
>> There are those of use weened on 'C' that would claim NULL terminated
>> strings are best.  I'd rather avoid religious rhetoric and stick to
>> verifiable/demonstrable claims about real situations.
>
>This isn't rhetoric.  "Adr len" strings are objectively best, in that
>a) Any character can appear in a string (null-terminated and tagged
>   strings are weak in this respect).
>b) Many types of string manipulation can be performed on "adr len" strings
>   without copying, without allocation of extra memory, and without
>   concerns about "read-only" storage.  (counted strings are weak in
>   these respects).
>c) An "adr len" string can be arbitrarily long.
>d) Any region of memory can be described as an "adr len" string without
>   requiring preallocation of space for either a count byte or a delimiter
>   byte.
>The one weakness of "adr len" strings is an issue of convenience.  There
>are 2 things on the stack instead of 1.
>
>I believe that the above claims are both verifiable and demonstrable.

Well, I certainly agree. I wrote a string package in 1981 or 2 for Forth
that used only "adr len". I called these "string descriptors." There 
were constant string descriptors and variable string descriptors (these
left *three* values on the stack-- address, physical length, and logical
length). The string package was very fast, easy to use, and protected
against string overflows. The only string packages I've seen that have
worked better, IMHO, are those associated with languages that do dynamic
memory allocation and garbage collection of strings (examples: Microsoft
BASIC, Lisp, SNOBOL, TRAC).

The really interesting aspect about the string package was that it was
based on the string instructions which I built into a processor I designed
in 1975. The processor was 32 bit words, "0-operand" (i.e. "stack machine"),
had string, queue, and IEEE floating point instructions as well as hardware 
process support (task swap was a machine instruction -- very uncommon for the 
time it saved the entire process state).

Naturally the processor would have made a good Forth Engine (I didn't even
know about Forth at the time).

An example instruction, Move Characters and Update, was used to concatenate
strings (Direct quote from manual):

	FORMAT: 0 1 1 0 1 0 0 1			1 byte instruction

	OPERANDS ON STACK:  (Top) 1. NB
				  2. B
				  3. NA
				  4. A

	RESULTS ON STACK:   (Top) 1. NR
				  2. R

	FUNCTIONAL DESCRIPTION:
NA and NB are non-negative integers. A and B are addresses of byte arrays.

L <- MIN(NA,NB);
IF A<B
	THEN FOR I=0 TO L-1 DO A(I) <- B(I);
	ELSE IF A>B
		THEN FOR I=L-1 TO 0 BY -1 DO A(I) <- B(I);
NR <- NA - L;
R <- A + L;

Where the destination is the address and physical length of a string
variable, and the source(s) is/are addresses and logical lengths of string
variables (or addr len of string constants), repeated application of this
instruction will concatenate strings into the destination variable. 
The final operation consists of subtracting the resulting length (NR) from
the physical length, giving the new logical length.

Specifying substrings, for either source or destination, was also trivial.
An instruction, Index, would increment the start address and decrement the
length simutaneously (also making sure the length didn't drop below zero.
This instruction was used to specify the starting position of a substring.
A simple Minimum function was used to specify the length of a substring.

Other instructions:
String Equal, Greater, etc	All 6 comparisons
Move Characters 	Doesn't "update"
Super Scan		Scans for first character of one string in another.
			First string specifies characters as ranges.
Translate and Update	Like move chars and update but goes through translate
			table.
Position		Finds position of one string in another.
Scan While		Scans for first character in second string that is not
			in first string.
Scan Until		Scans for first character in second string that is in
			first string.
Fill			Appends N copies of one string into another
Super Position		Like position, but will return a partial match if the
			first N characters of the first string match the last
			N characters of the second string.
Translated Position	Like position, but a translate table is used as well

Yep, some of these are obscure, but they had their uses.

Tom Almy
toma@tekgvs.labs.tek.com
Standard Disclaimers Apply

dwp@willett.UUCP (Doug Philips) (02/19/90)

In <9002161509.AA19458@jade.berkeley.edu>, wmb@ENG.SUN.COM writes:

> [Doing PostScript's closure is not easy nor portable in Forth, but
>  doing Forth's closure's in PostScript is easy.]
Ok, I concede the point here.  I might ask if this doesn't raise the
question of the proper factoring of the Forth closure-making words?

> In PostScript, you can't look at the bits in a stack cell, and you also
> can't pretend that a number is an address and try to "@" that location.
> In nearly all Forth implementations, both of these are possible.
> In PostScript, stack objects are typed, and if you try to use e.g. a
> logical operator on a number object, you get an error signal and the
> operation aborts.
Yes, but can you trivially write a word that converts between types, or
is type-ness inaccesable?

>...[Discussion about the lack of ease of writing a word which will,
>    at runtime, CREATE a word whose name is in the word in which
>    CREATE occurs.]
> Specifically, I mean that I have tried several times to do this, and have
> never come up with a good portable way of doing it, without resorting to
> extremely gruesome mucking about with the mechanics of the input stream,
> which isn't very portable because most of the Forth systems that I care
> about have extended those input stream mechanisms to cope with text files,
> and the input stream hacks aren't portable across those systems.
Again, doesn't this point to a factoring problem with the dictionary
manipulating words?  Having good streams doesn't solve the problem that
there should be a word that creates a dictionary entry from a 'string'
on the stack.  (Define string as addr or addr/len, or what have you).

> This isn't rhetoric.  "Adr len" strings are objectively best, in that
> a) Any character can appear in a string (null-terminated and tagged
>    strings are weak in this respect).
> b) Many types of string manipulation can be performed on "adr len" strings
>    without copying, without allocation of extra memory, and without
>    concerns about "read-only" storage.  (counted strings are weak in
>    these respects).
> c) An "adr len" string can be arbitrarily long.
> d) Any region of memory can be described as an "adr len" string without
>    requiring preallocation of space for either a count byte or a delimiter
>    byte.
> The one weakness of "adr len" strings is an issue of convenience.  There
> are 2 things on the stack instead of 1.
> 
> I believe that the above claims are both verifiable and demonstrable.
Point c works only if you can capture the range of your address space in
a single stack item, unless you have 'long strings' with double
precision integers as their length.  As to points a & d, well, we might
consider a difference between strings and arrays of bytes.  X3J11 did
that.  There are string functions and there are byte array functions.
Not the same things.

> The ANSI committee has settled upon the "adr len" representation for
> all new functions with string arguments.
Considerations of existing practice make this unsurprising.

		-Doug

---
Preferred: willett!dwp@gateway.sei.cmu.edu OR ...!sei!willett!dwp
Daily: ...!{uunet,nfsun}!willett!dwp   [in a pinch: dwp@vega.fac.cs.cmu.edu]

dwp@willett.UUCP (Doug Philips) (02/19/90)

In <6876@tekgvs.LABS.TEK.COM>, toma@tekgvs.LABS.TEK.COM (Tom Almy) writes:

> [Mitch Bradley's description of the advantages of addr/len strings.]
> Well, I certainly agree. I wrote a string package in 1981 or 2 for Forth
> that used only "adr len". I called these "string descriptors." There 
> were constant string descriptors and variable string descriptors (these
> left *three* values on the stack-- address, physical length, and logical
> length). The string package was very fast, easy to use, and protected
> against string overflows. The only string packages I've seen that have
> worked better, IMHO, are those associated with languages that do dynamic
> memory allocation and garbage collection of strings (examples: Microsoft
> BASIC, Lisp, SNOBOL, TRAC).

This sounds almost identical to Simula-67's TEXT type.  I don't have any
Simula ref's on hand anymore, but the idea is almost exactly the same as
your variable string descriptors.  As I recall, in Simula-67, there might
have been two addresses, one which always points to the beginning of the
memory allocated for the string, and one which is the 'current' character
pointer.  Anyone out there have Simula manuals around?

		-Doug

---
Preferred: willett!dwp@gateway.sei.cmu.edu OR ...!sei!willett!dwp
Daily: ...!{uunet,nfsun}!willett!dwp   [in a pinch: dwp@vega.fac.cs.cmu.edu]

wmb@ENG.SUN.COM (02/21/90)

> > [Doing PostScript's closure is not easy nor portable in Forth, but
> >  doing Forth's closure's in PostScript is easy.]
> Ok, I concede the point here.  I might ask if this doesn't raise the
> question of the proper factoring of the Forth closure-making words?

It certainly does.  The problem with trying to factor them is that
there are so many different things going on with the different kinds
of closures.  The "branch class" of closure (BEGIN and IF structures)
can be easily factored, but the DO class has some other wierd things
going on, the ":" class throws in some more wrinkles, and then DOES>
wierds it out even more.  Serious "ad-hoc" action here.


> > In PostScript, stack objects are typed, and if you try to use e.g. a
> > logical operator on a number object, you get an error signal and the
> > operation aborts.
> Yes, but can you trivially write a word that converts between types, or
> is type-ness inaccesable?

PostScript already provides words to convert between types, but some
pairs of types are not mutually convertible.  The original point was
that Forth is inherently untyped and PostScript is inherently strongly
typed, and that this difference is fundamental.  The above argument
supports that claim.  In Forth, you can ultimately do anything you want
with a stack cell, and in PostScript, you can only do "type consistent"
things.


> > ... difficulty of creating explicitly-named words within definitions ...
> Again, doesn't this point to a factoring problem with the dictionary
> manipulating words?  Having good streams doesn't solve the problem that
> there should be a word that creates a dictionary entry from a 'string'
> on the stack.

Absolutely right.  The dictionary manipulation words in Forth are factored
VERY poorly.  By the way, once you have a word like $CREATE  (like CREATE
but takes a string from the stack), then you need $: and $DEFER and
$whatever_other_defining_words_you_care_about .  If defining word syntax
were postfix, and the notions of declaring the action class and creating
the name were separate, then this problem wouldn't exist.


> > c) An "adr len" string can be arbitrarily long.

> Point c works only if you can capture the range of your address space in
> a single stack item

Which is sort of guaranteed, because the standard memory access operators
(e.g. C@) take single-cell addresses.  Extended address space acccess
only appeared in the ANSI BASIS at the last meeting.  More on that topic
in a later posting.

Mitch

bouma@cs.purdue.EDU (William J. Bouma) (02/22/90)

In article <9002210136.AA20515@jade.berkeley.edu> wmb@ENG.SUN.COM writes:
>PostScript already provides words to convert between types, but some
>pairs of types are not mutually convertible.  The original point was
>that Forth is inherently untyped and PostScript is inherently strongly
>typed, and that this difference is fundamental.  The above argument
>supports that claim.  In Forth, you can ultimately do anything you want
>with a stack cell, and in PostScript, you can only do "type consistent"
>things.

   You lost me here when you say "Forth is inherently untyped". In previous
   posts you seemed to be saying that traditionally forth is not strongly
   typed thus it is different from postscript which is. That is true. But here
   you seem to be saying the structure of is such that it cannot be typed. I
   would agree that postscript would be a mess without types, and that forth
   works without formal types. But I do think that forth can (and should) be
   typed. Forth already allows different types on the stack: integers, chars,
   addresses of arrays and variables.  But after allowing types, forth tries
   to ignore the whole issue. I am not saying type checking should be required
   everywhere, just that type information should be available when wanted.
   For the sake of speed you could have an 'f+' operation to add the top two
   floats which does not check if they are really floats. But if you want an
   operation to add a float and integer in either order, what do you do? Junk
   like '."' is stupid. Have a '.' that checks the type and prints it the
   appropriate way. Give me format statements to read and write different
   types of data. Give me the means to build new types from base primitives
   and allocate space for an instance of some type. Trash this junk that makes
   me calculate explicitely the size in bytes any time I want a block of data.
   Probably the main thing keeping a lot of people from taking forth seriously
   is the lack of a data abstraction system.

-- 
Bill <bouma@cs.purdue.edu>  ||  ...!purdue!bouma

dwp@willett.UUCP (Doug Philips) (02/22/90)

In <9002210136.AA20515@jade.berkeley.edu>, wmb@ENG.SUN.COM writes:

>  [wmb points out that doing Forth's closures in PostScript is easy
>   but that doing PostScript's cloure(s) in Forth is not easy.  I
>   concede that this indicitive of an essential difference between
>   the two languages and raise the question: "doesn't this mean that
>   Forth's closure making words are improperly factored?"]
> It certainly does.  The problem with trying to factor them is that
> there are so many different things going on with the different kinds
> of closures.  The "branch class" of closure (BEGIN and IF structures)
> can be easily factored, but the DO class has some other wierd things
> going on, the ":" class throws in some more wrinkles, and then DOES>
> wierds it out even more.  Serious "ad-hoc" action here.
Ah, but is there some small/fundamental/underlying set of closure making
words which can be combined to form these existing words such that the
'ad-hoc quirks' of these words is an artifact of the combination of the
more primitive words instead of being inherent 'ad-hoc quirks' of the most
basic building blocks?

> PostScript already provides words to convert between types, but some
> pairs of types are not mutually convertible.  The original point was
> that Forth is inherently untyped and PostScript is inherently strongly
> typed, and that this difference is fundamental.  The above argument
> supports that claim.  In Forth, you can ultimately do anything you want
> with a stack cell, and in PostScript, you can only do "type consistent"
> things.
It doesn't really matter if you can't go immediately from one type to
another.  All that matters is that some sequence of type converting words
can get you from one place to all the other places.  If the type system
has disjoint partitions, then this is true and I'll concede the point.  If
there are no disjoint partitions then it is merely a matter of convenience
(i.e. having to write your own foo->bar word that does the twelve necessary
intermediate type perturbations).

> [Streams, even if correctly done, don't solve the problem with
>  improperly factored closure creating words.]
> Absolutely right.  The dictionary manipulation words in Forth are factored
> VERY poorly.  By the way, once you have a word like $CREATE  (like CREATE
> but takes a string from the stack), then you need $: and $DEFER and
> $whatever_other_defining_words_you_care_about .  If defining word syntax
> were postfix, and the notions of declaring the action class and creating
> the name were separate, then this problem wouldn't exist.
See my comment above.  If you have the right fundamental wordset, you can
define the current closure-making words using it (although you'd most
likely want to use CODE words for speed).  BUT, once you have the right
closure making words, you can go beyound Forth's current ad-hoc-ism to
something simpler, cleaner, etc. (conceptually and implentationally).

		-Doug

---
Preferred: willett!dwp@gateway.sei.cmu.edu OR ...!sei!willett!dwp
Daily: ...!{uunet,nfsun}!willett!dwp   [in a pinch: dwp@vega.fac.cs.cmu.edu]

don@brillig.umd.edu (Don Hopkins) (03/02/90)

In article <528.UUL1.3#5129@willett.UUCP> you write:
>> PostScript already provides words to convert between types, but some
>> pairs of types are not mutually convertible.  The original point was
>> that Forth is inherently untyped and PostScript is inherently strongly
>> typed, and that this difference is fundamental.  The above argument
>> supports that claim.  In Forth, you can ultimately do anything you want
>> with a stack cell, and in PostScript, you can only do "type consistent"
>> things.
>It doesn't really matter if you can't go immediately from one type to
>another.  All that matters is that some sequence of type converting words
>can get you from one place to all the other places.  If the type system
>has disjoint partitions, then this is true and I'll concede the point.  If
>there are no disjoint partitions then it is merely a matter of convenience
>(i.e. having to write your own foo->bar word that does the twelve necessary
>intermediate type perturbations).
>

PostScript objects have their type stored with them (be they on the
stack, in arrays, in dictionaries, or anywhere else in memory).  Forth
objects do not. A cell is a cell is a cell, no matter what you think
of it. Your program cannot tell what you were thinking.  You have to
remember what type is stored where, and call the appropriate function
for that type, and forth programs have to assume that they are called
with arguments of the correct type. You can't write a forth word that
does one thing if passed an integer, or another thing if passed a
string, because there is no way it can tell by looking at the the
stack. You have to have two different forth words, one for integers
and one for strings, and the programmer has to call the appropriate
one at the appropriate time.

PostScript programs can behave differently according to their type of
arguments. A good example is the ShowThing function in NeWS. It takes
as an argument a Thing. A Thing can be any of several different data
types. If it's a string, ShowThing renders the string in the current
font. If it's a dictionary, it sends a /paint message to it. If it's
the name of an icon, it paints that icon. If it's an array, it breaks
it up and deals with each element individually: a font sets the
current font, a color sets the current color, a pair of numbers
offsets the current point, and it can also contain any of the above
types. 

Pointy head academic types call this "polymorphism". Each element of
an array can contain any type of object. The objects are tagged with
their type, not the places where the objects are stored. PostScript
code is simply an executable array of PostScript objects.
"Decompiling" PostScript code is trivial because of the type
information stored with each element.  There is really no PostScript
compiler. Just a scanner that translates text to typed objects. You
can extend the language by creating new control structures, etc, (like
"case") by defining new functions, not by changing the syntax of the
language or the format of code stored in memory; any extensions you
make to PostScript (in the sense of defining new functions -- I don't
mean hacking the source code to the interpreter itsself) do not
necessitate changes to the PostScript "decompiler". I surround
"decompiler" in double quotes because "==", the operator which prints
a PostScript object, is as much of a decompiler as most people need,
and you could write a nice indenting decompiler in terms of "==",
"print", and a little recursion.

Decompiling Forth is not trivial, because the decompiler has to know
about any in-line literals or other tricky things that might be
embeded in the code (like "literal", '(.")', "(;code)", "?branch",
"exit", and the "{{" and "}}" closures that have been mentioned). Any
extensions you make to the language that allot memory in the middle of
code, you have to teach the decompiler about.

The Forth compiler and scanner is written in Forth. There is no
PostScript compiler, but the scanner is not written in PostScript
(every implementation I have seen was in C). 

PostScript is not threaded. The PostScript interpreter "executes"
PostScript data structures, by looking at their type information, and
then deciding what to do (usually with a big switch statement). The
Forth inner interpreter, "next", is *much* simpler than any PostScript
interpreter -- one instruction on some machines.

	-Don