[comp.lang.forth] ANS TC Magnet for Mass Storage

ForthNet@willett.pgh.pa.us (ForthNet articles from GEnie) (05/28/91)

Category 10,  Topic 15
Message 21        Sat May 25, 1991
R.BERKEY [Robert]            at 20:45 PDT
 
 
 To: Mitch Bradley

 mb> A standard ANS Forth system is not required to reject
 mb> non-printable characters in blocks, nor is it required to accept
 mb> them.  The characters whose meanings are precisely defined in the
 mb> context of block source code are the space character and the ASCII
 mb> characters with codes from 33 to 126.

Forth-83 allows word names defined on blocks to contain any character other
than a space.  In Forth-83 there are no restrictions on the data that can be
stored in a block, a block is simply a 1K segment of RAM that interchangeably
resides in some form of mass storage.  This "mass storage" may itself be as
little and as transient as 32K of RAM.

 mb> "127 AND" is often used after KEY to remove junk like parity bits
 mb> and shift bits.  This technique, although quite common, is bogus,
 mb> because throwing away high bits doesn't necessarily result in a
 mb> meaningful 7-bit ASCII character.  Instead it may for example
 mb> transform a code that means the "F7" function key into the letter
 mb> "T", a behavior for which I can think of no justification.

The implementation requirements for the Forth-83 Standard word KEY are an
entitlement to the "Standard Programmer", i.e., a user of a FORTH-83 Standard
System.  Historically this definition arose because of conflicting program
needs involving KEY .  Sometimes the program needs all of the data coming
through the I/O stream, principally here meaning through an RS232 serial port.
It's a serious handicap to professional engineers when the implementation
arbitrarily discards data from the physical level of the I/O stream.  At the
logical level of use, just what that data means may simply be information
relevant to the I/O stream, such as a parity bit, and not interesting in the
logical interpretation of the data.  The point of the system requirements
involving the Forth-83 Standard word KEY is to assure that the application
level on a system-specific basis is able to make relevant decisions regarding
how to interpret physical-level data.  Further, the Forth-83 Standard
guarantees a Standard Program access to logical data, i.e., that all 128 ASCII
characters can be received.  The Forth-83 Standard further specifies the ASCII
standard as a normative appendix.

 mb> The correct phrase should be something like this:

 mb> 126 constant max-graphic    \ Value depends on system character set

 mb>    ...
 mb>     key  dup bl max-graphic  between  if   ( char )
 mb>        <insert character in buffer>
 mb>     else
 mb>        <process as editing character>
 mb>     then

That happens to be an example of a program using the Forth-83 KEY .

Mitch, I don't understand your readiness to discard the heritage with KEY and
BLOCK .  My sense of your view toward blocks is, "someone's not-so-bright idea
of a way to handle source code."  I know that you've cited control-S and
control-Q as implementation problems with KEY .  Yet especially knowing your
prodigious ability as an implementor, I find it difficult to understand that
you couldn't support the KEY in a program written to use DC1 or DC3, Device
Control 1 and 3, ASCII values hex 11 and 13.  (If this issue is a "quibble",
then are excuses for not providing a full implementation of KEY other than
"quibbles"?)

By the way, I consider X3.J14's handling of the definitions of KEY and BLOCK
two of the bigger objections going, either being plenty of reason for
reasonable people to reject the label "ANS Forth Standard".  The deletion of
EXPECT could be rationalized if KEY were conventionally functional.  KEY can't
input a <RETURN>?  In my opinion, the committee decision here lacks serious
commitment to standardization.  Re: disentitling BLOCK .  The nominal
rationale that embedded systems don't necessarily need source code doesn't
wash.  The existence of embedded systems that don't allow source code doesn't
in any way change the existence of programs that _are_ source code.

Robert

-----
This message came from GEnie via willett.  You *cannot* reply to the author
using e-mail.  Please post a follow-up article, or use any instructions
the author may have included (USMail addresses, telephone #, etc.).
Report problems to: dwp@willett.pgh.pa.us _or_ uunet!willett!dwp

Mitch.Bradley@ENG.SUN.COM (05/29/91)

 mb> A standard ANS Forth system is not required to reject
 mb> non-printable characters in blocks, nor is it required to accept
 mb> them.  The characters whose meanings are precisely defined in the
 mb> context of block source code are the space character and the ASCII
 mb> characters with codes from 33 to 126.

 rb> Forth-83 allows word names defined on blocks to contain any character
 rb> other than a space.  In Forth-83 there are no restrictions on the data
 rb> that can be stored in a block, a block is simply a 1K segment of RAM
 rb> that interchangeably resides in some form of mass storage.  This "mass
 rb> storage" may itself be as little and as transient as 32K of RAM.

Note that ANS Forth does not restrict the *data* that be stored in a block;
it simply says that the *source code* for a Standard Program should only
contain printable characters and blanks.  Regardless of what Forth-83 says,
there is a fair amount of variation in what real systems do when the
interpreter encounters control characters.

Another way of putting this is:

        If you want your program to be really portable, you better figure
        out how to write the source code with printable characters.

From a pragmatic standpoint, this is a good idea for several reasons,
including the ability to print and publish your code, both in paper form
and over the network.  Essentially, this is equivalent to saying that
a Standard Program can be published with real-world human-readable media.

> The implementation requirements for the Forth-83 Standard word KEY are
> an entitlement to the "Standard Programmer", i.e., a user of a
> FORTH-83 Standard System.  Historically this definition arose because
> of conflicting program needs involving KEY .  Sometimes the program
> needs all of the data coming through the I/O stream, principally here
> meaning through an RS232 serial port.

Increasingly, KEY is being used not only for data coming in over a serial
port, but also for event codes from extended keyboards and window systems,
and also for international characters.  For those other uses, high order bits
do not necessarily represent "uninteresting information" (uninteristing in
the sense that the presence or absense of a parity bit does not change the
meaning of the rest of the bits).  I would claim that having to deal with
visible parity bits at the Forth program level is extremely rare these days,
especially compared to the other cases (extended keyboards, international
character sets, event codes).

> Mitch, I don't understand your readiness to discard the heritage with KEY
> and BLOCK .  My sense of your view toward blocks is, "someone's not-so-
> bright idea of a way to handle source code."

The issue doesn't revolve around BLOCK, but rather around whether Standard
source code can contain "invisible" characters.  I don't think it's a good
idea to allow arbitrary control characters in standard source code.  Standard
implies publication or transmission or multi-platform use or at least long-
term maintainability, and all my experience with the real world tells me
that embedding arbitrary control codes in source code is contrary to all
those goals.

> I know that you've cited control-S and control-Q as implementation
> problems with KEY .  [compliment deleted] I find it difficult to
> understand that you couldn't support the KEY in a program written to
> use DC1 or DC3, Device Control 1 and 3, ASCII values hex 11 and 13.

The problem is that many communications channels intercept or alter
numerous control codes, in ways that are difficult to detect.  The
only reasonable alternative is to define some standard "escape sequence"
for entering otherwise-unreachable codes, similar to C's "\" convention.
This would be a nice feature to have, but I sort of doubt that the
committee would go for it, and I'm not willing to carry the banner on
this one.  If you propose it, I will support it.

> Re: disentitling BLOCK .  The nominal
> rationale that embedded systems don't necessarily need source code doesn't
> wash.  The existence of embedded systems that don't allow source code
> doesn't in any way change the existence of programs that _are_ source code.

Presumably, embedded systems that do not allow source code will succeed
or fail based on the market acceptance of their overall value.  There are
some people (I am not one of them) who are adamantely opposed to even
saying that if you have files, then you must have blocks too.  For a good
argument, call Jack Brown.  This battle have been fought at least 3 times,
and I doubt that anybody's mind is going to change at this stage.


Mitch.Bradley@Eng.Sun.COM

mikeh@touch.touch.com (Mike Haas) (06/01/91)

>
>> Mitch, I don't understand your readiness to discard the heritage with KEY
>> and BLOCK .  My sense of your view toward blocks is, "someone's not-so-
>> bright idea of a way to handle source code."
>
>The issue doesn't revolve around BLOCK, but rather around whether Standard
>source code can contain "invisible" characters.  I don't think it's a good
>idea to allow arbitrary control characters in standard source code.  Standard

Then don't enter them in source code.  But don't limit the functionality of
the system's tools to handle such data.  KEY and BLOCK (ugh) are used for
many thing other than funneling data to the interpreter/compiler.  In fact,
when most forth "applications" are running, doing what they were designed
to do, the often use KEY and BLOCK (no?) and never invoke the interpreter.

Other languages have been adding page formatting control characters
since day 1...right in with their source code.  any forth system worth
it's salt should be able to handle TAB characters as generic whitespace,
like BL.  Why the hell not?  I HATE it when someone designs limitations
into the development tools i have to use...especially forth tools, where
I can type in R> at the keyboard!  This is an advantage, this power.  Why
take it away?  Remember, as a forth developer, you're creating MY tools;
you can't possibly know what I'm going to want to do with them, especially
with a language like forth!

Then there are platforms (like tha amiga) where you can type in ANSI
escape codes to do VT-100-like forms management (including changing
text & background color).  a 7-bit KEY is braindamaged!  Quit trying
to protect me from the power of forth.  Both the mac & amiga use the
full 8 bits for data...don't strip this out just cause your forth
interpreter can't handle it.  forth needs to ADAPT to these systems...
even 8-bit text is old stuff, the world is now trying to figure out
international language support (16 bits & MORE!)...THIS IS WHAT WE
SHOULD BE FIGURING OUT!...
not whether KEY is gonna pass 8 bits!  ludicrous.  ADAPT...
it may be in Websters, but it sure isn't in many 'forth' dictionaries!

Mitch.Bradley@ENG.SUN.COM (Mitch Bradley) (06/01/91)

> >The issue doesn't revolve around BLOCK, but rather around whether Standard
> >source code can contain "invisible" characters.  I don't think it's a good
> >idea to allow arbitrary control characters in standard source code.

> Then don't enter them in source code.  But don't limit the functionality of
> the system's tools to handle such data.  KEY and BLOCK (ugh) are used for
> many thing other than funneling data to the interpreter/compiler.

Oops, misunderstanding time!  ANS Forth does NOT prohibit the use of control
characters in BLOCKs.  It says that the use of control characters in *block
source code* is ambiguous.  In other words, some systems may treat them
as word delimiters, and other systems may treat them as characters of a word.
The BLOCK mechanism still has to deliver the control characters without
modification, but the *interpreter/compiler* is not constrained on how it
must deal with them when they are delivered from a BLOCK.

This is a statement of historical reality.

However, in *text file source code*, ANS Forth specifies that control
characters are word delimiters, i.e. "white space".

Implementors have 3 reasonable options:

1) They can make both BLOCK source and text file source work the same way,
   i.e. control characters are "white space".  One might expect that an
   implementor of a new system might do it this way.

2) They can make BLOCK source treat control characters as "non-white" and
   text file source treat them as "white".  This would be a reasonable
   choice for a vendor with a substantial existing customer base devoted
   to BLOCK source code, in which control characters are "non-white".

3) They can refuse to implement text file source code, and do whatever they
   used to do with block source.  My personal prediction is that such vendors
   will get creamed in the marketplace (even FORTH, Inc. now offers text
   file source, due to popular demand), but the market will make the final
   call on that one.

Note that a vendor is not allowed to have the file access wordset without
the BLOCK wordset.  This is an issue that was a matter of much debate, and
the final decision was a compromise.  Note, however, that vendors who really
don't want to support BLOCK can either

a) Supply a "barely adequate" implementation of the BLOCK wordset in source
   form, telling users how to load it if they want it.  (This is what I
   currently do on my Atari Forthmacs system; the one person who has ever
   complained changed his mind a few weeks later, and decided he really
   does prefer text files.)

b) Don't supply BLOCKs at all.  Of course, this means that the vendor is not
   allowed to say he has the file wordset either.  What I would do in that
   case is to say that I have *all the words mentioned in the file wordset*,
   but NOT say that I have "the FILE wordset".  It's a technical quibble,
   but I believe that it's strictly legal.

> A 7-bit KEY is braindamaged!

ANS Forth DOES NOT HAVE a 7-bit KEY.  The person who said that it does was
simply mistaken.

KEY returns characters from the "implementation-defined character set".
If the implementation says it has an 8-bit character set, that's what
KEY returns.

However, a Standard Program without environmental dependencies cannot
*assume* that it can get any particular character outside the range of
printable 7-bit ASCII characters.  This does not mean that implementations
are supposed to throw away characters outside that range.  It means that
implementations are not required to simulate 8-bit characters on systems
that don't have them, consequently a completely-portable program better not
depend on them.

A program *can* simply declare an environmental dependency on a particular
character set if it needs it.   Such a program would be labeled as:

        ANS Forth Standard Program with environmental dependency on
        8-bit ISO xxx character set.

Here is an example:

        This code fragment has no environmental dependency; it only tests
        for particular characters in the 7-bit printable "common subset":

                BEGIN
                   KEY DUP  [CHAR] q  <>
                WHILE
                   PTR @ C!  PTR @ CHAR+ PTR !
                REPEAT

        Note that the preceding fragment does not prohibit the use of
        8-bit characters; but it also does not depend on them.

        We could give this code an environmental dependency by replacing
        "[CHAR] q" with "305", thus assuming that it is possible to receive
        the character whose numeric code is "305", which is outside the
        "guaranteed range".  Clearly, on a machine which only has US ASCII
        characters, this loop will never terminate.

An environmental dependency is not a "mark of shame".  It is simply a
statement about what kind of environment the program is intended for.
There are about a zillion programs in the world that have an "environmental
dependency on DOS", and that doesn't seem to be hurting most of them.

Mitch

ForthNet@willett.pgh.pa.us (ForthNet articles from GEnie) (06/04/91)

Category 10,  Topic 15
Message 23        Sun Jun 02, 1991
F.SERGEANT [Frank]           at 23:07 CDT
 
**FCS**  post to c10t15m1   1 Jun 91  **FCS**
 RB>Sometimes the program needs all of the data coming through the I/O 
 RB>stream, principally here meaning through an RS232 serial port. It's 
 RB>a serious handicap to professional engineers when the 
 RB>implementation arbitrarily discards data from the physical level of 
 RB>the I/O stream.
 Robert, you sure are right!  The serial input routine on the RTX  Evaluation
Board's ebforth will not accept a zero byte.  Thus, to  download an executable
image I had to work around that limitation  (annoying me, of course).  (I
changed all $00 and $01 bytes to an $01  byte followed by an $01 or an $02
byte -- all other bytes were passed  through without change as single bytes.)
Not that my comment has  anything in particular to do with your and Mitch's
discussion.
  -- Frank
-----
This message came from GEnie via willett.  You *cannot* reply to the author
using e-mail.  Please post a follow-up article, or use any instructions
the author may have included (USMail addresses, telephone #, etc.).
Report problems to: dwp@willett.pgh.pa.us _or_ uunet!willett!dwp

ForthNet@willett.pgh.pa.us (ForthNet articles from GEnie) (06/04/91)

Category 10,  Topic 15
Message 24        Sun Jun 02, 1991
R.BERKEY [Robert]            at 22:12 PDT
 
  Mitch Bradley writes, 91-05-28,

 > > Re: disentitling BLOCK .  The nominal rationale that embedded
 > > systems don't necessarily need source code doesn't wash.  The
 > > existence of embedded systems that don't allow source code doesn't
 > > in any way change the existence of programs that _are_ source code.
 >
 > Presumably, embedded systems that do not allow source code will
 > succeed or fail based on the market acceptance of their overall
 > value.  There are some people (I am not one of them) who are
 > adamantely opposed to even saying that if you have files, then you
 > must have blocks too.  For a good argument, call Jack Brown.  This
 > battle have been fought at least 3 times, and I doubt that anybody's
 > mind is going to change at this stage.

The issue of blocks versus files is not the issue.

At issue is the rationale for this proposed change.  An existence of internal
committee debates is not a rationale for making changes.

At issue is that X3.J14 proposes a change that a "Standard Program" cannot be
source code.  It's not clear that this proposal has an interpretation.

Robert

-----
This message came from GEnie via willett.  You *cannot* reply to the author
using e-mail.  Please post a follow-up article, or use any instructions
the author may have included (USMail addresses, telephone #, etc.).
Report problems to: dwp@willett.pgh.pa.us _or_ uunet!willett!dwp

ForthNet@willett.pgh.pa.us (ForthNet articles from GEnie) (06/04/91)

Category 10,  Topic 15
Message 25        Mon Jun 03, 1991
ELLIOTT.C                    at 10:59 EDT
 
Frank, I once had to cope with a problem a little like the one you mentioned
in msg #23.  The context was storage from RAM to MSDOS disk files.  What I did
was convert each byte into its hex character code representation (I didn't
care about the uneconomical use of storage).  A little later I realized that
the exercise was for the current purposes unncessary - I had probably
misunderstood the uses of file pointer manipulations.
-----
This message came from GEnie via willett.  You *cannot* reply to the author
using e-mail.  Please post a follow-up article, or use any instructions
the author may have included (USMail addresses, telephone #, etc.).
Report problems to: dwp@willett.pgh.pa.us _or_ uunet!willett!dwp

Mitch.Bradley@ENG.SUN.COM (Mitch Bradley) (06/04/91)

> At issue is the rationale for this proposed change [making the BLOCK
> wordset optional].

The rationale is:

   A lot of embedded systems get along nicely without BLOCK or any other
   form of directly-accessable mass storage.  By splitting the BLOCK words
   out into a separate optional wordset, ANS Forth recognizes this quite-
   common practice, and extends the domain of ANS Forth to such systems.

Mitch