ForthNet@willett.pgh.pa.us (ForthNet articles from GEnie) (05/28/91)
Category 10, Topic 15 Message 21 Sat May 25, 1991 R.BERKEY [Robert] at 20:45 PDT To: Mitch Bradley mb> A standard ANS Forth system is not required to reject mb> non-printable characters in blocks, nor is it required to accept mb> them. The characters whose meanings are precisely defined in the mb> context of block source code are the space character and the ASCII mb> characters with codes from 33 to 126. Forth-83 allows word names defined on blocks to contain any character other than a space. In Forth-83 there are no restrictions on the data that can be stored in a block, a block is simply a 1K segment of RAM that interchangeably resides in some form of mass storage. This "mass storage" may itself be as little and as transient as 32K of RAM. mb> "127 AND" is often used after KEY to remove junk like parity bits mb> and shift bits. This technique, although quite common, is bogus, mb> because throwing away high bits doesn't necessarily result in a mb> meaningful 7-bit ASCII character. Instead it may for example mb> transform a code that means the "F7" function key into the letter mb> "T", a behavior for which I can think of no justification. The implementation requirements for the Forth-83 Standard word KEY are an entitlement to the "Standard Programmer", i.e., a user of a FORTH-83 Standard System. Historically this definition arose because of conflicting program needs involving KEY . Sometimes the program needs all of the data coming through the I/O stream, principally here meaning through an RS232 serial port. It's a serious handicap to professional engineers when the implementation arbitrarily discards data from the physical level of the I/O stream. At the logical level of use, just what that data means may simply be information relevant to the I/O stream, such as a parity bit, and not interesting in the logical interpretation of the data. The point of the system requirements involving the Forth-83 Standard word KEY is to assure that the application level on a system-specific basis is able to make relevant decisions regarding how to interpret physical-level data. Further, the Forth-83 Standard guarantees a Standard Program access to logical data, i.e., that all 128 ASCII characters can be received. The Forth-83 Standard further specifies the ASCII standard as a normative appendix. mb> The correct phrase should be something like this: mb> 126 constant max-graphic \ Value depends on system character set mb> ... mb> key dup bl max-graphic between if ( char ) mb> <insert character in buffer> mb> else mb> <process as editing character> mb> then That happens to be an example of a program using the Forth-83 KEY . Mitch, I don't understand your readiness to discard the heritage with KEY and BLOCK . My sense of your view toward blocks is, "someone's not-so-bright idea of a way to handle source code." I know that you've cited control-S and control-Q as implementation problems with KEY . Yet especially knowing your prodigious ability as an implementor, I find it difficult to understand that you couldn't support the KEY in a program written to use DC1 or DC3, Device Control 1 and 3, ASCII values hex 11 and 13. (If this issue is a "quibble", then are excuses for not providing a full implementation of KEY other than "quibbles"?) By the way, I consider X3.J14's handling of the definitions of KEY and BLOCK two of the bigger objections going, either being plenty of reason for reasonable people to reject the label "ANS Forth Standard". The deletion of EXPECT could be rationalized if KEY were conventionally functional. KEY can't input a <RETURN>? In my opinion, the committee decision here lacks serious commitment to standardization. Re: disentitling BLOCK . The nominal rationale that embedded systems don't necessarily need source code doesn't wash. The existence of embedded systems that don't allow source code doesn't in any way change the existence of programs that _are_ source code. Robert ----- This message came from GEnie via willett. You *cannot* reply to the author using e-mail. Please post a follow-up article, or use any instructions the author may have included (USMail addresses, telephone #, etc.). Report problems to: dwp@willett.pgh.pa.us _or_ uunet!willett!dwp
Mitch.Bradley@ENG.SUN.COM (05/29/91)
mb> A standard ANS Forth system is not required to reject mb> non-printable characters in blocks, nor is it required to accept mb> them. The characters whose meanings are precisely defined in the mb> context of block source code are the space character and the ASCII mb> characters with codes from 33 to 126. rb> Forth-83 allows word names defined on blocks to contain any character rb> other than a space. In Forth-83 there are no restrictions on the data rb> that can be stored in a block, a block is simply a 1K segment of RAM rb> that interchangeably resides in some form of mass storage. This "mass rb> storage" may itself be as little and as transient as 32K of RAM. Note that ANS Forth does not restrict the *data* that be stored in a block; it simply says that the *source code* for a Standard Program should only contain printable characters and blanks. Regardless of what Forth-83 says, there is a fair amount of variation in what real systems do when the interpreter encounters control characters. Another way of putting this is: If you want your program to be really portable, you better figure out how to write the source code with printable characters. From a pragmatic standpoint, this is a good idea for several reasons, including the ability to print and publish your code, both in paper form and over the network. Essentially, this is equivalent to saying that a Standard Program can be published with real-world human-readable media. > The implementation requirements for the Forth-83 Standard word KEY are > an entitlement to the "Standard Programmer", i.e., a user of a > FORTH-83 Standard System. Historically this definition arose because > of conflicting program needs involving KEY . Sometimes the program > needs all of the data coming through the I/O stream, principally here > meaning through an RS232 serial port. Increasingly, KEY is being used not only for data coming in over a serial port, but also for event codes from extended keyboards and window systems, and also for international characters. For those other uses, high order bits do not necessarily represent "uninteresting information" (uninteristing in the sense that the presence or absense of a parity bit does not change the meaning of the rest of the bits). I would claim that having to deal with visible parity bits at the Forth program level is extremely rare these days, especially compared to the other cases (extended keyboards, international character sets, event codes). > Mitch, I don't understand your readiness to discard the heritage with KEY > and BLOCK . My sense of your view toward blocks is, "someone's not-so- > bright idea of a way to handle source code." The issue doesn't revolve around BLOCK, but rather around whether Standard source code can contain "invisible" characters. I don't think it's a good idea to allow arbitrary control characters in standard source code. Standard implies publication or transmission or multi-platform use or at least long- term maintainability, and all my experience with the real world tells me that embedding arbitrary control codes in source code is contrary to all those goals. > I know that you've cited control-S and control-Q as implementation > problems with KEY . [compliment deleted] I find it difficult to > understand that you couldn't support the KEY in a program written to > use DC1 or DC3, Device Control 1 and 3, ASCII values hex 11 and 13. The problem is that many communications channels intercept or alter numerous control codes, in ways that are difficult to detect. The only reasonable alternative is to define some standard "escape sequence" for entering otherwise-unreachable codes, similar to C's "\" convention. This would be a nice feature to have, but I sort of doubt that the committee would go for it, and I'm not willing to carry the banner on this one. If you propose it, I will support it. > Re: disentitling BLOCK . The nominal > rationale that embedded systems don't necessarily need source code doesn't > wash. The existence of embedded systems that don't allow source code > doesn't in any way change the existence of programs that _are_ source code. Presumably, embedded systems that do not allow source code will succeed or fail based on the market acceptance of their overall value. There are some people (I am not one of them) who are adamantely opposed to even saying that if you have files, then you must have blocks too. For a good argument, call Jack Brown. This battle have been fought at least 3 times, and I doubt that anybody's mind is going to change at this stage. Mitch.Bradley@Eng.Sun.COM
mikeh@touch.touch.com (Mike Haas) (06/01/91)
> >> Mitch, I don't understand your readiness to discard the heritage with KEY >> and BLOCK . My sense of your view toward blocks is, "someone's not-so- >> bright idea of a way to handle source code." > >The issue doesn't revolve around BLOCK, but rather around whether Standard >source code can contain "invisible" characters. I don't think it's a good >idea to allow arbitrary control characters in standard source code. Standard Then don't enter them in source code. But don't limit the functionality of the system's tools to handle such data. KEY and BLOCK (ugh) are used for many thing other than funneling data to the interpreter/compiler. In fact, when most forth "applications" are running, doing what they were designed to do, the often use KEY and BLOCK (no?) and never invoke the interpreter. Other languages have been adding page formatting control characters since day 1...right in with their source code. any forth system worth it's salt should be able to handle TAB characters as generic whitespace, like BL. Why the hell not? I HATE it when someone designs limitations into the development tools i have to use...especially forth tools, where I can type in R> at the keyboard! This is an advantage, this power. Why take it away? Remember, as a forth developer, you're creating MY tools; you can't possibly know what I'm going to want to do with them, especially with a language like forth! Then there are platforms (like tha amiga) where you can type in ANSI escape codes to do VT-100-like forms management (including changing text & background color). a 7-bit KEY is braindamaged! Quit trying to protect me from the power of forth. Both the mac & amiga use the full 8 bits for data...don't strip this out just cause your forth interpreter can't handle it. forth needs to ADAPT to these systems... even 8-bit text is old stuff, the world is now trying to figure out international language support (16 bits & MORE!)...THIS IS WHAT WE SHOULD BE FIGURING OUT!... not whether KEY is gonna pass 8 bits! ludicrous. ADAPT... it may be in Websters, but it sure isn't in many 'forth' dictionaries!
Mitch.Bradley@ENG.SUN.COM (Mitch Bradley) (06/01/91)
> >The issue doesn't revolve around BLOCK, but rather around whether Standard > >source code can contain "invisible" characters. I don't think it's a good > >idea to allow arbitrary control characters in standard source code. > Then don't enter them in source code. But don't limit the functionality of > the system's tools to handle such data. KEY and BLOCK (ugh) are used for > many thing other than funneling data to the interpreter/compiler. Oops, misunderstanding time! ANS Forth does NOT prohibit the use of control characters in BLOCKs. It says that the use of control characters in *block source code* is ambiguous. In other words, some systems may treat them as word delimiters, and other systems may treat them as characters of a word. The BLOCK mechanism still has to deliver the control characters without modification, but the *interpreter/compiler* is not constrained on how it must deal with them when they are delivered from a BLOCK. This is a statement of historical reality. However, in *text file source code*, ANS Forth specifies that control characters are word delimiters, i.e. "white space". Implementors have 3 reasonable options: 1) They can make both BLOCK source and text file source work the same way, i.e. control characters are "white space". One might expect that an implementor of a new system might do it this way. 2) They can make BLOCK source treat control characters as "non-white" and text file source treat them as "white". This would be a reasonable choice for a vendor with a substantial existing customer base devoted to BLOCK source code, in which control characters are "non-white". 3) They can refuse to implement text file source code, and do whatever they used to do with block source. My personal prediction is that such vendors will get creamed in the marketplace (even FORTH, Inc. now offers text file source, due to popular demand), but the market will make the final call on that one. Note that a vendor is not allowed to have the file access wordset without the BLOCK wordset. This is an issue that was a matter of much debate, and the final decision was a compromise. Note, however, that vendors who really don't want to support BLOCK can either a) Supply a "barely adequate" implementation of the BLOCK wordset in source form, telling users how to load it if they want it. (This is what I currently do on my Atari Forthmacs system; the one person who has ever complained changed his mind a few weeks later, and decided he really does prefer text files.) b) Don't supply BLOCKs at all. Of course, this means that the vendor is not allowed to say he has the file wordset either. What I would do in that case is to say that I have *all the words mentioned in the file wordset*, but NOT say that I have "the FILE wordset". It's a technical quibble, but I believe that it's strictly legal. > A 7-bit KEY is braindamaged! ANS Forth DOES NOT HAVE a 7-bit KEY. The person who said that it does was simply mistaken. KEY returns characters from the "implementation-defined character set". If the implementation says it has an 8-bit character set, that's what KEY returns. However, a Standard Program without environmental dependencies cannot *assume* that it can get any particular character outside the range of printable 7-bit ASCII characters. This does not mean that implementations are supposed to throw away characters outside that range. It means that implementations are not required to simulate 8-bit characters on systems that don't have them, consequently a completely-portable program better not depend on them. A program *can* simply declare an environmental dependency on a particular character set if it needs it. Such a program would be labeled as: ANS Forth Standard Program with environmental dependency on 8-bit ISO xxx character set. Here is an example: This code fragment has no environmental dependency; it only tests for particular characters in the 7-bit printable "common subset": BEGIN KEY DUP [CHAR] q <> WHILE PTR @ C! PTR @ CHAR+ PTR ! REPEAT Note that the preceding fragment does not prohibit the use of 8-bit characters; but it also does not depend on them. We could give this code an environmental dependency by replacing "[CHAR] q" with "305", thus assuming that it is possible to receive the character whose numeric code is "305", which is outside the "guaranteed range". Clearly, on a machine which only has US ASCII characters, this loop will never terminate. An environmental dependency is not a "mark of shame". It is simply a statement about what kind of environment the program is intended for. There are about a zillion programs in the world that have an "environmental dependency on DOS", and that doesn't seem to be hurting most of them. Mitch
ForthNet@willett.pgh.pa.us (ForthNet articles from GEnie) (06/04/91)
Category 10, Topic 15 Message 23 Sun Jun 02, 1991 F.SERGEANT [Frank] at 23:07 CDT **FCS** post to c10t15m1 1 Jun 91 **FCS** RB>Sometimes the program needs all of the data coming through the I/O RB>stream, principally here meaning through an RS232 serial port. It's RB>a serious handicap to professional engineers when the RB>implementation arbitrarily discards data from the physical level of RB>the I/O stream. Robert, you sure are right! The serial input routine on the RTX Evaluation Board's ebforth will not accept a zero byte. Thus, to download an executable image I had to work around that limitation (annoying me, of course). (I changed all $00 and $01 bytes to an $01 byte followed by an $01 or an $02 byte -- all other bytes were passed through without change as single bytes.) Not that my comment has anything in particular to do with your and Mitch's discussion. -- Frank ----- This message came from GEnie via willett. You *cannot* reply to the author using e-mail. Please post a follow-up article, or use any instructions the author may have included (USMail addresses, telephone #, etc.). Report problems to: dwp@willett.pgh.pa.us _or_ uunet!willett!dwp
ForthNet@willett.pgh.pa.us (ForthNet articles from GEnie) (06/04/91)
Category 10, Topic 15 Message 24 Sun Jun 02, 1991 R.BERKEY [Robert] at 22:12 PDT Mitch Bradley writes, 91-05-28, > > Re: disentitling BLOCK . The nominal rationale that embedded > > systems don't necessarily need source code doesn't wash. The > > existence of embedded systems that don't allow source code doesn't > > in any way change the existence of programs that _are_ source code. > > Presumably, embedded systems that do not allow source code will > succeed or fail based on the market acceptance of their overall > value. There are some people (I am not one of them) who are > adamantely opposed to even saying that if you have files, then you > must have blocks too. For a good argument, call Jack Brown. This > battle have been fought at least 3 times, and I doubt that anybody's > mind is going to change at this stage. The issue of blocks versus files is not the issue. At issue is the rationale for this proposed change. An existence of internal committee debates is not a rationale for making changes. At issue is that X3.J14 proposes a change that a "Standard Program" cannot be source code. It's not clear that this proposal has an interpretation. Robert ----- This message came from GEnie via willett. You *cannot* reply to the author using e-mail. Please post a follow-up article, or use any instructions the author may have included (USMail addresses, telephone #, etc.). Report problems to: dwp@willett.pgh.pa.us _or_ uunet!willett!dwp
ForthNet@willett.pgh.pa.us (ForthNet articles from GEnie) (06/04/91)
Category 10, Topic 15 Message 25 Mon Jun 03, 1991 ELLIOTT.C at 10:59 EDT Frank, I once had to cope with a problem a little like the one you mentioned in msg #23. The context was storage from RAM to MSDOS disk files. What I did was convert each byte into its hex character code representation (I didn't care about the uneconomical use of storage). A little later I realized that the exercise was for the current purposes unncessary - I had probably misunderstood the uses of file pointer manipulations. ----- This message came from GEnie via willett. You *cannot* reply to the author using e-mail. Please post a follow-up article, or use any instructions the author may have included (USMail addresses, telephone #, etc.). Report problems to: dwp@willett.pgh.pa.us _or_ uunet!willett!dwp
Mitch.Bradley@ENG.SUN.COM (Mitch Bradley) (06/04/91)
> At issue is the rationale for this proposed change [making the BLOCK > wordset optional]. The rationale is: A lot of embedded systems get along nicely without BLOCK or any other form of directly-accessable mass storage. By splitting the BLOCK words out into a separate optional wordset, ANS Forth recognizes this quite- common practice, and extends the domain of ANS Forth to such systems. Mitch