[comp.lang.forth] ComForth Specifications

ZMLEB@SCFVM.BITNET (Lee Brotzman) (07/22/88)

In reponse to a number of requests for more information about ComForth,
the Forth system developed by the South Belgium Forth Chapter, I have
been given this synopsis of ComForth by Andre Pirard, the author of the
original article entitled "Forth in an operating system".  Enjoy, but
be forwarned that this is a lengthy article (more than 400 lines).

-- Lee Brotzman
-- Moderator of the Forth Interest Group International List
-- (FIGI-L@SCFVM.BITNET)
------------------------------------------------------------------------

Following my article about "Forth in an operation system", I have received
questions asking what is ComForth and is it available.
Despite its professional look, Comforth is not a commercial product.
It is the work of the Southern Belgium Fig Chapter to whom I belong
and whose every member has contributed to Comforth at least by being
a criticizing user.
Being the only member with access to this list, I redistibute FIGI-L
to my Chapter's members. They suggested I'd spend some time writing
an article explaining how we feel Forth.
But we don't give it away either. It helps us covering the hardware
expenses (and also much time) we put into our dreams.
So, to answer the general interest, we put the specs sheet to the list.
I hope you will excuse inevitable redundancy with my previous article.
Please, don't send Lee questions any more. Write directly to me.
I'll make the best of my available time to answer them or relay
them to the right person. But please allow for some time for answers,
I have to do all that from home.

Andre'.

                     ComForth specifications

     Interpreters make testing a program a delight,  but are slow
when  executing.  Compilers  produce fast  code  but  compilation
turnaround  time  and reduced interactive test facilities  are  a
nightmare for the programmer.  Forth is a programming system that
offers  the  advantages of both interpreters and  compilers.  Its
interactivity   makes  it  tremendous  to   experiment   problems
solutions and write many applications quickly.  Its compiled code
gives very fast execution. It combines the ease of programming in
an expandable high-level strongly structured language with direct
access to machine code for versatility with equal interactivity.

     However,  the  Forth standards make Forth its own  operating
system  and only define a minimal ground onto  which  specialized
systems can be grown. Many Forth implementations do not implement
access to the host system files. They are limited to storing 1024
bytes  blocks  of  data on either an  unstructured  disk  or  its
simulation  on  a single direct access file.  This provides  high
program portability but only to those applications restricting to
the  minimal  definition.  This  is why Forth  usually excels  in
specialized  fields.  To be standard,  applications  designed  to
interface with a particular operating system are bound to include
the operating system interface, which everyone writes his own way
and  has  to be rewritten for each system.  The lack  of  a  host
system  interface also explains their inability to generate small
executable modules,  use chaining, load overlays etc... This is a
pity,   because   operating  systems  are  bridges   that   allow
applications   to   share  data  and  manage  them   efficiently.
Furthermore, several features commonly found in other programming
languages are missing,  principally strings,  local data  storage
management, error recovery, screen management etc...

     ComForth  is  a professionally  designed  Forth  development
system  designed and used at the "Southern Belgium FIG  chapter".
It  is  written  to combine the  powerful  interactive  debugging
facilities  of Forth and the usefulness of traditional  languages
and  operating  systems.  It uses a very simple operating  system
files handling interface which is identical between a variety  of
common  operating systems on a wide range of machines.  It  makes
possible  the easy generation of very small true modules directly
executable  from  the host system command  level.  It  integrates
several  essential language extensions and particularly  enhances
user  friendliness and the ease of development by providing  many
debugging tools.

     ComForth  has  been loved by many other people and  used  in
several economical,  scientific and industrial applications  such
as  accounting,  IBM  3270 terminal emulation and  reliable  data
transfer  protocol,   botanical  data  acquisition  and  analysis
system,  geological data acquisition and analysis (ComForth found
gold!), Eprom programmer control, thermodynamics data acquisition
and control,  cable layout computation, pasture fattening mixture
computation on handheld computer etc...

     In  summary,  the  ComForth implementation objective  is  to
eliminate  many  inconveniences  of  traditional  Forth   systems
and provide a (let us say it) comfortable programming environment
similar to say Turbo-Pascal(TM) but with all the added advantages
of Forth interactivity.

     Notes to the Forth novice:  the following text often  refers
to  the expression "word".  A Forth word is what would be  called
function  or  routine in other languages.  Words make a  sentence
that is either executed or used to define another new word.  They
are organized in vocabularies.  Don't be discouraged by some very
technical remarks in the discussion below. They are meant for the
Forth specialist to get a full insight of Comforth. Understanding
them is not necessary for everyday use of the system.


Features

- ComForth versions support and allow source portability between:

 - MSDOS (any, version 2.0 or higher, including PCDOS and clones)
 - CP/M 80 (version 2.0 or higher)
 - CP/M 86
 - Commodore 64
 - Commodore 128 (either mode 64, mode 128 or CP/M)
 - Commodore AMIGA
 - Atari ST
 - Sinclair/Timex QL
 - Apple-Dos on Apple II or IIe, 80 column support implemented.

- Forth-83   standard   with  didactic  Forth-79  and   Fig-Forth
compatibility supersets.  Of course,  assemblers and double words
set.  Many many extensions such as CASE ?LEAVE ?DO RECURSE,  ? .S
DUMP BASE?  ORDER, ASCII HEXA BIN OCT, B! B@, STRING S! SCAN SKIP
MATCH,  WITHIN  UWITHIN ROTATE SHIFT ASHIFT 2* 2/  DU2/,  PERFORM
EVALUATE NEWSTREAM SUSPEND RESUME, ALIGN, REPLY and the like...

- Drastically  easy to use system independent interface words  to
access  host  system  files in sequential or direct  access  mode
(FILE  OPEN  CLOSE GET PUT READ WRITE GETLINE POINT  NOTE  INFILE
INDATA PUTEOR PUTEOF MUTEIOER IOER DELETE RENAME...).  MSDOS Path
support.

- Source  code can be maintained in and loaded from free  format,
variable length records, operating system ASCII files. No storage
waste, no update nor cataloging nor documentation problems. Files
are generally 1/3 the size of screens files.  This frees from the
necessity  to cram the most on a single line and really yields  a
new  much clearer coding layout style with only related words  on
the same line and comments interspersed where they belong.

- Very convenient and complete fullscreen EDITOR to update  these
system  source  files from within Forth.  It uses general  screen
management functions that easily customize to any system (GOTOXY,
CURSOR,  CLS,  RUBLINE,  HILIGHT, LOLIGHT). Special keyboard keys
are  user customizable for editing functions.  Exiting the editor
can request interpreting the edit buffer and,  in case of  error,
re-editing with the cursor positioned at the point of error.

- Immediate  command  words  to manage system files  from  within
Forth (DIR ERA DEL REN FTYPE FPRINT FCOPY DOS etc...).

- Blocks   support  is  optional  and  uses  true   LRU   buffers
management.  I/O  is  made on direct access host files using  the
system independent file access interface. Line editor.

- FORTHGEN:  rewrites  the  current  augmented system  as  a  new
executable file. Useful for customization or for immediate access
to the pre-compiled stable part of a development in progress. Can
also define an autostart word.

- COMGEN: Easy host system application compact modules generator.
One just types: COMGEN source-file main-word module-file map-file
It  automatically  compiles the  application,  selects  the  sole
required  words  tree  and relocates them to  a  minimally  sized
headerless  execution file.  Even defining words definition parts
are excluded.  However, a user created dictionary can be isolated
and  searched  with  FIND in an  application  module  to  provide
keywords   interpretation  or  scan.   COMGEN  will  handle   any
dictionary  contents  not containing uninitialized data area  nor
relative  references (offsets) from one word to  another.  COMGEN
diagnoses  uninitialized  data and the ComForth assemblers  issue
error messages when they are instructed to generate such offsets.
But  any compiling method adhering to these restrictions  can  be
used. In particular, ComForth assemblers can be replaced with any
user's  choice  and any machine code compiled with  very  minimal
changes.

- A smart DECOMPILER can be used to examine the components of the
whole  system nucleus to the deepest level,  by name or  address.
This  is the key to fully understand any detail of the system and
allow any modifications for specific needs.  The best of  balance
between  source-like and compiled-like display has been chosen to
both  serve readability and accuracy.  Each line is prefixed with
its offset in the parameter field,  branching words arguments are
converted  to  offsets and every target of a branch is on  a  new
line. Consequently, control structures are easily recognized, but
what they compile is also faithfully shown.

- Thus,  the development system is very easily examined, patched,
augmented  and saved as a new executable module.  Bootup literals
are  referenced using the names of the user variables with  which
they are paired.  Before saving a new system with FORTHGEN or  at
the  end of a COMGEN compilation,  they can be modified by "value
user-variable  PRESET"  for  new  initial  values  of  the   user
variables  (storage  allocation,  type of  vocabulary  structure,
input/output  or other execution vectors redefinition,  numerical
conversion radix etc...). This process does not affect the normal
operation of the current system. The user variables are refreshed
from the bootup literals by COLD,  WARM or QUIT. They are ordered
in  three  groups  so that this refresh is a simple matter  of  a
storage  move.  8 spare positions in each group are free for  the
user (filling from top) or specific system (filling from  bottom)
to  participate to the same refresh method.  Other user variables
are not initialized by the system. The total number is controlled
by one of the storage allocation values which are user  variables
themselves.

- BREAKPOINT  facility  used in conjunction with the  decompiler.
Any word reference in a high-level word can be patched to execute
a   user-defined  word  (action  word)  before   executing   that
reference.  Patches  can  even be installed in action  words.  By
defaults,  the action is a special interpreter allowing execution
to  stop,  and interactive commands to be issued to  examine  and
modify data. This interpreter uses ONERROR to prevent terminating
the  execution of the tested program by errors caused by  command
lines  interpretation.  A  single return  keystroke  single-steps
through the tested word while displaying its name, the offset and
name of the reference and the parameter stack.

- Relocatable  OVERLAYS  creation  and  loading.  Speeds  up  the
loading  of  utilities  anywhere in  a  development  system.  The
editor,  assemblers,  decompiler, breakpoint and module generator
are in overlays. The user can easily create his own overlays.

- FLOATING POINT model (slow) in high level source code. Includes
trigonometric  and  hyperbolic functions.  Designed  to  be  data
representation  independent and easily converted to interface any
math coprocessor or ROM floating point routines.

- 8087  math coprocessor (fast!) assembler and floating point for
MSDOS  and CP/M 86.  Floating point system library interface  for
Amiga and ROM interface for CBM64/128.

- System library interface on Amiga.

- CHAINING from one application module to the other.

- Input   command  line  editing  and  previous  lines  retrieval
(EDEXPECT).  Uses  the same keyboard customization as the  source
files editor.

- Friendly user interfaces. e. g. long lists can be frozen on the
screen  by the simple use of the return key and canceled with the
C key.

- ONERROR  protects  a program section from ABORT by defining  an
"ONERROR recovery DURING protected NOERROR" structure similar  to
IF THEN ELSE.  If ABORT occurs during "protected", the stacks are
returned  to their initial depths and "recovery" is executed.  It
is used,  for example,  by the breakpoint facility to protect the
tested program against ABORTs produced by interactive debug  mode
commands  and the compile exit of the editor to catch compilation
errors.  A  delight to use to return error conditions and a  must
for strong applications.

- TEMPORARY   and  HEADERLESS  conserve  dictionary  space   (for
example:  assemble  then  remove the assembler or remove  the  no
longer  necessary  names  of  words).  The first  use  starts  an
alternate  dictionary  space  distant  from  HERE  by  the  value
contained  in  LAG  (the  modifiable maximum  size  that  can  be
compiled  before  DISPOSE).  In TEMPORARY  state,  everything  is
compiled  there.  In  HEADERLESS state,  only the name  and  link
fields. DISPOSE removes the alternate dictionary. Example to load
the assembler: "TEMPORARY   CODE DUMMY END-CODE   ...   DISPOSE".

- Assembler  LABELs  can  be coded anywhere within  machine  code
because they generate TEMPORARY constants.  ENTRY: can be used to
redefine  the entry point anywhere in the body rather than at its
beginning.  Both  features  come handy to relax Forth  assemblers
strict  structuring  constraints,   for  example  a  common  exit
sequence  coded  ahead  of the body and referenced  from  several
points in it.

- ONLY  ALSO  PREVIOUS vocabulary stack used  for  search  order.
SEALED  or  LINKED  vocabularies (global  option  and  override),
optional  search  of  CURRENT.  These allow  a  wide  variety  of
vocabulary structures, including Fig-Forth like.

- Facility  to implement two user defined words to be executed at
startup   and  close  down  of  the  application  module   and/or
development  system.  Useful for setting up and  removing  things
like   a  restart  interrupt  trap,   screen   options,   printer
initialization etc...

- Restart  is  such  that an interrupt (for example caused  by  a
special  key  stroke)  interrupt  a  looping  program  and  do  a
warmstart.  ComForth  has a single entry point.  The first  entry
causes a coldstart.  Subsequent ones cause warmstarts. Restart is
provided on all systems except generic CP/M.

- An  aggregate  STACK is integrated to easily  store  LIFO  data
structures.  This  conserves  memory  space and is  the  base  to
implement local variables.

- Character STRINGs words making use of the aggregate stack.

- Input/Output redirection to any device and customizable printer
support. Many other key words are vectored.

- Several general and system specific examples and utilities. The
number  of  these is ever  growing.  (Assembler  examples,  Menus
driver,    Sort,   Random   numbers,   Conditional   compilation,
Structures, Graphics, Sound etc...)

- Selectable  automatic or fixed adjustment of the size of memory
allocated to Forth.  Non self-modifying code,  word alignment for
faster 6502, 8086 and 68000 operation. Very fast 8088 operation.

Documentation:  70 pages tutorial explaining the ComForth special
features + 65 pages of functionally organized full glossary +  12
pages  alphabetical  index.  Beginners  will need  an  additional
tutorial  to  Forth-83.  Unlike most  glossaries,  Comforth's  is
organized  by  subjects,  so that every definition is learned  in
context. The index serves the alphabetical purpose.

Disk  formats:  usually distributed as single sided diskettes  on
the two sides of one diskette. Specify your machine, system, disk
format and if you were unable to read single sided.

MSDOS: IBM PC 5"1/4 or 3"1/2 format.
Atari ST: own 3"1/2 MSDOS-like.
Apple Dos: own format.
Commodore 64: own format.
Commodore 128: own format.
Commodore AMIGA: own format.
Sinclair/Timex QL: own format (tape).
CP/M 80 or 86:
  8" single density IBM 3740 format
  or one of the following 5"1/4 double density:
 IBM PC (CP/M 86)  Zenith Z100  Kaypro II  Epson QX-10  Osborne I

Other formats on arrangement.

Prepaid  with  Eurocheque  or international  postal  money  order
(apparently unknown in the states). To other cheques not drawn on
a Belgian bank,  sorry add $10 for endorsement. You may send cash
confidently, but we cannot take upon ourselves the risk of postal
loss.

$99 (3500 FB) for the base system.
$30 (1000 FB) for the applications companion diskette.
$3  for shipping to Europe.
$8.5 to the rest of the world.

Order and payment to:

    A. Pirard
    5, Piretfontaine
    B-4052 Dolembreux (Belgium)
    33-(0)41-688069 (19-21h GMT).

ComForth is copyrighted. No part of the system can be transferred
other parties,  even just to allow the execution of applications.
However,   the   licensee  is  entitled  to   sell   royalty-free
applications  using  any part of ComForth compiled  with  COMGEN,
that is with words identifiers removed. A limited selected set of
words  can  however  be aliased in a sealed dictionary  to  allow
their interpretation in a COMGENed application offering its users
a set of specialized commands.

Despite  no official support service,  the "Southern Belgium  FIG
chapter"  are  responsive to improve their product and will  make
the  best  of available time for  documentation  supplements  and
occasional  bug fixes.  They are collated and sent to  purchasers
whose remarks help us towards this improvement.

Typical  system sizes.  Example is MSDOS.  Other systems  nucleus
size  vary  slightly  according to the complexity  of  the  files
system  or  other system requirements.  Overlay sizes  include  a
relocation table (11%).  The other files contain common or system
specific source code.

F83NUC0  COM 13054  Minimal uncustomized system nucleus.
F        COM 17574  Typical customized development system. with
                    utilities.
OVLGEN   OVL   885  Overlays generation
ASM8086  OVL  7007  Assembler
COMGEN   OVL   999  Module generation, load phase
COMGEN2  OVL  3870  Modules generation, generation phase
SED      OVL  4478  Fullscreen source editor (3964 bytes)
SEDMAC   OVL  6246  Same with macro feature.
DECOMPLR OVL  1805  Decompiler.
BRKPOINT OVL  1983  Breakpoint and trace.

Files only on the companion diskette (*):

QCKSORT  F83  1628  Quicksort.
SWORDS   F83  1324  Example, sorted dictionary display.
CONDINT  F83  2658  Conditional interpretation.
PRETTY   F83  3730  Source code printer, ejects between words
MESSAGES F83  5982  Loading application messages from a file
MENUS    F83  8540  Simple effective menus system
MENUEX   F83  4571  Example
MENUEX   FR   1072  Loaded messages, providing
MENUEX   US    842  language independence.
STRUC    F83  3242  Pascal like structures.
STRUCEX  F83   637  Examples.
SYNMAC   F83  1758  Defining synonyms and macros.
BASIC    F83 10212  BASIC interpreter
IMATH    F83  2894  Extended integer maths.
GO       F83  6860  Game of GO in Forth

(*)  Some  are credited to the original author and only  reworked
for improvement or adaptation to Comforth.