[comp.lang.prolog] PROLOG DIGEST V6 #37

restivo@POLYA.STANFORD.EDU (Chuck Restivo) (07/02/88)
Date: Fri 17 Jun 02:53-PST
From: Chuck Restivo (The Moderator) <PROLOG-REQUEST@POLYA.STANFORD.EDU>
Reply-to: PROLOG@POLYA.STANFORD.EDU>
US-Mail: P.O. Box 4584, Stanford CA  94305
Subject: PROLOG Digest   V6 #37
To: PROLOG@POLYA.STANFORD.EDU


PROLOG Digest           Friday, 17 Jun 1988      Volume 6 : Issue 37

Today's Topics:
		Implementation - Ethics & Morality & end_of_file, 
				& Poplog Compilation

-----------------------------------------------------------------------------------------------------------------------------------

Date: Sat, 6 Feb 88 19:58:02 PST
From: quintus!ok@Sun.COM (Richard A. O'Keefe)
Subject: ripoff

It has recently come to my attention that there is a benchmark
program floating around which contains a collection of operations
on arbitrary precision integers.  These operations form the bulk
of the file, and include predicates with names like
	addn/5
	divq/4
	comz/5
and the data structure used to represent numbers is
	[d0,d1,...dk] ~~ d0 + R*(d1 + R*(... + R*dk) ...)
	where R is 10000 on 32-bit machines or 100000 on DEC-10s
	to represent non-negative integers

	separate sign (+ or -) and [d0,...,dk]
	to represent signed integers

	number(Sign,Numerator,Denominator)
	where gcd(Numerator,Denominator) = 1 or 0
	to represent rationals.

This collection of operations comes from either
	- the freely available DEC-10 Prolog file EVAL.PL, or
	- the Quintus Prolog file library(long).
But the "Author:" comment has been removed.

The DEC-10 Prolog version of the rational arithmetic package is
freely available, and I have no objection to anyone using and
distributing it.  However, I am very upset that my name has been
taken off it.  Taking the author's name off a program that you
got for nothing is a mean and contemptible thing to do, unless
of course you have the author's permission.  (If you make a
small extract from the program for instructional purposes, that's
another matter.  If someone had a file containing one or two of
the operations, I wouldn't expect them to retain my name on that.)

I am not going to name names, because I believe that the people who
distributed the copy I saw are at least two steps away from the culprit.
So if you have a copy of this program, don't think that I am accusing
the person you got it from.

In general:
  - the DEC-10 Prolog files which I sent to {SU-SCORE} may freely be
    used and distributed, even in commercial systems.
  - there is no *legal* requirement on anyone to do anything
    in particular.
  - but it is good manners to leave the name of the original author
    in the source code, and if the code is used in a commercial
    product, it is good manners to credit the original author in
    the documentation of the product.

I was really hurt and outraged by this.

------------------------------

Date: 30 Mar 88 07:34:06 GMT
From: munnari!mulga!lee@uunet.uu.net  (Lee Naish)
Subject: Behavior of read/get0 at end_of_file

In article <243@gould.doc.ic.ac.uk> cdsm@doc.ic.ac.uk (Chris Moss) writes:
>e.g.  go :- repeat,
>	    read(Term),
>            (Term=end_of_file -> true; process(Term), fail).
>
Testing for the end of file term using = is a common error.  If a
variable is read, it succeeds.
A second criticism I would make of this code is that it has a tendency
to loop.  I think that repeat/0 should also have a matching cut.

I posted a nicer way to encapsulate this backtracking style of reading
terms a while back.  It is also possible to move the read back into the
repeat loop, avoiding repeat and the need for cut.  With tro, it is just
as efficient.  Interestingly, it it works whether read fails or succeeds
on eof.

	% returns all terms read by backtracking
	% (should have stream/file arg and close file at eof?)
read_terms(Term) :-
	read(Term1),
	\+ isEof(Term1),	% if you dont want to return end_of_file
	(	Term = Term1
	;
		% \+ isEof(Term1),	% if you do want to
		read_terms(Term)
	).

Richard metioned some subtle differences between eof/1, is_eof/1 etc in
different systems.  There is another one which he missed: in NU-Prolog,
isEof/1 checks if its argument is the eof term (reading variables works)
and eof/1 returns the eof term (which is end_of_file for comatability).

Now for my suggestion of a new predicate which can be used to implement
your favourite version of read/1:

	read_term(Stream, Term)		% change the name if necessary

1) If it succeeds in reading a term T,
		Term = term(T, VarNames)
	where Varnames is some structure which allows mapping between
	variables and their names.  Wrapping a functor around the term
	enables to distinguish between variables, 'end_of_file' and real
	end of file easily.  It also lets us retain variable name
	information.

2) If end of file is encountered for the first time, or if an end of
	file marker occurs next in the stream (like ^Z on a terminal)
		Term = eof_marker

3) If eof has already been read and multiple eof markers are not
	possible
		Term = error(past_eof)
	Whether this is an error is arguable, but by explicitly
	returning something, the programmer has the choice of what
	to do.  Rather than having a proliferation of top level
	functors being returned by read_term, it seems reasonable to
	wrap the error functor around past_eof.

4) If there is a syntax error
		Term = error(syntax(X))
	where X is some indication of the error

5) If Stream is not a valid stream
		Term = error(invalid_stream(X))
	where X is some indication of the error

6) If there has just been a disk head crash
		Term = error(unix(hardware(disk_head_crash(X))))
etc, etc.

Similarly, reading characters could be done as follows

	read_character(Stream, Char)	% change name if necessary

1) If it succeeds in reading a character C,
		Char = C
	where char_to_int(C, I) can map the character to a small integer
	for get0/1.  There is no special functor needed to wrap up Char,
	assuming the other things returned by read_character/2 can be
	distinguished from characters (eg, by is_character(Char)).

2) If end of file is encountered for the first time, or if an end of
	file marker occurs next in the stream (like ^Z on a terminal)
		Char = eof_marker

3) If eof has already been read and multiple eof markers are not
	possible
		Char = error(past_eof)

4) I doubt that there will ever be a need for error(syntax(X)), but
	it should be reserved anyway.

5) If Stream is not a valid stream
		Char = error(invalid_stream(X))
etc, etc.

I think it would be useful for these (with the details fleshed out a bit
more) to be part of the standard.

-- Lee

------------------------------

From: Aaron Sloman <aarons@uk.ac.sussex.cvaxa>
Date: Sat, 30 Jan 88 23:18:17 GMT
Subject: Compilation to Assembler in Poplog

This is a  response to a  discussion in comp.compilers,  but as it  is
potentially of wider interest I'm offering  it to all of you for  your
bulletin boards. There  does not  seem to be  anything comparable  for
Lisp, so I suppose I just have to post it direct to comp.lang.lisp for
people interested in Lisp implementations? Or should I assume any such
people will read comp.compilers?

I hope  it  is of  some  interest, and  I  apologise for  its  length.
Although Sussex University has a commercial interest in Poplog I  have
tried to avoid raising any commercial issues.
                         

               COMPILING TO ASSEMBLY LANGUAGE IN POPLOG

There have  been  discussions  on  the network  about  the  merits  of
compiling to  assembly  language. Readers  may  be interested  in  the
methods used  for implementing  and porting  Poplog, a  multi-language
software  development  system  containing  incremental  compilers  for
Common Lisp, Prolog, ML and POP-11,  a Lisp-like language with a  more
readable Pascal-like syntax. Before I explain how assembly language is
used as output from the compiler during porting and system building, I
need to explain how the running system works. The mechanisms described
below  were  designed  and  implemented  by  John  Gibson,  at  Sussex
University.

All the languages in Poplog compile  to a common virtual machine,  the
Poplog VM which  is then  compiled to  native machine  code. First  an
over-simplified description:

The Poplog system allows different  languages to share a common  store
manager, and common data-types, so that a program in one language  can
call another and share data-structures.  Like most AI environments  it
also allows  incremental  compilation: individual  procedures  can  be
compiled and re-compiled and  are immediately automatically linked  in
to the rest of the system, old versions being garbage collected if  no
longer pointed to. Moreover, commands to run procedures or interrogate
data-structures can be typed in interactively, using exactly the  same
high level language  as the  programs are written  in. The  difference
between this  and  most AI  systems  is  that ALL  the  languages  are
compiled in the same way. E.g.  Prolog is not interpreted by a  POP-11
or Lisp program: they all compile (incrementally) to machine code.

The languages are all implemented using a set of tools for adding  new
incremental compilers. These tools include procedures for breaking  up
a text stream into items, and tools for planting VM instructions  when
procedures are compiled.  They are  used by the  Poplog developers  to
implement the four Poplog languages  but are also available for  users
to implement new  languages suited to  particular applications.  (E.g.
one user claims he  implemented a complete Scheme  in Poplog in  about
three weeks,  in  his spare  time,  getting a  portable  compiler  and
development  environment  for  free  once  he  had  built  the  Scheme
front-end compiler in Poplog.)

All this makes it  possible to build a  range of portable  incremental
compilers for different  sorts of programming  languages. This is  how
POP-11, PROLOG, COMMON LISP and  ML are implemented. They all  compile
to  a  common  internal  representation,  and  share  machine-specific
run-time code generators.  Thus several different  machine-independent
"front ends"  for different  languages  can share  a  machine-specific
"back end" which compiles to native machine code, which runs far  more
quickly than if the new language had been interpreted.

The actual story  is more  complicated: there are  two Poplog  virtual
machines, a high level and a low level one, both of which are language
independent and machine  independent. The high  level VM has  powerful
instructions, which  makes  it convenient  as  a target  language  for
compilers for high level  languages. This includes special  facilities
to  support  Prolog  operations,   dynamic  and  lexical  scoping   of
variables, procedure  definitions,  procedure  calls,  suspending  and
resuming processes, and so on.  Because these are quite  sophisticated
operations, the mapping from the Poplog  VM to native machine code  is
still fairly complex.

So  there   is  a   machine  independent   and  language   independent
intermediate compiler which compiles  from the high  level VM to  to a
low level VM, doing a considerable amount of optimisation on the  way.
A machine-specific back-end then translates the low-level VM to native
machine code, except when  porting or re-building  the system. In  the
latter case the final stage is translation to assembly language. (See
diagram below.)

The bulk of the core Poplog  system is written in an extended  dialect
of POP-11, with provision for C-like addressing modes, for efficiency.
We call it  SYSPOP. The system  sources, written in  SYSPOP, are  also
compiled to  the high-level  VM, and  then to  the low  level VM.  But
instead of  then  being translated  to  machine code,  the  low  level
instructions are automatically translated  to assembly language  files
for the  target machine.  This is  much easier  than producing  object
files, because there is a fairly straight-forward mapping from the low
level  VM  to  assembly  language,  and  the  programs  that  do   the
translation don't have  to worry  about formats for  object files:  we
leave that to the assembler and linker supplied by the manufacturer.

In fact, the system sources need facilities not available to users, so
the two  intermediate  virtual  machines  are  slightly  enhanced  for
SYSPOP. The following diagram summarises the situation.

                {POP-11, COMMON LISP, PROLOG, ML, SYSPOP}
                                    |
                               Compile to
                                    |
                                    V
                             [High level VM]
                          (extended for SYSPOP)
                                    |
                          Optimise & compile to
                                    |
                                    V
                             [Low level VM]
                          (modified for SYSPOP)
                                    |
                         Compile (translate) to
                                    |
                                    V
                      [Native machine instructions]
                       [or assembler - for SYSPOP]

So for ordinary  users compiling or  re-compiling their procedures  in
the system, the machine code generator is used and compilation is very
fast, with no linking required. For rebuilding the whole system we  go
via assembly language for maximum flexibility and it is indeed a  slow
process. But it does not need to be done very often, and not (yet)  by
ordinary users. Later  (1989) they  will have  the option  to use  the
system building route in order to configure the version of Poplog they
want. So we sit on  both sides of the  argument about speed raised  in
comp.compilers.

All the compilers and translators are implemented in Poplog (mostly in
POP-11). Only the last stage is machine specific. The low level VM  is
at a level that makes it possible on the VAX, for example, to generate
approximately one machine instruction per low level VM instruction. So
writing the code  generator for  something like  a VAX  or M68020  was
relatively easy. For a RISC machine  the Clipper the task is a  little
more complicated.

Porting to a new computer requires  the run-time "back end", i.e.  the
low level  VM compiler,  to be  changed and  also the  system-building
tools which output assembly language programs for the target  machine.
There are  also a  few  hand-coded assembly  files  which have  to  be
re-written for each machine. Thereafter  all the high level  languages
have   incremental    compilers   for    the   new    machine.    (The
machine-independent  system  building  tools  perform  rather  complex
tasks, such as  creating a  dictionary of procedure  names and  system
variables that have to be accessible to users at run time. So  besides
translating system source files, the tools create additional assembler
files and  also check  for consistency  between the  different  system
source files.)

I  believe  most  other  interactive   systems  provide  at  most   an
incremental compiler for one language,  and any other language has  to
be interpreted. If  everything is  interpreted, then  porting is  much
easier, but  execution is  much slower.  The advantage  of the  Poplog
approach is that  it is  not necessary to  port different  incremental
compilers to each new machine.

This makes it relatively easy  for the language designer to  implement
complex languages, since the Poplog  VM provides a varied,  extendable
set of  data-types and  operations thereon,  including facilities  for
logic  programming,  list,  record   and  array  processing,   'number
crunching',  sophisticated  control  structures  (e.g.   co-routines),
'active variables' and 'exit  actions', that is instructions  executed
whenever a procedure exits, whether normally or abnormally. Indefinite
precision arithmetic, ratios and complex numbers are accessible to all
the languages  that need  them. Both  dynamic and  lexical scoping  of
variables are provided. A tree-structured "section" mechanism  (partly
like packages)  gives further  support  for modular  design.  External
modules (e.g. programs in C or  Fortran) can be dynamically linked  in
and unlinked. A set of  facilities for accessing the operating  system
is also  provided. Poplog  allows functions  to be  treated as  "first
class" objects, and this is used to great advantage in POP-11 and ML.

The VM facilities are relatively easy to port to a range of  computers
and operating systems because the core system is mostly implemented in
SYSPOP, and is largely machine independent. Only the machine-dependent
portions mentioned above (e.g. run-time code generator, and translator
from low level  VM to  assembler), plus  a small  number of  assembler
files need be changed for a  new machine (unless the operating  system
is also new). Since the translators are all written in a high level AI
language, altering them is relatively easy.

Porting requires compiling all the SYSPOP system sources, to  generate
the corresponding  new  assmbler  files,  then  moving  them  and  the
hand-made assembler files to the new machine, where they are assembled
then linked. The  same process  is used to  rebuild the  system on  an
existing machine when new features are added deep in the system. Much
of the system is in source libraries compiled as needed by users, and
modifying those components does not require re-building.

Using this mechanism an experienced programmer with no prior knowledge
of Poplog or the target  processor was able to  port Poplog to a  RISC
machine in about  7 months.  But for  the usual  crop of  bugs in  the
operating system, assembler, and other software of the new machine the
actual porting time would have been shorter. In general, extra time is
required for user  testing, producing  system specific  documentation,
tidying up loose ends etc.

Thus 7  to  12  months  work  ports  incremental  compilers  for  four
sophisticated languages, a screen editor, and a host of utilities. Any
other languages implemented by users using the compiler-building tools
should also run immediately. So  in principle this mechanism  allows a
fixed  amount  of  work  to  port  an  indefinitely  large  number  of
incremental  compilers.  Additional  work  will  be  required  if  the
operating system  is different  from  Unix or  VMS,  or if  a  machine
specific window  manager  has  to  be provided.  This  should  not  be
necessary for workstations supporting X-windows.

The use of assembler output considerably simplifies the porting  task,
and also aids  testing and  debugging, since  the output  is far  more
intelligible to the programmer than if object files were generated.

Comments welcome.

-- Aaron Sloman,

------------------------------

End of PROLOG Digest
********************