[comp.lang.scheme] On the standard behavior of "load"

brian@granite.jpl.nasa.gov (Brian of ASTD-CP) (05/08/90)

I would like to start a discussion concer-
ning the Scheme essential procedure "load". The
Scheme definition, R3.99RRS, states that the re-
turn value of "load" is unspecified. However, I
think that "load" should have a specified return
value, and the next few paragraphs explain why
I think so.

With many implementations of Scheme, load returns
#f when the file passed to it is not found, and #t
otherwise. I think this ought to be the standard
behavior of load.  In some newer, more "Graphical
User Interface"-oriented implementations of Scheme,
load presents an interactive dialog, asking the
user to pick a file from a list when the given file
is not found. With this behavior, load is not
permitted to return #f and load cannot be used in
unattended programs when it might return #f.

However, there is a very good reason why one might
want load to tell a program that it can't find a 
file. 

Imagine that a collection of software is divided
into modules and clients. The modules are reusable
abstract data types, procedures, libraries, etc.
The clients are programs built up from the
modules.

It is highly advantageous to maintain the modules
separately from the clients. In particular, clients
should not require knowledge of the directories
modules reside in.  Clients should simply refer to
modules by name, as in

(load-module "methods.sch")
(load-module "queues.sch")
(load-module "binary-search.sch")

If clients use only load, then they must have
hard-coded directory names, as in:

(load "MyDisk:Scheme Code:Modules:OOP:methods.sch")
(load "MyDisk:Scheme Code:Modules:OOP:classes:queues.sch")
(load "MyDisk:Scheme Code:Modules:Procedures:binary-search.sch")

and so on. The big maintenance problem occurs when
a module is moved. Using load-module, no clients
need to be changed. Using load, however, all
clients of a module must be manually tracked down
and changed.

The implementation of load-module is simple. It
searches a canonical list of module directories,
using the return value of load to decide whether to
continue or terminate the search.

(define (load-module filename)
  (let iter ((canon-dir-list (list
                "MyDisk:Scheme Code:Modules:"
                "MyDisk:Scheme Code:Modules:OOP:"
                "MyDisk:Scheme Code:Modules:OOP:classes:"
                "MyDisk:Scheme Code:Modules:Procedures:" )))
    (cond
      ( (null? canon-dir-list) #f )
      ( (load (string-append (car canon-dir-list) file)) )
      ( else (iter (cdr canon-dir-list)) ))))

(my real load-module is a little different because
it avoids loading a module more than once, but the
version above illustrates the use of load in
load-module.)

With load-module, if a module directory is changed,
only the canonical directory list, which is in one
place in an initialization file, need be changed.
All static data concerning module locations is hidden
in this single list, rather than being distributed
in multiple copies among the clients, which is
clearly a much inferior organization.

It should be clear that load-module relies on 
load's returning #f when a file is not found in a
directory. The ostensibly more user friendly dialog
behavior of load actually forces a developer of
Scheme code to violate separation of module and
client and to pollute client code with extrinsic
information that creates a maintenance headache.

Comments?

 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
 . . Brian Beckman . . . . . . . . . . brian@granite.jpl.nasa.gov. . . .
 . . meta-disclaimer: every statement in this message is false . . . . .
 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

kend@tekchips.LABS.TEK.COM (Ken Dickey) (05/08/90)

In article <3601@jato.Jpl.Nasa.Gov> brian@granite.Jpl.Nasa.Gov (Brian of ASTD-CP) writes:
...
>With many implementations of Scheme, load returns
>#f when the file passed to it is not found, and #t
>otherwise. I think this ought to be the standard
>behavior of load.
...

It is certainly desireable to have a portably standard behavior here.
It is also somewhat difficult to achieve.

In some implementations of Scheme, load acts like BEGIN in that it
returns the last value of the storage loaded.  Aside from returning
some possibly useful value directly, I think that there are 2 kinds of
information which could be returned:
  [A] Was the file/object/whatever found or not.
  [B] Was the file/object/whatever successfully loaded, partially
loaded, or not loaded?

Once error conditions are considered, the question of what LOAD
returns becomes more complex--complex enough that there has not yet
evolved a concensus for a standard Scheme error system [I would
certainly like one!].  I believe that it will be difficult to resolve
the question of what LOAD should return in the absence of a standard
interface for failure/error handling.  What if there is no error
handling mechanism?  Also to be considered are behaviors which differ
in loading compiled vs source files.

Possibilities suggested:
  Have [possibly optional] success and failure continuations passed in,
  Call an `system' error handler in a specified way,
  Pass in a function to be called on error, 
  etc.  

Adding parameters, however, gets into parameter order and issues of
first-class environments and whether LOAD takes an environment
parameter.

Given that LOAD is very system dependent, the prudent approach is to
hide it abstractly away somewhere with other system dependent code.

-Ken

net@tub.UUCP (Oliver Laumann) (05/08/90)

In article <3601@jato.Jpl.Nasa.Gov> brian@granite.Jpl.Nasa.Gov (Brian of ASTD-CP) writes:

> I would like to start a discussion concerning the Scheme essential
> procedure "load". The Scheme definition, R3.99RRS, states that the
> return value of "load" is unspecified. However, I think that "load"
> should have a specified return value [...]

I don't think "load" should have a return value which indicates success
or error.  When the file could not be opened or some other error condition
occurred, it should simply "signal an error" (whatever this means) like
all other primitive procedures do in case of an error condition.

After all, "write" doesn't return #f either when a write error occurred
(disk full or whatever).

In addition, Scheme implementations could provide an "(autoload 'foo)"
function which causes a file "foo.scm" or so to be loaded automatically
the next time the symbol foo is evaluated but not bound.  In this case one
wouldn't be able to test the return value of the call to "load" anyway.

I'm sure most Scheme implementations allow you to catch and handle an
error anyway, for instance by "hooking" into the "system error handler"
or by redefining the error handler.  In Elk, for instance, one can
easily write an equivalent to the Lisp "errset" which can be wrapped
around the call to "load" to test whether it failed.

> It is highly advantageous to maintain the modules separately from the
> clients. In particular, clients should not require knowledge of the
> directories modules reside in.  Clients should simply refer to modules
> by name, as in  (load-module "methods.sch")   [...]

> The implementation of load-module is simple. It searches a canonical
> list of module directories, using the return value of load to decide
> whether to continue or terminate the search.

I think this can be handled by the more general concept of a "load
path".  Some (many?) Scheme and Lisp implementations maintain a list
of directories that are searched for the file that is specified in
the call to "load".

In Elk, for instance, "load-path" is a variable bound in the initial
environment; the initial value is a list of directories where the
Scheme system files and extensions can be found plus the current
directory.  Applications can bind the "load-path" to a different
list of directories or add the directories where the applications'
modules reside.

--
    Oliver Laumann, Technical University of Berlin, Germany.
    pyramid!tub!net   net@TUB.BITNET   net@tub.cs.tu-berlin.de

dorai@titan.rice.edu (Dorai Sitaram) (05/08/90)

In article <3601@jato.Jpl.Nasa.Gov> brian@granite.Jpl.Nasa.Gov (Brian of ASTD-CP) writes:
>I would like to start a discussion concer-
>ning the Scheme essential procedure "load". The
>Scheme definition, R3.99RRS, states that the re-
>turn value of "load" is unspecified. However, I
>[...]
>With many implementations of Scheme, load returns
>#f when the file passed to it is not found, and #t
>otherwise. I think this ought to be the standard
>behavior of load.  [...]

Wouldn't you be satisfied with just `file-exists?'?  Almost every
Scheme I've come across provides this procedure -- though RRRS doesn't
include it.  Obviously, `file-exists?'  and the unspecified-returning
`load' provide the capability you desire (load-module, etc.).

(define load1 
  (lambda (f) (if (file-exists? f) (begin (load f) #t) #f)))

(NB: The #f/#t `load' doesn't make `file-exists?' superfluous.  The
latter provides the ability to say that a file exists without having
to load it too.)

--dorai

shap@thebeach.wpd.sgi.com (Jonathan Shapiro) (05/10/90)

What file-exists *doesn't* do is guarantee that a subsequent load will
succeed, because the file could get deleted.

The underlying issue here is that the scheme designers haven't yet
agreed on an error handling mechanism.  It would be unambiguously wrong
to specify one in the standard if it isn't right.

Jonathan Shapiro
Silicon Graphics, Inc.