daveb@geac.UUCP (David Collier-Brown) (07/28/88)
From article <61251@sun.uucp>, by guy@gorodish.Sun.COM (Guy Harris): > input data before using it (unless they have good reason to know the data will > *never* be incorrect); I would prefer to see > input, line 12: value "12038" is out of range (should be between > 17 and 137) > > than to see > ILLEGAL ARRAY REFERENCE: "frobozz.q", line 31 > Subscript was 12038; should be between 17 and 137 > When I look in my Mutlics standard systems designers' notebook, I find the following words of wisdom... Subroutines should not check the number or type of input arguments, but assume they have been called correctly. Subroutines should not check the number or type of nor validate the correctness of their input arguements, unless it is part of their intended operation [but see below --dave]. However, subroutines which accept structure arguements should check the input structure version number for validity. What the Multicians are saying is exactly what Guy says: input routines validate input as part of their purpose in life. Other routines assume the data is valid, and don't put in checks unless thay have to deal with "versioned" structures. This is for a machine which happily passes descriptors of arrays around, and manages to bounds-check array references in parallell with the fetch. Depending on the hrdware or compiler to catch invalid data by trapping on its use has been a known bad practice since well before Unix... The manual above is a reprint, circa 1980. --dave (see .signature below) c-b -- David Collier-Brown. {mnetor yunexus utgpu}!geac!daveb Geac Computers Ltd., | Computer science loses its 350 Steelcase Road, | memory, if not its mind, Markham, Ontario. | every six months.
guy@gorodish.Sun.COM (Guy Harris) (07/29/88)
> This is for a machine which happily passes descriptors of arrays > around, and manages to bounds-check array references in parallell > with the fetch. Umm, err, what machine is that? Doesn't sound like the GE 645 or the successors that I knew of; as I remember it, the 645 and the HIS 6180 were fairly "conventional" machines in most regards, with no automatic bounds-checking for array references. (Maybe some of the weirdball "indirect then tag" addressing modes could do this, but I don't think the PL/I compiler made much use of most of them.)
barmar@think.COM (Barry Margolin) (07/29/88)
In article <3084@geac.UUCP> daveb@geac.UUCP (David Collier-Brown) writes: > This is for a machine [Multics] which happily passes descriptors of arrays >around, and manages to bounds-check array references in parallell >with the fetch. Sorry, but this is not true. None of the hardware architectures that Multics was implemented on had parallel array-bounds checking. There was an option to the PL/I compiler that caused it to include bounds-checking code before all array references. And array descriptors were only passed when the receiving procedure was declared as accepting variable-length arrays or strings. This is necessary because PL/I has builtin functions that return the array dimensions and operations that can be done on an entire array (e.g. "array(*) = 0;" will fill the array with zeroes) or slice of an array. Perhaps David is thinking of segment bounds checking. Multics has a segmented address space (but it isn't nearly as cumbersome to use as the 80x86 -- most programmers never really notice it), and it is possible to set the maximum length of a segment to the length of the array it contains. This will cause an error if the application attempts to reference too far into the segment. Using this feature requires explicit use of segments. Most applications simply allocate arrays from the heap using the PL/I "allocate" statement (similar to C's calloc() function), and this does not put each object in its own segment (Multics doesn't really support large numbers of small segments well, because a single process is used for an entire login session, and every program, dynamically-linked subroutine library, and directory that you've referenced gets its own segment, and there is a limit of 12 bits to the segment number in some pointer formats). The only part of the system that I think uses this feature is the detection of runaway recursion. By default, the max length of the stack segment is set smaller than the hardware limit, and when this is passed the handler increases the max length and signals an error (the max length has to be increased so that the stack frame of the error handler can be pushed). Also, the default max length of all segments is one page less than the maximum physically addressible; this catches negative array indexes, but I've been told that it was done because some versions of the processor had trouble when you auto-incremented pointers that point to the last word of a segment. Barry Margolin Thinking Machines Corp. barmar@think.com {uunet,harvard}!think!barmar
gwyn@brl-smoke.ARPA (Doug Gwyn ) (07/29/88)
In article <3084@geac.UUCP> daveb@geac.UUCP (David Collier-Brown) writes: > Subroutines should not check the number or type of input > arguments, but assume they have been called correctly. > What the Multicians are saying is exactly what Guy says: input >routines validate input as part of their purpose in life. Other >routines assume the data is valid, and don't put in checks unless >thay have to deal with "versioned" structures. > Depending on the hrdware or compiler to catch invalid data by >trapping on its use has been a known bad practice since well before >Unix... Yes. The basic problem is that errors detected at unanticipated points within the bowels of a program will not be handled intelligently. (In theory it would almost be possible to provide reasonable recovery from every such possible error, but in practice life is too short.) On the other hand, unless the code is developed under some formal verification system, there is a non-negligible chance that a high- level oversight will permit a low-level routine to be invoked improperly. Rather than behaving randomly, a suitable compromise is to perform simple, CHEAP plausibility tests in the low-level routines. For example, check that a pointer is non-null before dereferencing it, or check that a count is nonnegative. Low-level routines should try to behave as sanely as is reasonably possible. I usually code up such verifications under control of assert(), and turn them all off after the whole system has been thoroughly shaken out. Some people recommend leaving the tests enabled forever, as inexpensive insurance. Good run-time error handling for a production release of a system should not rely on recovery from such low-level interface errors anyway.
jfh@rpp386.UUCP (John F. Haugh II) (07/30/88)
In article <3084@geac.UUCP> daveb@geac.UUCP (David Collier-Brown) writes: >From article <61251@sun.uucp>, by guy@gorodish.Sun.COM (Guy Harris): >> input data before using it (unless they have good reason to know the data will >> *never* be incorrect); > > When I look in my Mutlics standard systems designers' notebook, I >find the following words of wisdom... > > Subroutines should not check the number or type of input > arguments, but assume they have been called correctly. in my humble opinion, subroutines on the interface boundary SHOULD check their arguments for conformity to the interface definition. once inside of a system of subroutines, there is no need to check for out of band data. a well defined interface will specify the action to be taken on each type of out of band data. a very well defined interface will specify EXACTLY how to deal with out of band values. > Depending on the hrdware or compiler to catch invalid data by >trapping on its use has been a known bad practice since well before >Unix... The manual above is a reprint, circa 1980. unix has been around longer than since circa 1980. DEC would seem to disagree since they included the trap on overflow/etc exceptions to their double precision machine operands when they created the VAX family of minicomputers. -- John F. Haugh II +--------- Cute Chocolate Quote --------- HASA, "S" Division | "USENET should not be confused with UUCP: killer!rpp386!jfh | something that matters, like CHOCOLATE" DOMAIN: jfh@rpp386.uucp | -- with my apologizes
daveb@geac.UUCP (David Collier-Brown) (08/02/88)
> In article <3084@geac.UUCP> daveb@geac.UUCP (David Collier-Brown) writes: >> This is for a machine [Multics] which happily passes descriptors of arrays >>around, and manages to bounds-check array references in parallell >>with the fetch. From article <24593@think.UUCP>, by barmar@think.COM (Barry Margolin): > Sorry, but this is not true. None of the hardware architectures that > Multics was implemented on had parallel array-bounds checking. There > was an option to the PL/I compiler that caused it to include > bounds-checking code before all array references. Well, it may not appear to check, but it sure did in practice! We lost a (large, scientific-applications) sale because we couldn't get a benchmark to run due to its addressing a large array out of its bounds, and therefore could not run the benchmark "as written". In fact, it was explained that the array in question was extremely large and had to be defined as a segment... > Perhaps David is thinking of segment bounds checking. [...] it is > possible to set the maximum length of a segment to the length of the > array it contains. This will cause an error if the application > attempts to reference too far into the segment. Using this feature > requires explicit use of segments. Most applications simply allocate > arrays from the heap using the PL/I "allocate" statement ... for the FORTRAN program in use. FORTRAN only used a subset of the standard parameter-passing mechanism, and caused screams of "but it **can't** be checking the array bounds, FORTRAN doesn't know how to find that part of the parameter list", which slowed down the identification of the problem a lot. Sufficient that they didn't come up with a work-around in time. You can do this on GCOS now, by the way, by "shrinking" a descriptor around an existing, normally allocated, array. But that's a different story entirely... None of the 'buns will address-check non-array variables without lots of special incantations, which sounded like what I was saying. 'Taint so! And I'm sorry if I made it sound like it was. --dave -- David Collier-Brown. {mnetor yunexus utgpu}!geac!daveb Geac Computers Ltd., | Computer science loses its 350 Steelcase Road, | memory, if not its mind, Markham, Ontario. | every six months.
daveb@geac.UUCP (David Collier-Brown) (08/02/88)
|From article <61251@sun.uucp>, by guy@gorodish.Sun.COM (Guy Harris): | input data before using it (unless they have good reason to know the data will | *never* be incorrect); |In article <3084@geac.UUCP> daveb@geac.UUCP (David Collier-Brown) writes: | When I look in my Mutlics standard systems designers' notebook, I | find the following words of wisdom... | | Subroutines should not check the number or type of input | arguments, but assume they have been called correctly. From article <4661@rpp386.UUCP>, by jfh@rpp386.UUCP (John F. Haugh II): | in my humble opinion, subroutines on the interface boundary SHOULD check | their arguments for conformity to the interface definition. Er, the caveat on that was to check IFF that was part of the purpose of the subroutine. The hierarchy proposed in the manual was to write a user-interface which did the checking, but not a programmer-interface with the same checks. As Guy argues, the checking at the programmer interface should be done with assert() or some other facility, arguing that this was normally for debugging a program during development. I tend to agree, as I make a lot of simple typos during development, and assertions catch many of them. I will admit I occasionally write a checking interface to some existing packages for testing purposes... | ... once inside | of a system of subroutines, there is no need to check for out of band | data. | | a well defined interface will specify the action to be taken on each | type of out of band data. a very well defined interface will specify | EXACTLY how to deal with out of band values. I'd suggest reserving the term "out of band data" for valid data sent along a special channel. Out of **range**, now... | unix has been around longer than since circa 1980. I know. The manual was so **old** it had to be re-printed. | ...DEC would seem to | disagree since they included the trap on overflow/etc exceptions to | their double precision machine operands when they created the VAX | family of minicomputers. The range/precision of our current floating-point implementations pose some very recalcitrant problems in predicting the input data to an individual computation. I still recommend trying to pass correct operands, but admit a set of suspenders to accompany my belt can be usefull, especially when I can't figure out what the constraints ought to be to keep the calculation from both going out of range and losing significant figures (I'm poor at numerical analysis). Actually, a computer manufacturer has to provide such traps, if only for languages which define specific processing on over- and underflow (eg, COBOL and PL/1), for implementing the IEEE standard, etc. ----- Probably the best approach is to try to guarantee correct data and behavior by input-data checking and carefull logic-checking respectively, but providing parallel checks in hardware where it was technically and financially feasable. Tripping over one of them would cause the program to dump, and my old boss had a policy that "we don't ship programs which dump"[1]. | John F. Haugh II +--------- Cute Chocolate Quote --------- | HASA, "S" Division | "USENET should not be confused with | UUCP: killer!rpp386!jfh | something that matters, like CHOCOLATE" | DOMAIN: jfh@rpp386.uucp | -- with my apologizes --dave (I like the chocolate quote) c-b [1] Nels Patterson, Director, Honeywell TSDC. -- David Collier-Brown. {mnetor yunexus utgpu}!geac!daveb Geac Computers Ltd., | Computer science loses its 350 Steelcase Road, | memory, if not its mind, Markham, Ontario. | every six months.
atbowler@watmath.waterloo.edu (Alan T. Bowler [SDG]) (08/15/88)
In article <61866@sun.uucp> guy@gorodish.Sun.COM (Guy Harris) writes: >> This is for a machine which happily passes descriptors of arrays >> around, and manages to bounds-check array references in parallell >> with the fetch. > >Umm, err, what machine is that? Doesn't sound like the GE 645 or the >successors that I knew of; as I remember it, the 645 and the HIS 6180 were >fairly "conventional" machines in most regards, with no automatic >bounds-checking for array references. (Maybe some of the weirdball "indirect >then tag" addressing modes could do this, but I don't think the PL/I compiler >made much use of most of them.) Guy is right. I think Dave is getting confused with the segmentation and capability hardware of the other large Honeywell machines. (L66, DPS-8, DPS-88, DPS-90, DPS-8000 etc). The Multics boxes were really just modified DPS-8's, but they did not have the same capability features. The protection mechanisms were done by the same designer, who basically said "what did I do wrong on Multics?".