[comp.lang.c] data validation

daveb@geac.UUCP (David Collier-Brown) (07/28/88)

From article <61251@sun.uucp>, by guy@gorodish.Sun.COM (Guy Harris):
> input data before using it (unless they have good reason to know the data will
> *never* be incorrect); I would prefer to see
> 	input, line 12: value "12038" is out of range (should be between
> 	  17 and 137)
> 
> than to see
> 	ILLEGAL ARRAY REFERENCE: "frobozz.q", line 31
> 		Subscript was 12038; should be between 17 and 137
> 

  When I look in my Mutlics standard systems designers' notebook, I
find the following words of wisdom...

	Subroutines should not check the number or type of input
  arguments, but assume they have been called correctly.
  Subroutines should not check the number or type of nor validate
  the correctness of their input arguements, unless it is part of
  their intended operation [but see below --dave].  However,
  subroutines which accept structure arguements should check the
  input structure version number for validity.  


  What the Multicians are saying is exactly what Guy says: input
routines validate input as part of their purpose in life.  Other
routines assume the data is valid, and don't put in checks unless
thay have to deal with "versioned" structures.
  This is for a machine which happily passes descriptors of arrays
around, and manages to bounds-check array references in parallell
with the fetch. 

  Depending on the hrdware or compiler to catch invalid data by
trapping on its use has been a known bad practice since well before
Unix... The manual above is a reprint, circa 1980.


--dave (see .signature below) c-b
-- 
 David Collier-Brown.  {mnetor yunexus utgpu}!geac!daveb
 Geac Computers Ltd.,  |  Computer science loses its
 350 Steelcase Road,   |  memory, if not its mind,
 Markham, Ontario.     |  every six months.

guy@gorodish.Sun.COM (Guy Harris) (07/29/88)

>   This is for a machine which happily passes descriptors of arrays
> around, and manages to bounds-check array references in parallell
> with the fetch. 

Umm, err, what machine is that?  Doesn't sound like the GE 645 or the
successors that I knew of; as I remember it, the 645 and the HIS 6180 were
fairly "conventional" machines in most regards, with no automatic
bounds-checking for array references.  (Maybe some of the weirdball "indirect
then tag" addressing modes could do this, but I don't think the PL/I compiler
made much use of most of them.)

barmar@think.COM (Barry Margolin) (07/29/88)

In article <3084@geac.UUCP> daveb@geac.UUCP (David Collier-Brown) writes:
>  This is for a machine [Multics] which happily passes descriptors of arrays
>around, and manages to bounds-check array references in parallell
>with the fetch. 

Sorry, but this is not true.  None of the hardware architectures that
Multics was implemented on had parallel array-bounds checking.  There
was an option to the PL/I compiler that caused it to include
bounds-checking code before all array references.

And array descriptors were only passed when the receiving procedure
was declared as accepting variable-length arrays or strings.  This is
necessary because PL/I has builtin functions that return the array
dimensions and operations that can be done on an entire array (e.g.
"array(*) = 0;" will fill the array with zeroes) or slice of an array.

Perhaps David is thinking of segment bounds checking.  Multics has a
segmented address space (but it isn't nearly as cumbersome to use as
the 80x86 -- most programmers never really notice it), and it is
possible to set the maximum length of a segment to the length of the
array it contains.  This will cause an error if the application
attempts to reference too far into the segment.  Using this feature
requires explicit use of segments.  Most applications simply allocate
arrays from the heap using the PL/I "allocate" statement (similar to
C's calloc() function), and this does not put each object in its own
segment (Multics doesn't really support large numbers of small
segments well, because a single process is used for an entire login
session, and every program, dynamically-linked subroutine library, and
directory that you've referenced gets its own segment, and there is a
limit of 12 bits to the segment number in some pointer formats).  The
only part of the system that I think uses this feature is the
detection of runaway recursion.  By default, the max length of the
stack segment is set smaller than the hardware limit, and when this is
passed the handler increases the max length and signals an error (the
max length has to be increased so that the stack frame of the error
handler can be pushed).  Also, the default max length of all segments
is one page less than the maximum physically addressible; this catches
negative array indexes, but I've been told that it was done because
some versions of the processor had trouble when you auto-incremented
pointers that point to the last word of a segment.


Barry Margolin
Thinking Machines Corp.

barmar@think.com
{uunet,harvard}!think!barmar

gwyn@brl-smoke.ARPA (Doug Gwyn ) (07/29/88)

In article <3084@geac.UUCP> daveb@geac.UUCP (David Collier-Brown) writes:
>	Subroutines should not check the number or type of input
>  arguments, but assume they have been called correctly.
>  What the Multicians are saying is exactly what Guy says: input
>routines validate input as part of their purpose in life.  Other
>routines assume the data is valid, and don't put in checks unless
>thay have to deal with "versioned" structures.
>  Depending on the hrdware or compiler to catch invalid data by
>trapping on its use has been a known bad practice since well before
>Unix...

Yes.  The basic problem is that errors detected at unanticipated
points within the bowels of a program will not be handled intelligently.
(In theory it would almost be possible to provide reasonable recovery
from every such possible error, but in practice life is too short.)

On the other hand, unless the code is developed under some formal
verification system, there is a non-negligible chance that a high-
level oversight will permit a low-level routine to be invoked
improperly.  Rather than behaving randomly, a suitable compromise
is to perform simple, CHEAP plausibility tests in the low-level
routines.  For example, check that a pointer is non-null before
dereferencing it, or check that a count is nonnegative.  Low-level
routines should try to behave as sanely as is reasonably possible.

I usually code up such verifications under control of assert(),
and turn them all off after the whole system has been thoroughly
shaken out.  Some people recommend leaving the tests enabled
forever, as inexpensive insurance.  Good run-time error handling
for a production release of a system should not rely on recovery
from such low-level interface errors anyway.

jfh@rpp386.UUCP (John F. Haugh II) (07/30/88)

In article <3084@geac.UUCP> daveb@geac.UUCP (David Collier-Brown) writes:
>From article <61251@sun.uucp>, by guy@gorodish.Sun.COM (Guy Harris):
>> input data before using it (unless they have good reason to know the data will
>> *never* be incorrect);
>
>  When I look in my Mutlics standard systems designers' notebook, I
>find the following words of wisdom...
>
>	Subroutines should not check the number or type of input
>  arguments, but assume they have been called correctly.

in my humble opinion, subroutines on the interface boundary SHOULD check
their arguments for conformity to the interface definition.  once inside
of a system of subroutines, there is no need to check for out of band
data.

a well defined interface will specify the action to be taken on each
type of out of band data.  a very well defined interface will specify
EXACTLY how to deal with out of band values.

>  Depending on the hrdware or compiler to catch invalid data by
>trapping on its use has been a known bad practice since well before
>Unix... The manual above is a reprint, circa 1980.

unix has been around longer than since circa 1980.  DEC would seem to
disagree since they included the trap on overflow/etc exceptions to
their double precision machine operands when they created the VAX
family of minicomputers.
-- 
John F. Haugh II                 +--------- Cute Chocolate Quote ---------
HASA, "S" Division               | "USENET should not be confused with
UUCP:   killer!rpp386!jfh        |  something that matters, like CHOCOLATE"
DOMAIN: jfh@rpp386.uucp          |             -- with my apologizes

daveb@geac.UUCP (David Collier-Brown) (08/02/88)

> In article <3084@geac.UUCP> daveb@geac.UUCP (David Collier-Brown) writes:
>>  This is for a machine [Multics] which happily passes descriptors of arrays
>>around, and manages to bounds-check array references in parallell
>>with the fetch. 

From article <24593@think.UUCP>, by barmar@think.COM (Barry Margolin):
> Sorry, but this is not true.  None of the hardware architectures that
> Multics was implemented on had parallel array-bounds checking.  There
> was an option to the PL/I compiler that caused it to include
> bounds-checking code before all array references.

  Well, it may not appear to check, but it sure did in practice!  We
lost a (large, scientific-applications) sale because we couldn't get
a benchmark to run due to its addressing a large array out of its
bounds, and therefore could not run the benchmark "as written".
  In fact, it was explained that the array in question was
extremely large and had to be defined as a segment...

> Perhaps David is thinking of segment bounds checking. [...]  it is
> possible to set the maximum length of a segment to the length of the
> array it contains.  This will cause an error if the application
> attempts to reference too far into the segment.  Using this feature
> requires explicit use of segments.  Most applications simply allocate
> arrays from the heap using the PL/I "allocate" statement

 ... for the FORTRAN program in use.  FORTRAN only used a subset of
the standard parameter-passing mechanism, and caused screams of "but
it **can't** be checking the array bounds, FORTRAN doesn't know how
to find that part of the parameter list", which slowed down the
identification of the problem a lot.  Sufficient that they didn't
come up with a work-around in time.



  You can do this on GCOS now, by the way, by "shrinking" a
descriptor around an existing, normally allocated, array.  But that's
a different story entirely...  None of the 'buns will address-check
non-array variables without lots of special incantations, which
sounded like what I was saying.

  'Taint so! And I'm sorry if I made it sound like it was.

--dave 
-- 
 David Collier-Brown.  {mnetor yunexus utgpu}!geac!daveb
 Geac Computers Ltd.,  |  Computer science loses its
 350 Steelcase Road,   |  memory, if not its mind,
 Markham, Ontario.     |  every six months.

daveb@geac.UUCP (David Collier-Brown) (08/02/88)

|From article <61251@sun.uucp>, by guy@gorodish.Sun.COM (Guy Harris):
|  input data before using it (unless they have good reason to know the data will
|  *never* be incorrect);

|In article <3084@geac.UUCP> daveb@geac.UUCP (David Collier-Brown) writes:
|   When I look in my Mutlics standard systems designers' notebook, I
| find the following words of wisdom...
| 
| 	Subroutines should not check the number or type of input
|   arguments, but assume they have been called correctly.

From article <4661@rpp386.UUCP>, by jfh@rpp386.UUCP (John F. Haugh II):
|  in my humble opinion, subroutines on the interface boundary SHOULD check
|  their arguments for conformity to the interface definition.

   Er, the caveat on that was to check IFF that was part of the
purpose of the subroutine.  The hierarchy proposed in the manual was
to write a user-interface which did the checking, but not a
programmer-interface with the same checks.  As Guy argues, the
checking at the programmer interface should be done with assert() or
some other facility, arguing that this was normally for debugging a
program during development.
   I tend to agree, as I make a lot of simple typos during
development, and assertions catch many of them.  I will admit I
occasionally write a checking interface to some existing packages
for testing purposes...

|  ... once inside
|  of a system of subroutines, there is no need to check for out of band
|  data.
|
|  a well defined interface will specify the action to be taken on each
|  type of out of band data.  a very well defined interface will specify
|  EXACTLY how to deal with out of band values.

   I'd suggest reserving the term "out of band data" for valid data
sent along a special channel.  Out of **range**, now...

|  unix has been around longer than since circa 1980.  
   I know.  The manual was so **old** it had to be re-printed.

|  ...DEC would seem to
|  disagree since they included the trap on overflow/etc exceptions to
|  their double precision machine operands when they created the VAX
|  family of minicomputers.

   The range/precision of our current floating-point implementations
pose some very recalcitrant problems in predicting the input data to
an individual computation.  I still recommend trying to pass correct
operands, but admit a set of suspenders to accompany my belt can be
usefull, especially when I can't figure out what the constraints
ought to be to keep the calculation from both going out of range and
losing significant figures (I'm poor at numerical analysis).

   Actually, a computer manufacturer has to provide such traps, if
only for languages which define specific processing on over- and
underflow (eg, COBOL and PL/1), for implementing the IEEE standard,
etc.
 

-----

   Probably the best approach is to try to guarantee correct data
and behavior by input-data checking and carefull logic-checking
respectively, but providing parallel checks in hardware where it was
technically and financially feasable.  Tripping over one of them
would cause the program to dump, and my old boss had a policy that
"we don't ship programs which dump"[1].


|  John F. Haugh II                 +--------- Cute Chocolate Quote ---------
|  HASA, "S" Division               | "USENET should not be confused with
|  UUCP:   killer!rpp386!jfh        |  something that matters, like CHOCOLATE"
|  DOMAIN: jfh@rpp386.uucp          |             -- with my apologizes


--dave (I like the chocolate quote) c-b
[1] Nels Patterson, Director, Honeywell TSDC.
-- 
 David Collier-Brown.  {mnetor yunexus utgpu}!geac!daveb
 Geac Computers Ltd.,  |  Computer science loses its
 350 Steelcase Road,   |  memory, if not its mind,
 Markham, Ontario.     |  every six months.

atbowler@watmath.waterloo.edu (Alan T. Bowler [SDG]) (08/15/88)

In article <61866@sun.uucp> guy@gorodish.Sun.COM (Guy Harris) writes:
>>   This is for a machine which happily passes descriptors of arrays
>> around, and manages to bounds-check array references in parallell
>> with the fetch. 
>
>Umm, err, what machine is that?  Doesn't sound like the GE 645 or the
>successors that I knew of; as I remember it, the 645 and the HIS 6180 were
>fairly "conventional" machines in most regards, with no automatic
>bounds-checking for array references.  (Maybe some of the weirdball "indirect
>then tag" addressing modes could do this, but I don't think the PL/I compiler
>made much use of most of them.)

Guy is right.  I think Dave is getting confused with the segmentation
and capability hardware of the other large Honeywell machines.
(L66, DPS-8, DPS-88, DPS-90, DPS-8000 etc).  The Multics boxes
were really just modified DPS-8's, but they did not have the
same capability features.  The protection mechanisms were done
by the same designer, who basically said "what did I do wrong on Multics?".