[comp.unix.wizards] Argument validity checking

ggw@wolves.uucp (Gregory G. Woodbury) (01/19/90)

While playing around with yet another subroutine library to perform interactive
editing of fields under curses(3x) I came face-to-face with a missing feature
of UN*X (and probably most C language environments).

When a subroutine depends on the user to pass addresses (strings, structures,
or functions) that the subroutine is going to use, and the subroutine wants
to be robust about not killing the process if the user makes a mistake,
validity checking the aruments passed is one of the front line defenses.

The problem, however, is that UN*X environments (at least Sys5 and related
ones) do not provide a general means of determining if a given address is
going to generate a memory fault of some kind.  By this I mean that before
using the address (to call a function for example) there is no way to discern
that the address is not available to the process.

Some programs can use signals or other exception trapping mechanisms to
catch bad references after the fact and attempt to fix up -- but this is
not a general method.

Some architectures provide a machine instruction to "probe" an address
to determine access, but generally such instructions or facilities are
not available at the C interface.

I can easily see some of the complications of implementing such a
facility (handling paged out memory, dealing with shared memory
libraries and such like) but began wondering if other variants of UN*X
have provided such a facility, or how other programmers deal with the
desire to be robust in a non-robust environment like most UN*Xes?

-- 
Gregory G. Woodbury
Sysop/owner Wolves Den UNIX BBS, Durham NC
UUCP: ...dukcds!wolves!ggw   ...dukeac!wolves!ggw           [use the maps!]
Domain: ggw@cds.duke.edu  ggw@ac.duke.edu  ggw%wolves@ac.duke.edu
Phone: +1 919 493 1998 (Home)  +1 919 684 6126 (Work)
[The line eater is a boojum snark! ]           <standard disclaimers apply>

wittig@gmdzi.UUCP (Georg Wittig) (01/22/90)

ggw@wolves.uucp (Gregory G. Woodbury) writes:
>When a subroutine depends on the user to pass addresses (strings, structures,
>or functions) that the subroutine is going to use, and the subroutine wants
>to be robust about not killing the process if the user makes a mistake,
>validity checking the aruments passed is one of the front line defenses.

>The problem, however, is that UN*X environments (at least Sys5 and related
>ones) do not provide a general means of determining if a given address is
>going to generate a memory fault of some kind.

My solution is the following one:

	#define MIN_NON_NIL_PTR ((unsigned long) 1L)
	#define MAX_NON_NIL_PTR ((unsigned long) 0x00ffffffL)

	if ( ! ( ((unsigned long) ptr_in_question) >= MIN_NON_NIL_PTR   &&
		 ((unsigned long) ptr_in_question) <= MAX_NON_NIL_PTR ) )
	{	... get_angry_or_whatever () ...
	}
or, if you allow a nil ptr:

	if (ptr_in_question != 0   &&   (...see above...))

I know, that's not a perfect solution. The values MIN_NON_NIL_PTR and
MAX_NON_NIL_PTR may vary from machine to machine. You know how to use #ifdef :-)
The condition ``MIN <= ptr <= MAX'' may be more complicated, and so on, and so
on ...

BUT it works on surprising number of machines.

Does someone know if there exists a portable ANSI C conforming solution for that
problem?
-- 
Georg Wittig   GMD-Z1.BI   P.O. Box 1240   D-5205 St. Augustin 1 (West Germany)
email: wittig@gmdzi.uucp   phone: (+49 2241) 14-2294
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
"Freedom's just another word for nothing left to lose" (Kris Kristofferson)

michael@stb.uucp (Michael Gersten) (02/02/90)

<Sigh>. Guys, to find out if an address is valid or not, pass it to
access as a filename. It has to check for that being valid and in your
address, and you can see if it gives you EACCESS or not.

So there is a use for access() after all.

		Michael
-- 
		Michael
denwa!stb!michael anes.ucla.edu!stb!michael 
"The 80's: Ten years that came in a row."

arielf@taux01.UUCP (Ariel Faigon) (02/04/90)

In <1990Jan26.003654.6080@NCoast.ORG> Brandon S. Allbery writes:
| As quoted from <1891@gmdzi.UUCP> by wittig@gmdzi.UUCP (Georg Wittig):
| +---------------
| | My solution is the following one:
| | 
| | 	#define MIN_NON_NIL_PTR ((unsigned long) 1L)
| | 	#define MAX_NON_NIL_PTR ((unsigned long) 0x00ffffffL)
| +---------------
|
I liked Brandon's original suggestion (to pass the address to some
system-call which checks for EFAULT).

Anyway, without claiming that the following solution is portable/general/
whatever I'll post my contribution to this thread,
just because on some systems it may be a bit better than Georg's solution
(although basically the same idea).

Quoted from some derivative of a 4.x BSD manual on end(3):

NAME
     end, etext, edata - last locations in program

SYNOPSIS
     extern end;
     extern etext;
     extern edata;

So (I add 'start' which may be defined in your C startup module):

#define IN_MY_TEXT(addr) ((void *) &start <= (addr) < (void *) &etext)
#define IN_MY_DATA(addr) (!(IN_MY_TEXT(addr) && (addr) < (void *) &end)
#define IN_MY_HEAP(addr) ((void *) &end <= (addr) < (void *) sbrk(0))
#define IN_MY_ADDRESS_SPACE(addr) \
	(IN_MY_TEXT(addr) || IN_MY_DATA(addr) || IN_MY_HEAP(addr))

(disclaimer: this code wasn't tested).

This still doesn't handle gaps, shared memory segments, and stack space
you can check (again, not bullet-proof) for an address near the top of
your stack by comparing 'addr' to some local variable address.

Just another approximation for the truth :-)
-- 
Ariel Faigon, CTP group, NSTA
National Semiconductor (Israel)
6 Maskit st.  P.O.B. 3007, Herzlia 46104, Israel   Tel. (972)52-522312
arielf%taux01@nsc.com   @{hplabs,pyramid,sun,decwrl} 34 48 E / 32 10 N

arielf@taux01.UUCP (Ariel Faigon) (02/04/90)

Ooops, I just wrote:
#define IN_MY_TEXT(addr) ((void *) &start <= (addr) < (void *) &etext)
                                          ^^^^^^^^^^^
#define IN_MY_HEAP(addr) ((void *) &end <= (addr) < (void *) sbrk(0))
					^^^^^^^^^^^

You need of course separate comparisons here
like in:
	((void *) &start <= (addr) && (addr) < (void *) &etext)

As I said the code wasn't tested, even not reviewed enough. sorry.
-- 
Ariel Faigon, CTP group, NSTA
National Semiconductor (Israel)
6 Maskit st.  P.O.B. 3007, Herzlia 46104, Israel   Tel. (972)52-522312
arielf%taux01@nsc.com   @{hplabs,pyramid,sun,decwrl} 34 48 E / 32 10 N

lehners@uniol.UUCP (Joerg Lehners) (02/05/90)

Hello !

michael@stb.uucp (Michael Gersten) writes:
><Sigh>. Guys, to find out if an address is valid or not, pass it to
>access as a filename. It has to check for that being valid and in your
>address, and you can see if it gives you EACCESS or not.

But that would cause tons of useless disk io.
And that routine would be really slow if the buffer (interpreted
as a path by access()) is a valid path to a file on eg. a mounted floopy
disk.

>So there is a use for access() after all.

I hope Michael is just joking ...

  Joerg
--
/ UUCP:    lehners@uniol              | Joerg Lehners                  \
|       ...!uunet!unido!uniol!lehners | Fachbereich 10 Informatik ARBI |
| BITNET:  066065 AT DOLUNI1          | Universitaet Oldenburg         |
\ Inhouse: aragorn!joerg              | D-2900 Oldenburg               /

lca@spodv4.UUCP (Lars H Carlsson) (02/05/90)

In article <1990Feb2.070437.2695@stb.uucp>, michael@stb.uucp (Michael Gersten) writes:
> <Sigh>. Guys, to find out if an address is valid or not, pass it to
> access as a filename. It has to check for that being valid and in your
> address, and you can see if it gives you EACCESS or not.
> 
> So there is a use for access() after all.
> 
> 		Michael
> -- 
> 		Michael
> denwa!stb!michael anes.ucla.edu!stb!michael 
> "The 80's: Ten years that came in a row."


X/Open page ACCESS(2).1
"
...
	[EACCES]	Permission bits of the file mode do not permit
			the requested access.
...
"

	(there are access and access ;-)

LH

chris@mimsy.umd.edu (Chris Torek) (02/05/90)

This whole discussion has been rather amazing.  In most cases, there is
little difference between a program that, when run, says

	% compute 2 + 2
	Segmentation fault (core dumped)
	% 

and one that says

	% compute 2 + 2
	!*797tKG
	%

where the former used an invalid address, and the latter used a valid but
incorrect address.  Testing whether an address can be read or written does
not tell whether that address *should* be read or written.  Much better
would be, for instance, a program that says:

	% compute 2 + 2
	compute: panic: add_integers: invalid data type code 47!
	compute: This program has discovered itself to be buggy.
		Please notify the vendor, including what you did
		and the exact output from the program.
	Segmentation fault (core dumped)
	% 

Address validity checking is at best a minor part of real validity
checking.  The core dump provides enough information to locate the bad
address, which is as much as the program could have done anyway (since
it must assume, once something has gone wrong, that *anything* could go
wrong).

There are a few exceptions to this rule, but they are fairly rare.
-- 
In-Real-Life: Chris Torek, Univ of MD Comp Sci Dept (+1 301 454 7163)
Domain:	chris@cs.umd.edu	Path:	uunet!mimsy!chris