[comp.unix.wizards] brk's zero-fill behavior on VAXen

mouse@mcgill-vision.UUCP (der Mouse) (11/08/86)

In article <7208@elsie.UUCP>, ado@elsie.UUCP (Arthur David Olson) writes:
[paraphrased]
> "brk" and "sbrk" don't promise the contents of any new memory they
> may create.  But on a VAX it's always zero.
[quoted]
> Can system performance be improved by avoiding zero filling of the new
> memory?

Probably.  But probably not measurably.  The VAX has a pte type known
as "zero-fill on demand" which means that the page is created full of
zeros when it is first referenced.  This, for instance, is how the bss
segment is normally set up (I think, the kernel code is spread over
several routines).

					der Mouse

USA: {ihnp4,decvax,akgua,utzoo,etc}!utcsri!mcgill-vision!mouse
     think!mosart!mcgill-vision!mouse
Europe: mcvax!decvax!utcsri!mcgill-vision!mouse
ARPAnet: think!mosart!mcgill-vision!mouse@harvard.harvard.edu

	Aren't you glad you don't shave with Occam's Razor?

rcodi@yabbie.rmit.oz (Ian Donaldson) (11/10/86)

In article <2447@hcr.UUCP>, mike@hcr.UUCP (Mike Tilson) writes:
> I'd like to point out that there is another very good reason to
> set newly allocated memory to a fixed value:  buggy programs are much
> less likely to exhibit non-deterministic behavior, which makes it
> much easier to fix problems.  If newly allocated memory were initialized
> with random values, then tracking down wild pointers, etc., would be much
> harder.

I might point out that initializing such memory with zero is less likely
to reveal bugs in a program than would be initializing with a constant
garbage value (eg: 0x3e).  Now, if a pointer was to be used that lived
in such memory, it would be: 0x3e3e3e3e, a value that will cause most
CPU's to give a bus-error or seg fault, because (1) if the pointer
is a pointer to an int, then it is an odd-address, causing many
cpus such as the 68000 to crap out; and (2) very few programs have
addresses that live up that high in their data or that low in their stack 
segments.  Initializing to zero will only cause machines that
disallow references to low-memory (eg: Sun's) to show up the error.

The CDC Cyber 170 series uses this concept to advantage with most languages;
since it has 60-bits (a silly number, I agree), it sets all 'bss' storage to
0600000000000004nnnnn, where nnnnnn is the address of the storage.  Since
pointers on the Cyber cannot exceed 131071 (0377777), any reference
to the data as a pointer will fail.  The 06 part is used so that the
hardware can trap any arithmetic operations on such data as overflow's.

Fortran, Pascal and several other languages use this to advantage to give
sensible post-mortem dumps, as it is always known with reasonable
probability which variables are undefined, since the address of the
variable is inside its contents.

Minnesota Pascal-4 uses all this to great advantage, as when run-time
tests are on, even stack-frames are initialized this way, making it
very easy to debug programs that use uninitialized variables.

Pointers declared in parts of the program where run-time-checks
are switched on are also physically larger than normal, to accomodate
extra information (the key) so that the pointer can be checked for
validity.  When a new() is done, a unique key is tacked on top of the object
allocated, that must match the key in the pointer referencing it, otherwise
a "pointer-invalid" run-time error occurs.

On the cyber, this is easy, since there are so many bits available in a word.

Perhaps for the sake of run-time checking available with languages such
as Pascal on a 32-bit machine you could sacrifice one state of
the 4G available to be classified as 'undefined'.  An obvious state is due
to 2's complement machines having an imbalance in the range of
signed numbers.  16-bit numbers go from -32768 to 32767.  You could
probably steal the -32768 for such checking without affecting too many
programs.  Similarly for 32-bit ints (0x80000000 I think?).

Pity you can't do a lot of this checking in C without breaking huge
amounts of code.  Therefore, Pascal++ :-)

Ian Donaldson.

henry@utzoo.UUCP (Henry Spencer) (11/12/86)

> ... When a new() is done, a unique key is tacked on top of the object
> allocated, that must match the key in the pointer referencing it, otherwise
> a "pointer-invalid" run-time error occurs.
> 
> On the cyber, this is easy, since there are so many bits available in a word.

Yet another similar trick:  in the Algol 68 implementation for the Cyber,
from CDC Netherlands I think it was, the garbage collector uses the extra
bits in the pointers as mark bits, thereby using zero extra storage!
-- 
				Henry Spencer @ U of Toronto Zoology
				{allegra,ihnp4,decvax,pyramid}!utzoo!henry

karl@haddock.UUCP (Karl Heuer) (11/12/86)

In article <363@yabbie.rmit.oz> rcodi@yabbie.rmit.oz (Ian Donaldson) writes:
>[Initializing to zero is inferior to] a constant garbage value (eg: 0x3e).
>Now, if a pointer was to be used that lived in such memory, it would be:
>0x3e3e3e3e, a value that will cause most CPU's to give a bus-error or seg
>fault, because (1) ... it is an odd-address,

A minor quibble here; 0x3e is even.

>Perhaps for the sake of run-time checking available with languages such
>as Pascal on a 32-bit machine you could sacrifice one state of
>the 4G available to be classified as 'undefined'.  An obvious state is
>[the most negative number on two's complement machines].

Of course, using this value throws away the benefit of having all bytes in
the "garbage value" be the same.  But anyway...  I don't mind losing this
value for objects used arithmetically (I don't trust operations on MAXNEG
anyway), but you'd have to make an exception for objects used as bit-patterns
("unsigned" in C).  (I had problems with this once, using a language that
enforced such a "garbage value".  I think it was a graphics program, and it
turned out to be impossible to draw a certain one-bit pattern.)

Karl W. Z. Heuer (ima!haddock!karl or karl@haddock.isc.com), The Walking Lint
P.S. I like the way VAXen do this with floating point (the result of some
operation is NaN, which triggers an FPE on the *next* usage, giving the user
a chance to test for it first).

david@ukma.uky.csnet (David Herron, NPR Lover) (11/12/86)

Hey guys!  Why not just initialize things when you declare/allocate them?
Wouldn't that just make so much more sense?

[I can hear the chant in the back of my mind... "Fixed in C++!"]
-- 
David Herron,  cbosgd!ukma!david, david@UKMA.BITNET, david@ms.uky.csnet
(I'm also "postmaster", "news", "netnews", "uucp", "mmdf", and ...)

"What have I got in my pocketses?" -- I never heard such a stupid damn riddle!

coleman@sdcsvax.UCSD.EDU (Don Coleman) (11/13/86)

The reason VAXen zero-fill brk'ed data is for security(it wouldn't be good
for a snoopy user to watch until somebody exits vi after editing a 
very private file, and then do lots of brks, and get access to the 'dirty'
pages that the vi was using...).

There are also a lot of other benefits in terms of the predictability of 
programs when they fail, and it makes core images quite a bit cleaner.

But this is *not* something to depend on.  I doubt you'll find it in
the SVID(This of course assumes that you care what's in the SVID).

don
coleman@sdcsvax.ucsd.edu
Newsgroups: net.unix-wizards

tom@hcrvx1.UUCP (Tom Kelly) (11/13/86)

In article <363@yabbie.rmit.oz> rcodi@yabbie.rmit.oz (Ian Donaldson) writes:
>The CDC Cyber 170 series uses this concept to advantage with most languages;
>since it has 60-bits (a silly number, I agree), it sets all 'bss' storage to
>0600000000000004nnnnn, where nnnnnn is the address of the storage.  Since
>pointers on the Cyber cannot exceed 131071 (0377777), any reference
>to the data as a pointer will fail.  The 06 part is used so that the
>hardware can trap any arithmetic operations on such data as overflow's.

>Ian Donaldson.

This brings back fond memories of working on CDC 6600s (the predecessor
of the Cyber 170 series).  If the word above is executed
as code, it's interpretation is:

		SB0	A0+0	-- has no effect, B0 is hard zero
		PS		-- Program stop

The operating system (at least KRONOS) noticed that the program was
executing a PS and terminated the job with an error message.

I believe that the trap mentioned above only works for floating point,
the bit pattern is a legitimate integer (can't win them all).

The Burroughs B6700 series uses a tagged architecture.  Each memory
word (48 bits) is associated with a 3 bit tag that specifies something
about the associated data (single, double, descriptor, code, ...)

A tag of 6 was reserved for software.  If used as an operand to most
operators, it caused a trap.  A normal store would overwrite a
tag 6, and replace the tag with the correct tag for the
type of data being stored. I worked on a Pascal compiler that
initialized all stack locations with a tag 6.  The compiler would
also put a tag 6 on the word holding the controlled variable in a
for-loop on exit from the loop to enforce the rule that the value
is undefined.  There was some talk of making tag 6 an option for
when memory was initialized by the operating system (normally,
it was set to zero).

I used a similar technique in the Fortran-77 compiler to distinguish
between when an integer variable contained a number and when it contained
a label (set with the ASSIGN statement).  This resulted in a cheap
check that you weren't trying to GOTO an integer value, or do arithmetic
on a label.

Tom Kelly  (416) 922-1937
Human Computing Resources Corp.
{utzoo, ihnp4, decvax}!hcr!hcrvx1!tom

guido@mcvax.uucp (Guido van Rossum) (11/14/86)

In article <7315@utzoo.UUCP> henry@utzoo.UUCP (Henry Spencer) writes:
>Yet another similar trick:  in the Algol 68 implementation for the Cyber,
>from CDC Netherlands I think it was, the garbage collector uses the extra
				    ^Yes
>bits in the pointers as mark bits, thereby using zero extra storage!

Well, I wouldn't call using 60 bits for an 18-bit quantity (really 17,
since user address spaces were limited to 2**17) "using zero extra
storage".  But I believe they did more than just using it for mark bits:
pointers on the heap would have a special bit pattern in the top 12 bits
which would make them an illegal and impossible floating point number,
so that they could do a linear scan of the heap and find all references!
(The FP unit would help in recognizing these words because it had an
instruction for testing whether any particular value was OK to use.)
This worked because there were no 'packed' structures, ints were
supposed to be limited to 48 bits (as indeed they are on the Cyber if you
want to do multiplies or divides), and the densest packing of characters
had only 4 12-bit characters in a word.

	Guido van Rossum, CWI, Amsterdam <guido@mcvax.uucp>

In a different world, we would all be doing our systems programming in
Algol-68 instead of C.

henry@utzoo.UUCP (Henry Spencer) (11/15/86)

> But this is *not* something to depend on.  I doubt you'll find it in
> the SVID(This of course assumes that you care what's in the SVID).

You won't find brk and friends in the SVID at all, in fact.
-- 
				Henry Spencer @ U of Toronto Zoology
				{allegra,ihnp4,decvax,pyramid}!utzoo!henry

mangler@cit-vax.Caltech.Edu (System Mangler) (11/15/86)

In article <160@haddock.UUCP>, karl@haddock.UUCP (Karl Heuer) writes:
> P.S. I like the way VAXen do this with floating point (the result of some
> operation is NaN, which triggers an FPE on the *next* usage, giving the user
> a chance to test for it first).

I inherited an interpreter that uses this trick.  Checking for a NaN was
done by a call-by-reference assembly language routine; when I rewrote it
as a call-by-value C routine, I got floating point exceptions.	Why?  The
MOVD instruction that pushes the value on the stack will raise an FPE if
asked to push a NaN!  You have to be very careful how you look at those...

Don Speck   speck@vlsi.caltech.edu  {seismo,rutgers,ames}!cit-vax!speck

bolosky@wb1.cs.cmu.edu (William Bolosky) (11/19/86)

In the process of putting up the Mach VM implementation here at CMU we
introduced a bug which caused newly allocated memory to not be zeroed in
some esoteric circumstances.  This caused malloc to behave incorrectly
(malloc itself, not things using it), at very infrequent intervals.

It's hard to guess just what else relies on this, but I would wager that
there are more things that do...

BTW, On the IBM RT PC the 0 opcode is a jump instruction, and 0000 means
(guess what) "jump to yourself."  This is annoying.
-------

Bill Bolosky
Mach Kernel Group, Carnegie-Mellon University CSD
ARPA: bolosky@wb1.cs.cmu.edu
BITNET: wb0g@cmcctb.bitnet

edler@cmcl2.UUCP (Jan Edler) (11/20/86)

The NYU Ultracomputer prototype has been running in various stages of
hardware and software development since 1982, and very early on we
decided that the kernel (based mostly on v7) would not clear newly
allocated memory.  It has not been shown to cause any problems.  I
don't recall ever having trouble with this when porting old programs
(dereferencing null pointers and byte-ordering problems are much more
pervasive).

We thought we had a bug once that was being caused by this, so we put
an optional feature in the kernel to spread a given garbage-pattern on
newly allocated memory, and spent some more time tracking down the
problem, only to find that it was caused by something else (hardware).
We kept the optional garbage-spreading feature, although it hasn't been
used in a long time.

This does not alter the fact that an uninitialized-variable bug in a
program can be nondeterministic, but having the kernel set newly
allocated memory to some value doesn't completely eliminate the problem
(e.g. if the uninitialized variable is on the stack, it can still have
an arbitrary value).

Of course, not setting newly allocated memory to some value is clearly
a weakness from a security point of view.

Jan Edler
New York University
edler@nyu
cmcl2!edler

chris@mimsy.UUCP (Chris Torek) (12/08/86)

>In article <7208@elsie.UUCP>, ado@elsie.UUCP (Arthur David Olson) writes:
>>Can system performance be improved by avoiding zero filling ... ?
 
In article <544@mcgill-vision.UUCP> mouse@mcgill-vision.UUCP (der Mouse)
writes:
>Probably.  But probably not measurably.  The VAX has a pte type known
>as "zero-fill on demand" which means that the page is created full of
>zeros when it is first referenced.

Not quite.  There are various bits per page stored in a PTE,
including five bits that are unused in invalid pages.  In 4BSD Vax
kernels, two special values may be stored in those bits to mean
`fill on demand from text inode' and `zero fill on demand'.  The
zero fill is done by code in the page fault handler, not by the
hardware.  But it is still quite fast.

The other 30 possible values for that five-bit field were used to
implement `vread' in 4.1 and 4.2.  The vread system call was dropped
from 4.3BSD, removing a major stumbling block that limited the
maximum number of open files per process to 30.  vread, like vfork,
was an implementation hack.  Unlike vfork, vread proved not terribly
useful.  (vfork is still faster than fork, even in systems implementing
copy-on-write forks.  Sigh.)
-- 
In-Real-Life: Chris Torek, Univ of MD Comp Sci Dept (+1 301 454 7690)
UUCP:	seismo!mimsy!chris	ARPA/CSNet:	chris@mimsy.umd.edu