[comp.lang.c] Insecure hardware

lvc@cbnews.ATT.COM (Lawrence V. Cipriani) (11/26/88)

In article <288@ispi.UUCP> jbayer@ispi.UUCP (Jonathan Bayer) writes:
>gets is different in that the input is undefined.  If gets is used in a
>program in which data is piped to, and it is part of a secure system, and
>unsecured data can be piped to it, then it is possible to break it.

(Not picking on you Mr. Bayer!)

All the discussion I have seen so far about recent virus has focused on
software.  To what extent can hardware be at fault?  Was the one of the
reasons the two processor types were attacked because they would allow
code to be executed in data space?  Is this what happened?  Some other
machines will produce a core dump if you pull this.  What else can be done
in hardware to enhance the security of the UNIX(r) operating system?

	Larry Cipriani
--
UNIX was a trademark of Western Electric, Western Electric is a trademark
of AT&T, UNIX is a registered trademark of AT&T

dhesi@bsu-cs.UUCP (Rahul Dhesi) (11/27/88)

In article <2330@cbnews.ATT.COM> lvc@cbnews.ATT.COM (Lawrence V. Cipriani)
writes:
>Was the one of the
>reasons the two processor types were attacked because they would allow
>code to be executed in data space?

The fingerd bug was that sending a long line to it via gets() allowed
you to push anything you wanted on the stack.  Since the stack contains
both data and return addresses, keeping code space and data space
separate would probably not have helped.  (I don't know if hardware
separation of code and data inhibited the worm but it would still leave
the same loophole there for exploitation in some other way.)

However, it is rumored that there are machines that maintain a tag
value with each word.  On such a machine, a return address on the stack
would be tagged as a 'code address'.  Overwriting it with the data read
by gets() would remove the 'code address' tag and replace it with a
'character data' tag, preventing a return to that address.  Such a
machine would not permit you to change a 'data' tag to a 'code address'
tag without either doing a system call or executing a "jump to
subroutine" instruction that would automatically push an address with a
'code address' tag on the stack.

Unfortunately, this might seriously hinder the assembly-language
programmer who wishes to play around with return addresses for code
optimization.
-- 
Rahul Dhesi         UUCP:  <backbones>!{iuvax,pur-ee}!bsu-cs!dhesi

chris@mimsy.UUCP (Chris Torek) (11/28/88)

>In article <2330@cbnews.ATT.COM> lvc@cbnews.ATT.COM (Lawrence V. Cipriani)
>asks:
>>Was the one of the reasons the two processor types were attacked
>>because they would allow code to be executed in data space?

(It is worth noting that the fingerd attack was applied only to VAXen.)

In article <4869@bsu-cs.UUCP> dhesi@bsu-cs.UUCP (Rahul Dhesi) writes:
>The fingerd bug was that sending a long line to it via gets() allowed
>you to push anything you wanted on the stack.  Since the stack contains
>both data and return addresses, keeping code space and data space
>separate would probably not have helped.  (I don't know if hardware
>separation of code and data inhibited the worm but it would still leave
>the same loophole there for exploitation in some other way.)

It would indeed have helped, although it might not have made a similar
attack impossible.

The attack consisted of sending a 536-byte `line' (with embedded NULs,
so it was not a C string) which contained 400 control-As, some VAX
machine code, numerous NULs, and finally a hand-crafted stack frame
that caused fingerd's return from main()---fingerd did not use
exit()---to return somewhere into the 512-byte buffer (to 0x7fffe9fc,
actually).  I have not gone so far as to analyse the stack of a normal
(inetd-spawned) 4.3BSD or 4.3BSD-tahoe fingerd, but the obvious
probability is that the ^As were used to allow for some slop in
addressing:  ^A, or ASCII 01, is a VAX NOP instruction.

The stack of a 4BSD VAX Unix process looks like this:

	8000 0000	System space
	7fff ec00	u. area + red zone + kernel stack (UPAGES=0t10)
	7fff xxxx	environment variables
	7fff yyyy	argv strings
	7fff zzzz	stack frame for return from main()
	7fff wwww	local variables for main()

xxxx depends on the number and length of environment variables, and
would typically be ebxx (<256 bytes of environment, e.g., just HOME and
PATH).  yyyy depends on xxxx and on the number and length of the argv
strings; zzzz depends on yyyy and on the number of registers used in
main().  Apparently, for fingerd, wwww is around (7fff)cabc.

The 400 ^As would allow for up to 400 bytes of error in the worm's
assumption for the value of wwww above.  If you had made changes to
/etc/rc.local and/or inetd and/or fingerd, but had not pushed the
address far enough away from the `usual' location, the worm could write
a frame that would still look reasonable and would skip over its NOPs
and execute the code it sent after the NOPs.  That code essentially did
an execve("/bin/sh"), but cleverly avoided anything but stack-relative
addressing.

Now, if the VAX hardware had refused to execute data pages---perhaps
by refusing to execute any pages with user-write permission enabled---
the worm could not have run code off the stack.  It could still replace
the old stack frame, and change the argv and environment strings.  I
will not speculate on possible ways to break in using this ability.
I will, however, note that any number of local changes might have
moved the address `wwww' far enough to foil the attack.  One could
argue that, perhaps, each process should have a different view of its
own address space.  It would certainly be easy enough to have the
c startup code move the stack down by a pseudo-random amount....
-- 
In-Real-Life: Chris Torek, Univ of MD Comp Sci Dept (+1 301 454 7163)
Domain:	chris@mimsy.umd.edu	Path:	uunet!mimsy!chris

brian@ucsd.EDU (Brian Kantor) (11/28/88)

In article <4869@bsu-cs.UUCP> dhesi@bsu-cs.UUCP (Rahul Dhesi) writes:
>However, it is rumored that there are machines that maintain a tag
>value with each word.  On such a machine, a return address on the stack
>would be tagged as a 'code address'.  Overwriting it with the data read
>by gets() would remove the 'code address' tag and replace it with a
>'character data' tag, preventing a return to that address.  

Such machines do exist; we have one of them here - a ancient beast
known as a Burroughs 7800.  Since the architecture of such a machine
dates from the middle SIXTIES, one wonders why more such equipment
isn't being made, especially in these days of cheap (well,
not-as-cheap-as-two-years-ago) memory.

The Burro uses a three-bit tag on each of its 48 data-bit words to
signal whether a word is code, data, array descriptor, pointer,
stack control word, etc.  It also does array bounds checking in
hardware as part of the index calculation, and allocates arrays in
virtual space (segmented, if necessary) with a descriptor on the
stack instead of the whole array on the stack, and has special
stack control words so the firmware can cut the stack back on
procedure exit, etc, etc, etc.  Oh, and the program counter stack
is separate from the data stack, so there is NOTHING you can do that
will bugger the return addresses.

Ah well, make something ahead of its time, and nobody remembers....

	Brian Kantor	UCSD Office of Academic Computing
			Academic Network Operations Group  
			UCSD B-028, La Jolla, CA 92093 USA
			brian@ucsd.edu ucsd!brian BRIAN@UCSD

gwyn@smoke.BRL.MIL (Doug Gwyn ) (11/28/88)

In article <4869@bsu-cs.UUCP> dhesi@bsu-cs.UUCP (Rahul Dhesi) writes:
>In article <2330@cbnews.ATT.COM> lvc@cbnews.ATT.COM (Lawrence V. Cipriani) writes:
>>Was the one of the reasons the two processor types were attacked
>>because they would allow code to be executed in data space?

I think one of the reasons was that those machines just happened to be
AVAILABLE for compilation of the source to produce the objects that
were shipped around.  However, the particular form that the fingerd
attack took would not work on a system that supported only split-I&D
spaces.

>The fingerd bug was that sending a long line to it via gets() allowed
>you to push anything you wanted on the stack.  Since the stack contains
>both data and return addresses, keeping code space and data space
>separate would probably not have helped.

No, this is wrong.  On a split-I&D machine, return addresses are in
D space but instructions are in I space.  Therefore, the code being
returned to would have had to have been placed in I space, which is
not normally something an unprivileged process can accomplish (other
than by one of a limited class of system calls, e.g. exec on UNIX).

Tags are not required for split-I&D implementations; just consider
the PDP-11 for an example.  Tagged architectures and other uncommon
architectures could also pose barriers, but split-I&D would have
been sufficient to block that particular virus/work attak path.
However, there were (and probably are) still plenty of exploitable
security holes for such viruses to enter.

Note that most common systems that support split I&D spaces do
not REQUIRE their use; the space separation would have to be
ENFORCED to serve as a practical security barrier.  There are
also a lot of subtle details that would have to be gotten right;
I have never in my decades of working with computers seen an
absolutely secure architectural implementation (not even Multics
was 100% secure).

rang@cpsin3.cps.msu.edu (Anton Rang) (11/28/88)

One quick note here...

Chris Torek (chris@mimsy.UUCP), in article 14733@mimsy.UUCP, writes:
>>In article <2330@cbnews.ATT.COM> lvc@cbnews.ATT.COM (Lawrence V. Cipriani)
>>asks:
>>>Was the one of the reasons the two processor types were attacked
>>>because they would allow code to be executed in data space?

>(It is worth noting that the fingerd attack was applied only to VAXen.)

> [ stuff deleted ]

>Now, if the VAX hardware had refused to execute data pages---perhaps
>by refusing to execute any pages with user-write permission enabled---
>the worm could not have run code off the stack.

  VAX processors do have separate bits for read, write, and execute on
each page (I seem to vaguely recall one more).  The problem lies with
the implementation of BSD and Ultrix, which leave the stack
executable; I can't see any reason for this offhand.

+---------------------------+------------------------+----------------------+
| Anton Rang (grad student) | "UNIX: Just Say No!"   | "Do worry...be SAD!" |
| Michigan State University | rang@cpswh.cps.msu.edu |                      |
+---------------------------+------------------------+----------------------+

chase@Ozona.orc.olivetti.com (David Chase) (11/29/88)

In article <1189@cps3xx.UUCP> rang@cpswh.cps.msu.edu (Anton Rang) writes:
>  The problem lies with
>the implementation of BSD and Ultrix, which leave the stack
>executable; I can't see any reason for this offhand.

Well, I'm told that there's signal handling code that writes on the
stack and then executes it.  It's also a handy trick for implementing
partial application, which can in turn be used to implement lexical
binding of variables in nested functions for languages that aren't C
but wish to remain compatible with it.  This is VERY useful if your
intermediate/portability language IS C (no, I can't use a global
variable to simulate a lexical or display pointer, because we'll be
using threads.  Yes, I'm aware that C structure-returning conventions
are often non-reentrant, and we've dealt with that.)  Some Lisp
systems may generate code onto the stack in certain situations.

Some new machines have separate instruction and data busses and/or
caches, and this is sometimes given as a good reason for keeping
instruction and data spaces separate (protect us from undefined
results, I assume).  I'd prefer otherwise, since it isn't too hard to
write code to do the right thing (somewhat efficiently, even) if there
is a system call that allows page-by-page changing of protection or
copying of a page of data into I-space.

David

henry@utzoo.uucp (Henry Spencer) (11/29/88)

In article <2330@cbnews.ATT.COM> lvc@cbnews.ATT.COM (Lawrence V. Cipriani) writes:
>... To what extent can hardware be at fault?  Was the one of the
>reasons the two processor types were attacked because they would allow
>code to be executed in data space?  Is this what happened?  Some other
>machines will produce a core dump if you pull this...

One should remember that dynamic code generation (necessarily into the
data space) followed by execution of the resulting code can be a very
valuable technique for things like interpreters.  One can finesse that
with a "change data to code" system call, but the system-call overhead
can hurt badly.
-- 
SunOSish, adj:  requiring      |     Henry Spencer at U of Toronto Zoology
32-bit bug numbers.            | uunet!attcan!utzoo!henry henry@zoo.toronto.edu

gregg@ihlpb.ATT.COM (Wonderly) (11/29/88)

From article <8995@smoke.BRL.MIL>, by gwyn@smoke.BRL.MIL (Doug Gwyn ):
> In article <4869@bsu-cs.UUCP> dhesi@bsu-cs.UUCP (Rahul Dhesi) writes:
>>In article <2330@cbnews.ATT.COM> lvc@cbnews.ATT.COM (Lawrence V. Cipriani) writes:
>>>Was the one of the reasons the two processor types were attacked
>>>because they would allow code to be executed in data space?
> 
> ...
>
> However, there were (and probably are) still plenty of exploitable
> security holes for such viruses to enter.

Just to demonstrate how vunerable we can make ourselves,

Consider my naive implementation of fingerd which, on most architectures
with a decending address, local variable stack allocation stratedgy
can be compromised, after a little investigation.

main ()
{
	char vec[10];
	int vecp = 1;
	char buf[512];

	vec[0] = "/usr/ucb/finger";
	if (gets(buf) != NULL)
		vec[vecp++] = buf;

	vec[vecp++] = NULL

	execvp (vec[0], vec);
	perror (vec[0]);
}

One only needs to write the appropriate strings, a value for vecp,
and a value for vec[0-?] to the programs standard input to force
execution of your choice of programs.  Naturally, there is less
freedom (actually none) of the values one might write into the 
array of pointers, but all it takes is the source and a wee bit of
knowledge about the STANDARD environment that sooooo many turnkey
UN*X boxes have these days, and you are in.

This is just one method of many...

-- 
It isn't the DREAM that NASA's missing...  DOMAIN: gregg@ihlpb.att.com
It's a direction!                          UUCP:   att!ihlpb!gregg

cmf@cisunx.UUCP (Carl M. Fongheiser) (11/29/88)

In article <1189@cps3xx.UUCP> rang@cpswh.cps.msu.edu (Anton Rang) writes:
>  VAX processors do have separate bits for read, write, and execute on
>each page (I seem to vaguely recall one more).  The problem lies with
>the implementation of BSD and Ultrix, which leave the stack
>executable; I can't see any reason for this offhand.

Oh really?  Are you sure you're talking about a VAX? :-)

The only permissions that can be specified in a VAX PTE are read & write.
And they aren't really encoded in separate bits; instead, you have
values which specify the outermost mode which can write (and read) the
given page.  Note also, there's no such thing as a write-only page.
If you can write the page, you can also read it.

				Carl Fongheiser
				University of Pittsburgh
				...!pitt!cisunx!cmf
				cmf@unix.cis.pittsburgh.edu

gwyn@smoke.BRL.MIL (Doug Gwyn ) (11/29/88)

In article <9119@ihlpb.ATT.COM> gregg@ihlpb.ATT.COM (Wonderly) writes:
>Just to demonstrate how vunerable we can make ourselves, ...

Your example is the same as what started the whole gets() discussion!

terryl@tekcrl.CRL.TEK.COM (11/30/88)

In article <1189@cps3xx.UUCP> rang@cpswh.cps.msu.edu (Anton Rang) writes:
>One quick note here...
>
>Chris Torek (chris@mimsy.UUCP), in article 14733@mimsy.UUCP, writes:
>>Now, if the VAX hardware had refused to execute data pages---perhaps
>>by refusing to execute any pages with user-write permission enabled---
>>the worm could not have run code off the stack.
>
>  VAX processors do have separate bits for read, write, and execute on
>each page (I seem to vaguely recall one more).  The problem lies with
>the implementation of BSD and Ultrix, which leave the stack
>executable; I can't see any reason for this offhand.


     BBBBUUUUUZZZZ!!!!! Wrong answer...

     The VAX only has read/write permissions per page, but it does have
4 different access modes per page (kernel, executive, supervisor, & user),
with each access mode having its own independent permissions per page...


Boy
Do
I
Hate
Inews
!!!!
!!!!

PS:
	Don't tell me about the various different ways around the infamous
"rn" included line count problem; I know all of them. I just like to complain
about fascist software!!!! (-:

chris@mimsy.UUCP (Chris Torek) (11/30/88)

Someone else mentioned the correct answer, but I suppose I had best do
it again.  I have redirected followups to comp.unix.wizards only.

>In article <1189@cps3xx.UUCP> rang@cpswh.cps.msu.edu (Anton Rang)
`corrects' me:
>>VAX processors do have separate bits for read, write, and execute on
>>each page (I seem to vaguely recall one more). ...

In article <3335@tekcrl.CRL.TEK.COM> terryl@tekcrl.CRL.TEK.COM writes:
>     BBBBUUUUUZZZZ!!!!! Wrong answer...

So far so good....

>     The VAX only has read/write permissions per page, but it does have
>4 different access modes per page (kernel, executive, supervisor, & user),
>with each access mode having its own independent permissions per page...

Not so.  There is a four bit field for `access control'.  With four CPU
modes (K E S & U as above) and two permissions (R & W), there are only
half as many bits as needed for fully independent permissions.
Instead, the VAX designers made the assumption that if the user can
write the page, all the more privileged modes should also be able to
write; if the user can only read, more bits might allow other modes to
write.  Whatever permissions a less-privileged mode has, a more-
privileged mode has at least those permissions.

4BSD VAX Unix makes use of only the following modes:

#define	PG_NOACC	0
#define	PG_KW		0x10000000
#define	PG_KR		0x18000000
#define	PG_UW		0x20000000
#define	PG_URKW		0x70000000
#define	PG_URKR		0x78000000

Execute permission is implied by read permission.
-- 
In-Real-Life: Chris Torek, Univ of MD Comp Sci Dept (+1 301 454 7163)
Domain:	chris@mimsy.umd.edu	Path:	uunet!mimsy!chris

levy@ttrdc.UUCP (Daniel R. Levy) (12/02/88)

In article <14733@mimsy.UUCP>, chris@mimsy.UUCP (Chris Torek) writes:
> I will, however, note that any number of local changes might have
> moved the address `wwww' far enough to foil the attack.  One could
> argue that, perhaps, each process should have a different view of its
> own address space.  It would certainly be easy enough to have the
> c startup code move the stack down by a pseudo-random amount....

Couldn't this cause problems in using a debugger?  With the stack location
differing from invocation to invocation, pointer values which refer to stack
locations would also differ between otherwise identical runs of a program.

-- 
|------------Dan Levy------------|  THE OPINIONS EXPRESSED HEREIN ARE MINE ONLY
| Bell Labs Area 61 (R.I.P., TTY)|  AND ARE NOT TO BE IMPUTED TO AT&T.
|        Skokie, Illinois        | 
|-----Path:  att!ttbcad!levy-----|

stuart@bms-at.UUCP (Stuart Gathman) (12/03/88)

Although the code space can't be modified, a virus can selectively
execute portions of the code space in any desired sequence by modifying
the stack in a split I&D process.  It doesn't take much imagination
to see what might be done with the library code included with a
typical program.  exec() is only one possibility.
-- 
Stuart D. Gathman	<stuart@bms-at.uucp>
			<..!{vrdxhq|daitc}!bms-at!stuart>

allbery@ncoast.UUCP (Brandon S. Allbery) (12/04/88)

As quoted from <14733@mimsy.UUCP> by chris@mimsy.UUCP (Chris Torek):
+---------------
| (It is worth noting that the fingerd attack was applied only to VAXen.)
| 
> (deleted; "wwww" is the "worm address" where the worm had to run)
|
| I will, however, note that any number of local changes might have
| moved the address `wwww' far enough to foil the attack.  One could
+---------------

From what I've read, the fingerd attack was applied to Suns as well -- but
the "wwww" address *was* sufficiently wrong, so an infected fingerd simply
dumped core.

++Brandon
-- 
Brandon S. Allbery, comp.sources.misc moderator and one admin of ncoast PA UN*X
uunet!hal.cwru.edu!ncoast!allbery  <PREFERRED!>	    ncoast!allbery@hal.cwru.edu
allberyb@skybridge.sdi.cwru.edu	      <ALSO>		   allbery@uunet.uu.net
comp.sources.misc is moving off ncoast -- please do NOT send submissions direct
      Send comp.sources.misc submissions to comp-sources-misc@<backbone>.

jik@athena.mit.edu (Jonathan I. Kamens) (12/05/88)

In article <13203@ncoast.UUCP> allbery@ncoast.UUCP (Brandon S. Allbery) writes:

>From what I've read, the fingerd attack was applied to Suns as well -- but
>the "wwww" address *was* sufficiently wrong, so an infected fingerd simply
>dumped core.

This is not correct.  I just checked with the one of the members of
the team who disassembled the code here at MIT.  He says that the
problem with the Sun version of the worm was that it was trying to use
the same hex instructions as the VAX code.  This obviously wouldn't
work, since the Sun instruction set is just slightly different from
the VAX's :-).

If the author(s) of the code had bothered to figure out the stack
frame dimensions on the Sun, I'm sure he/she/they would have also
figured out the necessary Sun instructions to make it work, and vice
versa.

  Jonathan Kamens
  MIT Project Athena

barmar@think.COM (Barry Margolin) (12/06/88)

In article <8308@bloom-beacon.MIT.EDU> jik@athena.mit.edu (Jonathan I. Kamens) writes:
[Regarding the fingerd worm:]
>If the author(s) of the code had bothered to figure out the stack
>frame dimensions on the Sun, I'm sure he/she/they would have also
>figured out the necessary Sun instructions to make it work, and vice
>versa.

I don't think so.  I don't think the worm knows the hardware of the
system it is trying to propogate to.  If it's propogating using
machine language instructions, it needs to know the hardware.  The
sendmail worm could go to either system because it transmitted a shell
script that runs on both Suns and Vaxes, which was able to look around
and determine which kind it was running on (in order to transfer and
link the correct stuff).

Barry Margolin
Thinking Machines Corp.

barmar@think.com
{uunet,harvard}!think!barmar