[comp.arch] Randomised Instruction Set Computer

ok@goanna.cs.rmit.oz.au (Richard A. O'Keefe) (06/02/90)

This is an obviously crazy idea, but is it crazy enough to work?
The problem with viruses that attach to ordinary programs is that
    (a) the virus has the power to write into code files
    (b) the virus can determine in advance _what_ to write

The Burroughs B6700 MCP distinguished between object code files and
ordinary files.  A program could not create an object code file unless
it had a special "I'm a compiler" privilege, and a program could only
acquire that privilege when the operator at a console typed a special
command.  The trouble with that was that once you managed to jemmy that
lock the whole system was wide open.  Also, although I never tried this,
you _could_ restore an object file from a backup tape, and there would
have been no great difficulty in forging backup tapes, so it seems that
it might have been possible to create object programs by writing "raw"
tapes and then restoring from them.  (You couldn't forge a _compiler_
this way, as compiler privilege was _not_ restored.)

Obviously, it would be possible to strengthen this system.

But even then, things like the sendmail bug could still be exploited.

The crazy idea, then, is to ensure that a virus can't be sure of knowing
what bit pattern to generate in order to realise its effect.  What if we
tried to make a computer hard to effect by giving it a randomised
instruction set?

Suppose you had a RISC machine with, say, 64 genuine instructions encoded
in a 7 bit field, the remaining 64 being mapped to "illegal opcode" and
that the machine had an on-chip memory of 128x7 bits, so that instruction
decoding went through this memory?  Suppose that there were a system call
"trust such&such a code file" which generated two random permutations:
	- one affecting the opcodes in instructions
	- one affecting the order of bytes within a code page
then when a program was to be run the decoding matrix would be set from
the opcode permutation and whenever a code page was fetched from the
code file it would be unscrambled using the byte order permutation (but
the opcodes would _not_ be unscrambled).

The "trust this code file" system call would presumably insert checksums
and such into the file as well.

Obviously, this device would slow a chip down.  Existing on-chip caches
are bigger, but they're not in quite the same path.  Even slowing down
by a factor of 2 might be worth while for battlefield systems.

The problem then comes back at the level of interpretive languages, alas.

Anyway, is this idea crazy enough to work?  _Can_ a virus spread when each
program appears to use a different instruction set?
-- 
"A 7th class of programs, correct in every way, is believed to exist by a
few computer scientists.  However, no example could be found to include here."

amos@taux01.UUCP (Amos Shapir) (06/04/90)

The idea is not so crazy; actually, some copy-protection methods work
that way.  However, using this method to stop viruses seems to be too much
of an overhead.

-- 
	Amos Shapir		amos@taux01.nsc.com, amos@nsc.nsc.com
National Semiconductor (Israel) P.O.B. 3007, Herzlia 46104, Israel
Tel. +972 52 522408  TWX: 33691, fax: +972-52-558322 GEO: 34 48 E / 32 10 N

hankd@ee.ecn.purdue.edu (Hank Dietz) (06/04/90)

In article <3131@goanna.cs.rmit.oz.au> ok@goanna.cs.rmit.oz.au (Richard A. O'Keefe) writes:
>The crazy idea, then, is to ensure that a virus can't be sure of knowing
>what bit pattern to generate in order to realise its effect.  What if we
>tried to make a computer hard to effect by giving it a randomised
>instruction set?
[stuff about using table-lookup to re-map instruction set for each program]

Can certainly be done, but table lookup isn't exactly the hardest encryption
scheme to break.  Compilers tend to generate recognizable instruction
patterns (unless they have really good optimizers), so the virus would only
need to be a bit more clever to defeat all that expensive & slow extra
hardware....  Of course, this is just my opinion.  Modal instructions might
make things more difficult, however, without the need for a full table....

						-hankd@ecn.purdue.edu

jc@atcmp.nl (Jan Christiaan van Winkel) (06/05/90)

From article <3131@goanna.cs.rmit.oz.au>, by ok@goanna.cs.rmit.oz.au (Richard A. O'Keefe):
> Suppose you had a RISC machine with, say, 64 genuine instructions encoded
> in a 7 bit field, the remaining 64 being mapped to "illegal opcode" and
> that the machine had an on-chip memory of 128x7 bits, so that instruction
> decoding went through this memory?  Suppose that there were a system call

I believe that there were special versions of the Z80 processor that used a
different instruction set, i.e. all opcodes were encrypted using a normal
substitution cipher. This means that normal roms for the boot process 
could not be used, and that every copy of the CPU could have a different 
opcode set.
JC
-- 
Jan Christiaan van Winkel              Tel: +31 80 566880     jc@atcmp.nl
AT Computing       P.O. Box 1428       6501 BK Nijmegen       The Netherlands

barry@PRC.Unisys.COM (Barry Traylor) (06/07/90)

In article <3131@goanna.cs.rmit.oz.au> ok@goanna.cs.rmit.oz.au (Richard A. O'Keefe) writes:
> ...
>The Burroughs B6700 MCP distinguished between object code files and
>ordinary files.  A program could not create an object code file unless
>it had a special "I'm a compiler" privilege, and a program could only
>acquire that privilege when the operator at a console typed a special
>command.  

I can't thank-you enough for mentioning this.  The B6700 lives on as the
Unisys "A Series".

>The trouble with that was that once you managed to jemmy (sic) that
>lock the whole system was wide open.  Also, although I never tried this,
>you _could_ restore an object file from a backup tape, and there would
>have been no great difficulty in forging backup tapes, so it seems that
>it might have been possible to create object programs by writing "raw"
>tapes and then restoring from them.  (You couldn't forge a _compiler_
>this way, as compiler privilege was _not_ restored.)

Both of these require breaking PHYSICAL security.  Some significant work
has been done on these systems in the area of security.  Users may set up
their system in such a way that very special privilege is required to
import "unknown" code files or change a valid code file into a compiler.

BTW, part of our security has always seemed to rely on the relative
obscurity of our systems, although they have been strengthened to not rely
on this anymore ;->.

>Obviously, it would be possible to strengthen this system.

This has been done.  One thing you don't mention is that there is no way to
modify the in-core image of an executing code file, i.e. any modifications
to the address space cannot affect the in-core code.  I believe this is one
of the methods used by R. Morris (sp?), Jr.  

>But even then, things like the sendmail bug could still be exploited.

I don't think so, unless the perpetrater has broken fundamental physical
security or the security administrator is a spy for IBM (8-)=  (seriously,
I'm just kidding).

>...
>The "trust this code file" system call would presumably insert checksums
>and such into the file as well.

How is this any more secure than the A Series method?  It still requires
adequate security in the operating system.

>...
>The problem then comes back at the level of interpretive languages, alas.

Alas.

>
>Anyway, is this idea crazy enough to work?  _Can_ a virus spread when each
>program appears to use a different instruction set?

For both questions:  uh... I don't think so (my personal opinion).

Barry Traylor
Unisys Large A Series Engineering (love those mainframes!)
barry@prc.unisys.com

ok@goanna.cs.rmit.oz.au (Richard A. O'Keefe) (06/08/90)

In article <14060@burdvax.PRC.Unisys.COM>, barry@PRC.Unisys.COM (Barry Traylor) writes:
> In article <3131@goanna.cs.rmit.oz.au> ok@goanna.cs.rmit.oz.au (Richard A. O'Keefe) writes:
> >The trouble with that was that once you managed to jemmy (sic) that
						      ^^^^^^^^^^^
I wrote "jemmy".  Barry Traylor, apparently without checking a dictionary,
added "(sic)".  here we go:
	jemmy (N,Count,^tool,=crowbar) A *jemmy* is a heavy metal
	bar which is curved at one end and which is used as a tool
	especially by criminals for forcing things open;  used in
	British English.  E.g. "Any filing cabinet will yield to
	a jemmy and a bit of brute force."
To jemmy something is to force it with a jemmy.  Clearly the metaphor
in "jemmy that lock" is "use special tools to break something open
with criminal intent".  No "(sic)" needed.

> Both of these require breaking PHYSICAL security.

True.  But _once_ is all it took.  For example, when Burroughs
installed their *lovely* machines in the New Zealand Universities, they
managed to leave behind at Auckland University, as a bookmark in a
publically accessible manual, a punched card with the magic letters NZUP
on it.  "New Zealand Universities Project", I thought, _surely_ this
can't be the super-user password.  But it _was_.  Compiled myself a
little ESPOL program and the world was _wide_ open ever after.  Only had to
use the password once too.  (Mind you, I was working at the Computer
Centre at the time and told them what I was doing.  Hacker yes, cracker no.)

> BTW, part of our security has always seemed to rely on the relative
> obscurity of our systems, although they have been strengthened to not rely
> on this anymore ;->.

Then there was the really fun bug (which was very promptly fixed; Burroughs
were pretty prompt to warn about things like that, and we had sources)
where if you knew what you were doing you could arrange for a code file
to contain zeros (no privilege needed); the operating system _really_
trusted the compilers, and if you ran such a code file the operators
had all the fun of rebooting (and rereading the card decks which had been
put into queues but not run).

> One thing you don't mention is that there is no way to
> modify the in-core image of an executing code file, i.e. any modifications
> to the address space cannot affect the in-core code.

This turns out not to be the case.  Words in the A series are tagged with
3-bit tags.  Instruction words have tag 3, and no odd tag is writable by
"ordinary" instructions.  But in the B6700 there were special instructions
one could use, and you didn't need to be in a special mode or ring or have
any special privilege to use them.  The system relied on the fact that the
ordinary compilers (Fortran, Algol, Cobol, PL/I, Pascal, WFL, and so on)
didn't generate those instructions.  But ESPOL did.  Several Universities
had load-and-go compilers (Auckland University had a rather nice Fortran)
which worked by creating a DIRECT ARRAY, filling it with code, and then
calling a library routine to set the tags.  After running the program,
one called the library routine again to clear the tags, and Bob's your
uncle.  Obviously the library routine had to check who was calling it
or that would have been a really fatal loophole.

There's an important distinction here between what the _hardware_ can
do and what the *complete* hardware/sofware system will _let_ you do.
This is why it was so important for compiler privilege to be controlled.
And then there were little notices "if you install compiler X with
feature Y enabled [specifically, non-unity increments in DO VECTORMODE
and analogues thereof] all warrantees are void".

> >But even then, things like the sendmail bug could still be exploited.

> >The "trust this code file" system call would presumably insert checksums
> >and such into the file as well.

> How is this any more secure than the A Series method?

Sorry, I was thinking about the B6700 and the crash-on-zero-code-files
bug; mixing up security and integrity.

> >Anyway, is this idea crazy enough to work?  _Can_ a virus spread when each
> >program appears to use a different instruction set?

> For both questions:  uh... I don't think so (my personal opinion).

Can't have it both ways.  If a virus can't spread in such a scheme,
then the scheme works.  (If there is a trusted immutable kernel which
enforces a code/data separation so that running programs not only
can't change code, they can't see it either, then the unscrambling can
be done when code pages are brought in rather than at run time, so most
of the overhead vanishes, as does the need for special hardware.  Oh
well.)

-- 
"A 7th class of programs, correct in every way, is believed to exist by a
few computer scientists.  However, no example could be found to include here."

byrd@portia.Stanford.EDU (Greg Byrd) (06/08/90)

In article <3199@goanna.cs.rmit.oz.au> ok@goanna.cs.rmit.oz.au (Richard A. O'Keefe) writes:
[prev. quote removed]

>I wrote "jemmy".  Barry Traylor, apparently without checking a dictionary,
>added "(sic)".  here we go:
>	jemmy (N,Count,^tool,=crowbar) A *jemmy* is a heavy metal
>	bar which is curved at one end and which is used as a tool
>	especially by criminals for forcing things open;  used in
>	British English.  E.g. "Any filing cabinet will yield to
>	a jemmy and a bit of brute force."
>To jemmy something is to force it with a jemmy.  Clearly the metaphor
>in "jemmy that lock" is "use special tools to break something open
>with criminal intent".  No "(sic)" needed.


In fairness to Mr. Traylor, the American usage is "jimmy," and that's
how it's listed in *my* dictionary.

-----------------------------------------------------------------------------
Greg Byrd                 byrd@sumex-aim.stanford.edu
Knowledge Systems Lab     701 Welch Rd., Bldg. C, Palo Alto, CA  94304
Stanford University       (415) 725-3849



-- 
-----------------------------------------------------------------------------
Greg Byrd                 byrd@sumex-aim.stanford.edu
Knowledge Systems Lab     701 Welch Rd., Bldg. C, Palo Alto, CA  94304
Stanford University       (415) 725-3849

rpw3@rigden.wpd.sgi.com (Rob Warnock) (06/09/90)

In article <3199@goanna.cs.rmit.oz.au> ok@goanna.cs.rmit.oz.au
(Richard A. O'Keefe) writes:
+---------------
| In article <14060@burdvax.PRC.Unisys.COM>, barry@PRC.Unisys.COM writes:
| > >The trouble with that was that once you managed to jemmy (sic) that
| 						      ^^^^^^^^^^^
| I wrote "jemmy".  Barry Traylor, apparently without checking a dictionary,
| added "(sic)".  here we go:
| 	jemmy (N,Count,^tool,=crowbar) A *jemmy* is a heavy metal bar...
|	used in British English...
+---------------

Truly an international network... ;-}  ;-}

U.S. dictionaries list *only* the spelling "jimmy" (from the proper name
"Jimmy", slang for a burglar) for "a short crowbar", or "to break open".
So Barry might have sic'd you even if he *did* check a dictionary!

-Rob

-----
Rob Warnock, MS-9U/510		rpw3@sgi.com		rpw3@pei.com
Silicon Graphics, Inc.		(415)335-1673		Protocol Engines, Inc.
2011 N. Shoreline Blvd.
Mountain View, CA  94039-7311

johnv@sequent.UUCP (John VanZandt) (06/09/90)

Having worked with a B6700 computer for many years, I found that
it was not necessary to break physical security to create codefiles
and compilers.  The system software (circa 1972-1980) had enough security holes
that a non-priviledged program could establish itself as having
total priviledges, and thus change the bits in the stack which determined
whether it was a compiler, circumventing the need for operator intervention.

Now, these security holes have been fixed, but whether any others exist,
I do not know.