[comp.unix.questions] C bug causes double fault

zarquon@tree.UUCP (Erin Filbert) (03/19/89)

While writing a deceptively simple C program, I managed to crash the entire
system here.  The following code:

main()
{
	float x;

	printf("x = %d", x);
}

will send the system into double fault.  I am using Microport System V
Release 2 on a 286 machine (AT Clone).

Has anyone else encountered this problem?  Are there any patches for it?  
The last time this happened, restarting the system caused severe file
damage.  

Any help would be greatly appreciated.

-- 
-------------------------------------------------------------------------------
I'm sorry.  I think we might have      |           Erin M. Filbert
been better off with a slide rule.     |   
   - Zaphod Beeblebrox                 |   Path: pacbell!sactoh0!tree!zarquon

gwyn@smoke.BRL.MIL (Doug Gwyn ) (03/20/89)

In article <244@tree.UUCP> zarquon@tree.UUCP (Erin Filbert) writes:
>main()
>{
>	float x;
>	printf("x = %d", x);
>}

You'll undoubtedly get a flood of responses correctly pointing out
that conversion of a double (promoted float) argument according to
an int format is incorrect.  Use %g or some such format specifier.

The reason I'm posting this is so I can include a plea not to post
questions like this to comp.unix.wizards.  That's what
comp.unix.questions is for.  UNIX-WIZARDS is for "wizardly"
discussions (not that it gets very many, but that's what it's
inteded for).  Thanks.

tvf@cci632.UUCP (Tom Frauenhofer) (03/21/89)

In article <9884@smoke.BRL.MIL> gwyn@brl.arpa (Doug Gwyn (VLD/VMB) <gwyn>) writes:
>You'll undoubtedly get a flood of responses correctly pointing out
>that conversion of a double (promoted float) argument according to
>an int format is incorrect.  Use %g or some such format specifier.

I'll give you the point, Doug, but his point remains valid.  On Microport V/AT,
what he wrote causes a kernel panic.  That doesn't seem to be reasonable
behavior for an OS/library routine/whatever.

(P.S.: Your group suggestion was correct, but I would have limited it to
       comp.unix.microport myself.  I'm not limiting my reply in deference
       to the original posting.)

Thomas V. Frauenhofer	...!rutgers!rochester!cci632!ccird7!tvf
*or* ...!rochester!cci632!ccird7!frau!tvf *or* ...!rochester!rit!anna!ma!tvf1477
BLOOM: You can't shoot the actors!  They're human beings!
BIALYSTOCK: Oh Yeah?  You ever eat with one?

dbell@cup.portal.com (David J Bell) (03/21/89)

>In article <244@tree.UUCP> zarquon@tree.UUCP (Erin Filbert) writes:
>>main()
>>{
>>	float x;
>>	printf("x = %d", x);
>>}
>
>You'll undoubtedly get a flood of responses correctly pointing out
>that conversion of a double (promoted float) argument according to
>an int format is incorrect.  Use %g or some such format specifier.
>
>The reason I'm posting this is so I can include a plea not to post
>questions like this to comp.unix.wizards.  That's what
>comp.unix.questions is for.  UNIX-WIZARDS is for "wizardly"
>discussions (not that it gets very many, but that's what it's
>inteded for).  Thanks.

OK, *WIZARDS*, now answer Erin's real question; I'm sure the original
error of printing the float )OK, double...) argument as int was recognized.

Now, why does the fragile compiler bring down the system?

decot@hpisod2.HP.COM (Dave Decot) (03/22/89)

In article <244@tree.UUCP> zarquon@tree.UUCP (Erin Filbert) writes:
> >main()
> >{
> >	float x;
> >	printf("x = %d", x);
> >}

In another article, gwyn@smoke.BRL.MIL (Doug Gwyn ) responds:
> You'll undoubtedly get a flood of responses correctly pointing out
> that conversion of a double (promoted float) argument according to
> an int format is incorrect.  Use %g or some such format specifier.

Well, I "doubt" that Erin will get a flood of responses pointing this
out, since it is obvious to almost any reader (by virtue of the fact that the
code in question is completely isolated) that Erin already knows this
code is incorrect.

I don't have an answer to the question Erin asked, either; sorry.

Dave

asmodeus@tree.UUCP (Jonathan Ballard) (03/22/89)

In article <9884@smoke.BRL.MIL>, gwyn@smoke.BRL.MIL (Doug Gwyn ) writes:
> In article <244@tree.UUCP> zarquon@tree.UUCP (Erin Filbert) writes:
> >main()
> >{
> >	float x;
> >	printf("x = %d", x);
> >}
> 
> You'll undoubtedly get a flood of responses correctly pointing out
> that conversion of a double (promoted float) argument according to
> an int format is incorrect.  Use %g or some such format specifier.

Okay, of course it is a bug.  But the problem is that cc is not detecting
it and just compiles it that way.  Shouldn't there be something in cc so
that a warning could appear.  Say like:

WARNING! Float not casted in to different declaration.

This is actually a serious error because anybody could do this and then
totally crash the system! 
-- 
----Asmodeus - Jonathan Ballard  ..!csusac!tree!asmodeus
				 ..!pacbell!sactoh0!tree!asmodeus
"I'm going to create the best game ever heard of!
	Might take a few years thou..." -me

gwyn@smoke.BRL.MIL (Doug Gwyn ) (03/22/89)

In article <27245@cci632.UUCP> tvf@ccird7.UUCP (Tom Frauenhofer) writes:
>On Microport V/AT, what he wrote causes a kernel panic.  That doesn't seem
>to be reasonable behavior for an OS/library routine/whatever.

Of course nobody would call it "reasonable", but it's not too surprising.
Incorrect user-mode code on a nonprotected multitasking system (forced by
limitations of the PC/AT architecture) can easily crash the entire system.
For another example, when testing newly written DMD applications downloaded
into my (AT&T 5620 or 630) terminal, some bugs cause the whole terminal
to die and have to be rebooted.  That's just the nature of environments
without hardware memory protection.

To avoid problems in the original example, the C implementation would
have to perform many detailed checks at run time, which would be considered
prohibitively high overhead, or else the compilation environment would have
to detect *printf() format/argument type mismatches.  The latter is feasible
and perhaps by nagging the vendor it will be done in some future release.

dave@viper.Lynx.MN.Org (David Messer) (03/22/89)

In article <9884@smoke.BRL.MIL> gwyn@brl.arpa (Doug Gwyn (VLD/VMB) <gwyn>) writes:
 >In article <244@tree.UUCP> zarquon@tree.UUCP (Erin Filbert) writes:

   { Mention that this causes the system to crash deleted by Doug Gwyn }

 >>main()
 >>{
 >>	float x;
 >>	printf("x = %d", x);
 >>}
 >
 >You'll undoubtedly get a flood of responses correctly pointing out
 >that conversion of a double (promoted float) argument according to
 >an int format is incorrect.  Use %g or some such format specifier.

And you will probably get a flood of responses correctly pointing out
that what you say is irrelevent.  The original message mentioned that
he "crashed the entire system" by running this program (calling it
a "double fault" rather than "double panic"; which may have misled you).
It doesn't matter that the C program has a bug, it still shouldn't
crash the operating system.

 >The reason I'm posting this is so I can include a plea not to post
 >questions like this to comp.unix.wizards.

A true wizard carefully reads the question so that he might answer the
question actually asked, rather than just say the first thing that comes
to mind.

 >  Thanks.

You are welcome.
-- 
This space                           | David Messer       dave@Lynx.MN.Org -or-
for rent.                            | Lynx Data Systems  ...!bungia!viper!dave

rcd@ico.ISC.COM (Dick Dunn) (03/23/89)

In article <9900@smoke.BRL.MIL>, gwyn@smoke.BRL.MIL (Doug Gwyn ) writes:
> In article <27245@cci632.UUCP> tvf@ccird7.UUCP (Tom Frauenhofer) writes:
> >On Microport V/AT, what he wrote causes a kernel panic.  That doesn't seem
> >to be reasonable behavior for an OS/library routine/whatever.
> 
> Of course nobody would call it "reasonable", but it's not too surprising.
> Incorrect user-mode code on a nonprotected multitasking system (forced by
> limitations of the PC/AT architecture) can easily crash the entire system.

What, more specifically, are the "limitations of the PC/AT architecture"?

Microport runs the 286 in protected mode.  Each process has its own memory,
protected via the LDT, and the GDT entries <should> be set so that user-
level code can't get outside its playpen.  The memory protection in the 286
is a fairly serious nuisance to work with, but it is there.

Exceptions occurring in user mode go through a call gate into system mode
where you can straighten out the mess.  You change stacks when you change
privilege levels; in the kernel you're in an environment as safe (and as
insulated from user-code screwups) as you care to make it.

Is there some problem with the 287 interaction with the 286 in protected
mode that can't be made safe?  Specifically, what can a user-mode program
on the AT do, when running in protected mode, that the OS couldn't protect
against?
-- 
Dick Dunn      UUCP: {ncar,nbires}!ico!rcd           (303)449-2870
   ...Never offend with style when you can offend with substance.

dave@micropen (David F. Carlson) (03/23/89)

In article <9900@smoke.BRL.MIL>, gwyn@smoke.BRL.MIL (Doug Gwyn ) writes:
> In article <27245@cci632.UUCP> tvf@ccird7.UUCP (Tom Frauenhofer) writes:
> >On Microport V/AT, what he wrote causes a kernel panic.  
> 
> Of course nobody would call it "reasonable", but it's not too surprising.
> Incorrect user-mode code on a nonprotected multitasking system (forced by
> limitations of the PC/AT architecture) can easily crash the entire system.
> For another example, when testing newly written DMD applications downloaded
> into my (AT&T 5620 or 630) terminal, some bugs cause the whole terminal
> to die and have to be rebooted.  That's just the nature of environments
> without hardware memory protection.
> 

Begging your pardon, but although the 80286 has an odd segmented scheme for
memory management, it is not non-protected when running Unix SV in anyway
I am familiar with the term.  Perhaps you are too quick to flame that which
you know not of.

The truth is that Microport early versions had the potential to corrupt the
kernel stack on floating point exceptions, which is what this should be.
This was supposedly fixed several versions ago and I never had saw this
again.  (It was a showstopper though for a multi-user development machine:
too insecure to use.)

-- 
David F. Carlson, Micropen, Inc.
micropen!dave@ee.rochester.edu

"The faster I go, the behinder I get." --Lewis Carroll

gwyn@smoke.BRL.MIL (Doug Gwyn ) (03/23/89)

In article <660@micropen> dave@micropen (David F. Carlson) writes:
>Begging your pardon, but although the 80286 has an odd segmented scheme for
>memory management, it is not non-protected when running Unix SV ...

I stand corrected.  I had been informed (apparently erroneously) that
the PC/AT did not have a hardware memory management unit.  I steer
clear of the whole IBM PC family myself..

>The truth is that Microport early versions had the potential to corrupt the
>kernel stack on floating point exceptions, which is what this should be.
>This was supposedly fixed several versions ago and I never had saw this again.

Someone else informed me the same, except they neglected to mention that
the problem had been fixed.  Apparently there are at least three forms of
floating-point processors, plus software emulation, available for such PCs.

davidsen@steinmetz.ge.com (Wm. E. Davidsen Jr) (03/25/89)

In article <9900@smoke.BRL.MIL> gwyn@brl.arpa (Doug Gwyn (VLD/VMB) <gwyn>) writes:
| Of course nobody would call it "reasonable", but it's not too surprising.
| Incorrect user-mode code on a nonprotected multitasking system (forced by
| limitations of the PC/AT architecture) can easily crash the entire system.

  I'm sure dozens of people will tell you that the 286 has full memory
and i/o space control in protected mode, and that all UNIX versions on
the 286 use this. The flaw is in the handling of floating point
exceptions.

  Question: I could see this failing if you used a "%f" format to print
an int, since it would be an unnormalized float, but why was printf
doing anything in f.p. on this??

  To the person who maligned the 286 vs. the PDP-11. 16MB segmented
beats 256k linear every time. Ask anyone who ever tried to shoehorn an
application one an old 11. Some of the very top of the line 11's had
access to more memory, I believe, but I don't think it was more
convenient than the "huge" model, in that you still only have 16 bit
int's for subscripts. The last 11 I used was a 40 (or 45) and V7.
-- 
	bill davidsen		(wedu@crd.GE.COM)
  {uunet | philabs}!steinmetz!crdos1!davidsen
"Stupidity, like virtue, is its own reward" -me

jbu@sfsup.UUCP (+Urban J.) (03/25/89)

The AT&T PC 6300 PLUS (which was a 80286) also had a similar problem in UNIX System
V Release 2.0 Version 2.

However, all these floating point panics were fixed (on the AT&T 6300 PLUS) in UNIX
System V Release 2.0 Version 2.5.

The basic problem was in the the floating point emulation code when a floating point
exception occured.  On UNIX System V Release 2.0, the floating point emulation code
ran in kernel mode, so when exception occured, the system panic'ed.  In UNIX System
V Release 2.0 Version 2.5  on the 80286, the floating point emulation code was moved
from the kernel stack/area into user space.  Therefore when a an exception occured,
only the user process core dumped and not the kernel panic.

On UNIX System V/386 Release 3.2 (for the 80386) the software emulation code is also
in the user space so when a floating point exception occurs, only the user's process
dies and not the kernel.

On the AT&T PC 6300 PLUS running UNIX System V Release 2.0, if a 80287 chip is present
the system will not panic (nor hang).

Sincerely,
John Urban

mrm@sceard.UUCP (M.R.Murphy) (03/30/89)

In article <660@micropen> dave@micropen (David F. Carlson) writes:
!
!The truth is that Microport early versions had the potential to corrupt the
!kernel stack on floating point exceptions, which is what this should be.
!This was supposedly fixed several versions ago and I never had saw this
!again.  (It was a showstopper though for a multi-user development machine:
!too insecure to use.)
!
The program as written doesn't double panic uPort V/AT 2.2.2 with 80287,
or 2.3, 2.4 without 80287. With other combinations, your mileage may vary:-)
Which, incidentally, points out one of the large problems in getting the
bugs out of an operating system which is expected to run in who-knows-how
many hardware configurations (CPU,motherboard,disk controller,disk drive,
and on and on...). The hardware used in this exhaustive test was no-name clone.
--
Mike Murphy  Sceard Systems, Inc.  544 South Pacific St. San Marcos, CA  92069
mrm@sceard.UUCP       {hp-sdd,nosc,ucsd}!sceard!mrm            +1 619 471 0655