[comp.unix.msdos] MSC 6.0 failing under VPIX.

NU013809@NDSUVM1.BITNET (Greg Wettstein) (09/24/90)

A couple of weeks ago I posted a note requesting information from anyone
who has had difficulty with MSC 6.0 producing faulty executables when
the compiler is run under VPIX.  I received one piece of mail from someone
who indicated that they had no difficulty in this environment.

Since that time I have spent some time replicating the problem and would
like to pass the following information on to the net in the hopes that
other users may try to replicate this problem and provide additional
information.  I have spoken with Microsoft Technical Support about the
problem and they basically have no ideas.  I was basically advised that
they support the product only under MS-DOS and that they cannot take
responsibility for emulators.  An odd situation since every time I
type ver under MS-DOS running under vpix I get: MS-DOS Version 3.30, but
that is another topic entirely.

The basic problem is this: When a program is compiled with debugging
enabled using Microsoft C Version 6.0 a faulty executable is generated.
If the same program is compiled without the debugging option (/Zi) the
problem does not occur.  The faulty executable produced causes VPIX to
terminate with the following error: Unable to emulate two-byte instruction.

I spent about two days poking around with Codeview and have isolated the
following features about the bug:

The problem lies entirely in the linkage stage.  I set up another machine
running with the same version of MS-DOS (3.3) in native mode and the
object code generated by identical programs with identical compilation
options produces identical object files.  The final executables differ
in size by 522 bytes.  The executable produced under VPIX is faulty
in both environments and causes the native DOS machine to hang requiring
a power-down.

The first error that I could find in the executable was a problem with
the null-pointer check code supported by the /qc option.  The startup code
checks the ___qczrinit variable and if this value is non-zero attempts to
execute code pointed to by the ___qczrinit location.  In the VPIX
generated executable this location is non-zero, unfortunately the address
pointed to by this location is filled with junk causing the emulator to
trap and the native DOS machine to hang.

Using CV I manually set the ___qczrinit variable to zero causing the
startup code to skip running the null-pointer check initialization code.
The program now ran past this fault but generated an emulator trap
(machine hang) later on in the startup code.

I traced program execution to the point where the floating point emulator
code is initialized.  This code can be found in crt0dat.asm which is
in the dos subdirectory of the startup code.  The code in question is
found around line 379 and is the following instruction: call [fpmath]
Unfortunately the value pointed to by fpmath contains junk, and as in the
case with ___qcrzinit causes the machine to trap or hang.

At this stage I was convinved that the incremental linker included with
MSC 6.0 was either not loading the needed code into the executable or
was loading it in the wrong place in the executable.  I created a number
of small test programs of varying complexity(size) but in each case the
difference in size of the executable between the VPIX and native DOS
was not consistent.  This lead me to believe that the problem was not
in the linker failing to include needed object code from the libraries.

At this stage I contacted Microsoft Support and conveyed all of the
previous information to them.  The support person I spoke with acknowledged
that there were some difficulties with the incremental linker under
certain environments.  At this point he mentioned something about networks
and a light clinked on in my head.

I do all of my MS-DOS support on the UNIX filesystem reserving the pseudo-C
drive to hold my utilities, compilers etc.  The thought crossed my mind
that I hadn't tested the compiler on the C drive.  I copied the program to
the C drive and ran the compiler which produced an error-free executable
which was byte-identical to the one produced in the native DOS environment.

With the problem very well isolated I contacted Microsoft Support but was
advised that they only support the compiler under MS-DOS and could not
be responsible for errors produced by an emulator.  I can understand their
position but the problem does not smell to me like it arises from the
emulator but rather from a networking perspective.

If I understand the environment correctly the UNIX filesystem is mounted
much like a network drive when VPIX is running.  In fact if DOS CHKDSK
is run while in a mounted directory (z: y: h: etc) an error message is
issued stating that CHKDSK cannot be run on a networked drive.  I have
tested a number of Norton-like utilities on the mounted directories and
they issue a similar message.

Based on these experinces I am concluding that the problem appears to be
in either the way the incremental linker deals with networked drives or
perhaps in the quality of the network emulation of the UNIX filesystem
under VPIX.  It probably doesn't matter where the problem lies because
I would suspect that the two software companies in question would blame
each other.  In the mean time the only work-around would seem to be to
do all compilations in the pseudo-C drive.  I am not happy with this
situation due to size limitatins on the pseudo-C drive and write-privilege
problems but if MSC 6.0 is to be used it is the only alternative.

I am wondering if anyone has experienced a similar problem or am I alone
in this?  The version of XENIX in use is 2.3.3 and VPIX Update A.  Hardware
does not seem to be an issue since the problem occurs on two entirely
different machines.  Machine 1: ALR 220 with 6 megabytes ram, EGA, 300
megabyte CDC Wren with Compu-Add caching controller (2 meg), Mountain tape,
two serial, parallel.  Machine 2: Gateway 2000 33 Mhz/64K cache, 8 megabytes
ram, 300 megabyte CDC Wren, VGA, mountain tape, two serial etc.

My suspicions center on problems associated with file size detection and
stat'ing of files on the network drive.  I have somewhat vague recollections
of MS-DOS based rm utilities failing to erase all files in a sub-directory
when rm *.* is used.  A second invocation cleans out all files left by the
first pass of rm.

My suspicion is that the incremental linker is not seeking to the correct
position in the executable file when it attempts to write things out.  This
would explain why executable code does not seem to be at the correct
location in the file.  This is of course entirely supposition as I have
not had time to figure out exactly what is happening.  Just as a side note
when I was using MSC 5.0 and QC 2.0 on the network mounted UNIX filesystem
I would occassionally get a linker failure with a message that said one
of the object modules was incorrect, invalid or something to that effect.  I
would delete the offending object module, recompile, link and all would be
fine.

I have wasted enough of everyone's time at this point so I will cease.  Any
information or comments would be appreciated.  Hopefully this might save
someone else an afternoon or evening of time at some point.  It is
unfortunate that MSC 6.0 exhibits this behavior, we depend pretty heavily
on VPIX to support our MS-DOS development.  The advantages of this
environment are numerous, not the least of which is the fact that I would
have worn out a power-switch and probably burned out a disk-drive trying to
isolate a similar problem on a native-mode DOS machine....

Replies or questions directed to the address in my sig are preferred.  Thanks
to everyone who has taken the time to read this.

                            As always,
                            Dr. G.W. Wettstein
                            Oncology Research Division Computing Facility
                            Fargo Clinic / MeritCare

                            UUCP: uunet!plains!wind!greg
                            INTERNET: greg%wind.uucp@plains.nodak.edu
                            Phone: 7001-234-2833

`The truest mark of a man's wisdom is his ability to listen to other
 men expound their wisdom.'

eric@femto.mks.com (Eric Gisin) (09/26/90)

In-reply-to: NU013809@NDSUVM1.BITNET's message of 24 Sep 90 13:02:18 GMT
> [the MS linker fails on non-DOS drives in SCO VP/ix]

There is a known problem with VP/ix on Interactive UNIX (not fixed in 2.2).
A write of 0 bytes is supposed to truncate the file at the current file position.
This is documented DOS behaviour and does not work in VP/ix on UNIX drives.

This may not be the bug that cause MS link to fail, but it
is possible that the linker seeks back and forth in the file
re-writing things. It should be easy to write a test program that
checks for bugs in both these operations.

andy@mks.com (Andy Toy) (09/26/90)

If this happens when linking on the UNIX filesystem from VPIX then this
problem may also exist if linking from a DOS machine with the UNIX
filesystem NFS mounted.  
-- 
Andy Toy, Mortice Kern Systems Inc.,       Internet: andy@mks.com
  35 King Street North, Waterloo,       UUCP: uunet!watmath!mks!andy
      Ontario, CANADA N2J 2W9      Phone: 519-884-2251  FAX: 519-884-8861

src@scuzzy.in-berlin.de (Heiko Blume) (09/28/90)

NU013809@NDSUVM1.BITNET (Greg Wettstein) writes:
>In the mean time the only work-around would seem to be to
>do all compilations in the pseudo-C drive.  I am not happy with this
>situation due to size limitatins on the pseudo-C drive and write-privilege
>problems but if MSC 6.0 is to be used it is the only alternative.

i'd suggest to set up a pseudo D: drive. then you can recreate D: every
once in a while to avoid fragmentation etc. and don't have to change C:
(by writing to it).
-- 
Heiko Blume c/o Diakite   blume@scuzzy.in-berlin.de   FAX   (+49 30) 882 50 65
Kottbusser Damm 28        blume@scuzzy.mbx.sub.org    VOICE (+49 30) 691 88 93
D-1000 Berlin 61          blume@netmbx.de             TELEX 184174 intro d
scuzzy Any ACU,e 19200 6919520 ogin:--ogin: nuucp ssword: nuucp