[comp.os.vms] Details on shareable image problems

kvc@minnie.UUCP (Kevin Carosso) (12/02/87)

This is a rather long followup to the discussion that took place in
INFO-VAX in September wrt problems using the VAXCRTL shareable image
or problems building shareable images from VAX C or FORTRAN.  The
discussion ended on a note of agreement that "yes, there is a problem
with the linker or librarian that makes this not work".  Since I
recently ran into the problem, this is a detailed analysis of what
I think is really happening.
-------------------------------------------------------------------

	Notes on building shareable images

		/Kevin Carosso @ NRC, 30-NOV-1987

Due to the way in which the VMS linker and the VMS C compiler work, there
is currently a problem using VAX C modules with shareable images.  These
notes represent what I believe to be an accurate analysis of what the
problems are, why they occur, and how to avoid them.  I also include a,
probably much less accurate, summary on what I think really ought to
be done to solve the problem.

Problems show up when VAX C modules are linked against a shareable image,
such as VAXCRTL.EXE or a user-written shareable image and that shareable
image has been placed in a shareable image library.  Resulting executable
programs are incorrect because references from within the modules to
"extern" variables defined by the shareable image do not match and the
two sections of code end up using different locations in memory.  They
never see each other's values.

To understand how the problem arises, we need to understand

	- how VAX C defines global variables
	- how the linker determines its order of processing modules
	- how the linker resolves references.

We'll start with VAX C:

When the C compiler encounters a definition outside a routine, such
as the following:

	int zz;

the compiler will create a PSECT (Program SECTion) called "ZZ", with the
OVR (overlay) attribute.  Every definition of "ZZ" like the above will
also define such a PSECT.  Because the PSECT has the OVR attribute, all
such PSECT definitions encountered by the linker under normal circumstances
will be overlaid, one atop the other, causing all references to "ZZ" to
end up referencing the same storage cell.  Interestingly enough, the C
compiler generates precisely the same PSECT definition for a C "extern"
declaration, such as:

	extern int zz;

This means, that you could write a set of VAX C modules all using a global
variable, such as "ZZ", and you could say "extern" for all of them or for
none of them, the result is the same.  At least I didn't see a difference
in the object code emitted by the C compiler for the two declarations
(using VAX C V2.2).

It should be noted here that the VAX C extensions to declarations, "globaldef"
and "globalref", do NOT create PSECT definitions and instead define
global symbols.  This stuff is documented in the "Guide to VAX C"
for V2.3.  Though they may both be used to create a global storage
cell and resolve references to that cell, there are some very important
differences between PSECT definition/reference and global symbol
definition/reference.  To understand these we need to talk a little bit
about how the linker works.  From this point on, we won't be specifically
addressing VAX C.  FORTRAN, for example, uses overlaid PSECTs for
COMMON storage and those are subject to exactly the same difficulties.

The linker understands the difference between a definition of a global
symbol and a reference to a global symbol and can check and complain
about multiply defined symbols or references to undefined symbols.
On the other hand, there is only one kind of PSECT declaration.
Multiple declaractions of a PSECT are merely overlaid or concatenated
depending on their attributes.  You can never reference a PSECT that
is not defined -- the first reference is the definition.

The linker processes entities called "clusters".  As the linker parses
its DCL command line and any options files you specified, it creates a
list of clusters to be processed.  Object modules, object libraries, and
shareable image libraries (remember this!) are all placed into a cluster
called "DEFAULT_CLUSTER" unless an options file specifically creates a
new cluster and names those modules to be put into it.  Shareable images
encountered in an options file are each placed into their own cluster.

The linker now has a list of clusters, each of which may have a list of
one or more files.  The order with which the linker encounters and resolves
symbols depends entirely on the order it processes clusters and the
the files listed in those clusters.  The linker processes clusters in the
order it encountered them, except for the default cluster, which is always
put on the end of the cluster list and thus processed last.

When the cluster list has been completely processed, the linker determines
whether it has any unresolved symbols.  If so, it then processes default
user libraries, the default system shareable image library (IMAGELIB) and the
default system object library (STARLET) in that order, unless these have
been supressed with appropriate LINK command qualifiers.

The key thing to note here is that the linker will process the VAXCRTL
shareable image before user written modules if VAXCRTL/SHARE is specified
in an options file.  If VAXCRTL is placed in a shareable image library or
in IMAGELIB, it will be seen after the user written modules.  Apparently,
in order to properly overlay all references to a PSECT to the same PSECT,
a shareable image containing a reference to the PSECT has to be seen first.
I assume this requirement has something to do with the fact that all
references within the shareable image to the PSECT have already been
resolved by the link of the shareable image itself and in order for
other modules to see the same resolution of the PSECT, the linker must
use the storage defined for the PSECT in the shareable image.  To guarantee
this, the linker must see the shareable image definition first.

So, keeping these things in mind, here are some suggestions for those
developers trying to use and or create shareable images which contain
PSECT definitions to be seen by code outside that shareable image
(e.g. C extern variables and FORTRAN common blocks).

  1 When linking to such a shareable image, make sure the linker will
    process it before any other modules.  By referencing the image with
    an options file this happens automatically.  If you so choose, you
    can use an options file to impose any ordering you like, just make
    sure the shareable image in question is processed before the modules
    with which it shares PSECTs.

  2 You can create a shareable image that references such PSECTs
    in another shareable image as long as you include the other shareable
    image in the link of the new shareable image and in the link
    of the final executable.  You won't get errors about the missing PSECTs
    if you don't -- though you may get other errors if you reference
    symbols too.  For example, you put your own C subroutines in a
    shareable image and they reference "errno".  In this case, you would
    be sure to include VAXCRTL/SHARE in the link of both the new shareable
    image and the final executable according to rule #1.  Technically,
    this applies even if you didn't reference anything else in VAXCRTL and
    hence would get no errors leaving it out (although it probably doesn't
    matter since you wouldn't be calling the routines with the different
    references).

  3 Remember that nice linker errors such as "multiply defined symbol" and
    "undefined symbol" won't happen with PSECTs.  Beware.

  4 If you wished to replace a symbol in a shareable image with your own
    definition (e.g. replace "getenv" with something of your own) the
    linker has to see your definition first.  Unfortunately, by rule #1
    you can't do that easily and still have PSECTs resolve properly.
    You would have to play games with options files and even then it may
    not be doable if your replacement routine references a PSECT that
    needs to be shared with the shareable image containing the symbol
    you are replacing. [got that?]  Note that replacing something like
    LIB$GET_VM is easy, barring a MULDEF complaint, because link order
    is not important for images that don't try to share PSECTs.

  5 If your code never references the PSECTs (e.g. errno, stdin, etc...)
    directly (e.g. you have no declaration of them in your C module) then
    you will probably not run into these problems since the references within
    the shareable image agree on what they are referencing.

How can this problem be fixed?  Well, one quite clear fix would be to have
LINK be a little smarter and figure out for itself how to resolve several
overlaid PSECTs into the same storage area and complain if for some reason
this cannot be done.  I don't know a thing about the internals of LINK,
so I certainly cannot say whether this is really doable.

It seems to me that, in the case of VAX C anyway, the wrong mechanism is
being used.  If VAX C used global symbols rather than PSECTs to reference
"extern" variables then it would be using the mechanism designed for the
purpose.  Why did they use PSECTs by default anyway?  Dump "globaldef"
and "globalref" and allow the use of an extension to the language for
people who need to create PSECTs (the extension is already there).

Well, congratulations are in order if you made it this far.  Can anyone
out there shed some more light on the subject?  Especially in regard to
why it works this way and/or why things aren't a little more flexible.