[comp.unix.internals] PIC and shared libraries

mohta@necom830.cc.titech.ac.jp (Masataka Ohta) (05/27/91)

Sorry for my misunderstanding about PIC and shared libraries.

In article <8029@auspex.auspex.com> guy@auspex.auspex.com (Guy Harris) writes:

>>As I already said, PIC (Position Independent Code) imposes several
>>restrictions to hardware, which many architectures can't obey.
>
>Which architectures?

If the code must be strictly Position Independent, MIPS, HP-PA, RS/6000 and
many other architectures can not support sharing of PIC, because of virtual
address aliasing and inverted page tables. I am corrct here.

The problem is that, to support shared libraries, strict PIC is not required.

Instead, it is required that the same code runs if the relocation is
multiple of some constant. To support many libraries at once, the
constant should be small enough compared to the size of address
space.  For example, with most segmented architectures, Segment number
Independent Code is enough, though extra overhead to reload segment
registers costs for inter-object-module call.

So, the remaining problem is the performance.

							Masataka Ohta

guy@auspex.auspex.com (Guy Harris) (05/28/91)

>If the code must be strictly Position Independent, MIPS, HP-PA, RS/6000 and
>many other architectures can not support sharing of PIC, because of virtual
>address aliasing and inverted page tables. I am corrct here.

Would you please show some proof for your so-far-unsupported assertion
that you are correct when you say that architectures with virtual
address caches, or inverted page tables, cannot support having the same
physical page mapped to multiple different virtual addresses in
different processes?

I've indicated *several times* how that not only *can* be done, but how
it *is* done on Suns, in the case of virtual address caches:

	ensure that all the virtual addresses get mapped to the same
	cache line by aligning the mappings properly (turn off caching
	in the rare case where that's not possible);

	if you get a cache miss, check whether the cache line actually
	has the same physical address as the virtual address
	that missed and, if so and the process has permission to access that
	virtual address, re-tag the cache line if it's tagged with virtual
	addresses (no need if the cache tags are physical addresses);

and how it is, as I remember, done on RS/6000s in the case of inverted
page tables:

	have a global virtual address space composed of a segment number
	and an offset-within-segment and a per-process virtual address
	space composed of a segment register number and an
	offset-within-segment;

	have the virtual addresses in the inverted page table be
	addresses in the *global* virtual address space, because a given
	page has only one virtual address in *that* space;

	load up the segment registers on a per-process basis to give a
	segment with a given global virtual address different
	per-process virtual addresses in different processes;

but you haven't demonstrated why that doesn't work (probably because, as
proven by the existence of various Sun machines and IBM machines running
OSes that *do* support shared libraries that can be mapped to different
addresses, it's impossible to demonstrate that it doesn't work because
it *does* work!).

lm@slovax.Eng.Sun.COM (Larry McVoy) (05/28/91)

mohta@necom830.cc.titech.ac.jp (Masataka Ohta) writes:
> So, the remaining problem is the performance.

Forgive me for being so dense as to not instantly grasp what it is that
you say.  Perhaps you could enlighten me with a segment of code that
will show a measurable performance difference between a statically
linked and dynamically linked program?  I understand about the extra
startup time, but not about this generic performance problem to which
you elude.

Much thanks,
---
Larry McVoy, Sun Microsystems     (415) 336-7627       ...!sun!lm or lm@sun.com

jim@segue.segue.com (Jim Balter) (05/30/91)

In article <605@appserv.Eng.Sun.COM> lm@slovax.Eng.Sun.COM (Larry McVoy) writes
[in response to Masataka Ohta]:
>I understand about the extra
>startup time, but not about this generic performance problem to which
>you elude.

Elude is right! :-)