[comp.unix.wizards] Data in text segment

jfh@rpp386.Dallas.TX.US (The Beach Bum) (09/26/88)

In article <159@taux02.UUCP> amos@taux02.UUCP (Amos Shapir) writes:
>Thanks to all people who corrected my mistake - separate I/D pdp11 cannot
>support text-segment shared data. It's been a long time since I hacked
>a pdp11 :-(

most modern CPU's can support separate I&D space in some sense by way
of protection bits, etc. in the MMU.  i think Zilog Z8000's support it
more or less directly, like the PDP-11's did.  the rest of the world
has EXECUTE bits in the MMU which (may or may not) inhibit data cycles
to text segments.
-- 
John F. Haugh II (jfh@rpp386.Dallas.TX.US)                   HASA, "S" Division

      "Why waste negative entropy on comments, when you could use the same
                   entropy to create bugs instead?" -- Steve Elias

andrew@frip.gwd.tek.com (Andrew Klossner) (10/05/88)

[]

		"... separate I/D pdp11 cannot support text-segment
		shared data."

	"most modern CPU's can support separate I&D space in some sense
	by way of protection bits, blah blah blah ..."

The point missed by the second poster is that "separate I&D space" is
usually taken to mean two separate but overlapping address spaces.
There's an instruction at text location 1000, and there's a totally
unrelated data item at data location 1000.  In this environment, you
can't put data in the (shared) text segment, because an attempt to
refer to it will refer instead to whatever is at the corresponding
address within the data segment.

If this were comp.unix.questions, I would say, "If you have never
thought about this concept, take a few moments and so so.  A light of
interesting lights will dawn."  But of course we're all wizards here ...

  -=- Andrew Klossner   (decvax!tektronix!tekecs!andrew)       [UUCP]
                        (andrew%tekecs.tek.com@relay.cs.net)   [ARPA]

jfh@rpp386.Dallas.TX.US (The Beach Bum) (10/06/88)

In article <10420@tekecs.TEK.COM> andrew@frip.gwd.tek.com (Andrew Klossner) writes:
>		"... separate I/D pdp11 cannot support text-segment
>		shared data."
>
>	"most modern CPU's can support separate I&D space in some sense
>	by way of protection bits, blah blah blah ..."
>
>The point missed by the second poster is that "separate I&D space" is
>usually taken to mean two separate but overlapping address spaces.

Incorrect.  This poster did take that fact into account.  The more
reasonable CPU's have status lines from the CPU which indicate the
nature of the instruction.

For example, the MC68000 supports separate I&D space at the hardware
level by careful use of the FC0, FC1, and FC2 lines.  The truth table
is

	FC2	FC1	FC0	Access Mode
	0	0	1	User Data
	0	1	0	User Program
	1	0	1	Supervisor Data
	1	1	0	Supervisor Program

Thus, the MC68000 can be said to support PDP-11-style separate I&D.

>If this were comp.unix.questions, I would say, "If you have never
>thought about this concept, take a few moments and so so.  A light of
>interesting lights will dawn."  But of course we're all wizards here ...

If this were comp.arch, I would say, go grab a few hardware manuals.
Unfortunately the rest of mine are at the office.  Or packed away in
boxes laying on the floor in the closet. ;-)

It would appear that the 8088/6 supports separate I&D by virtue of
being able to decipher if a reference is in the code segment or not.
I suspect the rest of that family has a similiar concept present in
silicon.

Another quick look into the Zilog Z8000 family indicates the information
is present on the ST 0 .. 3 lines.

So it would appear that "overlapping" separate I&D, which is the only
kind I ever thought existed, is the norm for modern CPU's.  Lacking
the desire to flip through my 80386 manual, I suspect that ALL leading
CPU's have some means of deciphering code segment from data segment
references.  And this capability is sufficient to support PDP-11
style separate I&D.
-- 
John F. Haugh II (jfh@rpp386.Dallas.TX.US)                   HASA, "S" Division

      "Why waste negative entropy on comments, when you could use the same
                   entropy to create bugs instead?" -- Steve Elias

andrew@frip.gwd.tek.com (Andrew Klossner) (10/08/88)

[]

		"The point missed by the second poster is that
		"separate I&D space" is usually taken to mean two
		separate but overlapping address spaces."

	"Incorrect.  This poster did take that fact into account.  The
	more reasonable CPU's have status lines from the CPU which
	indicate the nature of the instruction.  For example, the
	MC68000 supports separate I&D space at the hardware level ..."

The original point was that somebody thought to put unmodified data
into a shared text segment, and it was noted that this wouldn't work on
a machine like the PDP-11/45 using a magic number 411 object file,
"separate I&D".

Yes the hardware has a line to tell the memory whether I or D is being
requested.  But the program has no control over that line!  There's no
general use "load from instruction space" instruction on any
architecture I've looked at, so there's no way for a program to use
data in a separated instruction space.  (Sometimes there's a "load from
user's instruction space," but it's restricted to supervisor mode.)

  -=- Andrew Klossner   (uunet!tektronix!tekecs!frip!andrew)    [UUCP]
                        (andrew%frip.gwd.tek.com@relay.cs.net)  [ARPA]

jerry@olivey.olivetti.com (Jerry Aguirre) (10/20/88)

In article <10440@tekecs.TEK.COM> andrew@frip.gwd.tek.com (Andrew Klossner) writes:
>Yes the hardware has a line to tell the memory whether I or D is being
>requested.  But the program has no control over that line!  There's no
>general use "load from instruction space" instruction on any
>architecture I've looked at, so there's no way for a program to use
>data in a separated instruction space.  (Sometimes there's a "load from
>user's instruction space," but it's restricted to supervisor mode.)

Well, actually there is.  If an instuction refference a value that is PC
relative then the value is fetched from text space.  This is fairly easy
to do in the hardware and doesn't require adding new instructions or
addressing modes.  I am pretty sure this is the way the PDPs handled
split I/D.

In practice his means that simple constants (values and addresses) were
placed in the text segment.  What you can't do is take or pass the
address of the constants.  There isn't any segment information to go
along with the address to say what segment it is in.  So you can't put
printf strings, the most common shared data hack, in the text area.
(This also made sense in that once you take an address of data you
can't be sure it won't be written to.  So, lacking a keyword to indicate
that, the compiler can't be sure it can be placed in a read only shared
area.)

I saw this demonstrated once in some code I wrote to initialize
something.  (Not on a PDP so I don't know if it would have handled
this.)  I didn't want to use any library calls and I wanted the code to
be small so I wound up writing something like:

	i = 0;
	while (buf[i] = "Initialization string"[i]) i++;

My supprise was that the compiler was smart enough to realize the the
string could never be written to and placed it in the shared readonly
area.  If the hardware had a load PC plus register addressing mode then
the string could be placed in the text area.  Maybe someone with a
PDP11/45 or 70 can compile the above and see what the assembler output
looks like?
			Jerry Aguirre

koll@ernie.NECAM.COM (Michael Goldman) (10/20/88)

This is just to point out that the 8088/8086 does not distinguish
sufficiently between code and data (or stack) segments to make
any real difference in your programming.  I know because I
have over-written code with data and my stack has overflowed
into my code and or data segments at various times (stupid pointers !)
The 80286 and 80386 on the other hand do have complex but significant
distinctions between code and data spaces (when operating in protected
mode - i.e., not emulating an 8086) and, in fact, have 4 layers of
protection, modeled on UNIX I'm told.  One can designate all kinds
of data/code/stack combos including read only, read-write, and for
all I know, write-only.  The layers correspond to driver, kernel,
shell, and user space ( my own very rough approximation) with all
sorts of "layer N can call to layer M but only through layer K"
restrictions.  The idea is to put into hardware a lot of UNIX type
stuff to speed up context switching, and enforce system integrity.
OS/2 does not use all the features provided but I have no knowledge
of whether 286 or 386 UNIX(tm)s do.  It all makes life much more complex
if you like to work in assembly, and/or close to the hardware but
it probably cuts in half the context-switch timings.

"I have some advice for you, son on going your own way in life.
 Wait, where are you going ?"

Regards,
Michael Goldman
:w

henry@utzoo.uucp (Henry Spencer) (10/22/88)

In article <31029@oliveb.olivetti.com> jerry@olivey.UUCP (Jerry Aguirre) writes:
>... If an instuction refference a value that is PC
>relative then the value is fetched from text space.  This is fairly easy
>to do in the hardware and doesn't require adding new instructions or
>addressing modes.  I am pretty sure this is the way the PDPs handled
>split I/D.

Nope, sorry, wrong.  Split-space pdp11s fetched from instruction space
only when the thing being fetched was logically part of the instruction
stream.  PC-relative fetches went to data space.  There was absolutely
no way to read data from text space.

>My supprise was that the compiler was smart enough to realize the the
>string could never be written to and placed it in the shared readonly
>area...

Actually, the compiler was probably assuming this rather than discovering
it.  K&R says that strings are modifiable, but many people disliked this
and felt it should have been otherwise.  Some of them wrote compilers.
-- 
The meek can have the Earth;    |    Henry Spencer at U of Toronto Zoology
the rest of us have other plans.|uunet!attcan!utzoo!henry henry@zoo.toronto.edu