root@hobbes.UUCP (08/31/87)
+---- Herman Rubin writes in <572@l.cc.purdue.edu> ---- | +---- Ed Gould writes ---- | | It's also not legal in the proposed ANSI C standard. Pointers | | may be subtracted *only* if they point to members of the same | | array of elements. | +---- | You have no way of knowing how I can use the power of the machine; I may | very well find a new way of doing some things tomorrow that I do not see | today. Let us remove unnecessary restrictions from the languages. +---- *** The following is only valid on intel 808x architecture machines *** Followups are directed to comp.sys.intel On the intel chips (and I'm sure on many others) some compiler's malloc() routines align memory requests on 16 byte boundries. So, if you did: You might get: _________ char *p1, *p2, *p3; /________/| p1 = malloc(20); p1 -->|20 bytes|| p2 = malloc(20); +--------+/ p3 = p2 - p1; _________ /________/| filler |? bytes || +--------+/ _________ /________/| p2 -->|20 bytes|| +--------+/ and p2 - p1 would NOT give you a useful number! THAT is why ANSI said that the result was undefined. Not illegal, just undefined. This means that compiler writers can do stuff like this without having to worry about breaking code. Iff you know what your compiler does AND iff you don't care about portability then you can use the info like this: printf("On this machine there are %ld bytes of filler between p1 and p2\n", (unsigned long) ( (unsigned long)p2 - (unsigned long)p1 ) - 20); or somesuch. ( This code WILL NOT WORK on intel chips. See below) -- New Subject: pointer manipulation on intel chips -- Note: This DOES NOT pertain to the usual "*(a+3)" or "if (p1 == p2)" stuff which is called "pointer arithmetic" or "pointer manipulation" in languages like C. It instead refers to "dissecting" the value of "&foobar". This comes in when you wish to do things like the p3 = p2 - p1; above where p1 and p2 point to different aggregates. The C compiler already takes care of the first cases for you. If you wish to do pointer manipulation on the intel 808x chips you need to recognize how a pointer is constructed: A pointer has 2 parts, a SEGMENT and an OFFSET, each 16 bits in length. e.g.: 1040:3333 SEGMENT:OFFSET In the "small" model, the SEGMENT is an unchanging value stored in a register and the OFFSET is what is used as a "pointer" in C. In the "large" model, a pointer consists of a 32 bit structure which contains two 16 bit values, the SEGMENT and the OFFSET. The SEGMENT and the OFFSET are combined to make a 20 bit address like this: SEGMENT [0001|0000|0100|0000] 0x1040 OFFSET [0011|0011|0011|0011] 0x3333 -------------------------- ADDRESS [0001|0011|0111|0011|0011] 1040:3333 or 1000:3733 or 1001:3633 or Note: a pointer may have many 1002:3533 or values and still point to the same thing! ... or 1373:0003 To convert the pointer 0040:3333 to an unsigned long address we use the formula (SEGMENT * 16) + OFFSET to get: (0x1040 * 16) + 0x3333 = 0x00013733 Note: even though a pointer may have many values, it has only ONE address! On the 808x chips this is a physical ADDRESS, but NOT a valid POINTER. Note that in this discussion, pointers are not addresses and addresses are not pointers! Two addresses may be subtracted to obtain a valid number which is the absolute difference (in bytes) of their physical locations. An address may be converted into a normalized pointer by constructing a SEGMENT:OFFSET pair where the lower 12 bits of the SEGMENT are ZERO. segment = (unsigned short)(address & 0x000F0000) / 16; offset = (unsigned short)(address & 0x0000FFFF); Only pointers which A) are normalized, or B) have the same SEGMENT value can be validly compared for equality. All addresses can be validly compared for equality. Intel bashing flames should go to /dev/null, glaring errors should be emailed. minor errors should be ignored. -- John Plocher uwvax!geowhiz!uwspan!plocher plocher%uwspan.UUCP@uwvax.CS.WISC.EDU
gnu@hoptoad.UUCP (09/07/87)
root@hobbes.UUCP (John Plocher) wrote: > -- New Subject: pointer manipulation on intel chips -- > A pointer has 2 parts, a SEGMENT and an OFFSET, each 16 bits in length. > e.g.: 1040:3333 > SEGMENT:OFFSET > An address may be converted into a normalized pointer by constructing a > SEGMENT:OFFSET pair where the lower 12 bits of the SEGMENT are ZERO. > segment = (unsigned short)(address & 0x000F0000) / 16; > offset = (unsigned short)(address & 0x0000FFFF); I think it's great :-) that people are teaching folks how to write programs for the 8086 that will break when recompiled for the 80386. There's nothing like software foolishness to break even the best-implemented hardware compatability... -- {dasys1,ncoast,well,sun,ihnp4}!hoptoad!gnu gnu@postgres.berkeley.edu My name's in the header where it belongs.