[comp.lang.modula2] Pointers and whatnot

BOTCHAIR@UOGUELPH.BITNET (Alex Bewley) (08/22/87)

    (I hope this gets through)...

    I am having a problem with pointers and storage allocation.  When one
usually is using dynamic storage one knows how much space to allocate (as in
the following example, I believe it is syntactically correct).
...
TYPE
    PtrType = POINTER TO DataType;
    DataType = RECORD
                 Text : ARRAY [0..80-1] OF CHAR;
                 next : PtrType;
               END;

VAR
   Ptr : PtrType;

BEGIN
  NEW(Ptr); (* or ALLOCATE(Ptr, TSIZE(DataType); *)
  Assign(Ptr^.Text,"This is some text...");
END;...


   In that above case, Text is 80 chars, but suppose the length of Text varies.
How would I be able to change the type?

   Or suppose I just want 137 bytes of storage of no particular type, can one
have just a POINTER with no TO?  According to my compiler, no (Logitech
Modula-2/86).

   I hope I am not being obscure or stupid, but your help would be appreciated.

Alex Bewley
NetNorth: BOTCHAIR@UOGUELPH

| We don't understand it, let's give it to the programmers... |

paul@vixie.UUCP (Paul Vixie Esq) (08/22/87)

In article <8708220025.AA10833@cayuga.cs.rochester.edu> BOTCHAIR@UOGUELPH.BITNET (Alex Bewley) writes:
#TYPE PtrType = POINTER TO DataType;
#     DataType = RECORD
#                 Text : ARRAY [0..80-1] OF CHAR;
#                 next : PtrType;
#               END;
#
#VAR  Ptr : PtrType;
#
#BEGIN
#  NEW(Ptr); (* or ALLOCATE(Ptr, TSIZE(DataType); *)
#  Assign(Ptr^.Text,"This is some text...");
#END;...
#
#   In that above case, Text is 80 chars, but suppose the length of Text
#varies. How would I be able to change the type?

You could use something like

TYPE DataType = RECORD
	Text : POINTER TO CHAR;
	next : PtrType;
     END;

BEGIN
  NEW(Ptr);  ALLOCATE(Ptr.Text, strlength("what you want to put there"));
  Assign(Ptr^.Text^,...

You would have to deallocate in stages, also.  This is an old trick in C...
It ought to work fine for M2, although since arrays and pointers are not the
same thing in M2, you wound't be able to pass Ptr^.Text^ to a procedure that
was expecting ARRAY OF CHAR.  You could have the procedure expect POINTER TO
CHAR, but I'm unclear on whether you could then pass ARRAY OF CHAR params to
it without some sort of a type-cast.

#   Or suppose I just want 137 bytes of storage of no particular type, can one
#have just a POINTER with no TO?  According to my compiler, no (Logitech
#Modula-2/86).

Wirth recommends POINTER TO WORD for such things; this is supposed to be
type-compatible with all pointers.
-- 
Paul A. Vixie, Esq.		   "A viler evil than to throw a man into a
paul%vixie@uunet.uu.net		    sacrificial furnace, is to demand that he
{uunet,ptsfa,hoptoad}!vixie!paul    leap in, of his own free will, and that he
San Francisco, (415) 647-7023	    build the furnace, besides."  (Ayn Rand)

ronald@csuchico.EDU (Ronald Cole) (08/24/87)

In article <8708220025.AA10833@cayuga.cs.rochester.edu>, BOTCHAIR@UOGUELPH.BITNET (Alex Bewley) writes:
>    Or suppose I just want 137 bytes of storage of no particular type, can one
> have just a POINTER with no TO?  According to my compiler, no (Logitech
> Modula-2/86).

Alex,
	Isn't that the reason for the WORD type (i.e. POINTER TO WORD)?
Then just use an appropriate cast.

-- 
Ronald Cole				| uucp:     ihnp4!csun!csuchic!ronald
AT&T 3B5 System Administrator		| PhoneNet: ronald@csuchico.edu
@ the #_1_ party school in the nation:	| voice     (916) 895-4635
California State University, Chico	"It's O.K." -Hal Landon Jr., Eraserhead

nagler@olsen.UUCP (Robert Nagler) (08/24/87)

    >In that above case, Text is 80 chars, but suppose the length of Text varies.
    >How would I be able to change the type?

    >Or suppose I just want 137 bytes of storage of no particular type, can one
    >have just a POINTER with no TO?  According to my compiler, no (Logitech
    >Modula-2/86).

Sorry, there are no dynamic arrays in Modula-2.  This is a fundamental
problem with the language.  I have seen that the BSI language committee
has been talking about a new type of "String", but this is not a general
solution to supporting dynamically defined bounds for array types.

There are several ways around your problem:
    1) Use another language which supports real dynamic arrays.
    2) Go for the C-style management of arrays as POINTER TO CHAR and
       increment the pointer yourself to get the desired value.
    3) Declare BigArray = POINTER TO ARRAY [ 0 .. MAX( CARDINAL ) ] OF CHAR.
       When you want to use a variable of that type, you just partially
       allocate the array, e.g. ALLOCATE( p, theAmountIWant ).  However,
       you must be careful to do your own range checking.
    4) Allocate an array which is "big enough", but just remember that 
       "big enough" is never "big enough"!
    
After using M2 for several years, I find approach (3) is the best when
coupled with an opaque interface.  Here is an example:
    DEFINITION MODULE Vectors;
    TYPE
	Object;	
    PROCEDURE Create( vector : Object;
		      length : CARDINAL );
    PROCEDURE Destroy( vector : Object );
    PROCEDURE Get( vector : Object;
		   index  : CARDINAL )
			  : REAL;
    PROCEDURE Set( vector   : Object;
		   index    : CARDINAL;
		   newValue : REAL );
    END Vectors.

The implementation of this module does type checking, thus if an
index is out of bounds, the program crashes (just as if it were 
caught by the run-time system).   Here is what a Vectors.Object looks
like:
    IMPLEMENTATION MODULE Vectors; 
    CONST
	(* The following line is system dependent *)
	maxHigh   = ( SysTypes.MAXCARDINAL DIV SysTypes.bitsPerREAL
		      * SysTypes.bitsPerBYTE ) - 2;
    TYPE
	Object    = POINTER TO ObjectRec;
	ObjectRec = RECORD
	    high : CARDINAL;
	    data : POINTER TO ARRAY [ 0 .. maxHigh ] OF REAL;
	END;

The implementation of Get looks like:
    WITH vector^ DO
	IF index > high THEN
	    ProgErrs.Terminate( "Vectors: range error" );
	END;

	RETURN data^[ index ];
    END; (* WITH *)

The only problems you have is in being careful that the allocation of
"data" is done correctly.  It has been our experience that naive users
will often make mistakes implementing such modules.  However, once 
implemented, the module provides a very clean interface to dynamic 
length objects (whatever they may be, in this case, Vectors of REALs).

There are a couple of points here.  First, we have a fairly advanced
debugging environment which includes an "object allocator".  We don't
usually have problems catching improperly allocated memory areas.  
Second, this method is not a panacea, that is, the general "string"
problem will only be solved with a language modification (I believe).
Strings are so critical to most programs that we did not implement
an object oriented Strings interface.  Instead, we have chosen solution
(4) in combination with some much higher level functions which do
use dynamic allocation of strings.

Applications like Vectors turn out to be faster than just plain
ol' arrays when accompanied by higher level functions, e.g. ComputeFx, 
Multiply, Add, Subtract, ...  We have even found a significant speedup 
after introducing this type of abstraction (as one would expect!).

Rob

BOTCHAIR@UOGUELPH.BITNET (Alex Bewley) (08/25/87)

   Many thanks to those who replied, you have been a great help!  But... more
questions...

   In the message Rob sent, he had a module called SysTypes.  What compiler is
he using?  Is this a standard module ('cause I don't have it).  I can see how
useful it would be for portability reasons.

        Alex

dcw@doc.ic.ac.uk (Duncan C White) (08/25/87)

In article <748@vixie.UUCP> paul@vixie.UUCP (Paul Vixie Esq) writes:
>In article <8708220025.AA10833@cayuga.cs.rochester.edu>...
>BOTCHAIR@UOGUELPH.BITNET (Alex Bewley) writes:
>#TYPE PtrType = POINTER TO DataType;
>#     DataType = RECORD
>#                 Text : ARRAY [0..80-1] OF CHAR;
>#                 next : PtrType;
>#               END;
>#
>#VAR  Ptr : PtrType;
>#
>#BEGIN
>#  NEW(Ptr); (* or ALLOCATE(Ptr, TSIZE(DataType); *)
>#  Assign(Ptr^.Text,"This is some text...");
...
>#   In that above case, Text is 80 chars, but suppose the length of Text
>#varies. How would I be able to change the type?
>
>You could use something like
>
>TYPE DataType = RECORD
>	Text : POINTER TO CHAR;
>	next : PtrType;
>     END;
>
>BEGIN
>  NEW(Ptr);  ALLOCATE(Ptr.Text, strlength("what you want to put there"));
>  Assign(Ptr^.Text^,...
>

I'm afraid this is incorrect, Paul: in Modula-2, POINTER TO CHAR is precisely
what it says: a pointer to a SINGLE character.  Not a pointer to an array of
characters, or anthing else...

C, on the other hand, makes no distinction between POINTER TO CHAR,
ARRAY OF CHAR, and [to an extent] POINTER TO ARRAY OF CHAR..

Maybe the confusion arises due to the fact that the ALLOCATE procedure,
being nothing but a library procedure, will happily let you:

	ALLOCATE( Ptr^.Text, 10 );

[ Despite the fact that the size of Ptr^.Text is one byte... not 10 words ]

However, doing such an operation makes little sense, since you can only
use

	Ptr^.Text^

which is the first character of that block of 10 characters... the remaining
nine characters can only be accessed by exceedingly messy type conversions,
which involve type casting the pointer into a pointer to some other type [an
array or record] and then accessing the fields or elements through the
fiddled pointer...

One possible answer to Alex's problem could be:

TYPE
	HugeStrPtr = POINTER TO ARRAY [ 0..MaxInt ] OF CHAR;

	DataType = RECORD
		Text : HugeStrPtr;
		next : PtrType;
	END;

VAR
	Ptr : POINTER TO DataType;

BEGIN
	NEW(Ptr);
	HugeAssign( Ptr^.Text, "what you want to put there" );
...

Unfortunately, you then have to define HugeAssign yourself: perhaps as:

	PROCEDURE HugeAssign( VAR H : HugeStrPtr; S : ARRAY OF CHAR );

	VAR length, posn : INTEGER;

	BEGIN
		length := strlength( S );
		ALLOCATE( H, length );
		FOR posn := 1 TO length DO
			H^[posn] := S[posn-1];
		END;
		H^[ 0 ] := CHR( length );
		(* if you want strings longer than 256, two characters
		 * would have to be used for the length count...
		 *)
	END HugeAssign;

Offhand, I can't think of a better way to represent strings if you really
care about saving space [otherwise, I guess you just make an artibrary upper
limit : eg. define String = ARRAY[ 1..200 ] OF CHAR and ignore the wastage..]

Paul Vixie continues [quoting Alex:]
>#   Or suppose I just want 137 bytes of storage of no particular type, can one
>#have just a POINTER with no TO?  According to my compiler, no (Logitech
>#Modula-2/86).
>
>Wirth recommends POINTER TO WORD for such things; this is supposed to be
>type-compatible with all pointers.

Well, Alex, I guess it depends whether you really want 137 bytes of storage -
a byte is a well-understood concept, so 137 bytes of storage is just:

	ARRAY [ 1..137] OF SYSTEM.BYTE

or whether you truly want a 'blob' of memory of no particular type, in which
case you do:

	VAR X, Y : ADDRESS;	(* same as POINTER TO WORD *)

	BEGIN
		ALLOCATE( X, 137 );	(* note: 137 words, not bytes *)

If you do this, however, you must use expressions like:

	Y := X + 50;
	... Y^ ...

to access the 50th byte [or word on some systems] of X.

>-- 
>Paul A. Vixie, Esq.                  "A viler evil than to throw a man into a
>paul%vixie@uunet.uu.net              sacrificial furnace, is to demand that he
>{uunet,ptsfa,hoptoad}!vixie!paul     leap in, of his own free will, and that he
>San Francisco, (415) 647-7023	      build the furnace, besides."  (Ayn Rand)

-----------------------------------------------------------------------------
JANET address : dcw@uk.ac.ic.doc| Snail Mail :  Duncan White,
--------------------------------|               Dept of Computing,
  This space intentionally      |               Imperial College,
  left blank......              |               180 Queen's Gate,
  (paradoxical excerpt from     |               South Kensington,
  IBM manuals)                  |               London SW7
----------------------------------------------------------------------------
Tel: UK 01-589-5111 x 4982/4991
----------------------------------------------------------------------------

chris@mimsy.UUCP (Chris Torek) (08/25/87)

In article <520@ivax.doc.ic.ac.uk> dcw@doc.ic.ac.uk (Duncan C White) writes:
>in Modula-2, POINTER TO CHAR is precisely what it says: a pointer to
>a SINGLE character.  Not a pointer to an array of characters, or
>anthing else...
>
>C, on the other hand, makes no distinction between POINTER TO CHAR,
>ARRAY OF CHAR, and [to an extent] POINTER TO ARRAY OF CHAR..

This is not quite true.  The key difference between the Modula-2
conception of a pointer to a character and that of C is that C
implicitly states that that character lives somewhere within a
flat address space that may encompass multiple characters.  If
there are such additional characters, they may be found by using
pointer arithmetic on the given pointer.

These characters will exist if and only if the character to which
the pointer points is in fact one element of an array of characters,
or is contained within a region of characters allocated by the
runtime system.

>Maybe the confusion arises due to the fact that the ALLOCATE procedure,
>being nothing but a library procedure, will happily let you:
>
>	ALLOCATE( Ptr^.Text, 10 );
>
>[ Despite the fact that the size of Ptr^.Text is one byte... not 10 words ]

If ALLOCATE will allocate a contiguous group of words, Modula-2
also makes a similar implication, although in this case it is
weaker.  C's pointer types rule out sparse arrays, for example,
while Modula-2's probably do not.  Another thing to consider is
that C's arrays are (or are allowed to be) compile-time entities:
Once the code is generated, C no longer deals in arrays.  All arrays
do is carry sizes around during compilation.  If a language has
both conformant arrays and dynamic array allocation, that language
perforce has run-time arrays.  I am not familiar enough with Modula-2
to know whether this is the case.

[Finally, in his signature:]
> This space intentionally left blank......
> (paradoxical excerpt from IBM manuals)

(We once used `This page accidentally left blank'.)
-- 
In-Real-Life: Chris Torek, Univ of MD Comp Sci Dept (+1 301 454 7690)
Domain:	chris@mimsy.umd.edu	Path:	seismo!mimsy!chris

nagler@olsen.UUCP (Robert Nagler) (08/25/87)

   >In the message Rob sent, he had a module called SysTypes

The module SysTypes is part of the library we are using here at
Olsen & Associates.  As with many other organizations, we found
the Wirth library to be useless.  The library is quite different from
the BSI proposal recently distributed, but we find it useful nonetheless.
If people are interested, I will distribute the module SysTypes
(about a 6K defmod).  Frankly, I am surprised to see the BSI
library does not contain such a module, since it is hardly a new idea.

Rob
olsen!nagler@seismo.css.gov

alan@pdn.UUCP (Alan Lovejoy) (08/27/87)

In article <520@ivax.doc.ic.ac.uk> dcw@doc.ic.ac.uk (Duncan C White) writes:
>I'm afraid this is incorrect, Paul: in Modula-2, POINTER TO CHAR is precisely
>what it says: a pointer to a SINGLE character.  Not a pointer to an array of
>characters, or anthing else...

>[suggests using ALLOCATE(pointerToChar, stringSize)]
>However, doing such an operation makes little sense, since you can only
>use

   [pointerToChar^]	

>which is the first character of that block of 10 characters... the remaining
>nine characters can only be accessed by exceedingly messy type conversions,
>which involve type casting the pointer into a pointer to some other type [an
>array or record] and then accessing the fields or elements through the
>fiddled pointer...

The n'th byte is easily extracted by:

  cp := ADDRESS(pointerToChar) + n - 1;
  cp^;

And for iterating over the string:

  VAR cp: POINTER TO CHAR; endCp: ADDRESS;
  ...
  cp := pointerToChar;
  endCp := ADDRESS(cp) + stringSize;
  WHILE ADDRESS(cp) <= endCp DO
    process(cp^);
    cp := ADDRESS(cp) + 1;
  END;


This should actually be faster than using array indices, because
you'll get machine code such as "MOVE.B (An),<destination>",
instead of "MOVE.B 0(An,Dn)" (using m680x0 code as an example),
and a good optimizer should be able to discover the
"MOVE.B (An)+,<destination>" optimization (combining the access 
with the pointer increment).

Alan "String.Compare, when recoded to use this technique, ran more
      than twice as fast on my system" Lovejoy