[comp.lang.icon] Icon additions

gudeman@ARIZONA.EDU ("David Gudeman") (09/15/89)

[I was going to reply directly, but I had doubts about the return
address.  Frank, you should include a return address in your
.signiture file, or maybe add a Reply-To: to your mail header.]

Var params require a new _internal_ data type, but they don't really
require a new user-accessible one.  If you want to go all the way and
add C-style pointers, you have to consider things like what operations
will be allowed.  For example, in C you can add an integer to a
pointer to get a new pointer.  There is no range check because the C
philosophy puts the responsibility for such things on the programmer
and because C models memory as one huge continuous array.  Icon models
memory as a set of independent locations with variable (changing)
sizes, so if you want pointer addition, you have to come up with some
reasonable Iconesque semantics.

Assume that ^expr returns a pointer to the variable produced by expr.
Then {ls := list(10); ptr := ls[3]} creates a list, and makes ptr a
pointer into the list.  Presumably, ptr + 3 returns a pointer to
ls[6].  What does ptr + 9 do?  It would be terribly non-Iconish for it
to produce a random memory location.  I can think of two
possibilities: (1) pointers wrap around in a list, so ptr + 9 produces
a pointer to ls[2], or  (2) ptr + 9 fails, just like ls[3+9] would.

Here's another consideration: what happens after you do a push(ls, x)?
Does ptr still point to ls[3], or does it now point to ls[4]?  All of
this is relatively simple, when you add pointers to table elements
things get complicated...

Multiple dimensioned arrays would be useful, and I can see why you
want them static-sized (to avoid the semantic complexity of APL) , but
I think you are exaggerating the performance penalty of Icon's
variable-sized lists.  If you create a list with the list() function
and never extend it, there is very little difference between access
time of the list and that of a fixed-size array.  In fact, I would be
surprised if you could measure any performance difference by timing
programs.  And the list() only uses about 10 more words of memory than
a fixed-size array would need.  Unless you have hundreds of very small
arrays, this shouldn't be a problem.

vicc@unix.cie.rpi.edu (VICC Project (Rose)) (09/15/89)

In article <8909142135.AA18591@megaron.arizona.edu>, gudeman@ARIZONA.EDU ("David Gudeman") writes:
> [I was going to reply directly, but I had doubts about the return
> address.  Frank, you should include a return address in your
> ..signiture file, or maybe add a Reply-To: to your mail header.]

I wasn't thinking about that. We have a central news server here which actually
submits the articles. One of the problems is you can not cancel an article.

> 
> Var params require a new _internal_ data type, but they don't really
> require a new user-accessible one.  If you want to go all the way and
> add C-style pointers, you have to consider things like what operations
> will be allowed.  For example, in C you can add an integer to a
> pointer to get a new pointer.  There is no range check because the C
> philosophy puts the responsibility for such things on the programmer
> and because C models memory as one huge continuous array.  Icon models
> memory as a set of independent locations with variable (changing)
> sizes, so if you want pointer addition, you have to come up with some
> reasonable Iconesque semantics.

If I implemented a user-accessible pointer type I would implement it in a more
pascalish way. You can get a pointer to a variable, copy a pointer, or get
a pointer to newly allocated space. I think anything more (such as c type
operations) would stretch Icon's structure a bit too far (at least for my
tastes)
> 
> Assume that ^expr returns a pointer to the variable produced by expr.
> Then {ls := list(10); ptr := ls[3]} creates a list, and makes ptr a
> pointer into the list.  Presumably, ptr + 3 returns a pointer to
> ls[6].  What does ptr + 9 do?  It would be terribly non-Iconish for it
> to produce a random memory location.  I can think of two
> possibilities: (1) pointers wrap around in a list, so ptr + 9 produces
> a pointer to ls[2], or  (2) ptr + 9 fails, just like ls[3+9] would.
> 
> Here's another consideration: what happens after you do a push(ls, x)?
> Does ptr still point to ls[3], or does it now point to ls[4]?  All of
> this is relatively simple, when you add pointers to table elements
> things get complicated...

As far as I can see, it would have to now point to ls[4], it would take
a lot of work to change all of the pointers that might exist to ls[3].
This of course is one place that pointers will get messy with Icon even if
operations are not defined on pointers.

> 
> Multiple dimensioned arrays would be useful, and I can see why you
> want them static-sized (to avoid the semantic complexity of APL) , but
> I think you are exaggerating the performance penalty of Icon's
> variable-sized lists.  If you create a list with the list() function
> and never extend it, there is very little difference between access
> time of the list and that of a fixed-size array.  In fact, I would be
> surprised if you could measure any performance difference by timing
> programs.  And the list() only uses about 10 more words of memory than
> a fixed-size array would need.  Unless you have hundreds of very small
> arrays, this shouldn't be a problem.

I see that for a single dimensioned array, but what about a multi-dimensioned
array with lists, is not that a list of lists, in which case you have that
extra 10 words of overhead for each sub list? Also you have the time of an
extra list lookup as opposed to a multiplication. I think you are right though,
now that I think about it the speed improvement probably wont be much, but
I can see some space improvement by avoiding the list of lists. Of course
the programmer could also do the multiplication himself.

The other thing I hope to be able to do is to allow arbitrary lower bounds
which will improve the readability of some types of algorithms. Another
advantage of the array would be that copy would copy all the elements, not
just the first dimension.

Another reason to play with arrays is that I believe they will be a relatively
simple implementation which should be a good practice start. Pointers on the
other hand obviously have many ramifications which must be examined.

One other question, is it correct that to add an operator one will have to
alter ICONT?

Frank Filz
Center For Integrated Electronics
Rensselaer Polytechnic Institute
vicc@unix.cie.rpi.edu

dscargo@CIM-VAX.HONEYWELL.COM ("DAVE CARGO") (09/15/89)

There are three things of various levels of complexity that I'd like to see
added to Icon.

1) A bit-string type similar in underlying mechanisms to csets.  (In other
words a 256-bit limit is probably acceptable.  The same operations available
on csets should be provided.  Conversions should be available between bit
strings and strings, and possibly between bit strings and csets.  I don't
see any good way of representing bit string constants.  Translating them
to strings could be done in a combination of ways:  to a string of binary,
octal, or hex digits, and padded to full length or truncated (on the left
or right) if the remainder is zero filled.  I'm inclined to use Icon for
some graphics applications and true bit-strings would be very helpful.

2) Formatted translation from bytestrings to integers.  Given a specification
about width and byte order, translate a string of bytes into a list of
integers.  Useful for cracking some arbitrary file internal formats (like
TIFF and PCX).

3) POSIX compatibility.  In particular, if there are screen and keyboard
interfaces specified (or drafted), make them accessible to Icon so that
portable user interfaces for applications can be developed.  In the MS-DOS
world alone, support for ANSI or lower-level access to video memory might
be nice.

dsc

vicc@unix.cie.rpi.edu (VICC Project (Rose)) (09/15/89)

In article <8909151357.AA03203@megaron.arizona.edu>, dscargo@CIM-VAX.HONEYWELL.COM ("DAVE CARGO") writes:
> There are three things of various levels of complexity that I'd like to see
> added to Icon.

> 1) A bit-string type similar in underlying mechanisms to csets.  (In other
> words a 256-bit limit is probably acceptable.  The same operations available

This woud seem easy to do, perhaps without the need for a new data type. A few
functions would go a long way.

> 2) Formatted translation from bytestrings to integers.  Given a specification
> about width and byte order, translate a string of bytes into a list of
> integers.  Useful for cracking some arbitrary file internal formats (like
> TIFF and PCX).

Well there is ord and char, but yes, a few additional functions would be
nice.

> 3) POSIX compatibility.  In particular, if there are screen and keyboard
> interfaces specified (or drafted), make them accessible to Icon so that
> portable user interfaces for applications can be developed.  In the MS-DOS
> world alone, support for ANSI or lower-level access to video memory might
> be nice.

I have some ideas on this that I mentioned a few posts back. Basically the
idea would be to implement a function which has a string parameter specifying
the function call, and a variable number of parameters to be passed. The reason
I would have a single Icon function, is primarily to avoid generating too
many function names. Also searches for use of these I/O routines would be
easy since you would look for 1 function, not 30 or more. A system variable
such as &sysfuncs or so would be a generator like &features. There would be
a single addition to &features list.

On MS-DOS I would map these to DOS and BIOS calls. ANSI calls might be nice
but I think most MS-DOS users these days are PC BIOS compatible. (If I am
wrong let me know) The big problem will be in assigning a reasonably standard
list of I/O call names.

Frank Filz
Center For Integrated Electronics
Rensselaer Polytechnic Institute
vicc@unix.cie.rpi.edu

nevin1@ihlpb.UUCP (Nevin J Liber +1 312 979 4751) (09/21/89)

>If I implemented a user-accessible pointer type I would implement it in a more
>pascalish way. You can get a pointer to a variable, copy a pointer, or get
>a pointer to newly allocated space.

My question to you is:  what do you wish to be able to do with
pointers in Icon?  Pointers are needed in languages like Pascal, C,
etc. to implement entities such as linked lists and hash tables.  But
these are already built into Icon!  Also, trying to implement
user-accessible pointers in conjunction with garbage collection is far
from trivial.  I just can't see what new types of *implementation-independent*
(I know that there basically only one implementation of Icon right now,
but language design should not be restricted to one implementation)
applications that I cannot now write in Icon solely because I don't have
pointers.  What power would pointers give me?
 

>The other thing I hope to be able to do is to allow arbitrary lower bounds
>[on arrays,] which will improve the readability of some types of algorithms.

IMHO, this would not be very Icon-ish.  First off, negative indices
already are defined in Icon; they allow you to work from the end
of a string, list, etc.  instead of the beginning; with arbituary
lower bounds on arrays, this wouldn't work (there are ways to kludge it
in, but code that would take advantage of it would look very messy).

Secondly:  for the current data types, to generate over them in reverse
order the construct "every element := dataType[*dataType to 1 by -1]" works; 
this would not hold true for non-0 lower-bounded arrays.

Thirdly:  how does one pass an array to a procedure?  A procedure would
need some way of determining the lower bound of an array passed to it
(unless you want to write different procedures for each differently
bounded array as in standard Pascal -- blecch!).  Or do you plan on
converting all passed arrays into lists, in which case adding an array
type doesn't gain you very much at all.

NEVIN ":-)" LIBER  AT&T Bell Laboratories  nevin1@ihlpb.ATT.COM  (312) 979-4751

vicc@unix.cie.rpi.edu (VICC Project (Rose)) (09/21/89)

In article <8909202335.AA19395@megaron.arizona.edu>, nevin1@ihlpb.UUCP (Nevin J Liber +1 312 979 4751) writes:
> >If I implemented a user-accessible pointer type I would implement it in a more
> >pascalish way. You can get a pointer to a variable, copy a pointer, or get
> >a pointer to newly allocated space.
> 
> My question to you is:  what do you wish to be able to do with
> pointers in Icon?  Pointers are needed in languages like Pascal, C,
> etc. to implement entities such as linked lists and hash tables.  But
> these are already built into Icon!

Pointers could also be used to access non-Icon memory. This would obviously
be machine and os dependant, but I think is necessary for any language
on a personal computer. The most successeful pc programs are those which
directly use the machines resources. They may or may not do so in a
structured way. I will agree that this reason for pointers is a VERY
touchy subject.

Pointers are also usefull for constructing (more efficiently perhaps, or
more intuitively) tree structures which might have multiple pointers
in each record (and might have data in each record also).

> Also, trying to implement
> user-accessible pointers in conjunction with garbage collection is far
>from trivial.

The pointers I would implement would be required always to point to a
structure. This would simplify garbage collection in that there would be
a block (pointed to by a variable = accessible to garbage collector) which
contains a pointer to a block and the type of that block. I understand that
setting things up for the garbage collector is most important.

> I just can't see what new types of *implementation-independent*
> (I know that there basically only one implementation of Icon right now,
> but language design should not be restricted to one implementation)
> applications that I cannot now write in Icon solely because I don't have
> pointers.  What power would pointers give me?

Pointers are one way to implement pass by reference parameters. Basically, I
see an internal pointer type comming out to implement pass by reference
parameters without changing syntax in a way which would break existing
programs (a very important consideration in any change to Icon). Once these
pointers exist, it would be easy to make that pointer structure available
to the programmer.

> >The other thing I hope to be able to do is to allow arbitrary lower bounds
> >[on arrays,] which will improve the readability of some types of algorithms.
> 
> IMHO, this would not be very Icon-ish.  First off, negative indices
> already are defined in Icon; they allow you to work from the end
> of a string, list, etc.  instead of the beginning; with arbituary
> lower bounds on arrays, this wouldn't work (there are ways to kludge it
> in, but code that would take advantage of it would look very messy).

I don't see how this would be non-Iconish, also note that the suggested
project in the Icon implementation book mentioned lower bounds other than
1. Tables do not follow this indexing. I see arrays as allowing more
efficient means of storing data which most logically should be organized
in a multi-dimensioned manner by scalar indices. Sparse arrays should be
implemented with tables.

> Secondly:  for the current data types, to generate over them in reverse
> order the construct "every element := dataType[*dataType to 1 by -1]" works; 
> this would not hold true for non-0 lower-bounded arrays.

Not for tables. (note also that the "Iconish" way would be a lower bound
of 1 not 0 :-)

> Thirdly:  how does one pass an array to a procedure?  A procedure would
> need some way of determining the lower bound of an array passed to it

The array block will contain the ranges for each dimension. I think image()
would be one way to retrieve these. A record type access to them might also
work. At the worst case, an internal function may be used. Other than that,
arrays will be passed like all other structures (ie effectively a pointer
is passed)

One other item I was thinking of to go with arrays is a range data type
with a constuctor like .. (ie A[1..3]). This could allow reverse order
specification (which could even be used by the interpreter). I see that
this range data type would be converted to a subrange (1:3) if used
anywheres where a subrange is allowed (lists and strings). A[1..3]
might construct a list even.

Frank Filz
Center For Integrated Electronics
Rensselaer Polytechnic Institute
vicc@unix.cie.rpi.edu

kwalker@ARIZONA.EDU ("Kenneth Walker") (09/22/89)

> Date: 21 Sep 89 12:10:32 GMT
> From: gem.mps.ohio-state.edu!rpi!unix.cie.rpi.edu!vicc@tut.cis.ohio-state.edu  (VICC Project (Rose))
> 
> Pointers could also be used to access non-Icon memory. This would obviously
> be machine and os dependant, but I think is necessary for any language
> on a personal computer.

The word "necessary" is too strong. It is true that the feature
significantly extends the application domain of the language, though.

> Pointers are also usefull for constructing (more efficiently perhaps, or
> more intuitively) tree structures which might have multiple pointers
> in each record (and might have data in each record also).

The pointers are already there in the pointer semantics of Icon data
structures. Your suggestion would just make them explicit. It seems
to me that using them will necessarily make the code to construct
trees more verbose, without adding any power.

I am confused as to what your pointers would be allowed to point to.
You make 3 seemingly conflicting statements:

1) > Pointers could also be used to access non-Icon memory.

2) > The pointers I would implement would be required always to point to a
   > structure.

3) > Pointers are one way to implement pass by reference parameters.

Number 1 could be an "unsafe" pointer type. That is, the programmer
would be responsible for avoiding any kind of memory violation. This
is somewhat un-Iconish, but one makes compromises in language design.
You will probably have to avoid pointing into an Icon structure, because
garbage collection won't know how to relocate the pointer (at least
not without changes to the garbage collector).

Number 2 is no problem; it corresponds to the current pointer semantics
of Icon data structures.

Number 3 requires pointers to variables. While this is no problem
for the garbage collector, there is a problem of dangling pointers
to local variables that no longer exist (this is not a problem if you
only use the pointers for pass-by-reference). Ignoring the problem
would be very un-Iconish! One solution is to allocate local variables
in the heap so they don't go away while something is pointing to them.
For programs that do a lot of procedure calls, this can significantly
increase the number of garbage collections. An other approach would
be to keep track of pointers to local variables and change them
to the null value when the procedure terminates. This would also
be expensive.

   Ken Walker / Computer Science Dept / Univ of Arizona / Tucson, AZ 85721
   +1 602 621 2858  kwalker@Arizona.EDU   {uunet|allegra|noao}!arizona!kwalker

sbw@naucse.UUCP (Steve Wampler) (09/22/89)

On Sep 21 at  6:33, VICC Project (Rose) writes:
} 
} 
} Pointers could also be used to access non-Icon memory. This would obviously
} be machine and os dependant, but I think is necessary for any language
} on a personal computer. The most successeful pc programs are those which
} directly use the machines resources. They may or may not do so in a
} structured way. I will agree that this reason for pointers is a VERY
} touchy subject.

I can see this use, though perhaps there is some other way.
 
} Pointers are also usefull for constructing (more efficiently perhaps, or
} more intuitively) tree structures which might have multiple pointers
} in each record (and might have data in each record also).

I guess I don't see this one.  That's already easy to do.  Perhaps it's
more intuitive to one who things in pointers, but you're really using
pointers to represent hierarchies.

} The pointers I would implement would be required always to point to a
} structure. This would simplify garbage collection in that there would be
} a block (pointed to by a variable = accessible to garbage collector) which
} contains a pointer to a block and the type of that block. I understand that
} setting things up for the garbage collector is most important.

If you do this, then...

} Pointers are one way to implement pass by reference parameters. Basically, I

you're being redundant.  Given your constraint above, you already have it in
Icon.  Try the following program:

	procedure main()
   	local a
   	   call(a := [])
   	   every write(!a)
	end

	procedure call(x)
   	   every put(x,1 to 10)
	end

You must mean to have internal pointers as being different than pointers at
the source level?

Pointers are not the only way to do this, though they are perhaps the
simplest to implement.  I do see a lot of redundancy showing up, however,
given that structures provide pointer 'functionality' already.  It only
seems a gain in having pass by reference to simple data types.  A more
'Icon-ish' solution would be to extend the co-expression mechanism to
do it for you - i.e. have co-expressions exist in the environment of
their creation, rather than in a copy of that environment (providing
a scheme for isolating the environment if needed).  That gives you
more flexibility than simple pointers.  Of course, it also requires
some substantial (and potentially costly) changes to the implementation.

-- 
	Steve Wampler
	{....!arizona!naucse!sbw}

vicc@unix.cie.rpi.edu (VICC Project (Rose)) (09/22/89)

In article <8909211711.AA08091@megaron.arizona.edu>, kwalker@ARIZONA.EDU ("Kenneth Walker") writes:
> > Date: 21 Sep 89 12:10:32 GMT
> > From: gem.mps.ohio-state.edu!rpi!unix.cie.rpi.edu!vicc@tut.cis.ohio-state.edu  (VICC Project (Rose))
> > 
> > Pointers could also be used to access non-Icon memory. This would obviously
> > be machine and os dependant, but I think is necessary for any language
> > on a personal computer.
> 
> The word "necessary" is too strong. It is true that the feature
> significantly extends the application domain of the language, though.

Ok, I was too strong here. 

> > Pointers are also usefull for constructing (more efficiently perhaps, or
> > more intuitively) tree structures which might have multiple pointers
> > in each record (and might have data in each record also).
> 
> The pointers are already there in the pointer semantics of Icon data
> structures. Your suggestion would just make them explicit. It seems
> to me that using them will necessarily make the code to construct
> trees more verbose, without adding any power.
> 
> I am confused as to what your pointers would be allowed to point to.
> You make 3 seemingly conflicting statements:
> 
> 1) > Pointers could also be used to access non-Icon memory.
> 
> 2) > The pointers I would implement would be required always to point to a
>    > structure.
> 
> 3) > Pointers are one way to implement pass by reference parameters.
> 
> Number 1 could be an "unsafe" pointer type. That is, the programmer
> would be responsible for avoiding any kind of memory violation. This
> is somewhat un-Iconish, but one makes compromises in language design.
> You will probably have to avoid pointing into an Icon structure, because
> garbage collection won't know how to relocate the pointer (at least
> not without changes to the garbage collector).

I agree absolutely, I should have clarified things a bit. I think two
types of user pointers are implicated. One should only point to non-Icon
memory and the other for pointing to Icon structures (and this type might
also include some type information). One possibility is that if all the uses
of a non-Icon pointer are easy to enumerate (including possible calls to
malloc for non-Icon memory on the heap) Then one could set up a more
bomb proof pointer structure.

> Number 2 is no problem; it corresponds to the current pointer semantics
> of Icon data structures.

My intention is to follow Icon as closely as possible here to ease the work
of the garbage collector.

> Number 3 requires pointers to variables. While this is no problem
> for the garbage collector, there is a problem of dangling pointers
> to local variables that no longer exist (this is not a problem if you
> only use the pointers for pass-by-reference). Ignoring the problem
> would be very un-Iconish! One solution is to allocate local variables
> in the heap so they don't go away while something is pointing to them.
> For programs that do a lot of procedure calls, this can significantly
> increase the number of garbage collections. An other approach would
> be to keep track of pointers to local variables and change them
> to the null value when the procedure terminates. This would also
> be expensive.

This is one I didn't think about too much. Of course this is one advantage
of airing ones ideas in such a forum (of course when I sat down to actually
do the implementation I hope I would have seen this).

Here is an idea that might work:

   When a pointer to a local variable (static variables are fine) is generated,
we move the variable into the heap and replace the local variable with a
construct similar to a trapped variable. Thus if the pointer is left hanging,
the pointer is actually pointing to something in the heap which is then
accesible to the garbage collector and remains after the procedure returns.
I think this would provide protection without too much heap overhead. If the
use of a pointer to a local variable was intentional, and the procedure was
not re-entrant (recursive or otherwise), then a static variable could be used
to avoid the heap allocation.

One advantage that I do see to Icon pointers is that with care, you can never
end up with a dangling pointer. In Pascal or C it is easy to allocate
memory for a structure, set a pointer to it, copy that pointer, release the
memory and still have a pointer to the memory. In Icon, to deallocate
memory pointed to by a pointer, all we need do is set the variable to &null
(or any non-pointer variable). If all pointers to a structure are removed
in this way, the garbage collector will free the memory when called.

One thing in reference to non-Icon pointers is that I think most of the
time one would need them (other than for I/O type activities which can
easily be implemented in Icon so that pointers are not needed for them)
is that Icon can allocate the memory and then when the pointer need be
passed to some C or assembly function, the pointer can just be adjusted to
point to the space in the block. If a block is needed which can't be
moved, one could use malloc or modify the garbage collector in some way.

Frank Filz
Center For Integrated Electronics
Rensselaer Polytechnic Institute
vicc@unix.cie.rpi.edu

vicc@UNIX.CIE.RPI.EDU (VICC Project (Rose)) (09/22/89)

I have often wanted to be able to refer to a variable containing a
simple type. Also, being able to pass a simple type by reference
is more efficient than constucting a list with that element and passing
the list (although admittedly to a small ammount).
 
Frank Filz