[comp.lang.misc] Iterators

scc@cl.cam.ac.uk (Stephen Crawley) (03/19/88)

In article <1139@PT.CS.CMU.EDU> edw@IUS1.CS.CMU.EDU (Eddie Wyatt) writes:
In an earlier article, someone else writes:

>> [... Clu iterators are coroutines]

No they are not.  Most of the things that you can do with coroutines you
can't do with iterators.

>But do you need coroutines to implimented generlized iteration.  
>I think not.  

Spot on.  You CAN implement iterators with coroutines, but in practice
they are implemented more cheaply by turning the loop body into a procedure
like thing that is passed to the iterator as a hidden parameter.  More
details on the implementation of CLU iterators can be found in

  DGS Note 124 "VAX CLU Implementation Notes"  by Sharon E. Perl 29/11/84
  Laboratory for Computer Science, Massachusetts Institute of Technology

> [...]
>Example:
>
>	void iterate_inorder(root,func)
>	    TREE *root;
>	    void (*func)();
>	    {
>	    if (root != NULLPTR(TREE))
>		{
>	    	iterate_inorder(root->left,func);
>	    	func(root->val);
>	    	iterate_inorder(root->right,func);
>		}
>	    }
>
>    I do see one problem though, which is dealing with variable
>    referenced  inside the loop.  They would have to become
>    global.  Comments as to why this method might not work in general?

Apart from being a disgusting hack ... making your "local" variables into
globals will only work if the pseudo-iterator is NEVER called recursively.

To get it to work recursively, there are two approaches.

1)  Declare the relevant local variables as a structure, and pass a
    pointer to the struct as an extra parameter to the iter procedure
    and hence to the "loop body" procedure.  This involves casting /
    loopholing of the pointer, or a generic type parameter in a high 
    level programming language.

2)  Declare the loop body procedure in the same (non-global) scope as
    the local variables for the loop.  The loop locals are then available
    to the loop body procedure as non-locals.  This assumes that your
    language allows this ... C, BCPL, Pascal, Modula-*, Ada etc don't, 
    but Algol-68 and Mesa do, as do any languages which support proper
    procedure closures.

Here's an example of 2)

  ...
  P:  proc (...) =
    x, y: FooType
    loopbody: proc (elem: ElemType) = {
      ...
      x := list_elem.foo
      list_elem.foo := y
      ...
      }

    (* body of P *)
    ...
    iterate_over(list, loopbody)
    ...
    }

-- Steve

jejones@mcrware.UUCP (James Jones) (03/20/88)

There's one thing that bugs me about CLU iterators, which on the whole are
pretty neat: if I have two things I want to iterate through in lockstep,
how do I do it?   (That's a non-rhetorical question--it would be great to
hear about a way to do it.)

		James Jones

dana@gmu90x.UUCP (J Dana Eckart) (03/21/88)

In article <616@mcrware.UUCP> jejones@mcrware.UUCP (James Jones) writes:
>There's one thing that bugs me about CLU iterators, which on the whole are
>pretty neat: if I have two things I want to iterate through in lockstep,
>how do I do it?
>		James Jones

I have been concerned about this same problem.  I wrote a short paper
about this that appeared in SIGPLAN Notices in April 1987.  I would be
curious to know what you (and the rest of the net) think of it.  I have 
submitted a more recent and indepth paper on the subject to a conference 
that more clearly addresses some of the "problems" with the proposal 
that appears in the SIGPLAN Notices paper.  If there is sufficient
interest I might consider posting the "refined" proposal.

J Dana Eckart
	UUCP:	...!(gatech | pyrdc)!gmu90x!dana
	ARPA:   dana@gmu90x.gmu.edu
	SNAIL:	P.O. Box 236/Fairfax, VA  22030-0236

peter@athena.mit.edu (Peter J Desnoyers) (03/24/88)

[]
Since we've spent a lot of time explaining how you can iterate through
multi-element data objects in C or similar languages by using a
'traversing function' that applies a function to each element, I might
as well point out that you can (even though I cringe at the thought)
write a 'real' iterator in C. Contrast the methods used in the
following code fragments:

Iterators:
/* calculate n, sum x, sum x^^2 */
for (leaf = iter( tree, &s); leaf != NULL; leaf = iter( tree, &s)) {
    sum += leaf->size;
    sum_squared += leaf->size * leaf->size; 
    n++; }

/* yield successive elements from a tree */
iter( tree_node * t, stack ** s) {
    tree_node * tmp;

    if (* s == NULL) {
        * s = malloc( lots-o-space);
	push( *s, t); }

    if (* s is empty) {
        free( *s);
	* s = NULL;
	return (NULL); }

    for (tmp = pop( *s); !tmp->leaf; tmp = tmp->left)
      push( *s, tmp->right);
    return( tmp);
}

When done with 'traversing functions' [Is there a proper name for
these things? Please let me know if there is.] the code looks like
this: 

/* define function to be applied */
f( tree_node * t, int * sum, int * sum_squared, int * n) {
    * sum += t->size; ...}

/* apply it */
iter( tree, f, &sum, &sum_squared, &n);

/* traverser function: */
iter( tree, function, varargs stuff...) {
    if (tree->leaf)
      (*function)(leaf, varargs stuff...);
    else {
      iter( tree->right, function, ...);
      iter( tree->left, function, ...); } }


I'm not sure if this proves anything - I think the first makes more
sense, especially if I don't have to write the iterator and it runs on
a fast machine :-).


				Peter Desnoyers

sommar@enea.se (Erland Sommarskog) (03/25/88)

Stephen Crawley (scc@cl.cam.ac.uk) writes:
>2)  Declare the loop body procedure in the same (non-global) scope as
>    the local variables for the loop.  The loop locals are then available
>    to the loop body procedure as non-locals.  This assumes that your
>    language allows this ... C, BCPL, Pascal, Modula-*, Ada etc don't, 
>    but Algol-68 and Mesa do, as do any languages which support proper
>    procedure closures.
> <Example deleted for brevity>

I'm not entirely sure what you mean, so my apologies if my objection
isn't relevant. But looking at your example, it seems something that
is straghtforward to do in Ada. I even did it not long ago. And since
everyone is posting their favourite C iterators, you get some outlines
of it in Ada too.

First I have a generic package for binary trees. The package includes
generic procedures for traversing the tree in different directions.
This looks like:

   Generic
      Type Data_type is limited private;
      With procedure Assign(A : in out Data_type;
                            B : in     Data_type);
      With function "<"(A, B : Data_type) return boolean is <>;
      With function ">"(A, B : Data_type) return boolean is <>;
   -- We need the assignment and relations since we don't know anything   
   -- about what the user want to strore in the tree.
   Package binary_trees is
      Type Tree_type is private;     
      Type Node_type is private;     
      Null_node : constant Node_type; 
   ....      
      Generic
         With Procedure Treat(Node : in     Node_type;  
                              Data : in out Data_type); 
      Procedure Traverse_forward(Tree : Tree_type);
   ....
   End Binary_trees;
   
When I want to traverse the tree, I just instantiate the traversion 
procedure with a local one:

   Package Tree_handler is new Binary_trees(Data_ref); Use Tree_handler;
   ...
   Function Find_node(Criterion : Some_type) return Node_type is
   -- Traverses the (globally declared) tree and returns the first
   -- node that matches the criterion.
   Begin                 
      Declare
         Sought_node : Node_type;
         Found       : exception;
         Procedure Check_this_one(Node : in     Node_type;
                                  Data : in out Data_ref) is
         Begin                     
            If Data.Test_item = Criterion then
               Sought_node := Node;
               Raise Found;
            End if;
         End Check_this_one;
         Procedure Search is new Traverse_forward(Check_this_one); 
      Begin
         Search(Global_tree);
      Exception
         when found => Return Sought_node;
      End;
   End Find_node;

Now, isn't this very similar to the example Stephen posted?
(There are probably some people objecting strongly against the use
of the exception here. And I may well agree it's brute force. 
I included it just to show that there is a way out, ever the author 
of the iterator forgot to put in a stop condition.)  

-- 
Erland Sommarskog       
ENEA Data, Stockholm        
sommar@enea.UUCP           "Si tu crois l'amour tabou...
                            Regarde bien, les yeux d'un fou!!!" -- Ange

scc@cl.cam.ac.uk (Stephen Crawley) (03/26/88)

In article <616@mcrware.UUCP> jejones@mcrware.UUCP (James Jones) writes:
>There's one thing that bugs me about CLU iterators, which on the whole are
>pretty neat: if I have two things I want to iterate through in lockstep,
>how do I do it?   (That's a non-rhetorical question--it would be great to
>hear about a way to do it.)
>
>		James Jones

If I understand your problem correctly, there's a simple answer.  In CLU,
an iterator may be declared to yield a tuple each time it iterates.  So
you can iterate over 2 lists (say) in lock-step as follows.

lock_step = iter (list1, list2: my_l_type) yields (my_v_type, my_v_type)

  my_nil: my_l_type := my_l_type$get_nil ()	% erm ...

  while list1 ~= my_nil & list2 ~= my_nil & 
    yield (list1.value, list2.value)
    list1 := list1.next
    list2 := list2.next
    end
  end lock_step

[The mussing around with my_nil is for real.  There is no universal
 nil value in Clu, so you have to somehow construct one for each 
 recursive abstract data type.  It is usually has to be done using
 a oneof type ... sigh]

And just for the hell of it, here's a generic 2 list iterator!!

lock_step2 = iter [lt, t: type] (list1, list2: lt) yields (t, t)
                  where lt has is_empty: proctype (lt) returns (bool),
                               next:     proctype (lt) returns (lt),
                               value:    proctype (lt) returns (t)
  while ~lt$is_empty (list1) & ~lt$is_empty (list2) do
    yield (lt$value (list1), lt$value (list2))
    list1 := lt$list1.next
    list2 := lt$list2.next
    end
  end lock_step2

It strikes me that the above solution is too obvious.  Perhaps James
had a more tricky case in mind?

-- Steve
Newsgroups: comp.lang.misc
Subject: Re: iterators (was Re: From Modula to Oberon)
References: <2827@enea.se> <1557@pasteur.Berkeley.Edu> <3764@bloom-beacon.MIT.EDU> <3864@bloom-beacon.MIT.EDU> <616@mcrware.UUCP>
From: scc@cl.cam.ac.uk (Stephen Crawley)
Reply-To: scc@cl.cam.ac.uk (Stephen Crawley)
Organization: U of Cambridge Comp Lab, UK
Keywords: lockstep iteration

In article <616@mcrware.UUCP> jejones@mcrware.UUCP (James Jones) writes:
>There's one thing that bugs me about CLU iterators, which on the whole are
>pretty neat: if I have two things I want to iterate through in lockstep,
>how do I do it?   (That's a non-rhetorical question--it would be great to
>hear about a way to do it.)
>
>		James Jones

If I understand your problem correctly, there's a simple answer.  In CLU,
an iterator may be declared to yield a tuple each time it iterates.  So
you can iterate over 2 lists (say) in lock-step as follows.

lock_step = iter (list1, list2: my_l_type) yields (my_v_type, my_v_type)

  my_nil: my_l_type := my_l_type$get_nil ()	% erm ...

  while list1 ~= my_nil & list2 ~= my_nil & 
    yield (list1.value, list2.value)
    list1 := list1.next
    list2 := list2.next
    end
  end lock_step

[The mussing around with my_nil is for real.  There is no universal
 nil value in Clu, so you have to somehow construct one for each 
 recursive abstract data type.  It is usually has to be done using
 a oneof type ... sigh]

And just for the hell of it, here's a generic 2 list iterator!!

lock_step2 = iter [lt, t: type] (list1, list2: lt) yields (t, t)
                  where lt has is_empty: proctype (lt) returns (bool),
                               next:     proctype (lt) returns (lt),
                               value:    proctype (lt) returns (t)
  while ~lt$is_empty (list1) & ~lt$is_empty (list2) do
    yield (lt$value (list1), lt$value (list2))
    list1 := lt$list1.next
    list2 := lt$list2.next
    end
  end lock_step2

It strikes me that the above solution is too obvious.  Perhaps James
had a more tricky case in mind?

-- Steve
Newsgroups: comp.lang.misc
Subject: Re: iterators (was Re: From Modula to Oberon)
References: <2827@enea.se> <1557@pasteur.Berkeley.Edu> <3764@bloom-beacon.MIT.EDU> <3864@bloom-beacon.MIT.EDU> <616@mcrware.UUCP>
From: scc@cl.cam.ac.uk (Stephen Crawley)
Reply-To: scc@cl.cam.ac.uk (Stephen Crawley)
Organization: U of Cambridge Comp Lab, UK
Keywords: lockstep iteration

In article <616@mcrware.UUCP> jejones@mcrware.UUCP (James Jones) writes:
>There's one thing that bugs me about CLU iterators, which on the whole are
>pretty neat: if I have two things I want to iterate through in lockstep,
>how do I do it?   (That's a non-rhetorical question--it would be great to
>hear about a way to do it.)
>
>		James Jones

If I understand your problem correctly, there's a simple answer.  In CLU,
an iterator may be declared to yield a tuple each time it iterates.  So
you can iterate over 2 lists (say) in lock-step as follows.

lock_step = iter (list1, list2: my_l_type) yields (my_v_type, my_v_type)

  my_nil: my_l_type := my_l_type$get_nil ()	% erm ...

  while list1 ~= my_nil & list2 ~= my_nil & 
    yield (list1.value, list2.value)
    list1 := list1.next
    list2 := list2.next
    end
  end lock_step

[The mussing around with my_nil is for real.  There is no universal
 nil value in Clu, so you have to somehow construct one for each 
 recursive abstract data type.  It is usually has to be done using
 a oneof type ... sigh]

And just for the hell of it, here's a generic 2 list iterator!!

lock_step2 = iter [lt, t: type] (list1, list2: lt) yields (t, t)
                  where lt has is_empty: proctype (lt) returns (bool),
                               next:     proctype (lt) returns (lt),
                               value:    proctype (lt) returns (t)
  while ~lt$is_empty (list1) & ~lt$is_empty (list2) do
    yield (lt$value (list1), lt$value (list2))
    list1 := lt$list1.next
    list2 := lt$list2.next
    end
  end lock_step2

It strikes me that the above solution is too obvious.  Perhaps James
had a more tricky case in mind?

-- Steve

scc@jenny (Stephen Crawley) (03/28/88)

In article <2913@enea.se> sommar@enea.UUCP(Erland Sommarskog) writes:
>Stephen Crawley (scc@cl.cam.ac.uk) writes:
>>2)  Declare the loop body procedure in the same (non-global) scope as
>>    the local variables for the loop.  The loop locals are then available
>>    to the loop body procedure as non-locals.  This assumes that your
>>    language allows this ... C, BCPL, Pascal, Modula-*, Ada etc don't, 
>>    but Algol-68 and Mesa do, as do any languages which support proper
>>    procedure closures.
>> <Example deleted for brevity>
>
>I'm not entirely sure what you mean, so my apologies if my objection
>isn't relevant. But looking at your example, it seems something that
>is straghtforward to do in Ada. 

Arland's objection is both relevant and correct.

Since Ada does not allow you to treat procedures as values, the mechanism
I proposed (passing procedure as parameters to other procedures) does not
work.

However, Ada does allow you to use procedures as parameters to a generic
package.  As Erland pointed out, this allows you to construct iterators.
[It is sad that Ada forces you to go to the trouble of constructing a
 generic package for a data structure you want to iterate over.]

Perhaps I shouldn't make statements about languages I've never used ...

>(There are probably some people objecting strongly against the use
>of the exception here. And I may well agree it's brute force. 
>I included it just to show that there is a way out, ever the author 
>of the iterator forgot to put in a stop condition.)  

I'll excuse you! :-)  End conditions on pseudo-iterators are frequently
messy.  Yet another reason for wanting REAL iterators in a language.

[Sorry about the crap on the end of my last article.  People around
 here have made it unnecessarily hard to post news ... and that was 
 the result.]

-- Steve

sommar@enea.se (Erland Sommarskog) (03/30/88)

Stephen Crawley (scc@cl.cam.ac.uk) writes:
>However, Ada does allow you to use procedures as parameters to a generic
>package.  As Erland pointed out, this allows you to construct iterators.
>[It is sad that Ada forces you to go to the trouble of constructing a
> generic package for a data structure you want to iterate over.]

Minor correction: A package like a tree handler should always be generic. 
But if Ada had had procedure parameters, we wouldn't have been forced
to declare the iterator as a generic procedure.
  I don't really want to go into the debate on generic vs. "real"  
procedure parameters. I can imagine that the Ada designers felt 
procedure parameters as redundant aside with procedures as generic
parameters. But I'm not really fond of declarations like
   Generic 
      With procedure A(P1 : T;
                       P2 : T2);
   Procedure B;                       
Together with the instantiation this becomes unnecessarily verbose.
The compiler and library overhead is also greater for generics.

-- 
Erland Sommarskog       
ENEA Data, Stockholm        
sommar@enea.UUCP