[comp.lang.icon] A word/line procedure

KKTK_KOTUS@cc.helsinki.fi (05/16/91)

Dear iconists,

I have a following, obviosly simple problem. I have made a nice
Icon procedure, which divides a text file one word/line. It is much easier,
simpler and faster than the one given in Icon book. It goes like this 
(the original idea is, I believe, from some Unix-procedure, and it is the same
thing you would do with any editor or text processing program, if you had to
use them for the work):


	procedure main()
   		while line:=read() do write(map(line," ","\l"))
	end 


It works fine and fast, but it produces also empty lines which I cannot get rid
off in the same procedure, although I try to filter them out with something
like 


	if *line > 0 then write(line)


But if I make the first procedure to write the stuff in a file and then process
that file with the procedure


	procedure main()
  		while line:=read() do if *line > 0 then write(line)
	end



I get rid off the empty lines. It is easy to make a .bat or .com or whatever
from these two procedures, and the whole thing still works at least as fast or
even faster than the procedure in the Icon book. But
I would still like to know, how I actually could make it work in one shot. I
have tried different solutions, but they do not work. Can anybody
explain it?



Greetings,

Kimmo Kettunen
KKTK_KOTUS@CC.HELSINKI.FI 






    

goer@ellis.uchicago.edu (Richard L. Goerwitz) (05/16/91)

In article <124E207800E17F06@cc.Helsinki.FI> KKTK_KOTUS@cc.helsinki.fi writes:
>
>I have a following, obviously simple problem. I have made a nice
>Icon procedure, which divides a text file one word/line....:
>
>	procedure main()
>   		while line:=read() do write(map(line," ","\l"))
>	end 
>
>It works fine and fast, but it produces also empty lines which I cannot get rid
>off in the same procedure, although I try to filter them out with something
>like 
>
>	if *line > 0 then write(line)

I don't see how you can eliminate blank lines that are due to adjacent
spaces.  If all words are guaranteed to be separated by one space, and
no more, then naturally you could say

    ... do write("" ~== map(line, " ", "\l"))

This would eliminate lines with no words on them.  If, though, your files
distance words from each other using more than one blank space, you will
surely need to scan the lines by hand.  I don't know what was in the Icon
book, but I'd probably just use:

procedure main(a)

    separators := \a[1] | ',":<>,._+-=)(*&^%$#!@~`\'?/|\\][} {\t;'

    while read(&input) ? {
        tab(many(separators))
	if not pos(0) then {
            while write(tab(upto(separators))) do
	        tab(many(separators))
        }
	pos(0) | write(tab(0))
    }

end

If you add in control characters to your separator list, and add a test
for length, you'll have something like the UNIX strings command.

Quiz time:

    Why must the line "pos(0) | write(tab(0))" stand after the if..then
    expression?

If you are stumped, here is a hint (don't peek):






    Notice how I called "if...then" an "expression."  That means it pro-
    duces a result.  What is that result?

If you are still stumped, here is the (or rather one) answer:





    The read(&input) ? { if...then; junk } expression always succeeds,
    because "junk" (i.e. "pos(0) | write(tab(0))") always succeeds.  If
    junk were not present, then we'd have read(&input) ? { if...then }.
    If the if-condition fails, the if...then expression fails.  More-
    over, if it succeeds, the then { expression } is evaluated.  Since
    "expression" is a while loop, it will eventually fail, causing the
    entire if...then expression to fail.  When it fails, the read(&input)
    ? { etc. } expression will fail, which will cause termination of the
    while construct it is part of, and ultimately termination of the
    program.  The upshot is that junk must be present, and must always
    succeed.

I probably should have tested the above program, but I have to run.  If
there are any errors, then I apologize (though I really don't see where
any could be [famous last words]).

-- 

   -Richard L. Goerwitz              goer%sophist@uchicago.bitnet
   goer@sophist.uchicago.edu         rutgers!oddjob!gide!sophist!goer