[comp.sys.amiga.misc] Looking for a new 'join' command

clemon@lemsys.UUCP (Craig Lemon) (05/31/91)

        Hello world.  I'm looking for a newer, better join command designed
towards concatenating large numbers of large files.  My specific
application is UUCP maps if that gives anyone any understanding.  I've seen
references to using

        join #? as map

before but that doesn't work for me (CBM Shell).  What I have to do now is
write a recursive script calling join.  I call this script with 'eo' (Execute
On) with a directory lising of the map dir as input.  For those that don't
know, 'eo' takes a text file as input and processes dos commands on each
line of this file.  If you list a directory to a file and type

        eo {fileyouused} "delete @"

it will call the 'delete' command and substitute each file entry for @. 
The script I use now does something like

.key <smallfile>
join <smallfile> map as map2
copy map2 map
delete map2

        When 'map' reaches a few megabytes in size, this can easily take
and entire night.  I would like a quick command that I can use wildcards
(or maybe it's only written to do entire directories) that will be smart
and see that it needs to input from each file and just concatenate them
alphabetically to the output file once instead of after every small file
like it does now.  It should use large buffers because most maps average 
50K and this would increase speed.

        Any ideas or suggestions on how to accomplish this task?  Any new
scripts or methods?  What other methods do people use out there.  E-Mail
unless there's great interest (yeah right! :-)

--
 Craig Lemon - Kitchener, Ontario. Amiga B2000 UUCPv1.13D.
 clemon@lemsys.UUCP lemsys!clemon@xenitec.on.ca | Please Mail any binaries
 xenitec!lemsys!clemon@watmath.uwaterloo.edu    | to 'files' at this site
 ..!uunet!watmath!xenitec!lemsys!clemon         | instead of 'clemon'

peterk@cbmger.UUCP (Peter Kittel GERMANY) (06/03/91)

In article <clemon.3680@lemsys.UUCP> clemon@lemsys.UUCP (Craig Lemon) writes:
>
>        Hello world.  I'm looking for a newer, better join command designed
>towards concatenating large numbers of large files.  My specific
>application is UUCP maps if that gives anyone any understanding.  I've seen
>references to using
>
>        join #? as map

Didn't try, but perhaps the 2.0 version of join does already that.

>The script I use now does something like
>
>.key <smallfile>
>join <smallfile> map as map2
>copy map2 map
>delete map2

1. It should be faster to change the last two lines to avoid the copy:
      delete map
      rename map2 as map
2. Perhaps you could add a little intelligence that this script
   always takes more than 1 file to add to the file map. In the
   simplest approach you could perhaps always join 6 (or 10) files
   together and fall back to the one-file method only for the very
   last one(s).

-- 
Best regards, Dr. Peter Kittel  // E-Mail to  \\  Only my personal opinions... 
Commodore Frankfurt, Germany  \X/ {uunet|pyramid|rutgers}!cbmvax!cbmger!peterk

umueller@iiic.ethz.ch (Urban Dominik Mueller) (06/04/91)

In article <1265@cbmger.UUCP> peterk@cbmger.UUCP (Peter Kittel GERMANY) writes:
>In article <clemon.3680@lemsys.UUCP> clemon@lemsys.UUCP (Craig Lemon) writes:
>>
>>        Hello world.  I'm looking for a newer, better join command designed
>>towards concatenating large numbers of large files.
>>
>>        join #? as map
>
>Didn't try, but perhaps the 2.0 version of join does already that.

You can also use CShell (any version), which works under Kick 1.3:
  alias stuff "%a join -r $a map map2; delete map; rename map2 map
The call would then be:
  stuff *
This will add all articles at once, in alphabetical order. If you
want them in newest-at-the-top order, use:
  stuff $(dir -nt)
Which can be made an additional alias. Again, the job will be made
in one rush, ie. fast.

If this is too esoteric for you, use the ARP join command, it allows
patterns the way you need them.

    -Dominik

clemon@lemsys.UUCP (Craig Lemon) (06/07/91)

In article <29189@neptune.inf.ethz.ch> umueller@iiic.ethz.ch (Urban Dominik Mueller) writes:
>>Didn't try, but perhaps the 2.0 version of join does already that.
>
>You can also use CShell (any version), which works under Kick 1.3:
>  alias stuff "%a join -r $a map map2; delete map; rename map2 map
>The call would then be:
>  stuff *
>This will add all articles at once, in alphabetical order. If you
>want them in newest-at-the-top order, use:
>  stuff $(dir -nt)
>Which can be made an additional alias. Again, the job will be made
>in one rush, ie. fast.

        Is this not simply removing the need for a script but still doing
the repetitive operation over and over again?

>
>If this is too esoteric for you, use the ARP join command, it allows
>patterns the way you need them.
>
>    -Dominik

        My problem is partially pattern matching.  My main goal however is
to eliminate a lot of little steps and use one large step.  This means that
this join command would sort an alphabetical list (or whatever) of the
directory.  The command would open the new file dump the contents of a into
it, close a, dump b etc....etc....etc.... without closing the new file
between feeder files.  Hopefully this would be in assembler or something
and written to handle medium-sized files with ease (50K) from a buffering
and speed point of view.

        I know a lot of you say "write one yourself".  I am quite computer
literate and I am a programmer-type person but I haven't had the time to
learn Amiga C well enough to do anything useful yet.

--
 Craig Lemon - Kitchener, Ontario. Amiga B2000 UUCPv1.13D.
 clemon@lemsys.UUCP lemsys!clemon@xenitec.on.ca | Please Mail any binaries
 xenitec!lemsys!clemon@watmath.uwaterloo.edu    | to 'files' at this site
 ..!uunet!watmath!xenitec!lemsys!clemon         | instead of 'clemon'

umueller@iiic.ethz.ch (Urban Dominik Mueller) (06/11/91)

In article <clemon.3910@lemsys.UUCP> clemon@lemsys.UUCP (Craig Lemon) writes:
>I wrote:
>>You can also use CShell (any version), which works under Kick 1.3:
>>  alias stuff "%a join -r $a map map2; delete map; rename map2 map
>
>        Is this not simply removing the need for a script but still doing
>the repetitive operation over and over again?

No. It's all done using one signle 'join' command, ie. in one pass.

>        My problem is partially pattern matching.  My main goal however is
>to eliminate a lot of little steps and use one large step.  This means that
>this join command would sort an alphabetical list (or whatever) of the
>directory.  The command would open the new file dump the contents of a into
>it, close a, dump b etc....etc....etc.... without closing the new file
>between feeder files.

Waste of time. This is quite exactly what CShell's join and, I believe,
the other shells' join commands do. Additionally, using assembly for a
task like this would speed up the whole thing by 1% or so, so forget it.

   -Dominik