[comp.sys.apple2] IIgs Unzip thing

svetozar@pnet51.orb.mn.org (Eric C. Anderson) (03/24/91)

Hi folks.  I've been working on my IIgs unzip thing some more; it's now about
10% faster than the version I binscii'd and uploaded to that "binaries" group,
but as I'm unsure of the policy regarding uploading Yet Another version of
one's programs, I'll wait until more speed increases have been achieved.  Some
random numbers: a file (zip07src.zip, the unix ZIP source, v0.7) which the
previous version uncompressed in 55 seconds was uncompressed by the current
(23 March 1991) version in 48.5 seconds.  That's for a 2.8MHz IIgs, to /RAM5,
btw; uncompressing files to a floppy drive or hard disk would be slower.

Not exactly on the subject, but I found it interesting:

The aforementioned "zip07src.zip" file is 110047 bytes in length.  After
uncompressing it, I re-archived all the resulting files with GS-ShrinkIt.  The
resulting file was 156303 bytes in length.  I hope everyone won't laugh at my
ignorance, but if a .SHK archive is 42% larger than a .ZIP archive, would it
not be a good idea for someone to write a program for the Apple II family
incorporating PKZIP's superior compression algorithm?  On the other hand,
programs such as ARJ and LZH in the PC world achieve even better compression
than PKZIP, using something called "static LZW"; perhaps this would be a
profitable route for some software author to investigate.

UUCP: {amdahl!bungia, crash}!orbit!pnet51!svetozar
ARPA: crash!orbit!pnet51!svetozar@nosc.mil
INET: svetozar@pnet51.orb.mn.org

shrinkit@Apple.COM (Andrew Nicholas) (03/25/91)

In article <4388@orbit.cts.com> svetozar@pnet51.orb.mn.org (Eric C. Anderson) writes:

>The aforementioned "zip07src.zip" file is 110047 bytes in length.  After
>uncompressing it, I re-archived all the resulting files with GS-ShrinkIt.  The
>resulting file was 156303 bytes in length.  I hope everyone won't laugh at my
>ignorance, but if a .SHK archive is 42% larger than a .ZIP archive, would it
>not be a good idea for someone to write a program for the Apple II family
>incorporating PKZIP's superior compression algorithm?  On the other hand,
>programs such as ARJ and LZH in the PC world achieve even better compression
>than PKZIP, using something called "static LZW"; perhaps this would be a
>profitable route for some software author to investigate.

Hi, you've been trying to send me email, and I've been getting all of it, and
I've also been writing really long, long replies which evidently aren't getting
to you.  I just sent you email again, so let me know if you don't get it.

Now, about better compression methods -- I have had an "almost working"
assembly version of LZH for 2 years now.  I have never gotten the assembly code
to work correctly... then again, I've never gotten the chance lately to GET
it to work because I'm so busy (I am, really).

ARJ?  Did I miss something?  I knew of PKZIP, PAK, LZH, LZB, LZC, LZW, LZJ, and
a bunch of other LZ variants (LZFG, etc), but not ARJ.  Something to do with
arithmetic coding?

Btw, Orca/C 1.1 (and 1.2 I'm assuming, mine hasn't come yet) will correctly
compile the LZH source code and make it work.  It took something like 8 
minutes to compress Finder v1.3 on a stock GS.  It took GS-ShrinkIt something
like 23 seconds.  I remember doing this at KansasFest and came to the
conclusion that the C source was almost useless -- even with all of Orca/C's
optimizations >ON< (which isn't saying much, because the output which was
produced by the compressor didn't match the output produced when optimizations
were left off... hence my distrust of Orca/C 1.1's optimizer)... even with
the optimizations ON, it still took 7 some minutes to compress the
Finder.  The better the compression was going to be, the longer it took.
It wasn't uncommon to see compression times that took 1/2 hour or longer...
although I've never compressed something like the whole SYSTEM 504 disk.  That
would probably take hours to compress... and decompress.

andy

-- 
Andy Nicholas		  GEnie & America-Online: shrinkit
Apple Computer, Inc.  	 	      CompuServe: 70771,2615    
Apple IIGS System Software		InterNET: shrinkit@apple.com

I'm doing this on my own time, so I don't speak for Apple.

toddpw@nntp-server.caltech.edu (Todd P. Whitesel) (03/26/91)

shrinkit@Apple.COM (Andrew Nicholas) writes:

>Btw, Orca/C 1.1 (and 1.2 I'm assuming, mine hasn't come yet) will correctly
>compile the LZH source code and make it work.  It took something like 8 
>minutes to compress Finder v1.3 on a stock GS.  It took GS-ShrinkIt something
>like 23 seconds.  I remember doing this at KansasFest and came to the
>conclusion that the C source was almost useless -- even with all of Orca/C's
>optimizations >ON< (which isn't saying much, because the output which was
>produced by the compressor didn't match the output produced when optimizations
>were left off... hence my distrust of Orca/C 1.1's optimizer)... even with

Andy, learn something about Orca/C. It does not produce efficient code for lots
of simple things that are obvious to us assembly programmers. Also, the
optimizer sometimes exposes bugs in the code generator OR bugs in the program
itself that are usually hidden by 'forgiving' C implementations.

Using compiled languages for time-critical code never makes sense unless the
compiler is unspeakably awesome and efficient.

That C source is extremely useful -- the critical loops can be hand-compiled
into machine code and the non-critical sections can be left in C. I wrote LHG
originally in C, tested it in C, and then recoded the critical sections in
assembly. When I finally get a decent interface on the !@&*$@! thing you can
all see how well this method works. 

Todd Whitesel
toddpw @ tybalt.caltech.edu

shrinkit@Apple.COM (Andrew Nicholas) (03/26/91)

In article <1991Mar25.222305.6194@nntp-server.caltech.edu> toddpw@nntp-server.caltech.edu (Todd P. Whitesel) writes:

>Andy, learn something about Orca/C.

I've learned lots about Orca/C.  I've learned that it's not always worthy
of my trust in the correctness of the code it is generating.  This is as of
v1.1 -- I really hope v1.2 is better.

>It does not produce efficient code for lots
>of simple things that are obvious to us assembly programmers. Also, the
>optimizer sometimes exposes bugs in the code generator OR bugs in the program
>itself that are usually hidden by 'forgiving' C implementations.

Btw, I wasn't slamming Orca/C, I was slamming the C code which I was using for
the LZH implementation -- the C code just isn't all that efficient in and of
itself.  Yeah, sure, you could go in a tweak routines to use assembly, but in
this particular case, I doubt that much would come of it -- I think you'd need
to rewrite major parts of the algorithm itself to get substantial speed
increases that people could live with.

Which, I suppose, isn't so much an indictment of the C code, but rather of
the algorithm itself -- or, at least this implementation of the LZH algorithm.

Am I making any more sense?

>Using compiled languages for time-critical code never makes sense unless the
>compiler is unspeakably awesome and efficient.

I know that, Todd.  I ain't completely stooopid.  ;-)

Also, you took what I was saying much too generally -- I wasn't saying that
All C code is slow and inefficient, or that all C object produced by Orca/C
is slow and inefficient... but in this case, both appear to be true.

>That C source is extremely useful -- the critical loops can be hand-compiled
>into machine code and the non-critical sections can be left in C.

Yes, and I still think you'll have something which is extremely slow as far as
the LZH source code goes.  I would love to have someone prove me wrong, but
I don't see that happening...

>I wrote LHG
>originally in C, tested it in C, and then recoded the critical sections in
>assembly. When I finally get a decent interface on the !@&*$@! thing you can
>all see how well this method works. 

Want to elaborate a little?  What's LHG?  Some variant of LHA?

>Todd Whitesel

andy

-- 
Andy Nicholas		  GEnie & America-Online: shrinkit
Apple Computer, Inc.  	 	      CompuServe: 70771,2615    
Apple IIGS System Software		InterNET: shrinkit@apple.com

I'm doing this on my own time, so I don't speak for Apple.

svetozar@pnet51.orb.mn.org (Eric C. Anderson) (03/27/91)

Hi again.  I just sent a letter to you (no, none of your letters ever arrived)
about this unzip stuff.  Just in case it doesn't get to you, I decided to
write a little message here.

Other stuff: I would tend to agree that the Orca/C optimizer is not correct. 
Nearly everyone who has ported a large application to the IIgs using Orca/C
has experienced this for him/herself; perhaps v1.2 will be different.

Speaking of LZH on the gs, has anyone attempted to make use of the source for
Haruyasu Yoshizaki's (sp?) LHA v2.11?  Compression seems to be on a par with
PKZIP, although uncompression seems 2-3 times slower (love that Turbo
Profiler).  One can only hope the current talk of hours to compress System.504
is due to the Orca/C compiler, not the underlying algorithm.  Oh well.

Other thing: as per the suggestion of some helpful folks, I've started looking
for ways to improve my Unzip thing's file i/o.  If/when the changes result in
a working program, I'll probably upload it to that "binaries" thing.

Also, I'm attempting to write some more helpful comments to the source code;
when this has been accomplished, I'll try to figure out how to upload that,
too.

Bye.

UUCP: {amdahl!bungia, crash}!orbit!pnet51!svetozar
ARPA: crash!orbit!pnet51!svetozar@nosc.mil
INET: svetozar@pnet51.orb.mn.org

gwyn@smoke.brl.mil (Doug Gwyn) (03/28/91)

In article <1991Mar25.222305.6194@nntp-server.caltech.edu> toddpw@nntp-server.caltech.edu (Todd P. Whitesel) writes:
>Using compiled languages for time-critical code never makes sense unless the
>compiler is unspeakably awesome and efficient.

What a crock.  Practically ALL the time-critical code here is written in C
or Fortran, and the compilers aren't all that great.

toddpw@nntp-server.caltech.edu (Todd P. Whitesel) (03/28/91)

gwyn@smoke.brl.mil (Doug Gwyn) writes:

[In response to this bit that I wrote]
>>Using compiled languages for time-critical code never makes sense unless the
>>compiler is unspeakably awesome and efficient.

>What a crock.  Practically ALL the time-critical code here is written in C
>or Fortran, and the compilers aren't all that great.

Well excuse me for living!! What system are you using that's so unspeakably
awesome you use cruddy compilers for time-critical code??

Todd Whitesel
toddpw @ tybalt.caltech.edu

MQUINN@UTCVM.BITNET (03/28/91)

On Wed, 27 Mar 91 22:27:32 GMT Doug Gwyn said:
>In article <1991Mar25.222305.6194@nntp-server.caltech.edu>
> toddpw@nntp-server.caltech.edu (Todd P. Whitesel) writes:
>>Using compiled languages for time-critical code never makes sense unless the
>>compiler is unspeakably awesome and efficient.
>
>What a crock.  Practically ALL the time-critical code here is written in C
>or Fortran, and the compilers aren't all that great.

Just out of curiosity... where is "here"?

----------------------------------------
  BITNET--  mquinn@utcvm    <------------send files here
  pro-line-- mquinn@pro-gsplus.cts.com

shrinkit@Apple.COM (Andrew Nicholas) (03/28/91)

In article <4416@orbit.cts.com> svetozar@pnet51.orb.mn.org (Eric C. Anderson) writes:

>Hi again.  I just sent a letter to you (no, none of your letters ever arrived)
>about this unzip stuff.  Just in case it doesn't get to you, I decided to
>write a little message here.

Awww... I just got another one from you and sent a reply back to pnet51 and
to the marilyn machine, so we'll see if that one ever makes it through.

Anyone else try to mail anything to Eric and have it get through?  Or is it
just Apple.COM's mailer (or someone in between's)?

>Speaking of LZH on the gs, has anyone attempted to make use of the source for
>Haruyasu Yoshizaki's (sp?) LHA v2.11?  Compression seems to be on a par with
>PKZIP, although uncompression seems 2-3 times slower (love that Turbo
>Profiler).  One can only hope the current talk of hours to compress System.504
>is due to the Orca/C compiler, not the underlying algorithm.  Oh well.

I was using the source code to LZH, the first thing that Yoshi did -- I don't
have the source to anything more recent regarding lharc 2.x... is there
someplace that reliable C source can be found? 

andy

-- 
Andy Nicholas		  GEnie & America-Online: shrinkit
Apple Computer, Inc.  	 	      CompuServe: 70771,2615    
Apple IIGS System Software		InterNET: shrinkit@apple.com

I'm doing this on my own time, so I don't speak for Apple.

bkahn@spud.webo.dg.com (Bruce Kahn) (03/30/91)

In article <9103280259.AA19624@apple.com>, MQUINN@UTCVM.BITNET writes:
|> On Wed, 27 Mar 91 22:27:32 GMT Doug Gwyn said:
|> >In article <1991Mar25.222305.6194@nntp-server.caltech.edu>
|> > toddpw@nntp-server.caltech.edu (Todd P. Whitesel) writes:
|> >>Using compiled languages for time-critical code never makes sense unless the
|> >>compiler is unspeakably awesome and efficient.
|> >
|> >What a crock.  Practically ALL the time-critical code here is written in C
|> >or Fortran, and the compilers aren't all that great.
|> 
|> Just out of curiosity... where is "here"?
|> [Comments from others also deleted...] 

  'Here' in my case is Data General.  We do almost all our coding in C (from
the systems & apps level down to the streams and kernel level).  The only time
we have NOT done this (that I now of) is w/our PC transports (NetBIOS, done 
w/assembly, etc).

  The main reason we use C over assembly is two fold:
1) While assembly is faster (excluding an 'unspeakably awesome and efficient'
compiler), it is not as easy to maintain and follow as compared to the more
higher level languages like C or Pascal (not that I know of anyone here using
Pascal).  Most people can read C code from different platforms but they dont
always have experience w/the assorted lower level languages found on different
systems (Mac, 88K, 6502/65816, Z80, ECLIPSE, etc).

2) Assembly is no where as 'portable' as the higher level languages so if we
want to use the code over again elsewhere, unless its on the same h/w platform,
that just wont happen.  Kinda like writting a library full of books but not
using a language that others can easily read and follow...

-- 
Bruce <I-wont-give-my-middle-initial> Kahn   Phone (508) 870-6488
NSDD / OpenLAN                            Internet bkahn@archive.webo.dg.com
Data General Corporation, Westboro MA USA
Standard disclaimers still apply, even where prohibited by law...

rhyde@ucrmath.ucr.edu (randy hyde) (03/30/91)

Gee, Todd don't get so defensive! (:-))

First, let me state that I agree with you.  Time critical code should be
written in assembly language.  Anyone who think's that's a crock is fooling
themselves.

Now, let me have some fun with you.  Have you ever wondered why the GS isn't
selling as well as you feel it should?  It's because there is better software
available for Macs and PCs.  Why is this?  Well, normal applications like high-
end desktop publishing programs, massive spreadsheets, and power-user databases
are "time-crtical" applications.  If they ran to slow, no one would use them.
On a GS, Just about any major application would run too slow if written in C
(especially given the Quality of the C compilers available for the GS).
However, it is *almost* possible to write reasonably fast software in C or
Pascal (or whatever HLL) on a fast PC or MAC.  Since people (in general) have
so little assembly experience, they insist on writing their code in HLLs.  That's
why there is lots of software for the Mac and PC and almost none for the GS
(Interesting point, back in the days of slow 8088's and lousy 8086 compilers,
back when most people still programmed the PC in assembly language, there was
a lot of development on the Apple II as well).

One thing really amazes me: people's insistence on writing Operating Systems
in HLLs like C.  It is actually *easier* to write most of the OS in assembly
than it is to write it in C.  It's just that modern day OS writers are
incompetent assembly language programmers so they force C to do the job for them.
I know.  In my introductory OS class I've made my students write their code in
assembly (8086).  They bitch like mad at the beginning of the quarter about this.
but by the end of the quarter a majority (certainly not all) of the students
thank me for putting them through it because they were able to master assembly
concepts on a real project like an OS.

Most people who refuse to use assembly language on anything, or even as
little as possible, are displaying their ignorance.  All languages have
their place.  I make a big stink about assembly because someone (a prophet
perhaps) needs to keep yelling in the wilderness supporting this subject.

toddpw@nntp-server.caltech.edu (Todd P. Whitesel) (03/30/91)

rhyde@ucrmath.ucr.edu (randy hyde) writes:

>Gee, Todd don't get so defensive! (:-))

Actually, I was trying to be polite. You should have seen the first response
I wrote but didn't send.

>One thing really amazes me: people's insistence on writing Operating Systems
>in HLLs like C.  It is actually *easier* to write most of the OS in assembly
>than it is to write it in C.  It's just that modern day OS writers are
>incompetent assembly language programmers so they force C to do the job for
>them.

You're ignoring the biggest advantage C has over assembly, Randy. C is a
PORTABLE high-level language whose primitives are geared towards reasonably
efficient object code. I learned assembly first, then C; C's treatment of
pointers and integer math made infinite amounts of sense to me, especially
after many skirmishes with Pascal. To me, C is the best damn assembly macro
set ever invented (and then some).

>Most people who refuse to use assembly language on anything, or even as
>little as possible, are displaying their ignorance.  All languages have
>their place.  I make a big stink about assembly because someone (a prophet
>perhaps) needs to keep yelling in the wilderness supporting this subject.

I cannot agree with this completely. There are many applications for which
portability and development time are slightly higher in priority than raw
performance. I do agree that all languages have their place, but it is
important to realize that portable HLL's have a much more important place
in commercial software today. My guess is that the trend is going to be
towards better compilers and towards CPU's designed to successfully support
those compilers better.

Todd Whitesel
toddpw @ tybalt.caltech.edu

gwyn@smoke.brl.mil (Doug Gwyn) (03/30/91)

In article <13156@ucrmath.ucr.edu> rhyde@ucrmath.ucr.edu (randy hyde) writes:
>Most people who refuse to use assembly language on anything, or even as
>little as possible, are displaying their ignorance.  All languages have
>their place.

Assembly language does have its place, which is quite limited in scope.
Far from being "incompetent", as you called them, software developers
use HLLs because they learned the lessons of the late 1960s and early
1970s.  I have probably written more assembly language code for more
distinct computer architectures than you ever will, but you won't catch
me using it these days unless a PORTABLE language will simply not do the
job.  Consider what would have happened if UNIX had been left coded in
PDP-11 assembly language; it certainly would not have become the near-
universal operating system that it now is.  According to your notions,
it would have been justified to use assembler just for the small increase
in speed that would have been attained (in this case, we actually have
some measured numbers to show how little speed was lost in the rewrite
into C).  According to the rest of us, that small amount of improved
speed would not have been worth the impediment to further development and
especially portability that it would have caused.

declan@remus.rutgers.edu (Declan McCullagh/LZ) (03/31/91)

In article <13156@ucrmath.ucr.edu>, rhyde@ucrmath.ucr.edu (randy hyde) writes:

> However, it is *almost* possible to write reasonably fast software in C or
> Pascal (or whatever HLL) on a fast PC or MAC.

This is rapidly outgrowing comp.sys.apple2; perhaps followups should
be redirected to a more appropriate newsgroup...

There's a substantial trade-off between development time and actual
executing time: to write enough of an application in assembly to make
a critical difference, it will often take you a considerably longer
amount of time.  (And of course, make your application harder to
maintain and _greatly_ hinder portability)

Now, company X is writing a C++ or Objective-C version of a similar
application.  They'll have it out before you, and people will buy it,
and when yours comes out - six months or a year later - newer versions
of the other application will already be out, with more features and
goodies.  Plus, they can update it more easily and port it to other
platforms _much_ more easily than you can.

Let's look at the NeXT, with an award-winning software development
environment.  Pure Objective-C.  Yep, the performance isn't that
great on a slow computer, but all NeXTs now have 68040s.  And
companies can bring their product to market in an unprecedented amount
of time.  Considerable benefit there: take a look at Improv, Word
Perfect, and TouchType...

Plus, you can recompile a NeXTstep application and run it (if written
correctly) without any modification on an entirely different platform.
That's portability for you.

> One thing really amazes me: people's insistence on writing Operating Systems
> in HLLs like C.  It is actually *easier* to write most of the OS in assembly
> than it is to write it in C.

I'd like to see some references to back up that claim.

In any case, take a look at Mach, for example.  It's available for
anonymous FTP from CMU.  It's been compiled for a variety of machines,
including SPARCstations, 386s, and so on - and they even have it
running on Macs there, too.

It's quite efficient for most tasks, and benefits from being written
in a high level language - can you imagine trying to port a
SPARCstation version to a Mac or 386?

> Most people who refuse to use assembly language on anything, or even as
> little as possible, are displaying their ignorance.  All languages have
> their place.  I make a big stink about assembly because someone (a prophet
> perhaps) needs to keep yelling in the wilderness supporting this subject.

I think it's great that you're advocating learning assembly language;
it's often helpful for computer science students to learn it.  As a
class at a university and as a concept, it's great.  However, for
real-world commercial applications, it's not.

-Declan
 

dvac@druwa.ATT.COM (Daniel Vachon) (04/02/91)

If anyone out there has something than can take care of .LZH or .ZIP files,
on an Apple ][+ or Apple //e (no gs folks), please email me a uudecoded or
binscii'd copy of it.  (or post it to the binaries group).   I have scraped
up programs to kill ZOO and ARC, but I really need ZIP and LZH too.
Shrinkit couldn't do it (as I figured, but I thought I would try since it
seems Andy makes it pretty extensive).

SO, please let me know if you have anything than can nip those two formats
in the bud.

- Dan Vachon   dvac@druwa.ATT.COM