jdeitch@pnet01.cts.com (Jim Deitch) (07/17/89)
ast@cs.vu.nl (Andy Tanenbaum) writes: > >I have gone through the net postings and found a number of useful programs. >I have this fear that the next distribution is going to take 20 disks and cost >$149. I have also noticed that compress only seems to win a factor of 2. >I wonder if better compression of C programs is possible. In particular, >suppose we had a two pass compression program. Pass 1 collected all the >strings in the program and their frequencies. Pass 2 replaced the most >important string by ASCII code 128, the next most important one by 129 etc, >sort of like libpack.c does, only dynamically instead of using fixed strings. >Most important means that the product of # occurences times length is >maximum. The compressed file would include regular ASCII characters 0-127, >and 120 or so characters denoting strings. The last 8 could be for >1 tab, 2 tabs, 3 tabs, etc., and a two byte sequence for encoding runs of >blanks. The dictionary would have to go too. One would also have to think >carefully about what a string is, i.e., is "while" a string, or "while (" >a better choice? It is my suspicion that such a program could compress >better than a factor of 2 on C programs. > >Does anyone know of such a program, or want to write one? > >Andy Tanenbaum (ast@cs.vu.nl) Why not use a 16 bit compression? I have a version that will compile on the XT, AT under dos, and supports 16 bit. Also, has anyone tried to port the Atari 16 bit that went through the net awhile ago to the IBM? UUCP: {nosc ucsd hplabs!hp-sdd}!crash!pnet01!jdeitch ARPA: crash!pnet01!jdeitch@nosc.mil INET: jdeitch@pnet01.cts.com
ast@cs.vu.nl (Andy Tanenbaum) (07/17/89)
In article <4642@crash.cts.com> jdeitch@pnet01.cts.com (Jim Deitch) writes: >Why not use a 16 bit compression? I have a version that will compile on the >XT, AT under dos, and supports 16 bit. Also, has anyone tried to port the >Atari 16 bit that went through the net awhile ago to the IBM? None of the 16 bit compression programs that I have seen fit in 64K + 64K. They all need more table space. MINIX does not support large model and probably never will on 8088/80286 CPUs. If you have one that runs on MINIX and gives appreciably larger compression, great. Andy Tanenbaum (ast@cs.vu.nl)
henry@utzoo.uucp (Henry Spencer) (07/19/89)
In article <2890@ast.cs.vu.nl> ast@cs.vu.nl (Andy Tanenbaum) writes: >>Why not use a 16 bit compression? ... > >None of the 16 bit compression programs that I have seen fit in 64K + 64K. Also, 16-bit compression really does not work that much better than 12. It helps, but the difference isn't striking. -- $10 million equals 18 PM | Henry Spencer at U of Toronto Zoology (Pentagon-Minutes). -Tom Neff | uunet!attcan!utzoo!henry henry@zoo.toronto.edu
jca@pnet01.cts.com (John C. Archambeau) (07/19/89)
ast@cs.vu.nl (Andy Tanenbaum) writes: >None of the 16 bit compression programs that I have seen fit in 64K + 64K. >They all need more table space. MINIX does not support large model and >probably never will on 8088/80286 CPUs. If you have one that runs Large model not being supported in 8086 real mode is understandable, you only have 640K (roughly) of memory for processes if you don't use expanded memory. But with protect mode, you have 16 Mb of physical memory to play with. If SCO Xenix 286, MicroPort Unix/286, and MS-DOS 3.30 have C compilers that generate large model code, then why is it too much to ask to have such a compiler for 286 protect mode? In fact, to fully implement Minix 286 in protect mode, you'll have to rewrite asld to support the 286 instruction set anyway. I know segmentation is a pain, but it's not by any means impossible. I agree with your position that Minix should first and foremost be a teaching tool, but having a crippled C compiler isn't going to help matters any, especially when the expandability factor of going to 286, 386, and 486 protect modes comes into play. The current C compiler is probably sufficient for 8086 real mode, but it's a far cry from what can be done and what is usable under protect mode. GCC will probably suffice for 386 and 486 protect mode, but what about 286 protect mode? This is one of those in the middle cases. You have 16 Mb of physical memory, in just 20% of that space, a lot can be done, you can have a few large programs running. I agree the subject isn't trivial by any means, but it is doable. And I would like to see it done even if it means doing it myself. Unfortunately, I have found little documentation of how to go about writing a smart linker. Another border line case...I've seen a lot of debate on whether or not the linker is a piece of the compiler or a piece of the operating system. Your insight along with anyone else's will be appreciated as always... /*--------------------------------------------------------------------------* * Flames: /dev/null (on my Minix partition) *--------------------------------------------------------------------------* * ARPA : crash!pnet01!jca@nosc.mil * INET : jca@pnet01.cts.com * UUCP : {nosc ucsd hplabs!hd-sdd}!crash!pnet01!jca *--------------------------------------------------------------------------*/
rbthomas@athos.rutgers.edu (Rick Thomas) (07/20/89)
The MS-DOS / UNIX-286(all flavors) concept of "memory size models" for the i286 is an abomination, and I strongly resist any attempt to add such a thing to Minix! For the 80[34]86, and above, there is no problem. All code is one model, because segments can be any "reasonable" size (though some of the Supercomputer users around here are talking seriously of 32 bits not being enough for addresses... (8*) Minix on a Cray-2 anyone?) So we just use GCC and be done with it for these machines. For the 8086 and 80286 there are two solutions that are acceptable. Multiple memory models (a concept which the ANSI C standards committee wisely refrained from including in their language) is not acceptable. The first acceptable solution is the current Minix solution, namely 64K-i/64K-D with 16 bit pointers and 16 bit int's and that's that; the second is to support arbitrary i and d by code-generator kludges with 32-bit pointers and int's. There is too much existing code that assumes sizeof(int) == sizeof (char *) to allow anything with 16-bit int's and 32-bit pointers or vice versa. The required code-generator kludges include things like adding the separate halves of an int sum in separate registers, and dis-assembling pointers when they are loaded into registers. Multiplication of int's becomes a genuine nightmare! (I don't even want to think about division!) The problem with the first solution is that it doesn't support big programs. The problem with the second solution is that it is *SLOW*. (I have talked to some people who have tried it, and they claim a factor of 3 or 4 performance hit over the solution 1 approach.) We simply have to admit that there are a lot of brain-damaged machines out there and accept their limitations. The "teaching" value of Minix would be totally destroyed by having a compiler that required a two week short course to describe its non-standardness, and a lifetime to discover all its pitfalls. Personally, I prefer the first solution (current Minix). Small IS beautiful, in this case. Rick -- Rick Thomas uucp: {ames, att, harvard}!rutgers!jove.rutgers.edu!rbthomas internet: rbthomas@JOVE.RUTGERS.EDU bitnet: rbthomas@zodiac.bitnet Phone: (201) 932-4301
ast@cs.vu.nl (Andy Tanenbaum) (07/20/89)
In article <4660@crash.cts.com> jca@pnet01.cts.com (John C. Archambeau) writes: >Your insight along with anyone else's will be appreciated as always... I regard the 286 as a transient. It and its weird little segments will eventually pass from view. I don't want to do any work on the compiler or system to cater to it. Too much trouble. Andy Tanenbaum (ast@cs.vu.nl)
jca@pnet01.cts.com (John C. Archambeau) (07/21/89)
The large model compiler doesn't have to be part of the standard distribution, it can be for people like myself with the 'warped' sense of what the C compiler for protect mode Minix should be. My argument is the usability factor. What can you do with the current Minix C compiler besides recompile the kernel? Nothing really, in fact, things that will run in the seperate I&D memory model won't compile in some cases. Case and point, Bison (GNU Yacc) will run in a small memory model under Turbo C 1.5 which is similiar to seperate I&D, but because of the seperate I&D limitations of each pass of the Minix C compiler, it will not compile (i.e. output.c will not compile, opt bombs on it and runs out of memory). If Minix is to be used for ONLY educational purposes, then the current compiler is quite reasonable, but you and I both know that Minix has gotten bigger and better than that. Ethernet support is part of the standard distribution along with RS-232 support. Minix is becoming a REAL operating system rather than an educational tool. I know that Minix will not stop where it is, but will keep on growing. So therefore, I want a REAL C compiler for it that will fully utilize the idiosyncrasies of the 80286 architecture and since the problems are idiosyncratic to the 80286, I believe they can be isolated to linker that is smart enough to determine processor type and of course the memory manager task will have to be made 'smarter' when protect mode is used. The bottom line is that protect mode is being wasted and I don't like waste. The C compiler can be more than it is...regardless of how 'messy' it will get. Intel screwed up with the 80286, but there are work arounds. /*--------------------------------------------------------------------------* * Flames: /dev/null (on my Minix partition) *--------------------------------------------------------------------------* * ARPA : crash!pnet01!jca@nosc.mil * INET : jca@pnet01.cts.com * UUCP : {nosc ucsd hplabs!hd-sdd}!crash!pnet01!jca *--------------------------------------------------------------------------*/
ast@cs.vu.nl (Andy Tanenbaum) (07/23/89)
In article <4671@crash.cts.com> jca@pnet01.cts.com (John C. Archambeau) writes: >What can you do with the current Minix C compiler besides recompile >the kernel? Nothing really, At least count there were about 150 "standard" MINIX programs you could compile. >I want a REAL C compiler for it that will fully utilize the idiosyncrasies >of the 80286 architecture I, for one, am absolutely, irrevocably, unequivocably, and adamantly opposed to that mentality, except for people over 65, who grew up with the ENIAC and MARK I generation. If there is anything we have learned about computer science in the past 20 years, it is that programs should be portable. A program written in a high-level language, be it C, Pascal, Ada, or even FORTRAN, should not be tailored to one specific architecture, and certainly not one on the way out. I am familiar with too many examples of people who program in their manufacturer's extended FORTRAN and find it strange when some years later they have to move to a new machine and nothing works. If you want to tie your code to the 80286 architecture, in 5 years when you have a SPARC or a MIPS or an 88000 of something like that in your low-end PC, you're gonna be sorry, and you'll get no sympathy from anyone who has any concept of proper programming practice. Andy Tanenbaum (ast@cs.vu.nl)
throopw@dg-rtp.dg.com (07/23/89)
> jdeitch@pnet01.cts.com (Jim Deitch) >> ast@cs.vu.nl (Andy Tanenbaum) >>suppose we had a two pass compression program. Pass 1 collected all the >>strings in the program and their frequencies. Pass 2 replaced the most >>important string by ASCII code 128, the next most important one by 129 etc, >>[...] [..also, what is a string..], i.e., is "while" a string, or "while (" >>a better choice? It is my suspicion that such a program could compress >>better than a factor of 2 on C programs. I wrote such a program for just such a purpose. However, I abandoned it when it failed to outperform 16-bit modified Lempel-Ziv compression (as embodied in the the widely distributed "compress" program), at least for the cases I tried it on. I also fooled around with quite a bit of choices of how to break the input stream into "words" or "tokens". The most effective seemed to be to use alphabetic to non-alphabetic (or vice versa) transitions for the token boundaries. Now, I was trying to compress a news stream, and my compression scheme came *close* to 16-bit compression, hovering around 1.8-to-1. It might do better on C programs as opposed to English text interspersed with headers. After all, C programs have a smaller and more repetitive vocabulary than does standard English (or even Net-ese). If anybody is interested, I can send my prototype code along to them (though I'm not terribly proud of it... it was just an afternoon's experiment in hopes of getting 3 or 5 -to-1 news compression that didn't work out). > Why not use a 16 bit compression? I have a version that will compile on the > XT, AT under dos, and supports 16 bit. Also, has anyone tried to port the > Atari 16 bit that went through the net awhile ago to the IBM? I presume the reason is that it won't fit into a minix address space on the versions of minix for IBM-PC descended machines. However, as I wrote my version of token-based compression, it shared the Lempel-Ziv algorithm's taste for a large working set (I stored the "dictionary" in memory). This could be improved upon I suppose. -- Wayne Throop <the-known-world>!mcnc!rti!xyzzy!throopw
jca@pnet01.cts.com (John C. Archambeau) (07/23/89)
ast@cs.vu.nl (Andy Tanenbaum) writes: >In article <4671@crash.cts.com> jca@pnet01.cts.com (John C. Archambeau) writes: >>What can you do with the current Minix C compiler besides recompile >>the kernel? Nothing really, > >At least count there were about 150 "standard" MINIX programs you could compile. > >>I want a REAL C compiler for it that will fully utilize the idiosyncrasies >>of the 80286 architecture >I, for one, am absolutely, irrevocably, unequivocably, and adamantly >opposed to that mentality, except for people over 65, who grew up with >the ENIAC and MARK I generation. If there is anything we have learned about >computer science in the past 20 years, it is that programs should be >portable. A program written in a high-level language, be it C, Pascal, Ada, >or even FORTRAN, should not be tailored to one specific architecture, >and certainly not one on the way out. Maybe I should have clarified my position. What I want is a smart compiler/linker that will utilize the FULL power of the processor. If the compiler/linker detects an 8086 then the appropriate code will be generated for that CPU. If there's a 286 then it will compile to 286 protect mode unless otherwise specified (and this code will also run under a 386 and a 486 because of Intel's nice design policy of maintaining downward compatability), same goes for 386 and so on down the line until Intel abandons the iAPX 86 family. You and I both know that the compiler/linker is both OS as well as hardware dependant. I want a compiler/linker that will spot what CPU/Minix version (i.e. stock PC, 286 protect mode, 386, and so on). The ideal is to have a general purpose compiler/linker for Minix on the iAPX 86 family of CPUs. /*--------------------------------------------------------------------------* * Flames: /dev/null (on my Minix partition) *--------------------------------------------------------------------------* * ARPA : crash!pnet01!jca@nosc.mil * INET : jca@pnet01.cts.com * UUCP : {nosc ucsd hplabs!hd-sdd}!crash!pnet01!jca *--------------------------------------------------------------------------*/
hedrick@geneva.rutgers.edu (Charles Hedrick) (07/24/89)
>>I want a REAL C compiler for it that will fully utilize the idiosyncrasies >>of the 80286 architecture >I, for one, am absolutely, irrevocably, unequivocably, and adamantly >opposed to that mentality, except for people over 65, who grew up with >the ENIAC and MARK I generation. I think something's missing here. The whole point of a compiler is to use the architecture of the target machine. Since when did we consider it a virtue for a compiler not to fully utilize the idiosyncrasies of the machine it's compiling code for? It's not the object code you want to be independent of the architecture, but the source code. I've got a reasonable amount of experience playing with a System V port for the 286. This is a pretty good indication of what a Minix with large model would look like. There are certainly limitations to what you can do, but there are limitations to what you can do with the Minix compiler too, and the limitations of large model are much less serious. I'm not a great fan of the 286 either. But if you write in a higher level language, you are pretty much isolated from the oddities of the chip. I have brought up a lot of random code from the net. There's certainly code that assumes int's are 32 bits. But that code won't work with the small model Minix compiler either. All the code I've seen that is written to allow 16 bit int's at all works with a large model compiler. I admit that in principle that need not be true. If you store pointers in int's, you'll lose. But people have learned enough about portable programming that you don't see that in practice. At least not in programs intended to be used with 16 bit ints. When people need a generic pointer, they have enough sense to use char * rather than int. The real issue is whether it's worth the time to produce a large model Minix compiler, and whether it's worth the disk space to have two versions of the libraries. I'm willing to listen to the argument that it's not. But to consider it wrong for a compiler to compile code that uses the target architecture seems strange, indeed.
jk0@image.soe.clarkson.edu (Jason Coughlin) (07/24/89)
From article <4679@crash.cts.com>, by jca@pnet01.cts.com (John C. Archambeau): > I want a compiler/linker that will spot what CPU/Minix > version (i.e. stock PC, 286 protect mode, 386, and so on). The ideal is to > have a general purpose compiler/linker for Minix on the iAPX 86 family of > CPUs. Weeeeeelllllll, if you start now with the i286 "Black Book", you should have a preliminary version done for us in, ummmmmmmmmmm, a year maybe? :-) E A T T H I S I N E W S ! -- Jason Coughlin ( jk0@sun.soe.clarkson.edu , jk0@clutx )
brog@sloth.UUCP (Chris Brougham) (09/21/90)
Can anyone tell me if minix 1.5 for the PC runs a 16 bit compress-- this is important for my newsfeeds? Also, I've heard that b-news runs on minix. I took a look at the Makefile for b-news and it had flags to set for System 7, would these be what I would uncomment to compile b-news 2.11 for minix? Thanks for all your help... ___ cb... van-bc!sloth!brog <<<<<<<<<<<<<<<>>>>>>>>>>>>>>>> brog@sloth.wimsey.bc.ca
eyal@echo.canberra.edu.au (Eyal Lebedinsky) (09/21/90)
In article <8BmTP2w163w@sloth.UUCP> brog@sloth.UUCP (Chris Brougham) writes: >Can anyone tell me if minix 1.5 for the PC runs a 16 bit compress-- >this is important for my newsfeeds? > No. Only small model supported. I think it can do 12 (or 13) bits max. >van-bc!sloth!brog <<<<<<<<<<<<<<<>>>>>>>>>>>>>>>> brog@sloth.wimsey.bc.ca Regards Eyal -- Regards Eyal
michael@sunham.germany.sun.com (Michael Joswig (Vertriebsunterstuetzung Hamburg)) (09/26/90)
Eyal (???) writes: >> In article <8BmTP2w163w@sloth.UUCP> brog@sloth.UUCP (Chris Brougham) writes: >>Can anyone tell me if minix 1.5 for the PC runs a 16 bit compress-- >>this is important for my newsfeeds? >> >No. Only small model supported. I think it can do 12 (or 13) bits max. Sorry to correct you, but I have ftp'ed a 16bit compress some days ago which works fine. If there are more people interested in, I can try to remember where I got it (It must be some of those ftp-markets mentioned in the Info_Sheet)(Or not...). It was named "comress" that's all I can say for now. Bye Michael.
jds@mimsy.umd.edu (James da Silva) (09/26/90)
In article <31571@nigel.ee.udel.edu> michael@sunham.germany.sun.com (Michael Joswig (Vertriebsunterstuetzung Hamburg)) writes: > >Eyal (???) writes: >>> In article <8BmTP2w163w@sloth.UUCP> brog@sloth.UUCP (Chris Brougham) writes: >>>Can anyone tell me if minix 1.5 for the PC runs a 16 bit compress-- >>>this is important for my newsfeeds? >>> >>No. Only small model supported. I think it can do 12 (or 13) bits max. > >Sorry to correct you, but I have ftp'ed a 16bit compress some days >ago which works fine. If there are more people interested in, I can >try to remember where I got it (It must be some of those ftp-markets >mentioned in the Info_Sheet)(Or not...). >It was named "comress" that's all I can say for now. > Michael, Are you CERTAIN that this version of compress you mention will uncompress 16bit-compressed files under normal PC Minix? You aren't talking about DOS or 32-bit Minix? If so, please do let us know where you got it. Such a beast would be quite a feat of bit-bumming: the normal compress wants more than 400k of data space to do 16bit compress/uncompress. Jaime ........................................................................... : domain: jds@cs.umd.edu James da Silva : path: uunet!mimsy!jds Systems Design & Analysis Group
michael@sunham.germany.sun.com (Michael Joswig (Vertriebsunterstuetzung Hamburg)) (09/27/90)
>>> Jaime (jds@cs.umd.edu) asked >Michael, > >Are you CERTAIN that this version of compress you mention will uncompress >16bit-compressed files under normal PC Minix? You aren't talking about DOS >or 32-bit Minix? > >If so, please do let us know where you got it. Such a beast would be quite >a feat of bit-bumming: the normal compress wants more than 400k of data >space to do 16bit compress/uncompress. > >Jaime Well, I use this compress on my Atari ST and it seems to work fine. There is an option "-b n", where n means the maximal bit-number and ranges up to 16. But I must say, I had just yesterday (when this discussion started) small problems on BIG compressed files, which were compressed on one of this neat sparcstations. I haven't any idea, why it was nor if it's my or the programs fault... I have a headache now, because I thought so much where I got it! But I think, it was "dsrgsun.ces.cwru.edu" and this leads to our next problem: Yesterday I tried to ftp this site. I connected, but it was *EMPTY*!!!!!!!!!! If there is somebody from this site in the net, please tell me, what happened. My next session is on monday (I can read this newsgroup only twice a week, but I read all!), and if there is no response from the site, I post the whole lot (I cannot tell you if there is a Makefile for PC). Ciao, Michael.
fortinp@bwdls56.bnr.ca (Pierre Fortin) (09/28/90)
In article <31676@nigel.ee.udel.edu>, michael@sunham.germany.sun.com (Michael Joswig (Vertriebsunterstuetzung Hamburg)) writes: > > But I think, it was "dsrgsun.ces.cwru.edu" and this leads to our next > problem: Yesterday I tried to ftp this site. I connected, but it was > *EMPTY*!!!!!!!!!! I just connected to this machine. The *real* problem is that "dir" and "ls" are not producing output. Here's part of my session: ftp> dir 200 PORT command successful. 150 ASCII data connection for /bin/ls (x.x.x.x,p) (0 bytes). 226 ASCII Transfer complete. ftp> ls 200 PORT command successful. 150 ASCII data connection for /bin/ls (x.x.x.x,p) (0 bytes). 226 ASCII Transfer complete. ftp> cd pub/atari 250 CWD command successful. ftp> pwd 257 "/pub/atari" is current directory. ftp> get README ! >more README 200 PORT command successful. 150 ASCII data connection for README (x.x.x.x,p) (3327 bytes). 226 ASCII Transfer complete. local: test remote: README 3419 bytes received in 9.9 seconds (.34 Kbytes/s) This directory contains all the binaries/sources/diffs for the gnu C compiler (v1.37) and associated libraries and utilities: Please download all files in binary mode. Please use zoo to extract and see the README/Changelog files in each archive for more details. When using zoo to extract files make a sub-directory, cd to it and extract using zoo with paths zoo e//. file.zoo please remember to use the "//." modifiers to the zoo 'e' command to extract full paths, anchored at the current directory, with zoo creating sub-directories as needed. bin.zoo compiler/utilities binaries bin.zoo is a split binary file. get all the parts bin.0?zoo contatenate them into 1 zoo file cat bin.0?zoo > bin.zoo extract with zoo in appropriatesubdir zoo x//. bin.zoo Documentation.zoo docs for compiler/utils/library (updated regularly, need TeX to run these off) compress.zoo 16 bit compress for TOS & MINIX curses.zoo curses/termcap/widget library source gasdiff.zoo diffs from gnu gas (assembler) V1.36 gassrc.zoo source for gnu gas 1.36 gccdiff.zoo diffs from gcc V1.37 gdb.zoo the Gnu source level debugger for TOS gemlib.zoo gem bindings source gnutar.zoo gnutar for TOS include.zoo include files for library lib.zoo library objects libsrc.zoo library source mkptypes.zoo make ansi C prototypes patch.zoo larry walls patch for TOS pml.zoo portable math library util.zoo source for utilities gcc-137.??zoo gcc V1.37.1 source. cat all the parts together after download: cat gcc-137.??zoo > gcc-137.zoo then un'zoo : zoo e//. gcc-137.zoo in appropriate sub-directory [lots deleted] Oh well, looks like we have to work blind for a while(?). I think there are README files in most directories, so try pulling those across and then ask for the specific files you want... > Ciao, > Michael. Ciao, Pierre Pierre Fortin P.O.Box 3511, Stn C (613)763-2598 Internet Systems Ottawa, Ontario RIP: aptly named protocol Bell-Northern Research Canada K1Y 4H7 AppleTalk: Adam&Eve design
jds@mimsy.umd.edu (James da Silva) (09/28/90)
In article <31676@nigel.ee.udel.edu> michael@sunham.germany.sun.com (Michael Joswig (Vertriebsunterstuetzung Hamburg)) writes: > >Jaime (jds@cs.umd.edu) asked >>Are you CERTAIN that this version of compress you mention will uncompress >>16bit-compressed files under normal PC Minix? You aren't talking about DOS >>or 32-bit Minix? >> >>If so, please do let us know where you got it. Such a beast would be quite >>a feat of bit-bumming: the normal compress wants more than 400k of data >>space to do 16bit compress/uncompress. > >Well, I use this compress on my Atari ST and it seems to work fine. There >is an option "-b n", where n means the maximal bit-number and ranges up >to 16. Ah, that explains it. The original person you "corrected" was running on IBM-PC Minix, which *does* have a 64k data space restriction, and *cannot* do 16-bit compression. So your correction was not correct. The fact that Atari-ST Minix is capable of 16-bit compression is not surprising. The stock off-the-net compress program should compile and run fine under ST Minix. At least, it compiles and runs fine under Bruce's compiler on 32-bit 386 Minix. Jaime