pollarda@physc1.byu.edu (06/15/91)
PKZip as well as several other file compression utilities I have seen have the option to have the files self extracting. I understand that the files have the machine code to self extract along with the data combined somehow. But how exactly does it work? What tells the computer to stop loading in the file as machine code and handle the rest as data? What codes are needed to be able to achieve this? And what changes are needed to be made to the program in order for it to work on this kind of file? Does anyone have any ideas? While I am at it, a friend has shown me something else quite curious and I have told him I would try to find out what is going on. . . Some programs have something embeded in them so that if you "type" them, they will display the program name and the material will stop scrolling on the screen and the DOS prompt will appear again. e.g. C:> type program.exe The Software (c) 19xx The Software Co. C:> If there is anyone with the answer (or at least a good guess at one) to either of these two questions, I would greatly appreciate it if you could let me know what is going on. Both of these have me Stumped... Thanx a bunch, Art Pollard BitNet: PollardA@Xray.Byu.Edu Uncle Sam's Express: 600 N. 195 E. #31, Provo, UT. 84606 Phone Stuff: (801) 373-0339 (Home --Late!!!) (801) 378-4490 (Work)
brnstnd@kramden.acf.nyu.edu (Dan Bernstein) (06/15/91)
What does any of this have to do with comp.compression? Sounds like IBM programming group fodder. In article <309pollarda@physc1.byu.edu> pollarda@physc1.byu.edu writes: > What tells the computer to stop loading in the file as machine code > and handle the rest as data? It just loads in the whole thing and jumps to a spot near the beginning. Then both the code and the data are in memory. The code uncompresses the data, then jumps to the beginning of the newly uncompressed program. > Some programs have something embeded in them so that if you "type" > them, they will display the program name and the material will stop > scrolling on the screen and the DOS prompt will appear again. The DOS programs interpret ^Z as an end-of-file marker. They'll stop when they hit ^Z in a file, even if there's more data past that. ---Dan
d88-jwa@byse.nada.kth.se (Jon W{tte) (06/15/91)
In article <309pollarda@physc1.byu.edu> pollarda@physc1.byu.edu writes: combined somehow. But how exactly does it work? What tells the computer to stop loading in the file as machine code and handle the rest as data? It doesn't. Everything is loaded as code, but the flow in the program never reached the addresses where the data is. Look at it like a small program like: char stuffed_data [ ] = { 0 , 1 , .... } ; main () { while ( more_data ) { unstuff ( stuffed_data ) ; } } type foo.exe The Software (c) 19xx The Software Co. C:> Just add the text to the beginning with an end-of-file in there. Maybe some magic numbers still need to be there, but they can be erased by backspaces... I'm not so very at home with MS-DOS any longer. Now, on the mac, this info is in the "vers" resource and comes up when you "get info" on the icon - and a self-extrancter would have the code in the usual place (CODE resources) and the data in the data fork - so you could open it transparently with the original program as well, which doesn't need to look at the resource fork. A file system to die for ! :-) All of this has little to do with comp.compression, except maybe that archive writers need to be aware that not all file systems are flat... -- Jon W{tte h+@nada.kth.se - Speed !
davidsen@sixhub.UUCP (Wm E. Davidsen Jr) (06/17/91)
In article <309pollarda@physc1.byu.edu> pollarda@physc1.byu.edu writes: | | PKZip as well as several other file compression utilities I have seen | have the option to have the files self extracting. I understand that | the files have the machine code to self extract along with the data | combined somehow. But how exactly does it work? The expandor is the first part of the program, which is fixed length so it knows what to skip. | What tells the computer to stop loading in the file as machine code | and handle the rest as data? It all gets loaded into memory in some cases, in others the header info causes only part of the file to be loaded, and the full filename (dos 3.x and later) is used to find the real data. | Some programs have something embeded in them so that if you "type" | them, they will display the program name and the material will stop | scrolling on the screen and the DOS prompt will appear again. The DOS type command stops when it see a cntl-Z 26(10) character. -- bill davidsen - davidsen@sixhub.uucp (uunet!crdgw1!sixhub!davidsen) sysop *IX BBS and Public Access UNIX moderator of comp.binaries.ibm.pc and 80386 mailing list "Stupidity, like virtue, is its own reward" -me
rennyk@apex.com (Renny K) (06/18/91)
In article <309pollarda@physc1.byu.edu> pollarda@physc1.byu.edu writes: > >PKZip as well as several other file compression utilities I have seen >have the option to have the files self extracting. I understand that >the files have the machine code to self extract along with the data >combined somehow. But how exactly does it work? >What tells the computer to stop loading in the file as machine code >and handle the rest as data? There is a EXE Header as it is called, at the beginning of EVERY EXE file. The DOS loader looks at this information before loading the file for length, where to load, relocation etc. It's possible to change the amount of informa- tion loaded by modifying this header. >What codes are needed to be able to achieve this? And what changes >are needed to be made to the program in order for it to work on >this kind of file? You don't need any special codes. It's done by DOS (and the linker) >Some programs have something embeded in them so that if you "type" >them, they will display the program name and the material will stop >scrolling on the screen and the DOS prompt will appear again. >If there is anyone with the answer (or at least a good guess at one) >to either of these two questions, I would greatly appreciate it >if you could let me know what is going on. This is also easy to do. It's much easier in a .COM program than an .EXE program. In a COM program: Start off the program with a jump instruction, followed by the message and then end the message with the hex character 01Ah. This is the DOS End-Of-File character, which will cause the type command to stop. ex: JMP START DB 'My Program',13,10 DB 'Version 1.0',13,10 DB 'Copyright 1991, Renny Koshy',13,10,10,10,01Ah In an EXE program it's harder because of the header in the beginning, and I don't know how to do it without showing SOME garbage (i.e. the header). -- ------------------------------------------------------------------------------- Renny Koshy rennyk@apex.com Apex Computer, Redmond, WA.
aaron@backyard.bae.bellcore.com (Aaron Akman) (06/19/91)
In article <1991Jun17.205910.17669@apex.com>, rennyk@apex.com (Renny K) writes: >In article <309pollarda@physc1.byu.edu> pollarda@physc1.byu.edu writes: > >There is a EXE Header as it is called, at the beginning of EVERY EXE file. >The DOS loader looks at this information before loading the file for length, >where to load, relocation etc. It's possible to change the amount of informa- This goes back a bit, but I appended data to an EXE this way: Compiled the EXE completely, created a little utility program to open(file.exe), seek(file.exe, EOF), and write(file.exe, extra stuff). As I recall, I tried 2 methods for ``finding'' the data from within the executing EXE: (1) have the utility modify the initial value of one the EXE's global variables with an exact offset. You could figure out how to find a global variable by compiling a little program like: char *cgoff = "STUFF NUMBER HERE AND DO ATOI IN YOUR PROGRAM"; Probably the utility just has to search for that string in the file...get the size w/an lseek and make sure not to alter the size...after executing the utility, cgoff might look like this (if you could look at it in the compiled EXE): char *cgoff = "27435\0NUMBER HERE AND DO ATOI IN YOUR PROGRAM"; (2) have the utility program write an identifiable marker where the data begins, and search for that when you executing EXE opens itself. ___________________________ Aaron Akman aaron@backyard.bellcore.com 908-699-8019
mycroft@kropotki.gnu.ai.mit.edu (Charles Hannum) (06/20/91)
In article <1991Jun19.140345.18650@bellcore.bellcore.com> aaron@backyard.bae.bellcore.com (Aaron Akman) writes: In article <1991Jun17.205910.17669@apex.com>, rennyk@apex.com (Renny K) writes: >In article <309pollarda@physc1.byu.edu> pollarda@physc1.byu.edu writes: > >There is a EXE Header as it is called, at the beginning of EVERY EXE file. >The DOS loader looks at this information before loading the file for length, >where to load, relocation etc. It's possible to change the amount of informa- This goes back a bit, but I appended data to an EXE this way: Compiled the EXE completely, created a little utility program to open(file.exe), seek(file.exe, EOF), and write(file.exe, extra stuff). As I recall, I tried 2 methods for ``finding'' the data from within the executing EXE: [stuff deleted] This is unnecessary and unreliable. The .EXE header tells you exactly how large the executable portion is. Most programs also put their own header on the data, so they know exactly how large it is. If you choose to do this, you should make your header compatible with a .EXE header, but with a different magic value. (It only requires 16 bytes -- a small price to pay for the flexibility it adds.) This also allows you to add more than one data segment to your program, each with a different magic value, and simply have your loader search the headers for the right one. This is very fast.
rosenkra@convex.com (William Rosencranz) (06/20/91)
i am suprised nobody brought another issue up, so i will. i think it is relevant, too... i am extremely reluctant to advocate self extracting archives for 2 reasons: 1) in order to get at the stuff, u have to execute something (meaning more chance of viral infection), and 2) you can only extract on a particular system. the latter may make sense if what u archived will only work on that particular system tho it still makes it impossible to read an included text file like docs without the target system. however, the former is impossible to overlook. self extracting files are probably most useful for third party software distribution where these issues are probably moot (tho i know of at least one commercial package which was distributed with a virus, non-intentionally). -bill rosenkra@convex.com -- Bill Rosenkranz |UUCP: {uunet,texsun}!convex!c1yankee!rosenkra Convex Computer Corp. |ARPA: rosenkra%c1yankee@convex.com
brad@looking.on.ca (Brad Templeton) (06/20/91)
Without getting too doshish here, how do you find the .exe file? DOS doesn't tell you the name of the command you are executing. I assumed most of these self-extractors just took the compressed codes as data that was loaded in with the program. Is there a file descriptor sitting around to your program or something? Even on Unix the first argument is not assured to be the name of the program you ran. -- Brad Templeton, ClariNet Communications Corp. -- Waterloo, Ontario 519/884-7473
ts@uwasa.fi (Timo Salmi) (06/20/91)
In article <1991Jun20.034508.17792@convex.com> rosenkra@convex.com (William Rosencranz) writes: > >i am suprised nobody brought another issue up, so i will. i think >it is relevant, too... Perhaps because in the course of time they have been pointed out quite frequently. (But maybe not in this newsgroup). >i am extremely reluctant to advocate self extracting archives for 2 >reasons: 1) in order to get at the stuff, u have to execute something >(meaning more chance of viral infection), and 2) you can only extract >on a particular system. the latter may make sense if what u archived : Yes these are good and relevant points. In fact the ones that are usually deemed problematic. They certainly are why we try to avoid having self-extracting files on our FTP site, with the natural exception of (un)archiving software for obvious logical reasons. ................................................................... Prof. Timo Salmi Moderating at garbo.uwasa.fi anonymous ftp archives 128.214.12.37 School of Business Studies, University of Vaasa, SF-65101, Finland Internet: ts@chyde.uwasa.fi Funet: gado::salmi Bitnet: salmi@finfun
bowling@ucunix.san.uc.edu (Brian D. Bowling) (06/20/91)
In article <1991Jun20.040437.11896@looking.on.ca> brad@looking.on.ca (Brad Templeton) writes: |Without getting too doshish here, how do you find the .exe file? DOS |doesn't tell you the name of the command you are executing. I assumed |most of these self-extractors just took the compressed codes as data that |was loaded in with the program. Is there a file descriptor sitting around |to your program or something? | |Even on Unix the first argument is not assured to be the name of the program |you ran. |-- |Brad Templeton, ClariNet Communications Corp. -- Waterloo, Ontario 519/884-7473 Look up in the PSP. The info should be stored there by DOS. Isn't DOS fun??? Brian
sander@cwi.nl (Sander Plomp) (06/20/91)
>i am suprised nobody brought another issue up, so i will. i think >it is relevant, too... >i am extremely reluctant to advocate self extracting archives for 2 >reasons: 1) in order to get at the stuff, u have to execute something >(meaning more chance of viral infection), and 2) you can only extract >on a particular system. the latter may make sense if what u archived >will only work on that particular system tho it still makes it impossible >to read an included text file like docs without the target system. however, >the former is impossible to overlook. self extracting files are probably >most useful for third party software distribution where these issues >are probably moot (tho i know of at least one commercial package which >was distributed with a virus, non-intentionally). >-bill >rosenkra@convex.com I, too, hate the selfextracting archives, for exactly the same reasons. Yes, they are: (1) The ideal virus carrier. (2) Exceptionally non portable. Nearly always MS-DOS and nothing but MS-DOS. and, (3) Since we all have the archiver around anyway, whats the use of sending the uncompression routines along every time. Remember, we were talking about compression. (I know it's not that much, but since it's useless and dangerous anyway..) The reason that self extracting archives where invented was simple. When SEA distributed ARC as shareware via BBS systems, people often got incomplete versions. Of course you cannot use ARC to distribute ARC so the self extracting archive was invented as a way to make sure everyone got both the programs and the manual. For doing this, self extracting archives are very useful. For nearly any other purpose they are a pain in the ass. \footnote{ I wonder if anybody patented self extracting archives or something like that. These days every neat or not so neat trick seems to get patented. Anybody know? } -- Sander Plomp Internet: sander@cwi.nl Fidonet: 2:283/500.4
churchh@ut-emx.uucp (Henry Churchyard) (06/21/91)
In article <1991Jun20.060401.20338@ucunix.san.uc.edu>, bowling@ucunix.san.uc.edu (Brian D. Bowling) writes: > In article <1991Jun20.040437.11896@looking.on.ca> brad@looking.on.ca (Brad Templeton) writes: > |Without getting too doshish here, how do you find the .exe file? DOS > |doesn't tell you the name of the command you are executing. I assumed > > Look up in the PSP. The info should be stored there by DOS. Actually, the info is recorded at the end of a program's copy of the DOS environemnt in DOS versions 3.0 and above. Environment variables are stored as null terminated strings, there's an extra null after the last environment variable, and then the name of the currently executing program follows. You find the segment location of your copy of the environment at a specified place in the PSP, but I don't think the name of the program is there. Followups redirected to comp.os.msdos.programmer. -- --Henry Churchyard churchh@emx.cc.utexas.edu
frankb@sbsvax.cs.uni-sb.de (Frank Bauernoeppel) (06/21/91)
In article <3745@charon.cwi.nl>, sander@cwi.nl (Sander Plomp) writes: > I, too, hate the selfextracting archives, for exactly the same reasons. > Yes, they are: > (1) The ideal virus carrier. > (2) Exceptionally non portable. Nearly always MS-DOS and nothing but > MS-DOS. There is no problem. I can "unzip" self-extracting DOS .exe archives on a SUN, a VAX or even on a PC without executing them. Since unzip comes in C, only the native C compiler/linker has the chance to infect the program. Source code is available at simtel20.army.mil (among others). Thanks to the people at INFO-ZIP! BTW, unzip is just one example. ~~~~~~~~~~~ Greetings Frank frankb@cs.uni-sb.de