rich@rexago1.UUCP (K. Richard Magill) (11/25/85)
Two questions. I presume they are related. Answers or pointers to doc would be appreciated. 1) How does the shell (exec?) know whether the command I just typed is a shell script or one of several possible types of executable? 2) Presuming the answer to #1 above has something to do with magic numbers, who issues them? is there a common (definitive) base of them or does each manufacturer/environment make up their own set? K. Richard Magill (someplace between an advanced user & a guru)
pdg@ihdev.UUCP (P. D. Guthrie) (11/26/85)
In article <124@rexago1.UUCP> rich@rexago1.UUCP (K. Richard Magill) writes: >Two questions. I presume they are related. Answers or pointers to doc would >be appreciated. > > 1) How does the shell (exec?) know whether the command I just typed > is a shell script or one of several possible types of > executable? > The shell doesn't know. The shell merely tells the kernel to exec the file, after doing a fork. The kernel determines if a file is a binary executable by the magic number, which is obtained by reading an a.out.h structure (4.1,4.2) or filehdr.h (sys 5) and comparing it against hardcoded numbers in the kernel. In 4.1 for instance only 407,413 and 410 are legal. This also tells the kernel the specific type of executable, and in some cases can set emulation modes. The kernel also recognizes #! /your/shellname at the beginning of a file and execs off the appropriate shell instead. > 2) Presuming the answer to #1 above has something to do with > magic numbers, who issues them? is there a common > (definitive) base of them or does each > manufacturer/environment make up their own set? The magic number is issued by the linker/loader. Pretty much the magic number is decided by the manufacturer, but from what I have seen, is kept constant over machines. Forgive me if this is wrong, but I do not have any method of checking, but the magic numbers for say plain executable 4.x Vax and plain executable SysV.x Vax are the same, but SysV.x Vax and SysV.x 3B20 are different. Could someone comfirm this? > >K. Richard Magill >(someplace between an advanced user & a guru) Paul Guthrie
gwyn@brl-tgr.ARPA (Doug Gwyn <gwyn>) (11/27/85)
There is no central authority who issues "official" magic numbers for UNIX. System V keeps a list in /etc/magic (perhaps /usr/lib/magic or /usr/5lib/magic, depending on your system) and the "file" command uses this (along with some heuristic tests) to identify types of files. a.out types are hard-wired into the kernel (they're usually described in /usr/include/a.out.h). The same type of a.out is supposed to have the same magic number across UNIX implementations, but this scheme has broken down over the years and it is likely that different CPUs and even different UNIX vendors use different numbers for the same type.
geoff@ISM780B.UUCP (11/27/85)
> Two questions. I presume they are related. > 1) How does the shell (exec?) know whether the command I just typed > is a shell script or one of several possible types of executable? Magic numbers are indeed the answer, if file mode allows execution, the first few bytes are read, (probably sizeof (struct filehdr)), and if they contain a recognized magic number, they are exec'd appropriately, else, a shell script is assumed. One drawback of this approach is that if an executable has an unknown magic number (say you got it off a different system), the shell will try to interpret it as a script, causing various syntax errors (the most common is "unexpected `('" ). > 2) Presuming the answer to #1 above has something to do with > magic numbers, who issues them? is there a common (definitive) > base of them or does each manufacturer/environment make up their > own set? On Sys5 (at least), there is an ascii file called /etc/magic, which contains magic numbers and information on what they mean, (the `file' command uses this). I _think_ the numbers are made up by the implementors of the particular a.out, but are standardized as much as possible by enlightened self-interest. (And by AT&T, one would have to be crazy not to use whatever they use to describe, say, a 5.0 binary). For further info, see file(1), /etc/magic (on any sys5), the UNIX* System 5 Support Tools Guide (various sections), all the shell documentation, and don't assume this is a complete list of references. Hope this helps. Geoffrey Kimbrough INTERACTIVE Systems Corporation, Santa Monica California. {decvax!vortex || ihnp4!allegra!ima}!ism780!geoff Don't hold your nose up so high, it blocks the light. Standard Disclaimers apply. * UNIX is a trademark of AT&T. * Shell is a trademark of Shell Oil Company. (8^)
jsdy@hadron.UUCP (Joseph S. D. Yao) (11/28/85)
In article <124@rexago1.UUCP> rich@rexago1.UUCP (K. Richard Magill) writes: > 1) How does the shell (exec?) know whether the command I just typed > is a shell script or one of several possible types of > executable? Rarely is magic number testing done in the shell. Usually it is done in the kernel, at exec time. Exec() will test the first word for magicity (?), and directly execute it or not. Under 4BSD, the kernel also checks for "#!", and executes the program on the rest of the line with that file as input if it finds this "magic number." If the kernel doesn't execute a file, but the file is executable, the shell will try to execute a sub-shell with the file as input. Whether it does this first or goes down the PATH list looking for another executable file is shell-dependent -- I prefer the latter, personally. Note that in executing sub-shell's, the C shell on non-4BSD(-ish) systems will emulate the 4BSD kernel behaviour for files starting with "#!". > 2) Presuming the answer to #1 above has something to do with > magic numbers, who issues them? is there a common > (definitive) base of them or does each > manufacturer/environment make up their own set? Yes. Both. Both AT&T and Microsoft, and I believe others such as Intel and ISC (?) have issued definitive statements on what the common executable files should look like and what the magic numbers are. All, of course, are different. Look in /usr/include/a.out.h et al for information, and don't believe it until you've verified it, especially if you are unfortunate enough to have a binary-only system. On SysV, look in /etc/magic [I think] -- notice that several different numbers have the same labels; I changed them to read "old ...", "new ...", "COFF ...". If you have 'file.c', and the numbers are hard-coded, that is still a good guide. Provided your file.c compiles into the same as your /bin/file ... The very first original magic number (the old man said, stroking his long white beard) was 0407. This is equivalent to a 'br .+16' instruction in PDP-11 machine language -- useful if you want to execute a file stand-alone, since headers were 16 bytes long. Then came 410, 411, 413, 405, etc. -- many of which were not available on ordinary machines, but were lusted after by the more acquisitive members of the community, and in particular by the recipient of the coveted Golden Chicken award ... but I digress. ;-) -- Joe Yao hadron!jsdy@seismo.{CSS.GOV,ARPA,UUCP}
guy@sun.uucp (Guy Harris) (11/29/85)
> > 1) How does the shell (exec?) know whether the command I just typed > > is a shell script or one of several possible types of > > executable? > > The shell doesn't know. The shell merely tells the kernel to exec the > file, after doing a fork. The kernel determines if a file is a binary > executable by the magic number, which is obtained by reading an a.out.h > structure (4.1,4.2) And V7 and System III, and maybe Version 8. > or filehdr.h (sys 5) and comparing it against hardcoded numbers in the > kernel. The kernel also recognizes > #! /your/shellname > at the beginning of a file and execs off the appropriate shell instead. The *4.1 or 4.2 or Version 8* kernel recognizes "#! shellname"; this isn't in V7 or S3/S5 (although it could be added). In the case of other systems, or shell scripts which *don't* have a "#!" line at the beginning, the kernel sees that the file isn't an executable, and returns to the (forked) shell with an error indicating this. The shell then sees that the file had execute permission but wasn't an executable, and runs it as a script. In the C shell, and the 4.1/4.1 Bourne shells, there is a convention that if the first character of the script is a "#" that it's a C shell script and that if it's a ":" it's a Bourne shell script (this is because ":" used to be the only form of comment in the V6 and Bourne shells, and "#" was a comment character in the C shell; however, the 4.1/4.2 and S3/S5 Bourne shells accept "#" as a comment, so this convention is now a crock). They would "exec" the other shell if the first character of the script so indicated; otherwise, they'd interpret it themselves. > > 2) Presuming the answer to #1 above has something to do with > > magic numbers, who issues them? is there a common > > (definitive) base of them or does each > > manufacturer/environment make up their own set? > > The magic number is issued by the linker/loader. I think that by "issued" he meant "who decides what the magic number is"; as you point out, this is done by the manufacturer. Some of them are specified for System V as part of the Common Object File Format (does anybody know why there seem to be four(!) different magic numbers for 68000-family executables?). I don't think AT&T acts as a clearinghouse for them, though, so if you've just ported UNIX to your new 27-bit machine I guess you get to choose your own. Guy Harris
rb@istbt.UUCP (Bob Bishop) (11/29/85)
In article <416@ihdev.UUCP>, pdg@ihdev.UUCP (P. D. Guthrie) writes: > Forgive me if this is > wrong, but I do not have any method of checking, but > the magic numbers for say plain executable 4.x Vax and > plain executable SysV.x Vax are the same, but SysV.x > Vax and SysV.x 3B20 are different. Could someone > comfirm this? This is true, BUT it doesn't mean you can execute BSD binaries on a SysV VAX (nor, presumably, vice versa). The file formats are actually different. Running a BSD binary on a SysV VAX results in behavior almost, but not entirely, unlike what the programmer intended. -- Bob Bishop "What's so unpleasant about being drunk? You ask a glass of water!"
spw2562@ritcv.UUCP (Fishhook) (12/04/85)
In article <3044@sun.uucp> guy@sun.uucp (Guy Harris) writes: > does anybody know why >there seem to be four(!) different magic numbers for 68000-family >executables? > > Guy Harris There are four(!8-) distinct forms of executables for the M68000 family of processors. I took a course last year in systems programming, and one off the projects was a primitive object module editor. We had to be able to recognize the four formats by the magic number, and determine where the text and data areas were. It's been a while, so I don't exactly remember the formats, but I can look 'em up if you're really interested. ============================================================================== Steve Wall @ Rochester Institute of Technology USnail: 6675 Crosby Rd, Lockport, NY 14094, USA Usenet: ..!rochester!ritcv!spw2562 (Fishhook) Unix 4.2 BSD BITNET: SPW2562@RITVAXC (Snoopy) VAX/VMS 4.2 Voice: Yell "Hey Steve!" Disclaimer: What I just said may or may not have anything to do with what I was actually thinking...
guy@sun.uucp (Guy Harris) (12/07/85)
> > does anybody know why there seem to be four(!) different magic numbers > > for 68000-family executables? > There are four(!8-) distinct forms of executables for the M68000 family > of processors. Well, I looked on an S5 machine we have here, and there are now seven(1) different magic numbers in its "filehdr.h" - eight, if you count the fact that it claims that the UNIX PC 7300 uses the same magic number as iAPX286 large-model code(!!!). I don't know what the "four distinct forms of executables for the M68000 family" are, but I suspect there's no correlation between those forms and the various magic numbers. The formats listed are: five formats with "MC68K" in their #define names - one with no comment whatsoever on the #define line, one for "writable text segment" (like shared text, only with writable text?) one with "TV" in its name ("Transfer Vector", I presume), one for read-only text ("410" executables, I presume), and one for read-only demand paged text ("413" 4.xBSD executables, I presume). two formats with "M68" in their names. Why an "M68" is different from an "MC68K" is beyond my simple mind; maybe somebody from the group that did this can explain it to us mere mortals. Why the PC 7300 would use an iAPX286 magic number is totally beyond my comprehension. Why one simple chip family would need all these magic numbers (by the way, I though the magic number in the "UNIX header" - as in "aouthdr.h" - was supposed to indicate whether the file was shared text, split I&D, paged shared text, and all that) is also a bit incomprehensible. A Common Object File Format may be nice, but this is slipping past baroque into rococo... (we won't even discuss the fact that another header uses "#if mc68000" - it's nice that they agree with the #define we use, but somebody from Motorola assured me that "m68k" was the proper predefined constant, not "mc68000") Guy Harris
mats@fortune.UUCP (Mats Wichmann) (12/10/85)
68000 magic numbers? Well... This is what appears in the header file from Motorola (and AT&T) these days: #define MC68MAGIC 0520 #define MC68TVMAGIC 0521 #define M68MAGIC 0210 #define M68TVMAGIC 0211 The M68 stuff and all the TV (transfer vector) things have to do with AT&T-internal development - the multiprocessor switches and those things, if I am not mistaken. So there is really only one magic number for the 68000 family that gets used. (MC68MAGIC). As far as #defines go, the "party line" is: m68k is for the family M68000 M68010 M68020 M68881 and such identify a particular chip. Variations: as many as you can think of. For example, Unisoft at one time used mc68000, but later switched to m68000 (and may now be using m68k). And So On. I think MIT used mc68000, so people who started with their code as a base probably used mc68000 at least for a while.... Mats Wichmann Fortune Systems {ihnp4,hplabs,dual}!fortune!mats "Quality. Comfort. Style. And at prices jou can afford!" - Izzy Moreno
dave@ecrcvax.UUCP (David Morton) (12/22/85)
Summary: Expires: References: <124@rexago1.UUCP> <416@ihdev.UUCP> <3044@sun.uucp> <9107@ritcv.UUCP> <sun.3059> Sender: Reply-To: dave@ecrcvax.UUCP (David Morton) Followup-To: Distribution: net Organization: European Computer-Industry Research Centre, Munchen, W. Germany Keywords: In article <sun.3059> guy@sun.uucp (Guy Harris) writes: >Well, I looked on an S5 machine we have here, and there are now seven(1) >different magic numbers in its "filehdr.h" - eight, if you count the fact >that it claims that the UNIX PC 7300 uses the same magic number as iAPX286 >large-model code(!!!). I don't know what the "four distinct forms of >executables for the M68000 family" are, but I suspect there's no correlation >between those forms and the various magic numbers. There are probably x (x > 8) different magic numbers on unix machines in the meantime. What's to stop a manufacturer making the kernel recognise yet another one (perhaps because he's developed his own mmu for some purpose or other), then hacking the assembler, loader & the includes. I know of one company here in Germany that did this. So much for binary compatibility. > >Why an "M68" is different from an "MC68K" is beyond my simple mind; maybe >somebody from the group that did this can explain it to us mere mortals. Yes please ! This was really confusing. Apart form that, the Motorola 5.0 SGS was nice to work with. -- Dave Morton Tel. + (49) 89 - 92699 - 139 CSNET: dave%ecrcvax.uucp@germany.csnet UUCP: seismo!mcvax!unido!ecrcvax!dave