news@investor.UUCP ( Bob Peirce) (11/15/88)
In article <2643@nuchat.UUCP> steve@nuchat.UUCP (Steve Nuchia) writes: >Posting-number: Volume 5, Issue 28 >Submitted-by: "Steve Nuchia" <steve@nuchat.UUCP> >Archive-name: compress.magic > >If you have a file(1) compatible with the one in sysVr3 >adding the following lines to /etc/magic will make it >recognize compressed files. > >0 short 40223 compressed data >>2 byte <0200 - %d bits >>2 byte 0214 - 12 bits >>2 byte 0215 - 13 bits >>2 byte 0220 - 16 bits > We have SysVr2.2 and the 40223 needed to be changed to 8093 (1F9D hex). Use "od -x" on a compressed file to see if your first word is different. We also found a space in front of the '-' would be included in the output leading to "compressed data - 16 bits" instead of "compressed data- 16 bits". Much thanks for the idea. -- Bob Peirce, Pittsburgh, PA 412-471-5320 uucp: ...!{allegra, bellcore, cadre, idis, psuvax1}!pitt!investor!rbp NOTE: Mail must be < 30K bytes/message
guy@auspex.UUCP (Guy Harris) (11/16/88)
>>If you have a file(1) compatible with the one in sysVr3 >>adding the following lines to /etc/magic will make it >>recognize compressed files. >> >>0 short 40223 compressed data >> >We have SysVr2.2 and the 40223 needed to be changed to 8093 (1F9D hex). Person "A" had a little-endian machine (40223 is 9D1F hex) and person "B" had a big-endian machine; the S5 release had nothing to do with it. Unfortunately, "strings" in "/etc/magic" cannot, in the standard S5 version, contain C-language escapes, so you can't do this: 0 string \037\235 compressed data which the SunOS version of the S5 'file" supports; this obviates the need for byte-order-dependent versions of "/etc/magic". The SunOS version also supports >2 byte&0x80 >0 block compressed >2 byte&0x1f x %d bits with the "&<mask>" stuff obviating the need for individual entries for different numbers of bits. I hope this gets into S5R4 (especially since "compress" presumably will get into S5R4), but I have no idea if it will.
dik@cwi.nl (Dik T. Winter) (11/16/88)
In article <454@auspex.UUCP> guy@auspex.UUCP (Guy Harris) writes: > >>If you have a file(1) compatible with the one in sysVr3 > >>adding the following lines to /etc/magic will make it > >>recognize compressed files. > >> > >>0 short 40223 compressed data > >> > >We have SysVr2.2 and the 40223 needed to be changed to 8093 (1F9D hex). > > Person "A" had a little-endian machine (40223 is 9D1F hex) and person > "B" had a big-endian machine; the S5 release had nothing to do with it. > > Unfortunately, "strings" in "/etc/magic" cannot, in the standard S5 > version, contain C-language escapes, so you can't do this: > > 0 string \037\235 compressed data > > which the SunOS version of the S5 'file" supports; this obviates the need > for byte-order-dependent versions of "/etc/magic". It is clear that byte order dependencies are a pain, on the other hand if for some system some format is specified as 0177767 it appears to be backward to have to specify it in the form of a string. Wouldn't it be better to allow specification like AR16W, AR32W and AR32WR as happens in so many places in SysV (COFF comes to mind)? Then each system can specify in its favourite order its own /etc/magic part. Of course file(1) would have to compose the words and longs itself, but that is only minor. (Note: this could also make a truely portable file(1)). -- dik t. winter, cwi, amsterdam, nederland INTERNET : dik@cwi.nl BITNET/EARN: dik@mcvax
guy@auspex.UUCP (Guy Harris) (11/17/88)
>It is clear that byte order dependencies are a pain, on the other hand >if for some system some format is specified as 0177767 it appears to be >backward to have to specify it in the form of a string. So who says you *have* to specify it in the form of a string? From the same SunOS "/etc/magic": 0 short 0177545 old archive We didn't *take away* any capability, we just *added* some. In the case of compressed files, the format *is* properly specified as a string - in "compress.c", the header is char_type magic_header[] = { "\037\235" }; /* 1F 9D */ which sure looks like a string to me.... >Wouldn't it be better to allow specification like AR16W, AR32W and >AR32WR as happens in so many places in SysV (COFF comes to mind)? In this particular example, no, it wouldn't be better; since the magic number for compressed files *is* a string, the right way to specify it is as a string. The same applies for packed data: 0 string \037\036 packed data Although PACKED is #defined as 017436 in "pack.c", that #define is not, in fact, used; the code to generate the magic number is outbuff[0] = 037; outbuff[1] = 036; which is, again, a string "\037\036". The right tool for the right job; telling "file" about a machine's byte order is, in this case, entirely the wrong tool.
dupuy@douglass.columbia.edu (Alexander Dupuy) (11/17/88)
In article <454@auspex.UUCP> guy@auspex.UUCP (Guy Harris) writes: > Unfortunately, "strings" in "/etc/magic" cannot, in the standard S5 > version, contain C-language escapes, so you can't do this: > > 0 string \037\235 compressed data > > which the SunOS version of the S5 'file" supports; this obviates the need > for byte-order-dependent versions of "/etc/magic". In article <7714@boring.cwi.nl> dik@cwi.nl (Dik T. Winter) replies: > Wouldn't it be better to allow specification like AR16W, AR32W and > AR32WR as happens in so many places in SysV (COFF comes to mind)? > Then each system can specify in its favourite order its own /etc/magic > part. Not only is this better, but the "strings" in even the SunOS version of magic can't contain null bytes (well actually, they can, but it doesn't do what you want), and can't be masked in the way which SunOS allows you to mask numbers. While I find AR32WR pretty incomprehensible (is it network order, or reversed?) it should be possible to write a file which converts the numbers and masks specified in /etc/magic into network order before doing any comparisons, and converts values read from the file into local byte order before performing %d substitutions on the descriptions in the /etc/magic file. If needed, special printf macros could be used to indicate substitutions for which the bytes needed to be swapped before the ntohl conversion was done. This should provide a portable file(1). @alex -- inet: dupuy@columbia.edu uucp: ...!rutgers!columbia!dupuy
guy@auspex.UUCP (Guy Harris) (11/18/88)
>Not only is this better,
This assertion has now been made twice. I have yet to see any evidence
for its truth. The strings in "/etc/magic" correspond, in at least the
two cases I cited (compressed files and packed files), to C-language
strings; could somebody please explain why expressing them as numbers in
a standard byte order, rather than as the strings they are, is somehow
"better"?
dupuy@douglass.columbia.edu (Alexander Dupuy) (11/18/88)
Just to follow up on the problems with /etc/magic formats: it is currently possible to cause file(1) to crash on a Sparc (or other strict alignment machine) by having magic entries which are of type "short" or "long", at misaligned offsets. If you have a Sun-2 or Sun-4, you can repeat this by placing the following entry into /etc/magic: 1 long 0 doesn't matter if it matches or not. This should crash file with a Bus Error, when you apply it to any file for which it doesn't have a builtin rule, such as an ordinary ASCII text file. @alex -- inet: dupuy@columbia.edu uucp: ...!rutgers!columbia!dupuy