childers@avsd.UUCP (Richard Childers) (10/12/89)
I'm trying to create a MS-DOS equivalent of the UNIX utility 'strings' - ultimately part of a public-domain package of UNIX utilities that will compile under both MSDOS and UNIX - and I've run into a problem. Basically, no matter how I look at the resulting executable - whether with my 'strings', or a quick-and-dirty 'od' - I can't find the ASCII characters corresponding to the strings I thought I had compiled into the executable. As far as I know, in UNIX, char is stored as individual allocated bytes, perfectly accessible, perfectly in accord with ASCII specifications. I've tried explicitly defining char arrays, IE #define vers[CMDBUFSIZ] = "v1.00 891010 richard childers" ; ... as well as trying to find strings built into fprintf() calls, to no avail. What am I missing ? One of the possibilities I've considered includes the fact that, while I've defined this array, I've never referenced it, and thus the compiler might have decided to optomize it out of existence. Another possibility is that strings found in printf() or fprintf() are compressed. If so, I haven't seen any reference to how this might be turned off, although there are three manuals to peruse. Am I going to have to write a decompression algorithm ? That's going to have to be applied to every byte ? And if I want to bury ID strings in my code, am I going to have to initialize strings on a byte-by-byte basis ? Ay, caramba !! I'm using MicroSoft C v4.00 on a Wyse PC with about 128 KB on board ... -- richard -- * A CITIZEN: "Who might you be ? Samson ? --" * * CYRANO: "Precisely. Would you kindly lend me your jawbone ?" * * from _Cyrano de Bergerac_, by Edmond Rostand * * ..{amdahl|decwrl|octopus|pyramid|ucbvax}!avsd.UUCP!childers *
cpcahil@virtech.UUCP (Conor P. Cahill) (10/12/89)
In article <2141@avsd.UUCP>, childers@avsd.UUCP (Richard Childers) writes: > Basically, no matter how I look at the resulting executable - whether > with my 'strings', or a quick-and-dirty 'od' - I can't find the ASCII > characters corresponding to the strings I thought I had compiled into > the executable. How are you opening the input file? In MSC you must specify something like O_BINARY in order to read a complete non-text file. The strings are stored the same way in MSC as they are stored in UNIX executables - non-compressed sequences of characters followed by a null byte. Long ago I wrote a strings for dos which worked correctly under MSC 3.0. I don't know where it is now. > #define vers[CMDBUFSIZ] = "v1.00 891010 richard childers" ; What is this supposed to do? First of all the only way to use it is to have a global variable as follows: char string_you_want vers; which is totally unreadable. What were you attempting to do? There is nothing that you can do through the preprocessor that you couldn't do directly in the code. > One of the possibilities I've considered includes the fact that, while I've > defined this array, I've never referenced it, and thus the compiler might > have decided to optomize it out of existence. If this was true, all of the sccsid and rcsid strings would never appear in the object files (which they do). -- +-----------------------------------------------------------------------+ | Conor P. Cahill uunet!virtech!cpcahil 703-430-9247 ! | Virtual Technologies Inc., P. O. Box 876, Sterling, VA 22170 | +-----------------------------------------------------------------------+
davidsen@crdos1.crd.ge.COM (Wm E Davidsen Jr) (10/12/89)
In article <2141@avsd.UUCP>, childers@avsd.UUCP (Richard Childers) writes: | As far as I know, in UNIX, char is stored as individual allocated bytes, | perfectly accessible, perfectly in accord with ASCII specifications. | | I've tried explicitly defining char arrays, IE | | #define vers[CMDBUFSIZ] = "v1.00 891010 richard childers" ; | | ... as well as trying to find strings built into fprintf() calls, to no | avail. What am I missing ? You may have two problems here. One is that something defined to the preprocessor via #define never makes it into the program unless you use it. One way to define your string is to do something like: char *my_id = "The string you want, like copyright"; I make mine static, but I think it would be legal for a compiler to optimize out an unreferenced static. If you can't find strings which are formats of printfs you may have a broken "strings." I have used the one I have and it works for MSC and TC at least. -- bill davidsen (davidsen@crdos1.crd.GE.COM -or- uunet!crdgw1!crdos1!davidsen) "The world is filled with fools. They blindly follow their so-called 'reason' in the face of the church and common sense. Any fool can see that the world is flat!" - anon
childers@avsd.UUCP (Richard Childers) (10/14/89)
I recently said ... >I've tried explicitly defining char arrays, IE > > #define vers[CMDBUFSIZ] = "v1.00 891010 richard childers" ; I actually meant to say ... char vers[CMDBUFSIZ] = "v1.00 891010 richard childers" ; ... which changes the problem somewhat. A wide variety of people have replied, and, much to my surprise, nobody felt it necessary to call me 'hosehead' or tell me to go to a different newsgroup, such as alt.msdos.programmer, for which I am thankful. The best help I've gotten to date suggested that I use 'static' storage classes for SCCS-type buried ID strings, and another individual at UC Santa Cruz suggested I try opening the file using "binary" mode, which doesn't seem to be documented in my version of MSC. One possibility that's occurred to me is that, in a PC environment, the designers of the compiler might have decided that string compression was a win, much as ( according to many contributors ) Lattice' compiler tries to identify and eliminate redundant strings from the resulting image, given the significant decrease in space available in an MS-DOS environ- -ment. I've been informed that if this is true, it would be useful information to know, and I'll keep everyone posted on what I find out ... -- richard -- * A CITIZEN: "Who might you be ? Samson ? --" * * CYRANO: "Precisely. Would you kindly lend me your jawbone ?" * * from _Cyrano de Bergerac_, by Edmond Rostand * * ..{amdahl|decwrl|octopus|pyramid|ucbvax}!avsd.UUCP!childers *
mustard@sdrc.UUCP (Sandy Mustard) (10/17/89)
In article <2157@avsd.UUCP>, childers@avsd.UUCP (Richard Childers) writes: > I recently said ... > > char vers[CMDBUFSIZ] = "v1.00 891010 richard childers" ; > > > The best help I've gotten to date suggested that I use 'static' storage > classes for SCCS-type buried ID strings > > One possibility that's occurred to me is that, in a PC environment, the > designers of the compiler might have decided that string compression was > a win, much as ( according to many contributors ) Lattice' compiler tries > to identify and eliminate redundant strings from the resulting image, > given the significant decrease in space available in an MS-DOS environ- > -ment. You may also want to use static const char vers..... ^^^^^ This may help avoid the redundant string elimination. Would not the following be true. static char string1[] = "ABCD"; static char string2[] = "ABCD"; The compiler could eliminate the redundant strings (when appropriate) whereas: static const char string1[] = "ABCD"; static char string2[] = "ABCD"; should force the compiler to store two separate strings. (I hope someone will correct me if I'm wrong.:-)) Sandy
cpcahil@virtech.UUCP (Conor P. Cahill) (10/18/89)
In article <914@sdrc.UUCP>, mustard@sdrc.UUCP (Sandy Mustard) writes: > You may also want to use > > static const char vers..... > ^^^^^ > This may help avoid the redundant string elimination. The const should be a giant flag to the compiler that this data is the perfect choice for redundant data elimination since it won't be changed. -- +-----------------------------------------------------------------------+ | Conor P. Cahill uunet!virtech!cpcahil 703-430-9247 ! | Virtual Technologies Inc., P. O. Box 876, Sterling, VA 22170 | +-----------------------------------------------------------------------+