v069qqqc@ubvmsd.cc.buffalo.edu (Michael Carrato) (10/29/90)
I am writing a general purpose database in C. I'm developing it for the company I work for so I'm doing it with the ibm XT's they have in mind. However, I would like to eventually Amiga-tize it and release it as (probably) shareware. I would therefore prefer to make the program (as well as the database format) as portable as possible. The problem is, (correct me if I'm wrong) Intel machines are "little endian" whereas Motorolas are "big endian". I assume C adopts one of these standards to maintain protability of files between machines. My question is, which is the C standard, and how do I ensure that a database created on the ibm version will not be garbage on the Amiga version. (To add to the confusion, I'm doing the development on a sun... I have no idea what the sparc standard is!) As it stands now, I'm using fread() and fwrite() for my file i/o. I KNOW this must be wrong, since it's simply a binary image transfer. Is there any general way to read/write a block of data so that this mess can be resolved, or do I have to write code to output the data byte-by-byte? Forgive me if I'm making a mountain out of a molehill, but I want to be sure before I go ahead with this thing.. As a side note, is there any demand for this type of program in the Amiga community? I have seen few general purpose PD/Freeware/Shareware database programs out there, and I've been tempted to go out and buy SuperBase or some such commercial program. (Why haven't I? I'm broke! :-(..). I would think there are others in my position... looking for a decent, cheap database. Also, if you are interested, what features would you like to see? I plan to intuitionalize it, and add an AREXX port.. (I don't have AREXX yet but I'm working on it.. poor college student :-}). The program so far is built to handle a large number of records fairly quickly, and I hope to eventually include a report generator and some other goodies. Thanks for enduring all of this babble... Mike Carrato SUNY at Buffalo, senior in computer engineering.
mcmahan@netcom.UUCP (Dave Mc Mahan) (10/30/90)
In a previous article, v069qqqc@ubvmsd.cc.buffalo.edu writes: > >I am writing a general purpose database in C. I'm developing it for >the company I work for so I'm doing it with the ibm XT's they have in mind. >However, I would like to eventually Amiga-tize it and release it as >(probably) shareware. I would therefore prefer to make the program (as >well as the database format) as portable as possible. > >The problem is, (correct me if I'm wrong) Intel machines are "little >endian" whereas Motorolas are "big endian". I assume C adopts one of >these standards to maintain protability of files between machines. >My question is, which is the C standard, and how do I ensure that a database >created on the ibm version will not be garbage on the Amiga version. >(To add to the confusion, I'm doing the development on a sun... I have >no idea what the sparc standard is!) You are correct in your observation. Alas, there is no standard for the order of bytes written to a file (that I have ever heard of or seen). There really aren't any gaurantees that you will even be using the same number of bytes for an integer on one machine that you use on another. Some compilers like to use 16 bit integers, some prefer 32 bit integers. About all K&R will let you count on is that a long integer is at least as big as a 'normal' integer. There isn't really even a definate standard as to the number of bits per character, although most machines that are capable use 8 bits per character. Some older mainframes use 6 bits and some use 9. 'C' is able to accept code that uses the ASCII character sequence as well as EBCDIC (although functions like strcmp() have special versions for EBCDIC machines). About the only thing you can REALLY count on is that you can read a byte and write a byte. If you want to be really portable, you have to pack and un-pack bytes as they are read or written to disk. There really isn't any absolute gaurantee that writing a binary 16 bit value on a 68000 CPU will write the most significant byte first. >As it stands now, I'm using fread() and fwrite() for my file i/o. >I KNOW this must be wrong, since it's simply a binary image transfer. >Is there any general way to read/write a block of data so that this >mess can be resolved, or do I have to write code to output the data >byte-by-byte? To absolutely gaurantee portability, you have to read and write on a byte by byte basis. It's a pain, but it's the only way I know of. Of course, you can always make some assumptions....... The other solution is to keep all your info as ASCII characters. It tends to make your data files much bigger and slow down access times, but last time I checked, everyone still reads english from left to right, right? :-) >Mike Carrato >SUNY at Buffalo, senior in computer engineering. -dave
etxtomp@eos.ericsson.se (Tommy Petersson) (10/30/90)
> In a previous article, v069qqqc@ubvmsd.cc.buffalo.edu writes: >> >>I am writing a general purpose database in C. I'm developing it for >>the company I work for so I'm doing it with the ibm XT's they have in mind. >>However, I would like to eventually Amiga-tize it and release it as >>(probably) shareware. I would therefore prefer to make the program (as >>well as the database format) as portable as possible. >> >>The problem is, (correct me if I'm wrong) Intel machines are "little >>endian" whereas Motorolas are "big endian". I assume C adopts one of >>these standards to maintain protability of files between machines. >>My question is, which is the C standard, and how do I ensure that a database >>created on the ibm version will not be garbage on the Amiga version. >>(To add to the confusion, I'm doing the development on a sun... I have >>no idea what the sparc standard is!) For portability of data files, NEVER use int, long... in your program. Have a global header file that's included in all your programs define portable types like BIT16, BIT32 a.s.o. The byte-swap issue could be solved by something like this: Have a 16-bit word in the beginning of a data file contain number 1. If you read this data file on another machine and the word contains 256, the bytes are swapped. You could have your read/write routines do different things depending on this initial test(s) (have a table where you install pointers to function A or B...). However, I don't think byte-swapping is the only problem. Some machines have MSB and LSB different, so this number one can probably become either 1, 128, 256 or 32768... ASCII strings will not get swapped, but integers do, so the read routines will have to know what they are reading. Tommy Petersson
peter@sugar.hackercorp.com (Peter da Silva) (10/30/90)
You have three choices: o Choose one end of the word, and write it out byte-by-byte. o Flip your words in memory, then use fwrite. o Use an intermediate transfer format. I would suggest the third choice. And I'd suggest that if there's an existing standard for whatever you're doing (bitmaps, say) use it. There is a whole family of standards for things like video, audio, and music on the Amiga called Interchange File Format. If there is no standard, use a text format unless you're dealing with huge databases. This way you can edit your files to patch up problems and for debugging using a regular text editor. (Oh, I wish .info files were text!) -- Peter da Silva. `-_-' <peter@sugar.hackercorp.com>.
sheley@convex.com (John "Dumptruck" Sheley) (11/01/90)
In <43072@eerie.acsu.Buffalo.EDU> v069qqqc@ubvmsd.cc.buffalo.edu (Michael Carrato) writes: >I am writing a general purpose database in C. I'm developing it for >the company I work for so I'm doing it with the ibm XT's they have in mind. >However, I would like to eventually Amiga-tize it and release it as >(probably) shareware. I would therefore prefer to make the program (as >well as the database format) as portable as possible. > >The problem is, (correct me if I'm wrong) Intel machines are "little >endian" whereas Motorolas are "big endian". I assume C adopts one of >these standards to maintain protability of files between machines. >My question is, which is the C standard, and how do I ensure that a database >created on the ibm version will not be garbage on the Amiga version. >(To add to the confusion, I'm doing the development on a sun... I have >no idea what the sparc standard is!) There is no C standard which declares that integer types will be stored in memory `big endian' or `little endian' - that is left completely up to the architecture you're running on. C treats the basic types (char, int, short, long, float, double) atomically, and cares not a bit how the machine stores data as log as it retrieves the data the same way it stored the data. This causes nightmares (and rightly so) to people who do indiscrimant `union'ing and try to port their code to different cpus. >As it stands now, I'm using fread() and fwrite() for my file i/o. >I KNOW this must be wrong, since it's simply a binary image transfer. >Is there any general way to read/write a block of data so that this >mess can be resolved, or do I have to write code to output the data >byte-by-byte? This isn't really wrong - you're just screwed if you try to use the same data file on an I*M and an Amiga. There's really nothing you're missing. The suggestions others have offered are about all you can do if you want portability. I could be mistaken, but I believe dbase stores it's numbers as ASCII. Not a great example, but there you are. Something else to consider, though is how you are going to run indexes to your database tables. If you allow mixed composite keys (1 or more ASCII fields combined with 1 or more numeric fields) to be used to build indexes, then it might be better to store your numbers as ASCII. The collating sequence of a group of keys composed of ASCII data and pure binary numbers together will not be what you want unless the numbers are unsigned. By the way, I'm thinking of doing something along those same lines, and I'm having trouble deciding what mechanism to use for indexing. Do you plan to use ISAM, B-tree/balanced B-tree, or hashing? I'm thinking about ISAM for sorting speed, combined with extra pointers B-tree kind of searches. I'd appreciate any thoughts. >Mike Carrato >SUNY at Buffalo, senior in computer engineering. John Sheley Convex Computer Corp. sheley@convex.com
david@twg.com (David S. Herron) (11/03/90)
In article <43072@eerie.acsu.Buffalo.EDU> v069qqqc@ubvmsd.cc.buffalo.edu writes: >I am writing a general purpose database in C. ... >... I would therefore prefer to make the program (as >well as the database format) as portable as possible. > >The problem is, (correct me if I'm wrong) Intel machines are "little >endian" whereas Motorolas are "big endian". I assume C adopts one of >these standards to maintain protability of files between machines. ... No, C does not adopt a standard -- whatever the byte order is on the local system is what it is. The networking (TCP/IP) people came across the same problem loooong ago and came up with a Network Standard Byte Order. Grep around in /usr/include on your sun for "ntohl" to see how they're implemented. I forget where this ordering is defined, probably in an RFC somewhere. Probably it's that one which reprints the part of Gulliver's Travels detailing his experiences with the Endians..? >As it stands now, I'm using fread() and fwrite() for my file i/o. >I KNOW this must be wrong, since it's simply a binary image transfer. >Is there any general way to read/write a block of data so that this >mess can be resolved, or do I have to write code to output the data >byte-by-byte? No.. you can use f{read,write}() but before you fwrite() and after you fread() you should go through the buffer & swap all the bytes around. Usually this is implemented by having a layer of code through which all accesses to the object is done. (database in this case) In this layer you translate between internal & external representation appropriately as data is going through this layer. -- <- David Herron, an MMDF & WIN/MHS guy, <david@twg.com> <- Formerly: David Herron -- NonResident E-Mail Hack <david@ms.uky.edu> <- <- Use the force Wes!