flak@slovax.UUCP (Dan Flak) (08/17/87)
In article 923 Prentiss Riddle presents: >I've inherited responsibility for a medium-large and growing database (say >Currently the data is stored in a single large Informix file in which each >record consists of a key and 52 floating point lab values. The key is a >composite of a study group number, a subject number and a time. >+-----------+-------------+------+-------+-------+- -+--------+ >| group no. | subject no. | time | val 1 | val 2 | ... | val 52 | >+-----------+-------------+------+-------+-------+- -+--------+ (not all values are measured every time, and some of the value fields have different meanings to different groups). >PLAN A: Instead of using very wide records with 52 lab values, why not use >short records each containing a key and a single lab value? Each record >might look like this: > >+-----------+-------------+------+-------+-------+ >| group no. | subject no. | time | label | value | >+-----------+-------------+------+-------+-------+ > >The "label" would be a short (3-4 byte) string explaining the meaning of the >"value" field. Obviously most of our storage here goes for keys (and I like your plan A, but have additional questions. Are subject numbers unique to groups? That is, does an individual group study several subjects, but no subject is studied by more than one group? If that's the case, you could always maintain a group no. - subject no. file "on the side". You can join the group name back in at report run time. This would make the file look like (with sample data): +-------------------------+ | FILE ONE | +-----------+-------------+ | group no. | subject no. | |-----------+-------------| | 1 | 870001 | | 1 | 870002 | | 2 | 870003 | | 2 | 870004 | | 2 | 870005 | | 3 | 870006 | | . | . | | . | . | | . | . | +-----------+-------------+ +-----------------------------------+ | FILE 2 | |------------+------+-------+-------| | subject no | time | label | value | |------------+------+-------+-------| | 870001 | 1000 | TEMP | 98.6 | | 870001 | 1000 | SYS | 120 | | 870001 | 1000 | DIA | 80 | | 870001 | 1100 | TEMP | 98.7 | | . | . | . | . | | . | . | . | . | | . | . | . | . | | 870006 | 1000 | PH | 7.05 | +------------+------+-------+-------+ If not, no big deal, just more typing. Now, if there's only going to be one or two readings per subject number, this scheme may not be worth while. However, if there's going to be many readings per subject number, then this is the way to go. The main problem with this set up is that you have to retype the subject number (and possibly the group number) for each observation. It's far easier to "fill in the blanks" in the "flat" file system you have set up now. (I.e. type the group and subject once, and then associate a string of numbers). It may be easier on input, but it's more difficult to extract the data especially since corresponding fields don't always mean the same thing. I don't see how you can distinguish between legitimate zero values and "no data (reading)". Fortunately, Informix offers a nice compromise known as the Master-Detail screen Perform. 4GL is also supposed to have some "whiz-bang" entry method for doing something similar. BTW: When (if?) you decide to go with plan A. Don't overlook using unload ascii in dbstaus followed by some awk scirpts to parse out your data for reload into the new schema. Of course, there's always the possibility of a compromise. That is, have one (or several) "standard files". I'm judging by the organization you work for, that some of these readings are medical tests. I know that one standard would be basic vital signs (pulse, blood pressure, temperature, respiration) . You can maintain these in a file system such as you have now (only much reduced), and join them with "specific" tests or measurements for fluid intake/output, blood work results (gasses, electrolytes, etc. - possibly another "standard" group), etc. Without knowing the specifics, it's hard for me to generalize. I won't bore the rest of the net with specifics. If I am anywhere near the target, and if you can put up with more of my rambling send me E-mail. -- {psivax,ism780}!logico!slovax!flak : {hplsla,uw-beaver}!tikal!slovax!flak Dan Flak-R & D Associates,3625 Perkins Lane SW,Tacoma,Wa 98499,206-581-1322