martin@csd4.csd.uwm.edu (Martin A Miller) (07/18/90)
Greetings.. Our site (University of Wisconsin-Milwaukee) has recently purchased a Convex 220 (BSD 4.3) to replace our ancient UNISYS 1100/81 (Exec-8). There is some concern here that some of the [valuable] utilities on the 1100 might not have UNIX analogues. One particular utility which has proved to be indispensable is called the "Unified Data Handler" The following is a list of some of the functions of this software: 1) Append files 2) Combine adjacent records based on either every n records, or a break in a particular variable's value, or type of record. That is, combine two 128 character lines into one 256 line based on any of the above options, for example. 3) Convert tapes (EBCDIC, packed decimal, etc.) into ASCII files. 4) Create new variables, based on mathematical and/or conditional statements. (ie., recode) 5) Match records from two files based on multiple keys. The capability to include or exclude multiple records with the same keys from the two files; to do this differently for each of the two files; to output the matched records from both files separately and output the unmatched records from both separately and/or output a merged file. 6) Merge files based on multiple keys 7) Print files, regardless of record length and character type (e.g., binary, packed decimal) 8) Redescribe data, if necessary, to string manipulation procedures. 9) Reformat data. 10) Select records based on multiple keys 11) Sequence checking of data; the capability to output the highest or lowest record with the same value on a key variable. 12) Sort data based on multiple keys, alpha or numeric, in ascending or descending. 13) Update records in one file based on new and/or additional data in another file. The ability to change existing data to new values, and to add new data to records with the same key fields. Note: the above procedures have been performed on data sets of more than 4,000,000 records with substantial record length (500 characters per record, for example) using UDH. I realize that there *are* UNIX utilities etc., which will perform the above data manipulation routines, but I am not aware of an integrated package (perhaps even third party software) to do *all* these things. I am also aware of the formidable capabilities of sed, or awk to manipulate data, but it may require a considerable investment for previously non-UNIX personnel to write sed or awk scripts. Are there any data handling packages which might fill the bill? Please email me in reply - if the mail doesn't get through, please follow-up to comp.unix.questions. thank you, -mm Martin A. Miller Programmer/Consultant Social Science Research Facility University of Wisconsin-Milwaukee Internet: martin@csd4.csd.uwm.edu Bitnet : martin%csd4.csd.uwm.edu@INTERBIT UUCP : uunet!martin@csd4.csd.uwm.edu