Kling%UCI-20B@UCI-750A.ARPA@sri-unix.UUCP (06/09/84)
From: Rob-Kling <Kling%UCI-20B@UCI-750A.ARPA> In reply to the query about statistical packages: 1. I have had similar preferences for SPSS or SAS on a micro. Becuase of all sorts of sizing and programming problems, micro packages can't compete for very large files and in the sheer variety of statistical programs. Most of the packages lack good data management facilities and handle few variables. (See the major review in Byte April, 1984). 2. I searched for a medium priced package for an IBM-PC which would handle medium sized data sets, lots of variables, have good data mangement facilities, and have good non-parametric and parametric statistical facilities. Three packages stood out from the rest and two are plausible. They all cost $400-$500 and also take great pains to demonstrate statistical accuracy in multiple regression on a set of standard test data called the Longley data set (highly intercorrelated predictor variables.)(See Byte, Nov. 1983 for a desription of the Longley data.) a. CRISP ::(an SPSS-like package) with terrific non-parametric statistics and good regression analysis (both linear and stepwise regressions). There were good implementations of multiple comparissons tests for ANOVAs (e.g., Neumann-Kreuls and several others). No factor analysis or reliability analyses at this time. Good interactive interface, but menu driven, and no way of bypassing them. (Could be clumsy and frustrating for experienced users). Also, they seemed to expect good add-on prices for new modules such as factor analysis (maybe $100). As CRISP develops, a complete package could reach $700-$800. (Initial cost $500). The package was just coming out in the late Fall. I rejected CRISP because it's developers were trying to sell the product to a national vendor and its future was shaky regarding support for current purchasers and total pricing for the complete package as it would evolve. With different pricing and a more stable support arrangement, I would re-consider CRISP. [CRISP Software is based in San Fransisco.] b. SYSTAT - This package implements a general linear model and is written is Fortran for fast computation. It can make use of an 8087 (optional). (The co-processor is said to speed up the package by 5-10 times.) The package is weak in non-parametric statistics. It includes factor analysis. Its real strength is in providing a good implementation of the general linear model which can be used for regression analysis and fairly complex ANOVAs. It includes log-linear analysis and multi-dimensional scaling. This is the only micro-package I know of that will run these analyses. It appears to have good data management facilities, including a BASIC-like programming language for transforming and selecting data. One of my colleagues bought SYSTAT and swears by it. The number of cases is limited by the medium (and can be very large with a hard disk), but the number of variables is limited by core - 50 on a 64K Z-80-based machine and 75 on a 256K MS-DOS machine. Costs $495. My colleague who bought SYSTAT uses it on an IBMPC/XT with an 8087 co-processor. The programs come on 5 diskettes and take up about 1.4 megabytes of hard disk. I envisioned a good deal of diskette swapping in using this program. (Also, I've heard that some people who have 8087's have heating problems and add fans or move extra boards to expansion boxes. My colleague has, and I don't know how general this problem is.) I use a PC with 2 floppy drives and 576K (for RAM disks). I suspected that SYSTAT would require more machine resources for convenient use. It is a high-powered package that warrants investigation. SYSTAT will run on PC's, PC-clones, and a variety of Z-80 based machines including the Kaypro, Morrow Micro-Decision, Radio Shack's, etc. SYSTAT Inc. 1127 Asbury Ave. Evanston, Ill. 60202 312-864-5670 c. Walonick Associates StatPac. This package supports some non-parametric measures of association, t-tests, muti-value survey responses, and some multivariate statistics such as linear regression and 1 and 2 way ANOVAs. It is modelled on SPSS and requires that the user set up a codebook for a data set. The data management is fairly good and uses SPSS-style Select-If's, RECODE-IF's, etc. It handles Subfiles. (It will also read data in SPSS format if it is written out with a WRITE CASES command). The codebook allows useful variable labels, and value labels which help document the output. StatPac can be run interactively or be run from a batch file of commands (useful for repetitive analyses). [It can print to disk or printer.] It runs several times faster if the data file is stored on RAM disk. It will handle up to 5000 cases and 255 columns of information per case. The number of variables depends upon the size of the data per variable. However, there is a merge utility which enbales one to handle maange larger data sets. I have a survey with several hundred variables (some derived) and have segmented it so that it is straightforward to handle with StatPac. StatPac does not currently support factor analysis, but a factor analysis package is being developed. They seem to release a new version with additional facilities each quarter. Upgrades cost $50 for current users. They are included in the $400 price of the package for new buyers. Phone support for problems has been very good. StatPac will offer a sharp discount for groups of users in the same organizational unit (e.g., academic department, research lab..). Purchase 2 copies at $400/each. Additional copies are $20/each. For 10 users, this brings the cost down to $96/each!). Walonick Associates 5624 Girard Ave. Minneapolis, Minn. 55419 612-866-9022 ---------------------------------------------------- I purchased StatPac and have found it quite useful for my purposes. The interface is good and has gotten better with a recent upgrade. It is missing certain non-parametric and parametric analyses I would like. SYSTAT includes some of those analyses, but handles fewer variables. I were buying now and in less of a rush than then, I would look much more carefully at SYSTAT. I think these are the best micro-statistics packages available under $500. One could hope for better. Each of the packages has serious limitations in interfaces, range of tests, size of data sets easily accomodated, etc. whem compared with MAINFRAME packages. each of these packages has been thoughtfully designed and reflects sensible choices in scaling down mainframe concepts for a micro environment. In a year these packages will improve and there may be some new competitors. SPSS is releasing a micro version very soon. It will *require* a hard disk (and an 8087?) and cost about $700-$800. SYSTAT, CRISP, and StatPac are all relatively new packages. They are all a quantum jump in sophistication above the competition in their data management and in tests (SYSTAT). I suspect that a new generation of micro-based statistical software is emerging. Rob Kling University of California, Irvine