[net.micro] Statistical Packages for IBM-PC's

Kling%UCI-20B@UCI-750A.ARPA@sri-unix.UUCP (06/09/84)

From:  Rob-Kling <Kling%UCI-20B@UCI-750A.ARPA>

In reply to the query about statistical packages:

1.	 I have had similar preferences for SPSS or SAS on a micro.
   Becuase of all sorts of sizing and programming problems, micro
packages can't compete for very large files and in the sheer variety
of statistical programs. Most of the packages lack good data
management facilities and handle few variables. (See the major review
in Byte April, 1984).

2. I searched for a medium priced package for an IBM-PC which would
handle medium sized data sets, lots of variables, have good data
mangement facilities, and have good non-parametric and parametric
statistical facilities. Three packages stood out from the rest and two
are plausible. They all cost $400-$500 and also take great pains to
demonstrate statistical accuracy in multiple regression on a set of
standard test data called the Longley data set (highly intercorrelated
predictor variables.)(See Byte, Nov.  1983 for a desription of the
Longley data.)

a. CRISP	::(an SPSS-like package) with terrific non-parametric
statistics and good regression analysis (both linear and stepwise
regressions). There were good implementations of multiple comparissons
tests for ANOVAs (e.g., Neumann-Kreuls and several others). No factor
analysis or reliability analyses at this time. Good interactive
interface, but menu driven, and no way of bypassing them. (Could be
clumsy and frustrating for experienced users). Also, they seemed to
expect good add-on prices for new modules such as factor analysis
(maybe $100).

	 As CRISP develops, a complete package could reach $700-$800.
(Initial cost $500). The package was just coming out in the late Fall.
I rejected CRISP because it's developers were trying to sell the
product to a national vendor and its future was shaky regarding
support for current purchasers and total pricing for the complete
package as it would evolve.

With different pricing and a more stable support arrangement, I would
re-consider CRISP.

[CRISP Software is based in San Fransisco.]

b. SYSTAT - This package implements a general linear model and is
written is Fortran for fast computation. It can make use of an 8087
(optional). (The co-processor is said to speed up the package by 5-10
times.) The package is weak in non-parametric statistics. It includes
factor analysis. Its real strength is in providing a good
implementation of the general linear model which can be used for
regression analysis and fairly complex ANOVAs. It includes log-linear
analysis and multi-dimensional scaling. This is the only micro-package
I know of that will run these analyses. It appears to have good data
management facilities, including a BASIC-like programming language for
transforming and selecting data.
	
	 One of my colleagues bought SYSTAT and swears by it. The
number of cases is limited by the medium (and can be very large with a
hard disk), but the number of variables is limited by core - 50 on a
64K Z-80-based machine and 75 on a 256K MS-DOS machine. Costs $495.

	My colleague who bought SYSTAT uses it on an IBMPC/XT with an
8087 co-processor. The programs come on 5 diskettes and take up about
1.4 megabytes of hard disk. I envisioned a good deal of diskette
swapping in using this program. (Also, I've heard that some people who
have 8087's have heating problems and add fans or move extra boards to
expansion boxes. My colleague has, and I don't know how general this
problem is.) I use a PC with 2 floppy drives and 576K (for RAM disks).
I suspected that SYSTAT would require more machine resources for
convenient use. It is a high-powered package that warrants
investigation.

SYSTAT will run on PC's, PC-clones, and a variety of Z-80 based
machines including the Kaypro, Morrow Micro-Decision, Radio Shack's, etc.

SYSTAT Inc.
1127 Asbury Ave.
Evanston, Ill. 60202
312-864-5670

c. Walonick Associates StatPac.	This package supports some
non-parametric measures of association, t-tests, muti-value survey
responses, and some multivariate statistics such as linear regression
and 1 and 2 way ANOVAs. It is modelled on SPSS and requires that the
user set up a codebook for a data set. The data management is fairly
good and uses SPSS-style Select-If's, RECODE-IF's, etc. It handles
Subfiles. (It will also read data in SPSS format if it is written out
with a WRITE CASES command). The codebook allows useful variable
labels, and value labels which help document the output.

	StatPac can be run interactively or be run from a batch file
of commands (useful for repetitive analyses). [It can print to disk or
printer.]  It runs several times faster if the data file is stored on
RAM disk. It will handle up to 5000 cases and 255 columns of
information per case. The number of variables depends upon the size of
the data per variable. However, there is a merge utility which enbales
one to handle maange larger data sets. I have a survey with several
hundred variables (some derived) and have segmented it so that it is
straightforward to handle with StatPac.

	StatPac does not currently support factor analysis, but a
factor analysis package is being developed. They seem to release a new
version with additional facilities each quarter. Upgrades cost $50 for
current users. They are included in the $400 price of the package for
new buyers. Phone support for problems has been very good.

	StatPac will offer a sharp discount for groups of users in the
same organizational unit (e.g., academic  department, research lab..).
Purchase 2  copies at $400/each. Additional copies are $20/each. For
10 users, this brings the cost down to $96/each!).


Walonick Associates
5624 Girard Ave.
Minneapolis, Minn. 55419
612-866-9022

----------------------------------------------------

	I purchased StatPac and have found it quite useful for my
purposes. The interface is good and has gotten better with a recent
upgrade. It is missing certain non-parametric and parametric analyses
I would like. SYSTAT includes some of those analyses, but handles
fewer variables. 

	 I were buying now and in less of a rush than then, I would
look much more carefully at SYSTAT. I think these are the best
micro-statistics packages available under $500. One could hope for
better. Each of the packages has serious limitations in interfaces,
range of tests, size of data sets easily accomodated, etc. whem
compared with MAINFRAME packages. each of these packages has been
thoughtfully designed and reflects sensible choices in scaling down
mainframe concepts for a micro environment.

	 In a year these packages will improve and there may be some
new competitors. SPSS is releasing a micro version very soon. It will
*require* a hard disk (and an 8087?) and cost about $700-$800. SYSTAT,
CRISP, and StatPac are all relatively new packages. They are all a
quantum jump in sophistication above the competition in their data
management and in tests (SYSTAT). I suspect that a new generation of
micro-based statistical software is emerging. 


Rob Kling
University of California, Irvine