[ut.dcs.hci] /local/bin/unixstat

alee (Alison Lee) (02/27/90)

There has been interest expressed within the group for statistical tools.
Here is one (UNIX/STAT) that we have had for sometime and for which CSRI
recently purchased a new upgrade (version 5.4).

This message is also intended to solicit comments about what people want in
a statistical package.  If you have any views or preferences, or are aware
of any statistical tool that you think is really good and that we should
consider getting, VOICE your opinion.

+++++++++++++++++++++++++++++++++++++++++++++

For those unfamiliar with this package, it is a set of statistical tools
and data manipulation tools written by Gary Perlman.  Our previous version
was something like 5.1/5.2.  These are provided AS IS with no warranty
expressed or implied.

This new version is now installed in /local/bin/unixstat.  Here is a
synopsis of the package.

Data Manipulation Programs:

     abut      join data files beside each other
     colex     column extraction/formatting
     dm        conditional data extraction/transformation
     dsort     multiple key data sorting filter
     linex     line extraction
     maketrix  create matrix format file from free-format input
     perm      permute line order randomly, numerically, alphabetically
     probdist  probability distribution functions
     ranksort  convert data to ranks
     repeat    repeat strings or lines in files
     reverse   reverse lines, columns, or characters
     series    generate an additive series of numbers
     transpose transpose matrix format input
     validata  verify data file consistency

Data Analysis Programs:

     anova     multi-factor analysis of variance
     calc      interactive algebraic modeling calculator
     contab    contingency tables and chi-square
     desc      descriptions, histograms, frequency tables
     dprime    signal detection d' and beta calculations
     features  tabulate features of items
     oneway    one-way anova/t-test with error-bar plots
     pair      paired data statistics, regression, scatterplots
     rankind   rank order analysis for independent conditions
     rankrel   rank order analysis for related conditions
     regress   multiple linear regression and correlation
     stats     simple summary statistics
     ts        time series analysis and plots

Alison

doc@dgp.toronto.edu (Blaine Price) (02/27/90)

Since this was posted to the HCI group, I'll express my vote for a Macintosh
stats package with a reasonable interface, rather than yet another UN*X
program with unclear documentation requiring years of use to perform the 
simplest task and and on-site guru to do anything remotely interesting.
I'm sure that all of those fascinating commands in Alison's synopsis are
meaningful to someone, but I'd rather spend my time using a package than
reading mindless documentation and getting frustrated.

elf@dgp.toronto.edu (Eugene Fiume) (02/27/90)

In article <1990Feb26.173656.17187@jarvis.csri.toronto.edu> doc@dgp.toronto.edu (Blaine Price) writes:
>
>I'm sure that all of those fascinating commands in Alison's synopsis are
>meaningful to someone, but I'd rather spend my time using a package than
>reading mindless documentation and getting frustrated.

Funny, it seemed to me that many of the functions provided were precisely
the sorts of technical tools you would need to do decent statistics.  If
you want a Mac interface to give you a warm soft feeling, that's one
thing.  If you want to do proper data analysis, you have to know the cold
hard facts.  All of the terms mentioned can be found in a good stats book.

By the way, I strongly suggest you have a look at Maple's stats package.

doc@dgp.toronto.edu (Blaine Price) (02/27/90)

elf@dgp.toronto.edu (Eugene Fiume) writes:

>Funny, it seemed to me that many of the functions provided were precisely
>the sorts of technical tools you would need to do decent statistics.  If
>you want a Mac interface to give you a warm soft feeling, that's one
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
	    [ooooh, lets get the interface people up in arms, eh? :-)]
>thing.  If you want to do proper data analysis, you have to know the cold
>hard facts.  All of the terms mentioned can be found in a good stats book.

Yes, even my clearly impoverished undergraduate education introduced me
to most of the "terms" used, and I even had to apply them in another 
science (when you study population genetics they kind of want some 
interesting results after you've drugged and counted a few thousand 
fruit flies).  

We can argue interfaces until the Drosophilla Melanogastra come home, 
but given the choice of packages with equal power and the choice of a 
point and click "idiot" interface or a unix command line interface and
a stack of man pages, I'll take the mouse.  Is it intuitive what the
arguments are for the "features" command? (tabulate features of items)
Or better yet, the aptly named "dm" command?

My point:  let's put "ease of use" high on the shopping list, maybe right
after "powerful enough for our purposes."  And we can assume that our users
passed their required second year stats course and can handle ANalysis Of
VAriance or CHI-squares tests without hurting their brains too much...  :-)

						_doc

elf@dgp.toronto.edu (Eugene Fiume) (02/27/90)

In article <1990Feb27.004330.18285@jarvis.csri.toronto.edu> doc@dgp.toronto.edu (Blaine Price) writes:
>
>My point:  let's put "ease of use" high on the shopping list, maybe right
>after "powerful enough for our purposes."

My point is that this package is already here (as is Maple), and it's worth
trying it before dismissing it.  The fact that I find Mac interfaces insulting
(with others saying the same for Unix) is not the issue.

moraes@cs.toronto.edu (Mark Moraes) (02/27/90)

unixstat is installed on CSRI (and all tracking domains) on Suns.  (In
future, maybe on Irises) It has been for many years.  Alison's
announcement merely pointed out we have a new version for the benefit
of those who were asking about stats tools.  (The Deptt. of Statistics
has various other stats tools running on their Unix systems - S, for
instance)

The few times I've used it to munge large chunks of data, I found
unixstat useful and easy to use and interface with report generators
like awk and grap.  But then, I have no real objection to the Unix
command line.

Point and click visual interfaces are useful for many things, pipes
are useful for many others.  If someone has a nice visual interface
for tying together lots of different operations on data sets, plotting
them, etc, on some window system available here, by all means, get a
copy and lets try it, discuss it, and flame about it...  In the
meantime, why denigrate useful tools?

We return you to your regularly scheduled rational discussion on HCI...

	Mark, trying to head off yet another Mac/Unix war...