[net.math.stat] How robust is S?

daven@lll-crg.ARPA (Dave Nelson) (06/06/85)
*** REPLACE THIS LINE WITH YOUR MISSILE ***

I have just begun to look into S for a friend, and have been
unpleasantly surprised by its handling of arithmetic.
Here's the scenario:

I am trying to create a multiple logistic regression facility
similar to that available with SAS's LOGIST.  The guts of the
facility is an MLE computation of the following form:

?fmin(beta,					     # parameters in beta
	sum(log(1+exp(X %* beta))) - y %* X %* beta, # -log(L)
	-t(X) %* (y - 1/(1 + exp(-(X %* beta)))),    # grad(-log(L))
	-->put your starting vector here<---)	     # starting vector

The data I used is the dataset in the documentation for the SAS LOGIST
procedure (SAS Supplemental Library manual p. 83ff.), and I tried to run
the the model in step three of SAS's stepwise multlog example.
So what happened???

To make a long story short,

1)	the behavior of ?fmin is wildly dependent on the value of the
	parameter gtol:  .002 succeeds; .001 aborts with a floating
	point exception as the norm of the gradient approaches the limit.

2)	if I use the results of the SAS run as my STARTING GUESS,
	?fmin returns with beta = c(0, 0, 0, 0) after one iteration!
	(Moral: don't get TOO close with your starting guess??).

3)	all the arithmetic seems to be single-precision:  when ?fmin
	succeeds (as it does with the models in regression steps 1 and 2),
	the results agree only to about 4 significant digits.  I expect more.
	Who's at fault?  SAS or S?  I suspect S.

I could go on and on about the way it loops on a memory allocation error
if you interrupt it at the wrong time, then look at certain data
vectors, etc., etc.  And this is after TWO DAYS of looking into it.

Perhaps I am just exceeding its limitations and expecting too much.
Could someone verify that I am either:

*	justly upset at its behavior;
*	abusing S and deserve what I get;
*	too naive for my own good and crazy for even CONSIDERING
	such a problem in an interpreted system;
*	all of the above; or
*	none of the above.

Cheers,

Dave Nelson (daven@lll-crg.arpa)