[comp.ai.neural-nets] Efficiency

bph@buengc.BU.EDU (Blair P. Houghton) (03/12/89)

In article <37300001@m.cs.uiuc.edu> kadie@m.cs.uiuc.edu writes:
>
>In article joe@amos.ling.ucsd.edu (Fellow Sufferer) writes:
>> 
>> Hecht-Nielsen Corp of San Diego, Ca is doing just such research.
>> Their real problem was explaining why an applicant was refused [...]
>> That's not quite as easy as it sounds.
>> 
> There is another potential problem, even if an explanation is found, it may
> be illegal.
>
> For example, the ANN may be very sensitive to zipcode. This is called
> redlining; in many places it is illegal.

This is cured easily by doing what humans should (in effect) do:

Don't allow the net to process irrelevant information.  ZIP code has
nil to do with whether an applicant will repay.  There can be no
_causal_ relationship between ZIP and credit rating.  (There is a large
body of evidence supporting a positive correlation, but it's got
nothing to do with the number.)  To use it as input for a NN is
to make the job _more_ difficult and _less_ accurate, whether it
results in a "better class" of clientele or not.

This raises the question of efficiency metrics for Neural Networks.
In our example, it is bad business to lend money to deadbeats, and it
is worse business to label potentially profitable debtors as deadbeats
for erroneous reasons.  There are only so many of Donald Trump out there,
thank Napoleon.  The network used to make this decision would have to
be tuned to optimize the return on the lent money.

So, like, how do you tell beforehand that it's doing its job, and that
it's not _missing_ some people who were just never allowed to have
a loan before?  How do you know if a neural net is being overselective?
How do you even define the point of overselectiveness?  It's easy in
dollar-based problems: the net with the most at the end of the game wins.
What do you use for dollars in other situations?


				--Blair