[comp.sys.ibm.pc] cheap text scanners

paula@bcsaic.UUCP (Paul Allen) (06/05/90)

I've been looking for an inexpensive way to get stock and mutual fund
data from the newspaper into my computer.  A hand-scanner/OCR software
package recently went on sale at our local Egghead Software store, so
I rushed on down to pick one up.  The scanner was the Logitech
Scanman Plus.  The OCR software was also from Logitech and included
"Omni-Font technology" said to enable it to read any font.

To cut to the chase, the guys at Egghead told me that it wasn't
possible to reliably scan newspaper text with a 400dpi scanner and
that I'd probably be unhappy with the results.  I ended up leaving
the store empty-handed.  (But let's hear it for the up-front candor
of the store personnel, hey?  I was pre-sold, and they talked me out
of it!  :-) )

Now, I realize that hand scanners are primarily used for images,
rather than text.  But I've been hearing about character recognition
software for a couple years now.  Surely it hasn't all been hype?
(The Egghead guy pointed to the 20-point type on the scanner box and
said it would read that just fine!)

A back-of-the-envelope calculation shows that a 7-point numeral
scanned at 400dpi will stand ~39 pixels high.  That would seem like
adequate resolution for quite accurate recognition.  Lower-case
alpha would be a bit harder at ~20 vertical pixels.  Seems within
the realm of possibility to me, but I haven't tried to actually
write OCR software.  :-)

Is anybody out there actually scanning newspaper text with a 
hand-scanner?  How about with a flat-bed scanner?  What software are 
you using?  Which scanner?  What sort of error rate do you consider 
acceptable?  Do you have trouble with specs of dust being recognized
as characters?  If I tried to scan a full printed page out of the Investors 
Daily, how many days of compute time would it take on a 386/25?  :-)

If you have some useful intelligence to share, please reply by
email to the address below.  While I would love to be able to read
comp.sys.ibm.pc regularly, I simply lack the time.  Since this is 
probably of interest to more than just myself, I'll summarize whatever 
I learn.

Thanks,

Paul Allen

-- 
------------------------------------------------------------------------
Paul L. Allen                       | pallen@atc.boeing.com
Boeing Advanced Technology Center   | ...!uw-beaver!bcsaic!pallen

aesop@milton.u.washington.edu (Jeff Boscole) (07/05/90)

------------------------
Seeking any product review information on the following OCR software.
  Dest Retrieve, OCRON Perceive, Omnipage, Olduvai.

Seeking reliable OCR that is (1) fast, (2) accurate, (3) easy to use
  Helpful to be both -trainable- and -omnifont- capability.

Software should -learn- from the first page and speed up for subsequent
pages of text (same fonts).

Must run on IBM PC 286 or 386.  Should be compatible with a panoply
of flat-bed scanner/fax offerings.

Post -and- send E-mail.  Price/performance criteria is a consideration,
but performance wins over price.

:=: