paula@bcsaic.UUCP (Paul Allen) (06/05/90)
I've been looking for an inexpensive way to get stock and mutual fund data from the newspaper into my computer. A hand-scanner/OCR software package recently went on sale at our local Egghead Software store, so I rushed on down to pick one up. The scanner was the Logitech Scanman Plus. The OCR software was also from Logitech and included "Omni-Font technology" said to enable it to read any font. To cut to the chase, the guys at Egghead told me that it wasn't possible to reliably scan newspaper text with a 400dpi scanner and that I'd probably be unhappy with the results. I ended up leaving the store empty-handed. (But let's hear it for the up-front candor of the store personnel, hey? I was pre-sold, and they talked me out of it! :-) ) Now, I realize that hand scanners are primarily used for images, rather than text. But I've been hearing about character recognition software for a couple years now. Surely it hasn't all been hype? (The Egghead guy pointed to the 20-point type on the scanner box and said it would read that just fine!) A back-of-the-envelope calculation shows that a 7-point numeral scanned at 400dpi will stand ~39 pixels high. That would seem like adequate resolution for quite accurate recognition. Lower-case alpha would be a bit harder at ~20 vertical pixels. Seems within the realm of possibility to me, but I haven't tried to actually write OCR software. :-) Is anybody out there actually scanning newspaper text with a hand-scanner? How about with a flat-bed scanner? What software are you using? Which scanner? What sort of error rate do you consider acceptable? Do you have trouble with specs of dust being recognized as characters? If I tried to scan a full printed page out of the Investors Daily, how many days of compute time would it take on a 386/25? :-) If you have some useful intelligence to share, please reply by email to the address below. While I would love to be able to read comp.sys.ibm.pc regularly, I simply lack the time. Since this is probably of interest to more than just myself, I'll summarize whatever I learn. Thanks, Paul Allen -- ------------------------------------------------------------------------ Paul L. Allen | pallen@atc.boeing.com Boeing Advanced Technology Center | ...!uw-beaver!bcsaic!pallen
aesop@milton.u.washington.edu (Jeff Boscole) (07/05/90)
------------------------ Seeking any product review information on the following OCR software. Dest Retrieve, OCRON Perceive, Omnipage, Olduvai. Seeking reliable OCR that is (1) fast, (2) accurate, (3) easy to use Helpful to be both -trainable- and -omnifont- capability. Software should -learn- from the first page and speed up for subsequent pages of text (same fonts). Must run on IBM PC 286 or 386. Should be compatible with a panoply of flat-bed scanner/fax offerings. Post -and- send E-mail. Price/performance criteria is a consideration, but performance wins over price. :=: