[comp.ai] Optical Character Recognition, how?

00lhramer@bsu-ucs.uucp (Leslie Ramer) (12/03/90)

I was curious as to the methods of ASCII scanners (OCR's) for
translating BITMAP code into ASCII codes.  I've generated some
thoughts according to the methodology (that can be used) to
accomplish this.

The problem, according to my reasoning, follows a kind of
flow.

1.  Some form of logical reduction needs to be made to make
    the problem more tractable.

2.  Some form of mathematical reduction forming a list of
    statements, or data needs be done.

3.  A set of rules or a functional relationship of some kind
    must exist to make sense of the data.

These are elements of the "HOW" of my solution method.
Here's the application I would imagine the OCR would use.

1 -> A bitmapped image is bound to have hundreds, if not
thousands of dots.  It is really quite obvious that all these
dots aren't necessary for the recognition of the characters.
Would/is a pattern reduction type of method that would lead toward
a successful reduction of the problem?

2 -> Once we have the image reduced a little, some mathematics
need to reduce the image to another data form.  I would imagine
that a bunch of vectors need to be generated that best approximate
the image.  Once this has been accomplished, the vector mathematics
that are available could be used to recognize the characters in
virtually any rotation.

3 ->  The mathematical data created in 2 needs to be interpreted.
Some reference vectors (dot products, lengths, approximate locations,
etc...) should be matched in some sort of "FUNCTION" table, and the
approximate ascii characters output.  Maybe even a probability that
each character is the correct one, if this method doesn't work.

This method is very undeveloped in my mind.  I realize that there are
other problems in this problem that pose further problems.  ie. what is
difference between a lowercase L and the number 1?  Either context or
a direct difference needs to be noted.

Again, this is an interesting problem to me.  I'm interested in it
for the fact that it's interesting that a machine can read as well
as, if not faster than, I do.  (The blind reader in the library
reads (through a speech synth) at a rate of up to 425 wpm.  I don't
normally read anywhere near that.)

It would seem that there are a great number of operations that would need
to be done to recognize the characters no matter what typeface is used.
I like to see things get done, and get done fast.  (I'm very different
from the machines in that sense, I don't always think fast.)

:-)

Thanks in advance,

====         "No one runs so fast as he that is chased."
                ._
|    .-. .-.    | \ .-, .-.-. .-. .-.  00LHRAMER@bsu-ucs.bsu.edu
|    +-' `-,    |_/ .-+ | | | +-' |    00LHRAMER@bsu-ucs.UUCP
|___ `-' `-'    | \ `.| | | | `-' |    00LHRAMER@bsuvax1.bitnet

maf@gtenmc.UUCP (Mary Ann Finnerty) (12/07/90)

This is another curious question...I use scanners frequently,
since we don't have a lan at work and I sometimes need
documentation from one of the other platforms here.

Since the scanner just recognizes type-written or computer
print-out for input, the characters it must recognize are easily
(I think) to catagorize.

My question is this, are there scanners out there that improve
as they go along?  For example, that get input somehow confirming
correct character recognition so that they add to their
repertoire (sp?) of fonts that they can recognize? 

I don't know if I expressed this very well, but it seems like a
good application for a self-learning program, but I'm not sure
how the responses could be confirmed or rejected...Curious.

maf

alex@dutirt2.tudelft.nl (Alexander Vonk) (12/10/90)

It has been some three or four years ago that I was studying the
field of OCR, but I guess what I know of it is still partly applicable.

In <985@gtenmc.UUCP> maf@gtenmc.UUCP (Mary Ann Finnerty) writes:
>Since the scanner just recognizes type-written or computer
>print-out for input, the characters it must recognize are easily
>(I think) to catagorize.
Well, it depends. It's much more difficult with a printer or type writer
that does not align its characters very well. Skewed characters are
much harder to recognize if you expect `straight' characters. One time,
I attended a demo at Canon Nederland (in the Netherlands, that is). I was
very curious (too) about the scanner and OCR software that was demonstrated
and asked the sales rep to scan a letter somewhat rotated, so it would be
hard to recognize. I don't remember the exact figures, but the error rate
went up a factor five or ten; from 2 or 3 errors in the well-scanned letter
to 20 or so errors in the not that well-scanned letter. Of course, this
letter was printed on their Canon laserprinter with one standard font.

>My question is this, are there scanners out there that improve
>as they go along?  For example, that get input somehow confirming
>correct character recognition so that they add to their
>repertoire (sp?) of fonts that they can recognize? 
According to the Canon sales rep, their OCR software could improve its
recognition abilities by recognizing text written with the same fonts over
and over again (Much like you can learn to read even the most horrible
handwriting).

>I don't know if I expressed this very well, but it seems like a
>good application for a self-learning program, but I'm not sure
>how the responses could be confirmed or rejected...Curious.
Personally, I think that this should be a good opportunity to couple a
spelling checker with the OCR software. Of course, you should be very
careful about spelling errors in the letter itself, but an OCR program
will calculate an estimate of how good it recognized a certain character.
Maybe such a program could `learn' by just scanning the complete
printed characterset one time.

Alexander Vonk.

+++	Alexander Vonk - Technical Univ. Delft, Netherlands	+++
+++Phone:	(NL) 015 - 78 64 12	(world) 31 15 78 64 12	+++
+++Mail:	alex@dutirt2.tudelft.nl	or alex@dutirt2.UUCP	+++