[comp.ai.neural-nets] Help : sperate bitmap Characters

brian@ucselx.sdsu.edu (Brian Ho) (08/28/90)

Hello out there,
  I have some interesting problem that you may find interested and may be
  you can give me a hand/hint.

  I am currently working on a OCR (optical Character Recogniation) project.
  I am now in the stage that I need to scan a page of document, and sperate
  each character appears in the document.  The image of the document from the
  scanner will converted into (binary) bitmap format. e.g

0000000000000000000000000000000000000000000000000000000000000000000000
0000000000000000000000000000000000000000000000000000000000000000000000
0001111111100000000000000111000000000000000000000000000000000000000000
0001110000000000000000000111000000000110000001101110000000000000000000
0001100000000111111100001111111000011111100001111110000000000000000000
0001100000000111101110001111000000110000110001110000000000000000000111
0001111110000111000110000111000001111111110001100000000000000000011110
0001100000000111000110000110000001111111110001100000000000000000001111
0001100000000111000110000111000001110000000001100000000000000000000011
0001100000000111000110000111000000111001110001100000000000000000000000
0001111111100111000110000011111000011111100001100000000000000000000000
0001111111100010000000000000000000000000000000000000000000000000000000
0000000000000000000000000000000000000000000000000000000000000000000000

  
  And I have a function that can sperate each character from the document.
  My function work fine when two characters are sperated by one or more
  (blank) column, as the example shown in above.

  My problem is when two characters are sperated less than one blank column,
  I can not distinguish/sperate the two character. (P.S. the character has
  unknown size) e.g.


000000000000000000000000000000000000000000000000000000000000000000
000000000000000000000000000000000000000000000000000000000000000000
000000000000000000000000000000000000000000000000000000000000000000
000111111110000000000001110000000000000000000000000000000000000000
000111000000000000000001110000000110000001101110000000000000000000
000110000001111111000011111110011111100001111110000000000000000000
000110000001111011100011110000110000110001110000000000000000000111
000111111001110001100001110001111111110001100000000000000000011110
000110000001110001100001100001111111110001100000000000000000001111
000110000001110001100001110001110000000001100000000000000000000011
000110000001110001100001110000111001110001100000000000000000000000
000111111111110001100000111110011111100001100000000000000000000000
000111111110100000000000000000000000000000000000000000000000000000
000000000000000000000000000000000000000000000000000000000000000000


The characters "En" and "te" are eventually appears side by side with the other
character.


I am wondering if anybody out there that can give me some advices, how to solve
this problem.  Or even someone who is facing the same type of problem, I'll like
to hear about it.. Thank you .. Thank you.....


Thank you for advance.. 

Brian Ho

Contack me at :

brian@yucatec.sdsu.edu
brian@ucselx.sdsu.edu