amit@cs.tamu.edu (Amitabha Mukerjee) (01/29/91)
Archive-name: ai/bibliography/tamu-ai-bib/1991-01-27 Archive-directory: csseq.tamu.edu:/bib/ [128.194.2.20] Original-posting-by: amit@cs.tamu.edu (Amitabha Mukerjee) Original-subject: Re: Document Recognition - Information needed Reposted-by: emv@ox.com (Edward Vielmetti) Some results of a bibliography search with the keyword document-recognition. This bibliography is maintained at Texas A&M University and has about 2000 papers in AI, robotics, and geometric modeling. You can anonymous ftp it from csseq.tamu.edu (directory bib). amit mukerjee (amit@cs.tamu.edu) =========================================================================== Antonacci, F., M. Russo, M.T. Pazienza, P.Velardi; 1989 AI::NATURAL-LANGUAGE DOCUMENT-RECOGNITION 2IBM Italy/RomeUniv./Ancona U. A system for text analysis and lexical knowledge acquisition, Data and Knowledge Engineering, July 1989, v.4(1):1-20, Dengel, Andreas; 1989 VISION::AI::DOCUMENT-RECOGNITION RECTANGLE U.Stuttgart-CS Automatic visual classification of documents, Proceedings of Intl Workshop on Industrial Applications of Machine Intelligence and Vision (MIV-89), Tokyo, Japan, April 1988, p.276-281. { { First, align the document by determining the "dominant screw angle" { Next, divide up the document into block segments (rectangles). These { are then analyzed using a rule-based system. Results show the system { to be extremely robust for the class of business letters. -AM 7/89 { **** Possible project for implementation with the spatial relations { algebra. Ejiri, Masakazu; 1988 IMAGE-PROC::DOCUMENT-RECOGNITION MAP INSPECTION SPATIAL-REASONING RECTANGLE Hitachi CRL,Tokyo Knowledge-based approaches to practical image processing, Proceedings of Intl Workshop on Industrial Applications of Machine Intelligence and Vision (MIV-89), Tokyo, Japan, April 1988, p.1-8. { { Divide the document surface into different rectangular regions (title { area, author-name area etc.) using own language FDL (Form Definition { Language). Now use this model as input to the vision system - was { used to set up system for Japanese birth document. Also some { examples of tying maps to views from map locations etc. Govindraju, Venu, Stephen W. Lam, Debashish Niyogi, David B. Sher, Rohini Srihari, Sargur N. Srihari, and Dacheng Wang; 1989 KBS::VISION DOCUMENT-RECOGNITION NATURAL-LANGUAGE SPATIAL SUNY-Buffalo Newspaper image understanding, Knowledge Based Computer Systems, Narosa Publishing House, Bombay, India, Proceedings of the KBCS '89 conference, Bombay, December 1989, p.375-384. { { Very powerful paper. First, a block segmentation of the newspaper { to determine what part of the paper corresponds to what - news, { photo, title, dateline, etc. All are _rectangular blocks_, and { this analysis is done without reading any of the contents in the { block - based on the characteristics of the document itself. Next, { within the appropriate blocks, the characters are recognized using { a set of features, such as the strokes, a concavity, a hole, etc. { { The most interesting part is the caption-based picture { understanding. Based on a machine parsing of the figure caption { and a block segmentation of the image itself, the program labels { the portions of the image corresponding to interesting objects. { For example, faces are recognized by characteristics of the frontal { shape - downwardly converging lines, etc. Sample outputs display { the face portions of two persons in an image with a caption- { "Wearing their new Celtics sunglasses are Joseph Crowley, standing { with the pennant, and seated from the left, Paul Cotter, John Webb { and David Buck." This work reported in "Extracting visual { information from text: using caption to label human faces in { newspaper photographs", in CVPR '89. The reference list points to { a bunch of earlier stufdf from Srihari's group. - AM 2/90 Kasturi, Rangachar; Sing T. Bow; Wassim El-Masri; Jayesh Shah; James R. Gattiker; and Umesh B. Mokate; 1990 VISION::RECTANGLE DOCUMENT-RECOGNITION OCR SHAPE 2D SPATIAL-RELATIONS CURVED PennStateU/++ A system for interpretation of line drawings, IEEE PAMI, v.12(10):978-992 { { "An automatic graphics recognition system which can generate a { succinct description of various graphical objects and their spatial { relationships has many applications." The premise is that artificial { images, made up of blocks, text, and geometrical shapes, can be { analyzed automatically and symbolic descriptors generated. The first { step is to create smallest enclosing rectangules covering intensity { changes. Aspect ratios of rectangles are used to identify text vs { graphics areas, but this is a blurred area, so histograms do not work { very well (**** FUZZY). { { "Collinear component grouping" is performed next (**** tangency and { alignment) in the Hough transform domain with multi-scale resolution. { A significant part of the effort is in determining which parts of the { image are text, and which parts not, with the eventual objective of { removing all text portions from the image, leaving only the line { drawings. Gradually various parts of the image are removed using { "known shape" models such as trapezoid (model based on vertex P, L1, { L2, H, theta1, theta2), quasi-hexagon etc. { { Also does flowchart analysis. - AM 12/90 Koons, David B.; 1988 VISION::AI::HYPERMEDIA::DOCUMENT-RECOGNITION SPATIAL-REASONING TAMU-CS A model for the representation and extraction of visual knowledge from illustrated texts, Master's thesis, also Technical report TAMU-88-010, Computer Science Dept, TAMU, August 1988, 99 pages. { { Relating illustrative diagrams to text portions referring to the { diagram; based on a neuroanatomy text with diagrams and text on { facing pages. Constructs a dictionary for natural language phrases { such as "emerges from", "above", "attaches to"; uses these together { with partial models of the objects to construct predicate logic { representations; at this stage the figure-analysis was mostly { manual. A powerful concept, but one whose time is surely coming. { Can apply some of the ideas from [Mukerjee & Joe 89]. -AM 7/89 Srihari, Sargur N.; 1986 VISION::DOCUMENT-RECOGNITION SUNY Buffalo-CS Document image understanding, FJCC 1986, p.87-96. Srihari, Sargur N.; Ching-Huei Wang; Paul W. Palumbo; and Jonathan J. Hull; 1987 AI::VISION::DOCUMENT-RECOGNITION SHAPE RECTANGLE SUNY-Buff Recognizing address blocks on mail piece: Specialized tools and problem-solving architecture, AI Magazine, v.8(4):25-40, Winter 1987. { { Divides up the initial image into 3x3 grid, and identifies the address { block area based on a set of five heuristics, which are attenuated { through segmentation and thresholding. Some of the rules relate to { interpreting block types. e.g. { { Rule MSEGR1: { If block A's aspect ration. length, and height and if the number of { lines in the block are within the acceptable range for { machine-generated address labels, then increase evidence fraction { that this is a machine generated destination address label (by .4 for { destination address, .3 for return address, and .2 for advertising { text). { { Precursor to the much more thorough [Wang and Srihari 89]. - AM 12/90 Wang, Dacheng; and Sargur N. Srihari; 1989 AI::VISION::IMAGE-PROC DOCUMENT-RECOGNITION TEXTURE FILTER RECTANGLE SUNY-Buf Classification of newspaper image blocks using texture analysis, Computer Vision Graphics, and Image Processing, v.47:327-352, 1989. Yashiro, Hiroshi, Tatsuya Murakami, Yoshihiro Shima, Yashiki Nakano, and Hiromichi Fujisawa; 1989 VISION::AI::DOCUMENT-RECOGNITION RECTANGLE Hitachi-CRL,Tokyo A new method for document structure extraction using generic layout knowledge, Proceedings of Intl Workshop on Industrial Applications of Machine Intelligence and Vision (MIV-89), Tokyo, Japan, April 1988, p.282-287. { { Uses the Form Definition language as in [Ejiri 89] to define document { structures.