chuq%plaid@Sun.COM (Chuq Von Rospach) (06/01/87)
Date: Fri, 29 May 87 09:56:04 PDT From: mlwh@sphinx (Martin Hall) I would be interested in finding out about document analysis. I mean this in as general as a sense as you want to take it. Any pointers would be appreciated. Some of the areas that I would be particularly interested are: Affects of different typestyles, page layouts, etc on the reader Analysis of textual content -- how to analyze content of document -- length of word/sentences/paragraphs and the affect on readability Basically, I would like any information that concerns the readability and understanding of a document. ----Martin L. W. Hall---- Sun Microsystems HASA member in good standing {allegra | hplabs}!sun!mlwh@sphinx or mlwh@sun.COM ---------------------------------------- Submissions to: desktop%plaid@sun.com -OR- sun!plaid!desktop Administrivia to: desktop-request%plaid@sun.com -OR- sun!plaid!desktop-request Paths: {ihnp4,decwrl,hplabs,seismo,ucbvax}!sun Chuq Von Rospach chuq@sun.COM Delphi: CHUQ Now, where did my ex-wife put my Fairy Dust?
chuq@plaid.UUCP (06/04/87)
Date: Thu, 4 Jun 87 13:38:38+0300 From: nsc!nsta!nsta.UUCP!iddo (Iddo Carmon /NSTA (052)-522-267) Organization: National Semiconductor (Israel) Ltd. >From: mlwh@sphinx (Martin Hall) >I would be interested in finding out about document analysis. I mean >this in as general as a sense as you want to take it. Any pointers >would be appreciated. My view is that this kind of activity is best handled by a proper mix of human/machine interaction. Consider the news system as an example: here you have a massive amount of information to choose from, but still you're able to handle it efficiently and select things that are relevant to you by means of software utilities to various degrees of sophistication. However, these utilities all rely on a set of conventions for putting things in header lines that later enable the system to locate articles in the newsgroup hierarchy, and on the intelligence of a human poster who selects the proper newsgroups. Also the structure of the newsgroup hiereachy is developed by humans according to their interests and is a key factor in the ease of selecting specific information. Instead of treating a document as a 1-dimensional stream of characters and trying to extract meaning from that, I'd like to see some common general- purpose high-level 'document-programming' language evolving, that will be used to annotate the text and will then enable automatic parsing of the document into sections, threads of reasoning, selection of pieces by going down a subject menue-tree, etc. Such a convention may make it possible to scan/archive documsnts according to their contents in numerous ways, without a prerequisite for a "natural language understanding superexpert system". -- Iddo Carmon Architecture Dept. Tel: +972-52-522-267 National Semiconductor (Israel) Ltd. uucp: ...!nsc!nsta!iddo P.O.B. 3007, Herzlia B. 46104, Israel {hplabs,pyramid,sun,decwrl} ---------------------------------------- Submissions to: desktop%plaid@sun.com -OR- sun!plaid!desktop Administrivia to: desktop-request%plaid@sun.com -OR- sun!plaid!desktop-request Paths: {ihnp4,decwrl,hplabs,seismo,ucbvax}!sun Chuq Von Rospach chuq@sun.COM Delphi: CHUQ Now, where did my ex-wife put my Fairy Dust?
chuq%plaid@Sun.COM (Chuq Von Rospach) (06/11/87)
From: ames!hoptoad!localhost!killer!robm Date: Mon, 8 Jun 87 19:41:31 CDT > Date: Thu, 4 Jun 87 13:38:38+0300 > From: nsc!nsta!nsta.UUCP!iddo (Iddo Carmon /NSTA (052)-522-267) > Organization: National Semiconductor (Israel) Ltd. > > >From: mlwh@sphinx (Martin Hall) > >I would be interested in finding out about document analysis. I mean > >this in as general as a sense as you want to take it. Any pointers > >would be appreciated. > > ....... > > Instead of treating a document as a 1-dimensional stream of characters and > trying to extract meaning from that, I'd like to see some common general- > purpose high-level 'document-programming' language evolving, that will be > used to annotate the text and will then enable automatic parsing of the > document into sections, threads of reasoning, selection of pieces by going > down a subject menue-tree, etc. Such a convention may make it possible to > scan/archive documsnts according to their contents in numerous ways, > without a prerequisite for a "natural language understanding superexpert > system". > ........ You might take a look at the _Chicago Guide to Preparing Electronic Manuscripts_, University of Chicago Press, 1987. It contains a generic page markup language. Rob Moser --- ihnp4!killer!robm ---------------------------------------- Submissions to: desktop%plaid@sun.com -OR- sun!plaid!desktop Administrivia to: desktop-request%plaid@sun.com -OR- sun!plaid!desktop-request Paths: {ihnp4,decwrl,hplabs,seismo,ucbvax}!sun Chuq Von Rospach chuq@sun.COM Delphi: CHUQ Now, where did my ex-wife put my Fairy Dust?
chuq%plaid@Sun.COM (Chuq Von Rospach) (06/17/87)
From: David Boyes <dboyes@uoregon%tektronix.tek.com> Date: 17 Jun 87 02:51:05 GMT Organization: University of Oregon, Computer Science, Eugene OR >> Instead of treating a document as a 1-dimensional stream of characters >> ... I'd like to see some common general-purpose high-level >> 'document-programming' language evolving, ... > > There is the ANSI Standardized Generalized Markup Language, which allows >you to describe a document in a way which relates to its content rather You also might want to check out University of Waterloo's SCRIPT/GML or IBM DCF/GML implementations. Both are fairly good, if a bit artificial. The facility that they define could easily be ported to just about any text formatter -- I'm going to see what can be done with making a TeX-GML on that basis...someday when I have time...8-). -- David Boyes ARPA: 556%OREGON1.BITNET@WISCVM.WISC.EDU Systems Division BITNET: 556@OREGON1 University of Oregon Computing Center UUCP: dboyes@uoregon.UUCP ---------------------------------------- Submissions to: desktop%plaid@sun.com -OR- sun!plaid!desktop Administrivia to: desktop-request%plaid@sun.com -OR- sun!plaid!desktop-request Paths: {ihnp4,decwrl,hplabs,seismo,ucbvax}!sun Chuq Von Rospach chuq@sun.COM Delphi: CHUQ Now, where did my ex-wife put my Fairy Dust?