[comp.text] poetry analysis, pattern recogn.

cap4r@boole.acc.virginia.edu (Chris Pohlig) (02/19/88)

I have a project that involves determining variations between different 
versions of a very long poem.  Unfortunately, simple file comparison 
programs are inappropriate since not all differences between the versions 
are important.  For example, many (but not all) spelling variations are 
insignificant.  Some versions of the poem have extra, or missing lines. 
Some corresponding lines (between different versions) are of unequal  
length as well.  The real need (I think) is to be able to specify (in a 
separate "rule" file) a list identifying significant difference rules. 

Are there any relevant software products?  Are there any relevant
journals? Does anyone have any suggestions?

Please reply to:  cap4r@virginia.edu  (internet)
             or:  cap4r@virginia      (bitnet)

Many thanks.

farris@marlin.NOSC.MIL (Russell H. Farris) (02/20/88)

In article <419@boole.acc.virginia.edu>cap4r@boole.acc.virginia.edu
(Chris Pohlig) writes:
>I have a project that involves determining variations between different
>versions of a very long poem.  Unfortunately, simple file comparison 
>programs are inappropriate since not all differences between the versions 
>are important. . . .

Look into using SNOBOL (on a PC) or SPITBOL.  A PD version called
Vanilla SNOBOL4 is available--with many sample programs--from
Simtel20.  The full-blown version, called SNOBOL4+, is available
for $95 from Catspaw, Inc., P.O.  Box 1123, Salida, Colorado
81201, (305) 539-3884.

                   Russ (just a happy customer) Farris

poser@csli.STANFORD.EDU (Bill Poser) (02/20/88)

	Instead of using SNOBOL or SPITBOL I suggest using ICON.
ICON is a decendant of SNOBOL that combines the string manipulation and
pattern matching facilities of SNOBOL with modern control structures.
ICON is good for the same sorts of tasks that SNOBOL is good for, but is
a much nicer language. It was developed by a group headed by Ralph Griswold,
one of the designers of SNOBOL, at the University of Arizona.
ICON runs on lots of machines (I have run it on a VAX 11/750 running 4.2BSD
UNIX and on an IBM PC running DOS 2.1 and DOS 3.3) and is available for
the cost of the distribution media from the ICON Project in the
Computer Science Department at the University of Arizona (Tucson, AZ 85721).
ICON is described in a book by Rallph Griswold and Madge Griswold entitled
_The ICON Programming Language_. There is also a book on the implementation
of ICON called _The Implementation of the ICON Programming Language_ by
the same authors which should be of interest to anyone interested
in the implementation of systems with extensive run-time support
requirements.