[comp.ai] AI Project Information Request

neighorn@qiclab.UUCP (03/21/87)

At Portland Public Schools we are using a Writing Assessment guide to
examine certain writing assignments. Normally, writing experts are used
to evaluate the text. This is a slow and laborious process. The idea
of computerizing some or all of the assessment was brought up at a recent
meeting. A visiting Artificial Intelligence expert thought the assessment
presented many interesting problems, and suggested presenting it to a
wider audience.

Writer's Work Bench and similar programs are useful for checking sentence
structure, but what we are interested in is something that can examine a
paper for organization, presentation, word usage, and content.

The assessment is divided up into five areas. Each area has a possible
score of 1, 3, or 5. A perfect paper would receive a score of 25.

The five scored areas for Writing Assessment are : Ideas and Content,
Organization, Voice, Effective Word Choice, and Sentence Structure.

An example of one of the areas is as follows:

                           Analytical Rating Guide

                             IDEAS AND CONTENT

5.  This paper is clear in purpose and conveys ideas in an interesting,
original manner that holds the reader's attention.  Clear, relevant examples,
anecdotes or details develop and enrich the central idea or ideas.

    o   The writer seems to be writing what he or she knows, often from
        experience.
    o   The writer shows insight--a good sense of the world, people, 
        situations.
    o   The writer selects supportive, relevant details that keep the main
        idea(s) in focus.
    o   Primary and secondary ideas are developed in proportion to their
        significance; the writing has a sense of balance.
    o   The writer seems in control of the topic and its development
        throughout.

3.  The writer's purpose is reasonably clear; however, the overall result
may not be especially captivating.  Support is less than adequate to fully
develop the main idea(s).

    o   The reader may not be convinced of the writer's knowledge of the
        topic.
    o   The writer seems to have considered ideas, but not thought things
        through all the way.
    o   Ideas, though reasonably clear and comprehensible, may tend toward the
        mundane; the reader is not sorry to see the paper end.
    o   Supporting details tend to be skimpy, general, predictable, or
        repetitive.  Some details seem included by chance, not selected
        through careful discrimination.
    o   Writing sometimes lacks balance: e.g., too much attention to minor
        details, insufficient development of main ideas, information gaps.
    o   The writer's control of the topic seems inconsistent or uncertain.

1.  This paper lacks a central idea or purpose--or the central idea can be
inferred by the reader only because he or she know the topic (question asked).

    o   Information is very limited (e.g.,  restatement of the prompt, heavy
        reliance on repetition) or simply unclear altogether.
    o   Insight is limited or lacking (e.g., details that do not ring true;
        dependence on platitudes or stereotypes).
    o   Paper lacks balance; development of ideas is minimal, or there may be
        a list of random thoughts from which no central theme emerges.
    o   Writing tends to read like a rote response--merely an effort to get
        something down on paper.
    o   The writer does not seem in control of the topic; shorter papers tend
        to go nowhere, longer papers to wander aimlessly.

I would be very interested in hearing from anyone in netlandia who is working/
has worked/will be working on similar projects. Please follow-up, send email,
or call via landline. Comments are more than welcome. Thank you for your
consideration.

-- 
Steven C. Neighorn                tektronix!{psu-cs,reed}!qiclab!neighorn
Portland Public Schools      "Where we train young Star Fighters to defend the
(503) 249-2000 ext 337           frontier against Xur and the Ko-dan Armada"
QUOTE OF THE DAY ->                'Dr. Ruth is no stranger to friction.'

dmc@videovax.UUCP (03/23/87)

Well, I'm probably over reacting to what will end up being
nothing more than a spelling checker, but I find the thought
of having creative writing graded by a computer program appalling.
It's particularly pernicious in the public school system,
where penalties for failure to conform to some computer
program's judgement of style and content are brought to bear.

The best and most universal writing is about the human condition.
What does a computer program (or indeed its artificially
intelligent author) know about that?  What would it do with...
James Joyce? William S. Burroughs? Anthony Burgess? Ogden Nash? 

What would happen to literary experiment?
Would there be an image processing version that graded Picasso?

It's bad enough that some smartass robot comes up to me at
trade shows pedalling product, or some auto-dialer phones
me while I'm in the shower to sell carpet cleaner, but
these uppity machines I can be rude to and ignore.  The one
that's marking my school essays I cannot.

In law I have the right to be judged by a jury of my peers.
In school I demand that same right.  I will NOT be judged by
a machine.

Yours for a better tomorrow,

Don Craig
Whose opinions are his own.
-- 
Don Craig			dmc@videovax.Tek.COM
Tektronix Television Systems	... tektronix!videovax!dmc

kadie@uiucdcsb.UUCP (03/25/87)

Automatic checking and automatic grading are different things. I think
                                        <<* 3. WEAK: I think  *>>^
automatic computer checking is a good thing, especially for spelling
and simpler grammar. 

But there is no reason to grade automatically, just let the students
   ^<<* 23. SENTENCE BEGINS WITH BUT *>>
work on their papers (with the automatic checker) until they are satisfied. 
                        <<* 21. PASSIVE VOICE: are satisfied. *>>^
                             <<* 17. LONG SENTENCE:  24 WORDS *>>^
Then have them turn in their work and the final computer critique to a human
grader.

The situation is similar to programming, where the compiler 
automatically checks the syntax. It would be unthinkable to make people turn
in programs without letting them compile the programs first. On
the other hand it would unthinkable to leave a syntax error in
when the compiler tells you right were it is.
  

                        <<** SUMMARY **>>

     READABILITY INDEX: 10.42
 Readers need a  10th grade level of education to understand.
  
     STRENGTH INDEX:  0.41
 The writing can be made more direct by using:
               - the active voice
               - shorter sentences

     DESCRIPTIVE INDEX: 0.65
 The use of adjectives and adverbs is within the normal range.

     JARGON INDEX: 0.00

  SENTENCE STRUCTURE RECOMMENDATIONS:
       1. Most sentences contain multiple clauses.
          Try to use more simple sentences.

                    << UNCOMMON WORD LIST >>
The following words are not widely understood.
Will any of these words confuse the intended audience?
        CRITIQUE   1          SYNTAX   2     UNTHINKABLE   2 
                 << END OF UNCOMMON WORD LIST >>


Carl Kadie
University of Illinois at Urbana-Champaign
UUCP: {ihnp4,pur-ee,convex}!uiucdcs!kadie
CSNET: kadie@UIUC.CSNET
ARPA: kadie@M.CS.UIUC.EDU (kadie@UIUC.ARPA)

kadie@uiucdcsb.UUCP (03/30/87)

Several people have ask if the grammar checker I used was real. It
is. It is a commercial product for the IBM PC. Here is some more
information and an example.


I own a spelling checker that I always use. And a grammar and style checker
that I sometimes use. I have a lot of confidence in the spelling checker; I 
take virtually all of its advice. The style checker is not as good. I always
consider it's suggestions, but I know that it has missed many grammar
and style errors and that not everything it flags is really wrong.

Enclosed find its critique of a draft report.
This gives a pretty good indication of how well the program works.

The program is RIGHTWRITER version 2.0, a Right Soft product by 
Decisionware, Inc.
of 2033 Wood Street, Suite 218, Sarasota, Florida 33577. It runs on
IBM PC's and compatible computers. It costs about $100.00.

Carl Kadie
University of Illinois at Urbana-Champaign
UUCP: {ihnp4,pur-ee,convex}!uiucdcs!kadie
CSNET: kadie@UIUC.CSNET
ARPA: kadie@M.CS.UIUC.EDU (kadie@UIUC.ARPA)
(I disclaim any ulterior relationship to Decisionware.)
-------------------------------
.+c "A Program To Compute Moore's Stable Expansions"
.pp
Moore has recently proposed a possible-world semantics for autoepistemic logic.
His method has the intriguing property of producing multiple expansions, that
                                    <<* 16. UNNECESSARY COMMA *>>^
is it list the (finite) theories of what you believe about the world, given
the axioms.
          ^<<* 17. LONG SENTENCE:  27 WORDS *>>
For example, if your unbelief in proposition $P$ implies $Q$, and your unbelief
in proposition $Q$ implies $P$, then we can theorize that either
$P$ is true or alternatively $Q$ true.
 <<* 17. LONG SENTENCE:  31 WORDS *>>^
         <<* 31. COMPLEX SENTENCE *>>^
.pp
In Lisp notation the axioms are expressed:
.(L
(and (imp (not (l 'p)) q) (imp (not (l 'q)) p))
.)L
and the conclusion is expressed:
.(L
(Q) (P)
.)L
.pp
I have written a program that finds the stable expansions of
formula in Moore's autoepistemic logic. As might be expected
                     <<* 21. PASSIVE VOICE: be expected  *>>^
the program run in time exponential to the number of variables.
              <<* 32. INCOMPLETE SENTENCE OR MISSING COMMA *>>^
Let's look at some runs:
.(L
A non-autoepistemic sentence:

   (expand
      '(and p (imp p (not q)) (imp (not q) r)) ;; axioms
      '(p q r)  ;; propositions
       0) ;; trace level

returns:
   ((P (NOT Q) R))
.)L
In other words, the axioms entail that $P$ is true, $Q$ is false, and $R$ is true.
This is of course just what we expect for this propositional sentence.
.pp
Here is a trace of the run of the example we saw before:
.(L
[Figure goes here. -- CMK]
.)L
.pp
The program also identifies cases where no stable expansion exists:
.(L
[Figure goes here. -- CMK]
.)L
.pp
At higher trace levels, the program provides counter-models
to non-grounded theories. For example:
.(L
(expand
  '(and (imp (not (l 'p1)) p2)
        (imp (not (l 'p2)) p3)
        (imp (not (l 'p3)) p4)
        (imp (not (l 'p4)) p1))
   '(p1 p2 p3 p4)
    2)

 ...

(P1 P2 P3 P4) in theory is stable w.r.t. the axioms.
S5 is ((P1 P2 P3 P4))
(s5:((P1 P2 P3 P4)) , V:((NOT P1) P2 (NOT P3) P4)) is a model of A
Counter-model: (s5:((P1 P2 P3 P4)), V:((NOT P1) P2 (NOT P3) P4))
Theory (P1 P2 P3 P4) is NOT a stable expansion of the axioms
 ...
((P2 P4) (P1 P3))
.)L
.pp
In fact it is just this test of groundness that makes Moore's logic
different from the logic of Shoham that we will see later.
                     <<* 17. LONG SENTENCE:  24 WORDS *>>^
For example when we give Shoham's gun
example to the program it replies that there are no stable
   <<* 1. REPLACE: that there  BY there  *>>^
expansions. This is because it does not have Shoham's
chronological ignorance criteria with which to choose ungrounded
theories. Here is the trace:
.(L
[Figure goes here. -- CMK]
.)L
.pp
Having no stable expansion and believing nothing are two separate case.
Here is a case where the only stable expansion is the theory
where nothing is believed.
                         ^<<* 21. PASSIVE VOICE: is believed. *>>
.(L
[Figure goes here. -- CMK]
.)L
.pp
The program works by enumerating every theory, then constructing
the corresponding S5 structure. Next, it tests every world
of the S5, if any world fails to support the axioms then
it is unstable and the theory is removed from consideration.
  <<* 21. PASSIVE VOICE: is removed  *>>^
                       <<* 17. LONG SENTENCE:  27 WORDS *>>^
Stable theories are next tested for groundness. This is done
           <<* 21. PASSIVE VOICE: are next tested  *>>
                         <<* 21. PASSIVE VOICE: is done  *>>^
by trying every variable assignment $V$. If an assignment
makes the axioms true then $V$ must correspond to a world
in the S5, or else the theory is not grounded. A theory
  <<* 21. PASSIVE VOICE: is not grounded. *>>^
         <<* 17. LONG SENTENCE:  33 WORDS *>>^
that is both stable and grounded is added to the stable
     <<* 21. PASSIVE VOICE: is added  *>>^
expansion list to be returned at the end of the program.
             <<* 21. PASSIVE VOICE: be returned  *>>
                   <<* 17. LONG SENTENCE:  24 WORDS *>>^
.pp
Overall, the program works very well on small problems (four
variable problems take only seconds on a SUN). The program
accepts any formula that Lisp can evaluate;
so very complex formula may be input. However, since
the program relies on enumeration, it can not be expanded
                  <<* 21. PASSIVE VOICE: be expanded  *>>^
to first-order logic, nor can it be considered practical
     <<* 21. PASSIVE VOICE: be considered  *>>^
unless the problems can be guaranteed to be small.
            <<* 21. PASSIVE VOICE: be guaranteed  *>>
             <<* 17. LONG SENTENCE:  31 WORDS *>>^
                     <<* 31. COMPLEX SENTENCE *>>^
  

                        <<** SUMMARY **>>

     READABILITY INDEX:  7.63
 Readers need an   8th grade level of education to understand.
  
     STRENGTH INDEX:  0.19
 The writing can be made more direct by using:
               - the active voice
               - shorter sentences
               - more common words
               - fewer abbreviations

     DESCRIPTIVE INDEX: 0.74
 The use of adjectives and adverbs is within the normal range.

     JARGON INDEX: 0.25

  SENTENCE STRUCTURE RECOMMENDATIONS:
      15. No Recommendations. 
 

                    << UNCOMMON WORD LIST >>
The following words are not widely understood.
Will any of these words confuse the intended audience?
   AUTOEPISTEMIC   3           AXIOM   2          AXIOMS  40 
   CHRONOLOGICAL   1        CRITERIA   1         DRIBBLE   1 
          ENTAIL   1     ENUMERATING   1     ENUMERATION   1 
     EXPONENTIAL   1          FINITE   1           FIRE4  20 
      GROUNDNESS   2             IMP  24      INTRIGUING   1 
            LISP   2           LOAD1  20           MOORE   1 
         MOORE'S   3             NIL   5          NOISE6  17 
               P   4     PROPOSITION   2   PROPOSITIONAL   1 
    PROPOSITIONS   2               Q   4               R   1 
       SEMANTICS   1          SHOHAM   1        SHOHAM'S   2 
        THEORIZE   1        UNBELIEF   2      UNGROUNDED   1 
               V   2         VACUUM5  17             WRT  19 
                 << END OF UNCOMMON WORD LIST >>
   
                 <<** WORD FREQUENCY LIST **>>
               A  31           ABOUT   1         ACCEPTS   1 
             ADD   1            ALSO   1   ALTERNATIVELY   1 
              AN   1             AND  18             ANY   2 
             ARE   4              AS   1      ASSIGNMENT   3 
              AT   3   AUTOEPISTEMIC   3           AXIOM   2 
          AXIOMS  40              BE   7         BECAUSE   1 
          BEFORE   1         BELIEVE   3            BOTH   1 
				 [Rest of word frequency list goes here -- CMK]
                 <<END OF WORD FREQUENCY LIST>>