neighorn@qiclab.UUCP (03/21/87)
At Portland Public Schools we are using a Writing Assessment guide to examine certain writing assignments. Normally, writing experts are used to evaluate the text. This is a slow and laborious process. The idea of computerizing some or all of the assessment was brought up at a recent meeting. A visiting Artificial Intelligence expert thought the assessment presented many interesting problems, and suggested presenting it to a wider audience. Writer's Work Bench and similar programs are useful for checking sentence structure, but what we are interested in is something that can examine a paper for organization, presentation, word usage, and content. The assessment is divided up into five areas. Each area has a possible score of 1, 3, or 5. A perfect paper would receive a score of 25. The five scored areas for Writing Assessment are : Ideas and Content, Organization, Voice, Effective Word Choice, and Sentence Structure. An example of one of the areas is as follows: Analytical Rating Guide IDEAS AND CONTENT 5. This paper is clear in purpose and conveys ideas in an interesting, original manner that holds the reader's attention. Clear, relevant examples, anecdotes or details develop and enrich the central idea or ideas. o The writer seems to be writing what he or she knows, often from experience. o The writer shows insight--a good sense of the world, people, situations. o The writer selects supportive, relevant details that keep the main idea(s) in focus. o Primary and secondary ideas are developed in proportion to their significance; the writing has a sense of balance. o The writer seems in control of the topic and its development throughout. 3. The writer's purpose is reasonably clear; however, the overall result may not be especially captivating. Support is less than adequate to fully develop the main idea(s). o The reader may not be convinced of the writer's knowledge of the topic. o The writer seems to have considered ideas, but not thought things through all the way. o Ideas, though reasonably clear and comprehensible, may tend toward the mundane; the reader is not sorry to see the paper end. o Supporting details tend to be skimpy, general, predictable, or repetitive. Some details seem included by chance, not selected through careful discrimination. o Writing sometimes lacks balance: e.g., too much attention to minor details, insufficient development of main ideas, information gaps. o The writer's control of the topic seems inconsistent or uncertain. 1. This paper lacks a central idea or purpose--or the central idea can be inferred by the reader only because he or she know the topic (question asked). o Information is very limited (e.g., restatement of the prompt, heavy reliance on repetition) or simply unclear altogether. o Insight is limited or lacking (e.g., details that do not ring true; dependence on platitudes or stereotypes). o Paper lacks balance; development of ideas is minimal, or there may be a list of random thoughts from which no central theme emerges. o Writing tends to read like a rote response--merely an effort to get something down on paper. o The writer does not seem in control of the topic; shorter papers tend to go nowhere, longer papers to wander aimlessly. I would be very interested in hearing from anyone in netlandia who is working/ has worked/will be working on similar projects. Please follow-up, send email, or call via landline. Comments are more than welcome. Thank you for your consideration. -- Steven C. Neighorn tektronix!{psu-cs,reed}!qiclab!neighorn Portland Public Schools "Where we train young Star Fighters to defend the (503) 249-2000 ext 337 frontier against Xur and the Ko-dan Armada" QUOTE OF THE DAY -> 'Dr. Ruth is no stranger to friction.'
dmc@videovax.UUCP (03/23/87)
Well, I'm probably over reacting to what will end up being nothing more than a spelling checker, but I find the thought of having creative writing graded by a computer program appalling. It's particularly pernicious in the public school system, where penalties for failure to conform to some computer program's judgement of style and content are brought to bear. The best and most universal writing is about the human condition. What does a computer program (or indeed its artificially intelligent author) know about that? What would it do with... James Joyce? William S. Burroughs? Anthony Burgess? Ogden Nash? What would happen to literary experiment? Would there be an image processing version that graded Picasso? It's bad enough that some smartass robot comes up to me at trade shows pedalling product, or some auto-dialer phones me while I'm in the shower to sell carpet cleaner, but these uppity machines I can be rude to and ignore. The one that's marking my school essays I cannot. In law I have the right to be judged by a jury of my peers. In school I demand that same right. I will NOT be judged by a machine. Yours for a better tomorrow, Don Craig Whose opinions are his own. -- Don Craig dmc@videovax.Tek.COM Tektronix Television Systems ... tektronix!videovax!dmc
kadie@uiucdcsb.UUCP (03/25/87)
Automatic checking and automatic grading are different things. I think <<* 3. WEAK: I think *>>^ automatic computer checking is a good thing, especially for spelling and simpler grammar. But there is no reason to grade automatically, just let the students ^<<* 23. SENTENCE BEGINS WITH BUT *>> work on their papers (with the automatic checker) until they are satisfied. <<* 21. PASSIVE VOICE: are satisfied. *>>^ <<* 17. LONG SENTENCE: 24 WORDS *>>^ Then have them turn in their work and the final computer critique to a human grader. The situation is similar to programming, where the compiler automatically checks the syntax. It would be unthinkable to make people turn in programs without letting them compile the programs first. On the other hand it would unthinkable to leave a syntax error in when the compiler tells you right were it is. <<** SUMMARY **>> READABILITY INDEX: 10.42 Readers need a 10th grade level of education to understand. STRENGTH INDEX: 0.41 The writing can be made more direct by using: - the active voice - shorter sentences DESCRIPTIVE INDEX: 0.65 The use of adjectives and adverbs is within the normal range. JARGON INDEX: 0.00 SENTENCE STRUCTURE RECOMMENDATIONS: 1. Most sentences contain multiple clauses. Try to use more simple sentences. << UNCOMMON WORD LIST >> The following words are not widely understood. Will any of these words confuse the intended audience? CRITIQUE 1 SYNTAX 2 UNTHINKABLE 2 << END OF UNCOMMON WORD LIST >> Carl Kadie University of Illinois at Urbana-Champaign UUCP: {ihnp4,pur-ee,convex}!uiucdcs!kadie CSNET: kadie@UIUC.CSNET ARPA: kadie@M.CS.UIUC.EDU (kadie@UIUC.ARPA)
kadie@uiucdcsb.UUCP (03/30/87)
Several people have ask if the grammar checker I used was real. It is. It is a commercial product for the IBM PC. Here is some more information and an example. I own a spelling checker that I always use. And a grammar and style checker that I sometimes use. I have a lot of confidence in the spelling checker; I take virtually all of its advice. The style checker is not as good. I always consider it's suggestions, but I know that it has missed many grammar and style errors and that not everything it flags is really wrong. Enclosed find its critique of a draft report. This gives a pretty good indication of how well the program works. The program is RIGHTWRITER version 2.0, a Right Soft product by Decisionware, Inc. of 2033 Wood Street, Suite 218, Sarasota, Florida 33577. It runs on IBM PC's and compatible computers. It costs about $100.00. Carl Kadie University of Illinois at Urbana-Champaign UUCP: {ihnp4,pur-ee,convex}!uiucdcs!kadie CSNET: kadie@UIUC.CSNET ARPA: kadie@M.CS.UIUC.EDU (kadie@UIUC.ARPA) (I disclaim any ulterior relationship to Decisionware.) ------------------------------- .+c "A Program To Compute Moore's Stable Expansions" .pp Moore has recently proposed a possible-world semantics for autoepistemic logic. His method has the intriguing property of producing multiple expansions, that <<* 16. UNNECESSARY COMMA *>>^ is it list the (finite) theories of what you believe about the world, given the axioms. ^<<* 17. LONG SENTENCE: 27 WORDS *>> For example, if your unbelief in proposition $P$ implies $Q$, and your unbelief in proposition $Q$ implies $P$, then we can theorize that either $P$ is true or alternatively $Q$ true. <<* 17. LONG SENTENCE: 31 WORDS *>>^ <<* 31. COMPLEX SENTENCE *>>^ .pp In Lisp notation the axioms are expressed: .(L (and (imp (not (l 'p)) q) (imp (not (l 'q)) p)) .)L and the conclusion is expressed: .(L (Q) (P) .)L .pp I have written a program that finds the stable expansions of formula in Moore's autoepistemic logic. As might be expected <<* 21. PASSIVE VOICE: be expected *>>^ the program run in time exponential to the number of variables. <<* 32. INCOMPLETE SENTENCE OR MISSING COMMA *>>^ Let's look at some runs: .(L A non-autoepistemic sentence: (expand '(and p (imp p (not q)) (imp (not q) r)) ;; axioms '(p q r) ;; propositions 0) ;; trace level returns: ((P (NOT Q) R)) .)L In other words, the axioms entail that $P$ is true, $Q$ is false, and $R$ is true. This is of course just what we expect for this propositional sentence. .pp Here is a trace of the run of the example we saw before: .(L [Figure goes here. -- CMK] .)L .pp The program also identifies cases where no stable expansion exists: .(L [Figure goes here. -- CMK] .)L .pp At higher trace levels, the program provides counter-models to non-grounded theories. For example: .(L (expand '(and (imp (not (l 'p1)) p2) (imp (not (l 'p2)) p3) (imp (not (l 'p3)) p4) (imp (not (l 'p4)) p1)) '(p1 p2 p3 p4) 2) ... (P1 P2 P3 P4) in theory is stable w.r.t. the axioms. S5 is ((P1 P2 P3 P4)) (s5:((P1 P2 P3 P4)) , V:((NOT P1) P2 (NOT P3) P4)) is a model of A Counter-model: (s5:((P1 P2 P3 P4)), V:((NOT P1) P2 (NOT P3) P4)) Theory (P1 P2 P3 P4) is NOT a stable expansion of the axioms ... ((P2 P4) (P1 P3)) .)L .pp In fact it is just this test of groundness that makes Moore's logic different from the logic of Shoham that we will see later. <<* 17. LONG SENTENCE: 24 WORDS *>>^ For example when we give Shoham's gun example to the program it replies that there are no stable <<* 1. REPLACE: that there BY there *>>^ expansions. This is because it does not have Shoham's chronological ignorance criteria with which to choose ungrounded theories. Here is the trace: .(L [Figure goes here. -- CMK] .)L .pp Having no stable expansion and believing nothing are two separate case. Here is a case where the only stable expansion is the theory where nothing is believed. ^<<* 21. PASSIVE VOICE: is believed. *>> .(L [Figure goes here. -- CMK] .)L .pp The program works by enumerating every theory, then constructing the corresponding S5 structure. Next, it tests every world of the S5, if any world fails to support the axioms then it is unstable and the theory is removed from consideration. <<* 21. PASSIVE VOICE: is removed *>>^ <<* 17. LONG SENTENCE: 27 WORDS *>>^ Stable theories are next tested for groundness. This is done <<* 21. PASSIVE VOICE: are next tested *>> <<* 21. PASSIVE VOICE: is done *>>^ by trying every variable assignment $V$. If an assignment makes the axioms true then $V$ must correspond to a world in the S5, or else the theory is not grounded. A theory <<* 21. PASSIVE VOICE: is not grounded. *>>^ <<* 17. LONG SENTENCE: 33 WORDS *>>^ that is both stable and grounded is added to the stable <<* 21. PASSIVE VOICE: is added *>>^ expansion list to be returned at the end of the program. <<* 21. PASSIVE VOICE: be returned *>> <<* 17. LONG SENTENCE: 24 WORDS *>>^ .pp Overall, the program works very well on small problems (four variable problems take only seconds on a SUN). The program accepts any formula that Lisp can evaluate; so very complex formula may be input. However, since the program relies on enumeration, it can not be expanded <<* 21. PASSIVE VOICE: be expanded *>>^ to first-order logic, nor can it be considered practical <<* 21. PASSIVE VOICE: be considered *>>^ unless the problems can be guaranteed to be small. <<* 21. PASSIVE VOICE: be guaranteed *>> <<* 17. LONG SENTENCE: 31 WORDS *>>^ <<* 31. COMPLEX SENTENCE *>>^ <<** SUMMARY **>> READABILITY INDEX: 7.63 Readers need an 8th grade level of education to understand. STRENGTH INDEX: 0.19 The writing can be made more direct by using: - the active voice - shorter sentences - more common words - fewer abbreviations DESCRIPTIVE INDEX: 0.74 The use of adjectives and adverbs is within the normal range. JARGON INDEX: 0.25 SENTENCE STRUCTURE RECOMMENDATIONS: 15. No Recommendations. << UNCOMMON WORD LIST >> The following words are not widely understood. Will any of these words confuse the intended audience? AUTOEPISTEMIC 3 AXIOM 2 AXIOMS 40 CHRONOLOGICAL 1 CRITERIA 1 DRIBBLE 1 ENTAIL 1 ENUMERATING 1 ENUMERATION 1 EXPONENTIAL 1 FINITE 1 FIRE4 20 GROUNDNESS 2 IMP 24 INTRIGUING 1 LISP 2 LOAD1 20 MOORE 1 MOORE'S 3 NIL 5 NOISE6 17 P 4 PROPOSITION 2 PROPOSITIONAL 1 PROPOSITIONS 2 Q 4 R 1 SEMANTICS 1 SHOHAM 1 SHOHAM'S 2 THEORIZE 1 UNBELIEF 2 UNGROUNDED 1 V 2 VACUUM5 17 WRT 19 << END OF UNCOMMON WORD LIST >> <<** WORD FREQUENCY LIST **>> A 31 ABOUT 1 ACCEPTS 1 ADD 1 ALSO 1 ALTERNATIVELY 1 AN 1 AND 18 ANY 2 ARE 4 AS 1 ASSIGNMENT 3 AT 3 AUTOEPISTEMIC 3 AXIOM 2 AXIOMS 40 BE 7 BECAUSE 1 BEFORE 1 BELIEVE 3 BOTH 1 [Rest of word frequency list goes here -- CMK] <<END OF WORD FREQUENCY LIST>>