mccarty@nash.rutgers.edu (Thorne McCarty) (07/13/90)
Was your paper rejected by AAAI this year? Were the reviews, in your opinion, incompetent and irresponsible? I was so outraged by one review of our paper that I sent a long message of protest to Tom Dietterich, the Program Co-Chair. Dietterich apologized for the poor quality review, but he insisted that this was an aberration. Most reviews are of high quality, he claims. I do not think this review was an aberration. From the anecdotal evidence, there are many AAAI reviews each year that are thoroughly unprofessional. Worse, according to the anecdotal evidence, the selection process itself is biased. It is difficult to prove these statements, of course, since most rejected authors simply grumble to their friends and then drop the matter. I would suggest the following: If you have a serious complaint about the quality of a AAAI review, send a critique to the Program Committee. (This year's Co-Chairs are Tom Dietterich: tgd@turing.cs.orst.edu, and Bill Swartout: swartout@vaxa.isi.edu.) Better yet, publicize your discontent. If enough people protest, perhaps AAAI will improve its act. For the record, here is my response to the reviews. To set the context, I have included an abstract of our paper. I would be happy to send a copy of the full paper to anyone who is interested in reading it. L. Thorne McCarty Computer Science Department Rutgers University _____________________________________________________________________________ \documentstyle[12pt]{article} \setlength{\parskip}{\medskipamount} \newcommand{\Px}{\mbox{$P({\bf x})$}} \newcommand{\Gix}{\mbox{$G_{i}({\bf x})$}} \newcommand{\Djx}{\mbox{$D_{j}({\bf x})$}} \let\If=\Leftarrow \let\And=\wedge \let\Or=\vee \let\Not=\neg \mathchardef\Fail="0218 \begin{document} \begin{center} \Large\bf The Case for Explicit Exceptions \end{center} \begin{center} \large L. Thorne McCarty\\ William W. Cohen \end{center} \begin{quotation} {\bf Abstract:} Most of the work on inheritance hierarchies in recent years has had as its goal the design of general purpose algorithms that depend only on the topology of the inheritance network. This research has produced some important observations about the various strategies used in human common sense reasoning, but it has also produced a proliferation of incompatible systems. In this paper, we resurrect the alternative technique, originally proposed by Etherington and Reiter, of explicitly encoding exceptions to default rules. The main technical innovation is the use of a different logical framework: a logic programming language based on {\em intuitionistic logic.} Using a combination of full intuitionistic negation plus negation-as-failure to encode default rules, we obtain analogues of the normal, seminormal and nonnormal defaults of Reiter's default logic. The advantage of our approach is that, whereas there is no adequate proof theory in classical logic for seminormal defaults, the analogous queries to an intuitionistic default rulebase can be answered by a simple top-down goal-directed interpreter. The claim that a default rulebase with explicit exceptions is easy to write and debug has been substantiated by encoding more than 40 standard examples from the literature. \end{quotation} \subsection*{First Review (Handwritten):} This is not a serious review, but a diatribe against logic programming. The reviewer simply refuses to evaluate the paper because the work was done in PROLOG! Some sample comments: \begin{quotation} This paper belongs in a conference on logic programming, rather than AI. \end{quotation} \noindent The paper certainly belongs in a conference on AI. It addresses problems that have been dominant in AI for many years. If the logic programming community can offer solutions to these problems, then the (American!) AI community ought to pay attention. This sort of insularity has seriously weakened the field in the past. \begin{quotation} The author labors under the misconception that PROLOG is a good language for communicating to other humans. \end{quotation} \noindent One of the main points of the paper is that our proof procedure for default rules can be encoded in a very simple PROLOG interpreter. To make this point, naturally, we describe the interpreter. It is very simple to understand. At Rutgers, we teach our first-year graduate students both LISP and PROLOG, and they should all be able to follow this discussion. If the reviewer is unable to do so, then he (or she) is not qualified to review the paper. Another claim of the paper is that it is very simple to encode the standard benchmark problems in the literature in our system, and run them through our interpreter. The reviewer addresses this claim as follows: \begin{quotation} The paper does substantiate his [sic] claims with benchmark examples, but unfortunately, most of these are contained in an appendix that exceeded the legal page limit. \end{quotation} \noindent This statement is false. Our paper conforms exactly to the page limits listed in the Call for Papers for {\it AAAI-90\/}: eleven pages of text including appendices, but not including the bibliography. This last comment is a transparent device to avoid addressing the substantive claims of the paper. When the reviewer gets the facts wrong, however, he (or she) reveals the full extent of his (or her) bias. It is obvious that the reviewer has decided to reject the paper from the start, without ever reading it carefully, and is simply grasping at straws to justify this decision. This person should never be allowed to review a {\it AAAI\/} paper again. \subsection*{Second Review (Typed):} On the surface, this is a responsible review. The reviewer appreciates the main claims of the paper: \begin{quotation} The most appealing part of this work is the exceedingly simple proof procedure which implements the intuitionistic theorem proving. The authors are justifiably proud of this aspect of their approach! \ldots The proof procedure is especially elegant for interacting defaults. \end{quotation} \noindent From these comments alone, I would have anticipated an acceptance rather than a rejection. The only complaint about the paper is the alleged ``obscurity of its writing style.'' It does appear that the reviewer had some difficulty following portions of the paper. But let us see if we can identify some of the reasons for this difficulty. Here is one specific point that the reviewer claims is unclear: \begin{quotation} Also, (4) and (10) can't possibly be equivalent under the classical (= Boolean) interpretation of $\If$ and $\Fail$, although perhaps under classical (= negation-as-failure) of $\Fail$. This needs to be clarified. \end{quotation} \noindent But is it true that (4) and (10) cannot possibly be equivalent under the classical interpretation of $\If$ and $\Fail$? Note the following: \begin{displaymath} \Px \;\If\; \bigwedge_{i}\Gix \;\And\; \bigwedge_{j}\Fail\Not\Djx \end{displaymath} \begin{displaymath} \left[\Px \;\If\; \bigwedge_{i}\Gix\right] \;\If\; \bigwedge_{j}\Fail\Not\Djx \end{displaymath} \begin{displaymath} \left[\Px \;\If\; \bigwedge_{i}\Gix\right] \;\Or\; \Fail\bigwedge_{j}\Fail\Not\Djx \end{displaymath} \begin{displaymath} \left[\Px \;\If\; \bigwedge_{i}\Gix\right] \;\Or\; \bigvee_{j}\Fail\Fail\Not\Djx \end{displaymath} \begin{displaymath} \left[\Px \;\If\; \bigwedge_{i}\Gix\right] \;\Or\; \bigvee_{j}\Not\Djx \end{displaymath} Each of these steps is an equivalence in classical logic! So what needs to be clarified? We could have included this derivation in our paper, of course, but we assumed that this equivalence could be verified by any first-year graduate student in AI! The reviewer also complains about the citation of various results in intuitionistic logic, as needed in the paper. There is nothing particularly obscure about intuitionistic logic. It has been thoroughly studied since the 1930's, and we have included references to some of the main sources [12, 21]. We have also included references to the major papers on intuitionistic logic programming, by myself and others [2, 14, 15, 23, 24, 26]. All the stated results can be found in these sources. There is a comment by the reviewer that the main facts about intuitionistic logic should be collected in a single place, ``up front.'' I think this a reasonable editorial suggestion, although I think an appendix would be a more appropriate location for such material. However, given the problem with (4) and (10) above, should we also include an appendix collecting the main facts about {\it classical\/} logic? The other specific examples of ``obscure writing'' are difficult to track down. I have searched for ``inconsistent uses of the symbol script R,'' and I cannot find them. The following statement is itself very obscure: \begin{quotation} \ldots it is not at all clear without repeated scrutiny that the authors are primarily {\it reinterpreting\/} the workings of the proof procedure. At first it really seems like they're really recoding the default rules. \end{quotation} \noindent In a sense, the point of giving a model-theoretic semantics is precisely to ``reinterpret the workings of the proof procedure,'' so it is not clear why the reviewer has a problem here. And what does he (or she) mean by the phrase ``recoding the default rules''? This needs to be clarified! In general, it is very difficult to evaluate a claim that a paper is obscurely written. What is obscure to one reader may be a model of clarity to another. I have shown this paper to several people, and they have all found it comprehensible. I sent the paper to one person (a specialist in natural language processing) prior to receiving the {\it AAAI\/} reviews, and she volunteered the following comment: ``One thing that I would like to add, is that I think you write very clearly, so that even novices in nonmon like me have a chance.'' \subsection*{Overall Assessment:} When the Program Committee Chair (or the Topic/Subtopic Chair) received these reviews, what should he (or she) have done? The first review should have been discarded. It should have been clear from a superficial examination of the paper that these were irresponsible comments. That would have left only one review, which is positive on the substance of the paper and negative on the writing style. At this point, it should have been mandatory that the Committee Chair solicit an additional review. Moreover, given the very subjective nature of an opinion about writing style, two additional reviews should have been required in this case. The Committee did not take these steps. I stand by my previous statement that the quality of reviewing at {\it AAAI\/} is very poor. \end{document}
byland@iris.cis.ohio-state.edu (Tom Bylander) (07/16/90)
>Was your paper rejected by AAAI this year? Were the reviews, in your >opinion, incompetent and irresponsible? Is there an AI conference with competent reviewers? Anecdotally, I have found AAAI to much more of a closed conference than other AI conferences. My impression is that the AAAI programming committee wields more power than those of other conferences and is very partial to "mainstream" research. (What counts as "mainstream" depends on the powers-that-be in each individual subarea.) In contrast, I have not found IJCAI reviewers (a category to which I belong) to be any more or less incompetent than AAAI reviewers. However, my impression is that IJCAI reviewers are more methodologically mixed than AAAI reviewers, giving non-mainstream work a better chance to be accepted. In general, many reviewers (including myself) have a narrow concept of what an AI result should be. However, when I review a paper, I try to judge the paper within the framework of its methodology, rather than my own. I accept the paper if I think it is reasonably rigorous and will be interesting to the methodological subgroup. Oh, to answer your questions---yes, my paper was rejected and IMHO one of the two reviewers was incompetent and irresponsible. Apparently, I will have to use a lot more precious space to explain why my problem is interesting and to explain my methodology. I hoped to do much of this work by referring to previous papers (in arcane proceedings like IJCAI and AAAI) that already explain these issues in detail. Unfortunately, the reviewer had not read them and consequently misunderstood what I was doing. Tom
forbus@m.cs.uiuc.edu (07/17/90)
Speaking as a former program chair for AAAI, let me stick in the following $0.02. [These opinions are, of course, my own, and should not be construed as an official statement of the AAAI.] 0. If you think you have received an irresponsible review, PLEASE get in touch with the program chair(s). Such feedback is viewed as extremely important. While it may or may not affect the outcome in your particular case, there is deep concern about quality control in the reviewing process. For example, reviewers who screw up often end up getting dropped from future program committees. 1. When I was program chair, my very own paper was rejected. If it is a "closed shop" or an "old boy's network", it certainly didn't work very well for me! (And, yes, I thought it was a great paper. Furthermore, it had been flushed the year before and I carefully re-wrote it along the lines the referees had suggested. Still lost. See point 4 below.) 2. Writing conference papers is a special skill, akin to writing haiku. A good conference paper, due to space limitations, is typically about one idea and some of its ramifications. System descriptions rarely work. (The paper mentioned above finally saw the light of day as a journal article. There was just no way it could fit in the constraints of a conference paper.) Empirical papers where the experiments don't all lead to some coherent point often don't work. And space limitations aren't the only problem. The time constraints means that, unlike a journal, if it isn't pretty much all there the first time, there won't be time to fix it. Now, add to these constraints the fact that people tend to finish their papers at the last minute (50% of them arrive the day of the deadline, on the average), and you see why writing for this forum is tough. Empirically, many AAAI (and IJCAI) submissions are poorly written. One may say this is a "subjective matter", but that's dodging the issue. In point of fact, conference papers really need to be understandable by the reviewers, who after all are part of the community one hopes to address. 3. In my experience, the reviewing provided by AAAI usually results in more accurate appraisals than the more distributed IJCAI method. The face-to-face approach really helps reviewers converge more accurately. Many conflicts get resolved by finding another expert in the area and having them read the paper, and either providing extra advice or writing an additional review. (The most controversial case garnered five reviews. I don't remember if the paper got in or not.) The methodological bias issue, to me, is a red herring. I'm sure there are biases. I'm also sure that everyone who reviews works very hard to make sure that their biases are kept under control. Interestingly, I've seen many examples of people from radically different "camps" in a subarea of AI agree, right down the line, on accept/reject decisions on papers in their area. So, personally, I'm much less likely to believe a paper was rejected for dogmatic reasons, and more inclined to believe that the referees just honestly thought it wasn't over threshold for that conference. I often get papers rejected. It doesn't feel good. Sometimes in retrospect I see what the reviewers meant. Sometimes I don't. While I have seen cases of irresponsible reviews, in my experience they are extremely rare. 4. In any filtering system some noise is inevitable. Ultimately, you have to realize that there are simply honest differences of opinion. People differ on when a point is well-enough established, on what constitutes sufficient rigor, on what kinds of discussion of potential impact and literature survey suffices, etc. Usually the really great papers all get accepted, the really bad papers all get rejected, and the stuff in between gets sorted out more or less correctly. That's really the best one can hope for, I think. Ken Forbus
ether@tzero.usa (David Etherington) (07/19/90)
There are lots of problems with the AAAI reviewing process, but perhaps things aren't as bad as they may seem annectdotally. First of all, all the annectdotes you hear are bad--nobody tells you how good the reviews they got were! Secondly, each reviewer meets with the other reviewer(s) face to face and discusses the paper (in some cases, assuming that they didn't already agree with a cut and dried accept or reject) to hash out a decision. Thus, given paper where one reviewer who thinks there is no content and the other thinks it is acceptable, some negotiation goes on. Thus, you have to have 2 irresponsible reviewers to get an irresponsible JUDGEMENT. Unfortunately, the process doesn't then FORCE the reviewers to go back and write a good review. So you end up with reviews that don't reflect the final judgement adequately. The area chairs are supposed to try to check the reviews to filter out the inflammatory ("Why do you keep sending in this kind of crap?" ones, but it doesn't always work (there were about 350 reviews done in 2 days in the KR area!). However, there were at least a couple of irresponsible reviewers (IMHO) this year, and they may have shared a couple of papers. There should be a mechanism to review reviewers (publicly? :-) so that the flakes get weeded out. Don't know how you could do it in such a way that you'd be free of lawsuits, though. All in all, the AAAI review process seems much better than the IJCAI process, taken in the large, since there is confrontation on the reviews. I rarely see what I'd consider a well-written paper with good content rejected. (If anything, I see papers accepted that need lots of work.) Unfortunately, less-well-written but significant papers sometimes suffer. Maybe that should encourage us to write better, rather than trying to get the paper together in time to make the Monday FedEx deadline!
hendler@dormouse.cs.umd.edu (Jim Hendler) (07/21/90)
I've sent quiet messages to several people about the AAAI reviews letter, but I do want to address an issue that I think is important. Both Etherington and Forbus, rightly, point out that the technique used at AAAI, with face to face and a small program committee, probably leads to a better quality of reviewing than the distributed IJCAI approach. I've been involved in several conference program committees (including IJCAI reviewing, but not AAAI yet) and agree with this. However, one of the criticisms often addressed to AAAI is not of poor reviewing (although every PC hears this), but rather of having a point of view and a relatively closed shop. Papers at IJCAI may not always be as good as AAAI (although it is arguable), but the many people involved in the reviewing keeps the conference from being biased by approach. At AAAI the Prog Chair(s) choose the subtopic area "chairs" and these folks in turn, working with the Prog chair(s), choose the reviewers. Unfortunately, some of these folks do not make the effort to be sure that the PC they pick is distributed accross the approaches to the area (and there are understandably many reasons this may happen - geographic proximity, professional collegiality, etc.). Since one of the goals each year is to have continuity, in the past this has sometimes lead AAAI to have some of its areas become somewhat narrow. I think that AAAI is aware of this, and the choice of Program chairs, PC members, etc. show an effort is being made by AAAI to avoid this. In its early years, however, I think less effort was made, and the conference gained a negative reputation among some sectors of the AI community. As a member of the AI community, I'd argue that we should encourage AAAI in these efforts, and stay involved in the conference. On the other hand, I think AAAI must remain aware of its reputation and keep making sure that PC members are chosen to span areas, that new people are being brought into the process each year, that some folks who typically do NOT publish in AAAI should be brought into the process, etc. Jim Hendler Computer Science Dept. UMCP College Park, Md. 20742 hendler@cs.umd.edu