jkpachl@watdaisy.UUCP (Jan Pachl) (11/05/86)
Question: --- How many people read an average research paper? --- Has anyone considered (or even investigated) this question? Obviously, the question is ill-posed; one would have to define "research paper", "average", and "read". Has anyone tried to answer the question for _any_ reasonable definition of those terms? Perhaps a good definition of a reader for this purpose would be "someone who has spent enough time on the paper to learn more than what could be found from a short abstract". Other related questions (e.g. "how many research papers quote an average research paper?") are much easier to formulate precisely (and to answer), but they are not as interesting. Jan Pachl, University of Waterloo
bogstad@brl-smoke.ARPA (William Bogstad ) (11/06/86)
In article <7966@watdaisy.UUCP> jkpachl@watdaisy.UUCP (Jan Pachl) writes: >Question: >--- How many people read an average research paper? --- > I had a conversation a few days ago when someone mentioned a supposed study that had been done once. (I'm afraid he either didn't know the details or I have forgotten them.) In any case, papers that had been previously published in a journal in the recent past were sent out again to the current reviewers. In only one instance, did someone notice that the paper had been previously published. (The text remained the same only the title and author's name were changed.) Note, this may be an apocryphal story. Bill Bogstad bogstad@hopkins-eecs-bravo.arpa
roy@phri.UUCP (Roy Smith) (11/06/86)
In article <7966@watdaisy.UUCP> jkpachl@watdaisy.UUCP (Jan Pachl) writes: > > Other related questions (e.g. "how many research papers quote > an average research paper?") are much easier to formulate > precisely (and to answer), but they are not as interesting. Actually, this question has in fact been answered. Science Citation Index (put out by ISI Press, I believe; the same people who bring you Current Contents) list papers according to citations. This is usually a excellent way to do library research -- start with a paper that you're interested in and trace out a chain of people who have cited that paper. ISI lists each year the papers which get cited the most often. That's really the only way to say "this was an important piece of work". If more people have cited your paper than any other paper, it's probably the most important. -- Roy Smith, {allegra,philabs}!phri!roy System Administrator, Public Health Research Institute 455 First Avenue, New York, NY 10016 "you can't spell unix without deoxyribonucleic!"
gh@utai.UUCP (11/09/86)
In article <7966@watdaisy.UUCP> jkpachl@watdaisy.UUCP (Jan Pachl) writes: >> Other related questions (e.g. "how many research papers quote >> an average research paper?") are much easier to formulate >> precisely (and to answer), but they are not as interesting. In article <2483@phri.UUCP> roy@phri.UUCP (Roy Smith) writes: > Actually, this question has in fact been answered. > [blurb on Science Citation Index, but not The Answer!] ISI found long ago that in any given year, the ratio of the number of citations they processed to the number of articles they processed was between 1.65 and 1.7. Fewer than 25% of all articles published are cited more than 10 times by other articles. See Eugene Garfield's article in /Current Contents/, 9 Feb 1976 (reprinted in his /Essays of an Information Scientist/, vol 2.) Garfield has also reported on the question that started this discussion, how many people read the average article (excluding the writer, editor, referees, typesetter, etc). The number was amazingly small; unfortunately, I have been unable to find the reference. Perhaps with this clue someone else may succeed. Finally, let me commend Roy for his blurb on the Science Citation Index. Although it is well-known and used in the hard sciences, fewer people in the computing and mathematical sciences (whom I assume are a large slab of the readers of these groups) seem to know of it or use it; a great pity! -- \\\\ Graeme Hirst University of Toronto Computer Science Department //// utcsri!utai!gh / gh@ai.toronto.edu / 416-978-8747
larsen@brahms (Michael Larsen) (11/10/86)
In article <2483@phri.UUCP> roy@phri.UUCP (Roy Smith) writes: >Science >Citation Index (put out by ISI Press, I believe; the same people who bring >you Current Contents)... >lists each year the papers which get cited the most often. That's >really the only way to say "this was an important piece of work". If more >people have cited your paper than any other paper, it's probably the most >important. >-- >Roy Smith, {allegra,philabs}!phri!roy >System Administrator, Public Health Research Institute >455 First Avenue, New York, NY 10016 This is an interesting theory. Let's see how it stands up against a short trip through the math citations index (CMCI). The following observations can be duplicated by anyone with access to the 1976-80 edition. 1. Most people would consider Newton's _Principia_ to be a work of some importance. CMCI gives 2 references. 2. O.K., that example was unfair because the paper in question is fairly old. It is hard to choose a contemporary mathematical work whose title most people will recognize. Nevertheless, we can use Math Reviews as an indication of stature. This usually staid index fairly gushed over Deligne's 1972 paper "La Conjecture de Weil pour les surfaces K3." I quote: This paper is awe-inspiring. Its powerful technique and arithmetic insight should recommend it to a larger audience, although its sophistication will cause trouble for almost all readers. How many references did this masterpiece garner from 1976 to 1980? Four. And that includes one by the author himself. 3. How about Durbin-Watson, "Testing for Serial Correlation in Least Squares Regression?" This paper, in which a theorem of Von Neumann is rederived, has a reputation for being often cited. It lives up to it in CMCI: 52 citations. 4. A random search through the columns of CMCI turned up a 1964 paper by one J. B. Kruskal which has 116 citations. It is quite possible that I am merely exposing my ignorance, but I confess to having heard of neither the mathematician in question nor the work. The idea that a reference count is an accurate indication of the quality of a scholar must have a strong appeal to the bureaucratic mind. Unfortunately, the real world doesn't seem to work that way. -larsen @ berkeley.edu.brahms
jin@hropus.UUCP (Jerry Natowitz) (11/11/86)
One thing that amazes my oldest sister, now an enviornmental mutagenisist, is how often her earliest research (1960s) is still quoted. I guess it helps to do your dissertation on interferon ... -- Jerry Natowitz (HASA - J division) Bell Labs - HR 2A-214 201-615-5178 (no CORNET yet) ihnp4!houxm!hropus!jin (official) ihnp4!opus!jin (better)
bzs@bu-cs.UUCP (Barry Shein) (11/11/86)
When I was working in medical research at Harvard I remember getting into a big argument with someone about what the chances were that, given a random sample of papers, some number of them were wrong and would be later disproved. I threatened to do a T-test of such overturned findings on his group's papers and publish that I had discarded the null hypothesis and proven, with a p << .001, that everything they have ever said or would say was wrong. I don't speak to that person any more... -Barry Shein, Boston University
shor@sphinx.UChicago.UUCP (Melinda Shore) (11/11/86)
[] Citation analysis is a popular research area among library science Ph.Ds. Much of what they've found has been consistent with what you'd intuitively expect. Researchers in the sciences don't cite as heavily as researchers in the social sciences and humanities, and are less likely to make obligatory citations to standard works. One thing we're seeing more of is people citing themselves heavily, as citation count is beginning to be considered in tenure decisions. -- Melinda Shore ..!ihnp4!gargoyle!sphinx!shor University of Chicago Computation Center XASSHOR@UCHIMVS1.Bitnet
dickey@ssc-vax.UUCP (Frederick J Dickey) (11/11/86)
> In article <2483@phri.UUCP> roy@phri.UUCP (Roy Smith) writes: > > >Science > >Citation Index (put out by ISI Press, I believe; the same people who bring > >you Current Contents)... > >lists each year the papers which get cited the most often. That's > >really the only way to say "this was an important piece of work". If more > >people have cited your paper than any other paper, it's probably the most > >important. > >-- > >Roy Smith, {allegra,philabs}!phri!roy > >System Administrator, Public Health Research Institute > >455 First Avenue, New York, NY 10016 I read an interesting article (in Science, I think) a few years ago that is somwhat relevant to this discussion. My recollection of the article follows. It dealt with the subject of LPU's. LPU = Least Publishable Unit. Years ago someone suggested that the number of citations of a paper might be a way of measuring the significance of the paper. At that time, it might have been. However, many researchers said to themselves, "WOW! I can increase the significance of my paper if it is cited a lot. I can increase my significance if I get lots of papers of mine cited." So these guys started splitting up their papers into atomic units (LPUs) so that they could get lots of citations. They also insisted on being listed as an co-author if they made any contribution at all to a paper, however minute. This is why you see papers with a zillion authors. To ensure the papers got cited, they worked out deals, "I'll cite you if you cite me." The upshot seems to be that number of citations may reflect political rather than technical acumen. ---f.j. dickey
berman@psuvax1.UUCP (Piotr Berman) (11/12/86)
In article <236@cartan.Berkeley.EDU> larsen@brahms (Michael Larsen) writes: >In article <2483@phri.UUCP> roy@phri.UUCP (Roy Smith) writes: > >>Science >>Citation Index (put out by ISI Press, I believe; the same people who bring >>you Current Contents)... >>lists each year the papers which get cited the most often. That's >>really the only way to say "this was an important piece of work". If more >>people have cited your paper than any other paper, it's probably the most >>important. > > This is an interesting theory. Let's see how it stands up against a >short trip through the math citations index (CMCI). The following observations >can be duplicated by anyone with access to the 1976-80 edition. > >1. Most people would consider Newton's _Principia_ to be a work of some >importance. CMCI gives 2 references. > >2. O.K., that example was unfair because the paper in question is fairly old. >It is hard to choose a contemporary mathematical work whose title most >people will recognize. Nevertheless, we can use Math Reviews as an indication >of stature. This usually staid index fairly gushed over Deligne's 1972 >paper "La Conjecture de Weil pour les surfaces K3." I quote: > > This paper is awe-inspiring. Its powerful technique and arithmetic > insight should recommend it to a larger audience, although its > sophistication will cause trouble for almost all readers. > >How many references did this masterpiece garner from 1976 to 1980? Four. >And that includes one by the author himself. > >3. How about Durbin-Watson, "Testing for Serial Correlation in Least Squares >Regression?" This paper, in which a theorem of Von Neumann is rederived, has >a reputation for being often cited. It lives up to it in CMCI: 52 citations. > >4. A random search through the columns of CMCI turned up a 1964 paper by one >J. B. Kruskal which has 116 citations. It is quite possible that I am >merely exposing my ignorance, but I confess to having heard of neither >the mathematician in question nor the work. > I do not know the paper either, but the paper J.B. Kruskal, On the shortest spanning subtree of a graph and the travelling salesman problem is cited by any textbook on data structure and algorithms. In general, if someone has a very deep and difficult theorem which 'closes' certain topic, it will not be cited very much. On the other hand, even a weak paper which 'opens' an area of research which becomes very popular, will be cited very often (often without reading, I guess, many times people cite citations of others). >The idea that a reference count is an accurate indication of the quality of >a scholar must have a strong appeal to the bureaucratic mind. Unfortunately, >the real world doesn't seem to work that way. > >-larsen @ berkeley.edu.brahms Here is the catch: there is not such a thing as a precise indication of quality or importance. But imprecise indicators have their value, if used with care. Piotr Berman
mae@weitek.UUCP (Mike Ekberg) (11/13/86)
In article <236@cartan.Berkeley.EDU> larsen@brahms (Michael Larsen) writes: >The idea that a reference count is an accurate indication of the quality of >a scholar must have a strong appeal to the bureaucratic mind. Unfortunately, >the real world doesn't seem to work that way. > >-larsen @ berkeley.edu.brahms I think the reference count does not necessarily indicate the quality of a scholar. But reference counts may be used to determine the areas of current work in a given field. As an example, you could generate a 'citation' index for usenet. You might find that article <236@cartan.Berkeley.EDU has been cited several times in the last two weeks. Was it a good article? Who knows? but I do know that in this news group there are several people pursuing the topic of Citation Indices. Maybe i'll unsubscribe if this topic continues:-}. mike - {cae780,turtlevax}/weitek/mae
braner@batcomputer.tn.cornell.edu (braner) (11/13/86)
[] Another interesting twist to the glorified "citation index" is that if you write a provocative enough paper, you're bound to have it cited (as a bad example). So a high index does not prove it was a GOOD paper! Sort of like well-known political candidates having to respond to negative claims about them made by unknown candidates, thus giving the latter much-needed free publicity. Of course, they (both) may deserve it! - Moshe Braner
roy@phri.UUCP (Roy Smith) (11/13/86)
In article <937@ssc-vax.UUCP> dickey@ssc-vax.UUCP (Frederick J Dickey) writes: > I read an interesting article (in Science, I think) a few years ago that > is somwhat relevant to this discussion. [...] It dealt with the subject > of LPU's. LPU = Least Publishable Unit. Quoting from a recent issue of a computer science journal (the names have been changed to protect the innocent and to protect me from lawsuits): J.P. Foobar received the Ph.D. degree in computer science from Random University in 1975. [...] Dr. Foobar has published over 100 papers. How does that strike you? My initial impression was "Hmm, over 100 papers in 11 years? That's like 1 every 6 weeks! Something's fishy here". Maybe I'm wrong (I havn't read most of Dr. Foobar's papers), but I just can't believe *anybody* can do something worth publishing every 6 weeks. A common (and, in my opinion, disreputable) practice in biology is to get your name on a paper by providing some technical service, trumped up as a collaborative effort. "Sure, I'll give your sample to my technician and tell him to run it through my Amino Acid Sequenator if you make me a co-author on your paper". -- Roy Smith, {allegra,cmcl2,philabs}!phri!roy System Administrator, Public Health Research Institute 455 First Avenue, New York, NY 10016 "you can't spell deoxyribonucleic without unix!"
jbk@alice.UUCP (11/21/86)
In article <236@cartan.Berkeley.EDU> larsen@brahms.berkeley.edu (Michael Larsen) writes: >4. A random search through the columns of CMCI turned up a 1964 paper by one >J. B. Kruskal which has 116 citations. It is quite possible that I am >merely exposing my ignorance, but I confess to having heard of neither >the mathematician in question nor the work. In article <2326@psuvax1.UUCP> berman@psuvax1.UUCP (Piotr Berman) writes: >I do not know the paper either, but the paper > > J.B. Kruskal, On the shortest spanning subtree of a graph and > the travelling salesman problem > >is cited by any textbook on data structure and algorithms. >In general, if someone has a very deep and difficult theorem which 'closes' >certain topic, it will not be cited very much. On the other hand, even >a weak paper which 'opens' an area of research which becomes very popular, >will be cited very often (often without reading, I guess, many times people >cite citations of others). To Michael Larsen: The 1964 paper you cite concerns non-metric multidimensional scaling. This method has achieved routine use in psychology, marketing, and some other fields. (And I haven't heard of you either, darling!) To Piotr Berman: The 1956 "shortest spanning subtree" paper you cite was written while I was a graduate student at Princeton, and was only my second published paper. (May you write many papers as weakly popular as this one.) A finitized form of a theorem from my Ph.D. thesis was the first proposition of genuine mathematical interest to be demonstrated as undecidable in a formal system (work by Harvey Friedman, using the first new method for demonstrating undecidability since Goedel introduced the concept). Sic transit gloria mundi. Joseph B Kruskal
larsen@brahms (Michael Larsen) (11/21/86)
>In article <236@cartan.Berkeley.EDU> >larsen@brahms.berkeley.edu (Michael Larsen) writes: > >>4. A random search through the columns of CMCI turned up a 1964 paper by one >>J. B. Kruskal which has 116 citations. >> (*) It is quite possible that I am merely exposing my ignorance, >>but I confess to having heard of neither >>the mathematician in question nor the work. > >To Michael Larsen: The 1964 paper you cite concerns non-metric multidimensional >scaling. This method has achieved routine use in psychology, marketing, and >some other fields. (And I haven't heard of you either, darling!) My apologies to Dr. Kruskal for gratuitously bringing his name into a discussion of the merits of citation counting. I can only attribute the selection to the immense popularity of his paper (his fault) and the accident of my not recognizing his name (my fault). I imagine he would be gratified by the number of sci.math subscribers who brought my attention to the validity of (*). >Sic transit gloria mundi. >Joseph B Kruskal Indeed. Happy for Isaac Newton that he was long dead on the day that his 1687 treatise on laws of motion (which has achieved routine use in physics, engineering, and some other fields) fell ingloriously before a paper on non-metric multidimensional scaling. Michael J. Larsen @ berkeley.brahms.edu
berman@psuvax1.UUCP (Piotr Berman) (11/25/86)
>In article <236@cartan.Berkeley.EDU> >larsen@brahms.berkeley.edu (Michael Larsen) writes: > >>4. A random search through the columns of CMCI turned up a 1964 paper by one >>J. B. Kruskal which has 116 citations. It is quite possible that I am >>merely exposing my ignorance, but I confess to having heard of neither >>the mathematician in question nor the work. > >In article <2326@psuvax1.UUCP> berman@psuvax1.UUCP (Piotr Berman) writes: > >>I do not know the paper either, but the paper >> >> J.B. Kruskal, On the shortest spanning subtree of a graph and >> the travelling salesman problem >> >>is cited by any textbook on data structure and algorithms. >>In general, if someone has a very deep and difficult theorem which 'closes' >>certain topic, it will not be cited very much. On the other hand, even >>a weak paper which 'opens' an area of research which becomes very popular, >>will be cited very often (often without reading, I guess, many times people >>cite citations of others). > >To Michael Larsen: The 1964 paper you cite concerns non-metric multidimensional >scaling. This method has achieved routine use in psychology, marketing, and >some other fields. (And I haven't heard of you either, darling!) > >To Piotr Berman: The 1956 "shortest spanning subtree" paper you cite was >written while I was a graduate student at Princeton, and was only my second >published paper. (May you write many papers as weakly popular as this one.) > >A finitized form of a theorem from my Ph.D. thesis was the first proposition of >genuine mathematical interest to be demonstrated as undecidable in a formal >system (work by Harvey Friedman, using the first new method for demonstrating >undecidability since Goedel introduced the concept). > >Sic transit gloria mundi. > >Joseph B Kruskal Sorry for a clumsy formulation. I LIKE KRUSKAL ALGORITHM. Any former student of Comp. Sc. must now it, so I was surprised that your name was unfamiliar to someone here. But you must admit that it was not a most difficult result of yours. And would I quote you, I would do it by quoting the reference from a textbook, without reading this paper. I would conjecture that many people writing on applications of your 1964 paper read about the result, and then requoted the reference. This perhaps indicates that the question should be 'how many people learn an average mathematical result' rather then 'how many people read an average paper'. Very few people quote Pitagoras, for example. Sorry that I am ignorant of your thesis, may be I should read it over the Christmass as a pennance. Piotr Berman