dupuy@cs.columbia.edu (Alexander Dupuy) (08/08/90)
I don't see much problem with duplicate bibliographies - people can decide for themselves which is more useful/accurate/whatever - but more of a problem would be the question of the database formats. The first issue would be whether any format would be used at all. Given the experience of comp.archives, I would expect that trying to define a new format would almost certainly be a failure, and even using an existing one, at least half the postings wouldn't use it (or any other any standard format). There are currently three major bibliography formats out there (that I know of, anyhow) not counting library software systems. One is Unix refer(1) format, documented in addbib(1), and the other two are Scribe and BibTeX. BibTeX format is pretty much a subset of Scribe's with one or two minor exceptions. Both refer and Scribe/BibTeX format have their own advantages and disadvantages. Unix refer format is more fixed in structure, and thus more amenable to database-style operations (e.g. sortbib, indxbib, lookbib). It has the advantage that it comes pretty much standard with Unix. Although the defined fields are somewhat more regular than Scribe/BibTeX format, they aren't quite as extensive. Scribe/BibTeX format is more freeform, but requires classification of the document type (i.e. article, book, proceedings, unpublished, etc.). It has the advantage that Scribe and BibTeX can both understand a common subset format, and both provide support for generating bibligraphies and references in a number of styles (e.g. CACM, IEEE, etc.) A sample refer format bibliography entry might look like this: %K Miscellaneous %A David P. Anderson %A Robert Wahbe %T A Framework for Multimedia Communication in a General-Purpose Distributed Sys tem %R Technical Report 89/498 %I UC Berkeley CS Division %D March 1989 %X \fBAbstract:\fP Motivates the design given in TR 88/462, gives some comparisons, and discusses implications for protocol and local system design. Description of channel parameters supercedes TR 88/462. The same bibliography entry in Scribe/BibTeX common subset might look like: @TechReport(UCBTR-89-498, Author = "David P. Anderson and Robert Wahbe", Title = "A Framework for Multimedia Communication in a General-Purpose Distributed System", Institution = "UC Berkeley CS Division", Number = "89/498", Month = "March", Year = "1989", Abstract = { Motivates the design given in TR 88/462, gives some comparisons, and discusses implications for protocol and local system design. Description of channel parameters supercedes TR 88/462.} ) It is more or less feasible to convert from one format to the other (easier, I think when going from refer to Scribe/BibTeX, which is why I prefer refer). I'll follow this article with a posting describing each format in more detail, and some notes I've made on conversions between them. @alex -- -- inet: dupuy@cs.columbia.edu uucp: ...!rutgers!cs.columbia.edu!dupuy
dupuy@cs.columbia.edu (Alexander Dupuy) (08/08/90)
The following are some notes I've made on the various bibliography formats - at the end is an outline of a heuristic method for determining a Scribe/BibTeX classification for refer style bibliography entries. @alex _______________________________________________________________________________ References are kept in Unix refer format, described below: Bibliography Key Letters The most common key-letters and their meanings are given below. %A Author's name %B Book (or Proceedings) containing article referenced %C City (place of publication) %D Date of publication %E Editor of book containing article referenced %F Footnote number or label (supplied by refer) %G Government order number %H Header commentary, printed before reference %I Issuer (publisher) %J Journal containing article referenced %K Keywords to use in locating reference %L Label field used by -k option of refer %M Bell Labs Memorandum (undefined) %N Number within volume %O Other commentary, printed at end of reference %P Page number(s) %Q Corporate or Foreign Author (unreversed) %R Report, paper, or thesis (unpublished) %S Series title %T Title of article or book %V Volume number %X Abstract - used by roffbib, not by refer %Y,Z Ignored by refer In practice, some of these conventions are ignored: %J is often used for Conference Proceedings, where %B is correct %J and %N are used for Technical Reports, where %R (or %M?) is correct %A is used in cases where %Q should be used for organizational authors In order to encode certain information needed by Scribe and BibTeX, we will use the %Y and %Z fields: %Y Classification category (see below for list) %Z Additional fields: FIELD = "val", FIELD = "val" ... Valid classification categories are: ARTICLE BOOK BOOKLET CONFERENCE MANUAL MASTERSTHESIS MISC PHDTHESIS PROCEEDINGS TECHREPORT UNPUBLISHED _______________________________________________________________________________ Another format is bibtex format, described below Bibliography File The bibliography file (.bib) format is just about a subset of that allowed in Scribe bibliographies. Only the delim- iter pairs {...} and "..." are allowed inside entries. Entries themselves can be delimited by (...) also. The = sign between field names and field values is not optional. There are a number of conventions that should be followed when writing .bib files. These are not requirements of bib- tex, but standard bibliography style files will typically expect these conventions to be followed. References should be categorized as in Scribe into one of the categories: article, book, booklet, inbook, incollec- tion, inproceedings, manual, mastersthesis, misc, phdthesis, proceedings, techreport, and unpublished. See the Scribe manual for the fields that must/can appear in each type of reference. The title field should be entered in uppers-and-lowers for- mat, where everything is capitalized except articles and unstressed conjunctions and prepositions, and even those are capitalized if they are the first word or the first word after a colon. Some style files will convert all words except the first to all lowercase. This is a mistake for things like proper nouns, so you have to tell bibtex not to touch such capital letters by enclosing them in braces, as in "Dogs of {A}merica". It is unlikely that any style file would attempt to convert book titles to lowercase, so perhaps you can omit braces in such titles. The author and editor fields should conform to a particular format, so that the style file can parse them into parts. A name can have four parts: first, von, last, junior, each of which can consist of more than one word. For example, "John Paul von Braun, Jr." has "John Paul" as the first part, "von" as the von part, "Braun" as the last part, and "Jr." as the junior part. Use one of these formats for a name: First von Last von Last, First von Last, Junior, First The last part is assumed to be one word, or all the words after the von part. Bibtex will treat anything in braces as one word, so use braces to surround last names that contain more than one word. The von part is recognized by looking for words that begin with lowercase letters. When possible, enter the full first name(s); style files may abbreviate by taking the first letter. Actually, the rules for isolating the name parts are a bit more complicated, so they do the right thing for names like "de la Grand Round, Chuck". There is no need for a field like Scribe's fullauthor field. If there are multiple authors or editors, they should all be separated by the word and. Scribe's editors field should not be used, since bibtex style files can count how many names are in an editor field. _______________________________________________________________________________ And from the Scribe manual page: BIBLIOGRAPHIES Scribe contains a mechanism for automatically assembling a Bibliography for a document by selecting entries from a larger bibliographic database. Scribe expects to find the information for Bibliography entries in a bibliography data- base file (.BIB) in a specific data format. Each entry in an .BIB file must have the structure: @classification(codeword, list-of-fields) The list of valid "classifications" appears in the "Classes" subtopic. The list of valid "fields" appears in the "Fields" subtopic. The formatting of the cited biblio- graphic reference is controlled by the reference format chosen via the @Style command. Available formats can be found in the "Reference_Formats" subtopic. CLASSES Scribe's bibliography classifications are listed below. Not all classifications are used with each reference format. Refer to the "Scribe User Manual" or the "Scribe Advanced User Manual" for details on which classifications are required and optional for each reference format. ARTICLE An article from an academic journal or a magazine. BOOK Something published on its own, usually by a publish- ing house that is not the same as the author. BOOKLET Something published and bound, but having neither an explicitly-named publisher nor a sponsoring institution. CONFERENCE The conference name. INBOOK For a reference to a part of a book rather than to the entire book. INCOLLECTION Something composed of papers or chapters pre- viously published elsewhere. INPROCEEDINGS A reference to a paper in a conference proceedings or the like. MANUAL An instruction manual or piece of technical documen- tation. MASTERSTHESIS A Masters' thesis. MISC Any category not mentioned in this list. PHDTHESIS A Ph.D. thesis. PROCEEDINGS The proceedings of a conference or some similar document. The identifying characteristics of the classifi- cation are that its publisher and author are identical, and often no editor's name appears. TECHREPORT A technical report. Similar to a book, except that it is published by a research institution instead of by a publisher and that it usually has an assigned "report number". UNPUBLISHED Some paper that is in preparation or that has been printed but not published. FIELDS The following field names are used in defining bibliography database entries. All take a delimited string or an abbreviation code as a value. Not all field names apply to each classification; some are required while others are not. Check the "Scribe User Manual" and "Scribe Advanced User Manual" for details. ADDRESS The address of the publisher or printer or organi- zation. AUTHOR The name(s) of the author or authors, in the format in which they should be printed. ANNOTE Any annotation text. Not actually printed in most bibliography formats. BOOKTITLE The title of a book or proceedings of which this reference is a chapter or paper or article. CHAPTER If a reference is being made to part of a book and not the entire book, specify either chapter or pages. DATE Can be used instead of MONTH and YEAR in some reference formats EDITION Manuals often have an edition name or number that is not part of the actual title of the manual. EDITOR The name of the editor. If more than one, use Edi- tors. EDITORS The name of the editors. If only one, use Editor. FULLAUTHOR The full name of the author or authors, written out without commas. FULLORGANIZATION The "full" name of the organization for mailing purposes. HOWPUBLISHED For unusual manuscripts; how it came into your possession ("personal note", etc.). INSTITUTION The organization or institution backing or publishing a technical report or a proceedings. JOURNAL The title of the journal. KEY The sort key. This field is used for alphabetization. MEETING Used with the value of the SOCIETY field name. MONTH January, February, etc. NOTE Any comment, usually used to clarify the reference or to suggest alternate sources. Differs from Annote in that Note will always be printed, but Annote will be printed only in those bibliography types that specify annotation. NUMBER Issue number of a journal or series number in a book series or serial number of a technical report. ORGANIZATION The name of the organization holding a confer- ence that published a proceedings. PAGES The page numbers within a journal, proceedings, or book that contain the material actually cited. PUBLISHER The name of the publishing company. SCHOOL For theses, the name of the school granting the degree. SERIES When books are published in a series, the series has a name. TITLE The title of the book, article, thesis, or other document that is being cited. Do not italicize or under- line; that detail will be handled by the selected reference format. TYPE Some technical reports are called by other names, such as "Research Report", etc. If this is not a "Technical Report", put its true name in this field. VOLUME The volume number of a journal or a series book. YEAR The year of publication; four digits: 1979. REFERENCE_FORMATS Bibliography format definitions in the Database are used to control the style and sequencing of the list of references and the citations. Select one with the References @Style parameter. 1APA Similiar to the APA format except that it contains an Annote field that is treated as a Comment. 1APADRAFT Similiar to the 1APA format except that it is double-spaced. ANNAPA Similiar to the 1APA format except that the Annote field is treated as text. ANNAPADRAFT Similiar to the 1APADraft format except that the Annote field is treated as text. ANNOTEDSTDALPHABETIC Same as StdAlphabetic, but includes annotations and has filled lines. ANNOTEDSTDIDENTIFIER Similiar to the STDIdentifier format except it includes annotations and has filled lines. ANNOTEDSTDNUMERIC Same as STDNumeric, but includes annotations (i.e. the contents of the Annote field) in the Bibliography and has filled lines. ANNSTDALPHABETIC Similiar to the STDAlphabetic format except it includes annotations and has unfilled lines. ANNSTDNUMERIC Similiar to the STDNumeric format except it includes annotations and has unfilled lines. APA (American Psychological Association). Spelled-out citations (Knuth, 1978), outdented closed reference list, alphabetical ordering of references. APADRAFT Draft ver- sion of APA format. Same as regular version, but triple- spaces the Bibliography. CACM Numeric citations [5], closed format, alphabetical ordering of references. CLOSEDALPHABETIC Similiar to the STDAlphabetic format. CLOSEDNUMERIC Similiar to the STDNumeric format. 5 IEEE Superscripted numeric citations, closed format, cita- tion sequence ordering of references. IPL (Information Processing Letters). The format required by IPL. This format is incomplete; it does not have all standard Scribe types yet (April 1984) and is being included for convenience only. NEWAPA (American Psychological Association). The new APA format with the Year following the Author. Spelled-out citations (Knuth, 1978), outdented closed reference list, alphabetical ordering of references. SIAM (Society for Industrial and Applied Mathematics). The format required by SIAM journals. This format is incom- plete; it does not have all standard Scribe types yet (April 1984) and is being included for convenience only. STDALPHABETIC Alphabetic citations [Knuth 78], open format, alphabetical ordering of references. STDIDENTIFIER Open format, reference identifier for cita- tions rather that a generated label. STDNUMERIC Numeric citations [5], open format, alphabetical ordering of references. COMMANDS BIBFORM Defines a Bibliography classification, such as "Book", for a particular Bibliography reference format. May only be used in .REF files and the subtopic in the "Bibliographies" entry for available bibliography classifi- cations.) Format: @Bibform(Classification=delimited-definition-string) EXAMPLES 1. @BibForm(UnPublished=< @begin(BibEntry) @parm(tag).@@parm(Author), @~ "@parm(Title)"@~ @Imbed(Note,def ', @Parm(Note)', undef '.') @end(BibEntry) >) (Note: Taken from the IEEE.Ref database file.) 2. @BibForm(Misc=< @begin(BibEntry) @l1{[@parm(tag)]@@imbed(Author,def '@parm(Author).')} @imbed(Title,def '@l2{@parm(Title).}') @imbed(HowPublished,def '@l2{@parm(HowPublished).}') @imbed(Year, def '@l2{@imbed"Month, def {@Parm(Month), }"@~ @parm(Year)}') @imbed(Note,def '@l2{@parm(Note).}') @end(BibEntry) >) (Note: Taken from the Standa.Lib database file.) _______________________________________________________________________________ And finally, a mapping from refer keywords to Scribe/BibTeX fields: %A AUTHOR (use last word before comma for KEY) %B BOOKTITLE %C ADDRESS %D [MONTH] YEAR (or DATE, if more than two words) %E EDITOR (or EDITORS, for Scribe) %F ignored %G NUMBER %H NOTE (see also %O) %I INSTITUTION (TECHREPORT) PUBLISHER ([IN]BOOK, BOOKLET, INCOLLECTION) ORGANIZATION (CONFERENCE, [IN]PROCEEDINGS, MANUAL) SCHOOL (MASTERSTHESIS or PHDTHESIS) %J JOURNAL %K ignored %L KEY (also citation name of reference) %M NUMBER, TYPE="Bell Labs Memorandum" %N NUMBER %O NOTE (see also %H) %P PAGES %Q AUTHOR (use first word for KEY) %R parse into TYPE and NUMBER, very messy %S SERIES %T TITLE %V VOLUME %X ANNOTE The following Scribe/BibTeX fields have to be encoded in %Z: CHAPTER EDITION FULLAUTHOR FULLORGANIZATION HOWPUBLISHED MEETING PUBLISHER (INPROCEEDINGS, PROCEEDINGS) SOCIETY An heuristic for determining the classification type from the refer data: (evaluate from top to bottom, observing nesting conditionals) %J present %N present and %J contains the string "report" TECHREPORT (and convert %J into TYPE) %I present and %I contains any of the strings "univ.", "university", "dept.", "department", "labs", "laboratory", "center", "institut", "division" TECHREPORT (and convert %J into TYPE) %J contains any of the strings "proc.", "proceedings", "conf.", "conference", "symp.", "symposium" "congress", "intl.", "workshop" INPROCEEDINGS else ARTICLE %A missing %E present BOOK (%B or (%I and %T)) and %D present PROCEEDINGS %T and %D present MANUAL else MISC %T missing MISC %R or %M present %R present and %R contains the string "thesis" %R contains the string "masters" MASTERSTHESIS else PHDTHESIS else TECHREPORT %B present if %E missing or %B contains any of the strings "proc.", "proceedings", "conf.", "conference", "symp.", "symposium" "congress", "intl.", "workshop" INPROCEEDINGS else INCOLLECTION %E present if %T contains any of the strings "proc.", "proceedings", "conf.", "conference", "symp.", "symposium" "congress", "intl.", "workshop" if %P present INPROCEEDINGS else PROCEEDINGS else if %P present INCOLLECTION else BOOK %P present INBOOK %I present and %I does not contain the string "press" %I contains any of the strings "univ.", "university", "dept.", "department", "institut", PHDTHESIS %I or %C present BOOK %D present BOOKLET else MISC _______________________________________________________________________________ -- -- inet: dupuy@cs.columbia.edu uucp: ...!rutgers!cs.columbia.edu!dupuy
emv@math.lsa.umich.edu (Edward Vielmetti) (08/09/90)
In article <DUPUY.90Aug7212148@hudson.cs.columbia.edu> dupuy@cs.columbia.edu (Alexander Dupuy) writes:
There are currently three major bibliography formats out there (that I know of,
anyhow) not counting library software systems. One is Unix refer(1) format,
documented in addbib(1), and the other two are Scribe and BibTeX. BibTeX
format is pretty much a subset of Scribe's with one or two minor exceptions.
Both refer and Scribe/BibTeX format have their own advantages and
disadvantages.
There's also "tib" format, which is a slight mutation of refer(1)
format but usable with TeX. And the 10th edition unix manuals have a
further bibliography format (don't recall the name) that uses
refer-ish style except the tags are multicharacter (%title instead of
%t).
Here's a cite (from sgml.math.lsa.umich.edu:/pub/sgml/bibliography) on
tools that go from the SGML format to BibTeX; I haven't seen this thing
yet.
Cover, Robin; Duncan, Nicholas; Barnard, David. "A Bibliography
on Structured Text." Technical Report, 1990. This is the
preliminary print version of a bibliographic and information
database (compiled by Robin Cover), structured in SGML-database
and formatted with SGML ->> BibTeX utilities developed at Queen's
University by Nick Duncan and David Barnard. Contact: Department
of Computing and Information Science; Queen's University;
Kingston, Ontario, Canada K7L 3N6; Tel: (613) 545-6056.
I think there's also an ANSI bibliographic standard, though I don't know
how it addresses storage representation vis-a-vis appearance on the page.
--Ed
Edward Vielmetti, U of Michigan math dept <emv@math.lsa.umich.edu>
comp.text.sgml ISO 8879 SGML, structured documents, markup languages
yes votes to sgml-yes@math.lsa.umich.edu
no votes to sgml-no@math.lsa.umich.edu