pgil%histone@LANL.GOV (Paul Gilna) (08/10/90)
Over the past two years, the primary thrust of the GenBank project has been to improve the timeliness and completeness of the database. Endeavours such as the interaction with journals, sequence submission policies, and new submission software tools have brought us to the point where we now receive 80% of our data in electronic form directly from the scientific community and where our average turnaround is now measured in weeks rather than months. This progress in soliciting direct and automated data submission, and in the RDBMS conversion now free us to deal in greater detail with one of the most important components of the database, the biology represented within the annotation. In addition to our work to enrich the quality of the annotation using our own annotation resources, we now wish to seek the direct involvment of the members of the scientific community. The following announcment represents the beginning of a program to aid us to enhance the quality and integrity of the data represented in the GenBank database. This announcment will only be distributed via e-mail for the pilot phase, however recipients are free to redistribute this notice. This notice is being posted to both the GENBANK-BB and BIONEWS bulletin boards and we apologize in advance for any redundancy across the two newsgroups. Paul Gilna GenBank Biology Domain Leader Los Alamos National Laboratory Los Alamos, NM 87545 pgil%histone@lanl.gov Tel: (505) 665-2177 Fax: (505) 665-3493 GENBANK CURATOR PROGRAM GenBank announces the pilot phase of the GenBank Curator Program. We are seeking suggestions for work to be done on the database in the form of informal proposals. Authors of successful proposals will travel to Los Alamos and work with the annotation or computation staff to carry out their proposed project. Although GenBank has had some curators in the past, the advent of the GenBank RDBMS restructuring and its attendant interface, the Annotator's Workbench, allows us to implement an expanded program using a unified, intuitive annotation tool that provides the capability of remote use. The current program seeks to identify domains within the database that are in need of overhaul either at the sequence or at the annotation level. In addition, as part of ongoing development of the Sequence Validation Suite (SVS), a suite of software programs that will be used to check the validity of submitted sequence and annotation data, we have expanded the program to include software development associated with the SVS. We are looking to the readership of the molecular biology-oriented Bulletin Boards for proposals for curation on GenBank; if you are familiar with a domain or family of sequences represented within the database and with the existing annotation, and have some ideas on how the annotation could be improved (for example to reflect similarities in features across entries, to improve existing nomenclature, or to point out sequence merges), or on software that could be developed to aid data integrity and validation, then we would like to hear from you. In this pilot study, about six proposals will be selected to be implemented before the end of September, 1990. Based on the results of the study, we hope to take on about 30 or so more projects over the course of the next two years. The capability exists for continued interaction with the data bank staff on a consultant basis, using remote access facilities to the annotation software. The work will be carried out on site at Los Alamos. Travel (within the US for the pilot study), hotel costs, and subsistence will be covered. Project proposals will be reviewed by GenBank and NIH staff. Proposals should be submitted to Dr. Paul Gilna via e-mail (pgil%histone@lanl.gov) and should cover the following topics: o Detailled description of work proposed, citing examples from the database, where relevant, and of the scope of the proposed work o Justification of work in terms of benefit to community and data bank o Estimation of time needed to conduct work at LANL o Abbreviated CV including representative publications.
roy@phri.nyu.edu (Roy Smith) (08/11/90)
pgil%histone@LANL.GOV (Paul Gilna) writes: > Authors of successful proposals will travel to Los Alamos and work with > the annotation or computation staff to carry out their proposed project. I made an attempt to respond to this earlier today, over my morning cup of tea. Apparantly, enough caffiene had not yet entered my system, since no trace of my article now exists. So, let my try again. I wonder if it should really be necessary to travel to Los Alamos to do the work. The whole idea of building NSFNet, NREN, etc, is to bring data and computing resources to people, not the other way around. Private email with Paul (between the first abortive posting and this one) has caused me to mellow my original position, to the point where I agree that an introductory in-person get together is A Good Thing, but I still feel that it should be possible to do most of the work remotely. Of course, I understand the scenery in New Mexico is pretty nice, and you can't really get that through a T1 wire. Aha! I just figured out why my earlier posting got lost. The version of rn I'm using automagically turned the newsgroups line in my followup of a bionet.general article into bionet.followup, a holdover from what I think is long-obsolete usenet policy. -- Roy Smith, Public Health Research Institute 455 First Avenue, New York, NY 10016 roy@alanine.phri.nyu.edu -OR- {att,cmcl2,rutgers,hombre}!phri!roy "Arcane? Did you say arcane? It wouldn't be Unix if it wasn't arcane!"
kristoff@genbank.BIO.NET (David Kristofferson) (08/11/90)
> Aha! I just figured out why my earlier posting got lost. The > version of rn I'm using automagically turned the newsgroups line in my > followup of a bionet.general article into bionet.followup, a holdover from > what I think is long-obsolete usenet policy. We encountered that annoying problem with our vnews USENET software too when we first put it up, but got rid of this troublesome "feature." Systems managers, beware! -- Sincerely, Dave Kristofferson GenBank On-line Service Manager kristoff@genbank.bio.net
pgil%histone@LANL.GOV (Paul Gilna) (08/13/90)
Roy Smith (roy@phri.nyu.edu) writes: > I wonder if it should really be necessary to travel to Los Alamos > to do the work. The whole idea of building NSFNet, NREN, etc, is to bring > data and computing resources to people, not the other way around.. Private > email with Paul (between the first abortive posting and this one) has > caused me to mellow my original position, to the point where I agree that > an introductory in-person get together is A Good Thing, but I still feel > that it should be possible to do most of the work remotely. Of course, I > understand the scenery in New Mexico is pretty nice, and you can't really > get that through a T1 wire. The goal of the curator program is to enable exactly this--remote access to the database by a curatorial team of scientists, using system independant annotation tools running either on a local hardware platform, or remotely on the GenBank database host. I would emphasize that we are in the pilot phase of this program, and as such are treading carefully, so that we may allow and adjust for the need to be flexible in the implementation of the program. For those involved in biological curation, there is a fair amount of training in the annotation tools (the Annotators Workbench, our interface to the RDBMS), and in our editorial standards and policies. For those involved in software module development for the SVS (the sequence validation suite), there is a need to familiarize oneself with the design features of the RDBMS, that cannot (at this stage) be accomplished remotely. Early feedback in the program suggested that scientists might be more comfortable with performing the work in a discrete "chunk" of their time, rather than drawn out over time, where more conflicts were likely to occur, hence the emphasis on on-site work. We do not see this policy as dogma however, and recognize that in the full program, a family of interaction modes will likely prevail over any single design. We have already had some favourable reaction from the community, and I would encourage continued comments (public or private) on this program. We are very excited about the possibilities and impact on the database that will come from this endeavour. Finally, I cannot but concur with the perception that New Mexico is "pretty nice" (masterly use of the understatement here, Roy!); what more could one ask than for good science, good scenery, and good food? Regards, --paul