toms@fcs260c2.ncifcrf.gov (Tom Schneider) (05/08/91)
Archive-name: bionet/molbio/delila/1991-04-30 Archive-directory: ncifcrf.gov:/pub/delila/ [129.43.1.11] Original-posting-by: toms@fcs260c2.ncifcrf.gov (Tom Schneider) Original-subject: Re: Software for automated subseqence extraction Reposted-by: emv@msen.com (Edward Vielmetti, MSEN) In article <eesnyder.672776972@beagle> eesnyder@boulder.Colorado.EDU (Eric E. Snyder) writes: >I am looking for some software that will allow me to extract subsequences >from genbank or PIR. The Delila system, old and senile as it is, was designed to extract large sets of subsequences (DNA only). >For example, I would like to be able to provide a keyword such as 'splice >site' and have the program search genbank and return with a list of sequence >names and the subsequence from each entry corresponding to my keyword. Because Delila was designed before GenBank, and GenBank structure is STILL not up to snuff, one must convert from GenBank to Delila format. This is a simple program called dbbk (written by Matt Yarus, son of Mike Yarus, you may be interested to know!). The Delila viewpoint is that the database consists of a set of organisms and their chromosomes. You must specify these, and then the piece of DNA you are interested in. The piece corresponds roughly to a GenBank entry. The idea is that Delila is a 'librarian' and you give 'her' instructions that define the fragments you want. She reaches into the library and pulls out -- what else? -- a book. Instructions might look like: title 'Demonstration of Delila instructions'; (* the title is required to name the resulting book *) (* this is a comment, just as in the computer language Pascal *) organism H.sapians; (* define the organism *) chromosome 3; (* I made this name up; unfortunately GenBank hasn't stored this information consistently *) piece x253; (* I made this name up also *) get from 536 -24 to 536 +30; The last instruction, 'get' says to Delila that you want the fragment that starts 24 bases before coordinate 536 and ends 30 bases after. By having the instructions written in a file, one can handle many of them. There is now a program that automatically creates Delila instructions from the GenBank features. This has allowed us to create hundreds to thousands of fragments for statistical analysis. Parts of the Delila system are available by anonymous ftp from ncifcrf.gov in pub/delila. See the README files. I will place more programs in the archive if you request them. Tom Schneider National Cancer Institute Laboratory of Mathematical Biology Frederick, Maryland 21702-1201 toms@ncifcrf.gov -- comp.archives file verification ncifcrf.gov total 1111 -rw-r--r-- 1 140 42225 May 1 11:16 sites.c.Z -rw-r--r-- 1 140 36808 May 1 11:15 sites.p.Z -rw-r--r-- 1 140 5478 Apr 29 17:45 ribo.logo.Z -rw-r--r-- 1 140 14006 Apr 29 13:55 siva.p.Z -rw-r--r-- 1 140 128 Apr 29 12:37 sivap.Z -rw-r--r-- 1 140 227 Apr 29 12:36 README.delila.Z -rw-r--r-- 1 140 2800 Apr 29 11:04 bez.ps.Z -rw-r--r-- 1 140 271 Apr 29 10:52 README.misc.Z -rw-r--r-- 1 140 1789 Apr 29 10:51 dops.demo.Z -rw-r--r-- 1 140 30365 Apr 29 10:47 dops.p.Z -rw-r--r-- 1 140 1789 Apr 29 10:47 demo.Z -rw-r--r-- 1 140 2692 Apr 17 17:38 standard.t7.Z -rw-r--r-- 1 140 1875 Apr 17 17:35 database.t7.Z -rw-r--r-- 1 140 3311 Apr 9 09:56 vaxmod.p.Z -rw-r--r-- 1 140 1589 Apr 2 17:17 cell.sty.Z -rw-r--r-- 1 140 13642 Apr 2 17:17 cell.bst.Z -rw-r--r-- 1 140 3649 Apr 2 12:36 decat.c.Z -rw-r--r-- 1 140 1864 Apr 2 12:36 decat.p.Z -rw-r--r-- 1 140 15223 Mar 26 13:58 calc.c.Z -rw-r--r-- 1 140 13686 Mar 26 13:58 calc.p.Z -rw-r--r-- 1 140 15 Mar 23 06:21 catalp -rw-r--r-- 1 140 12886 Mar 21 17:57 worcha.p.Z -rw-r--r-- 1 140 2660 Mar 20 14:18 titer.verbose.Z -rw-r--r-- 1 140 625 Mar 20 14:17 titer.result.Z -rw-r--r-- 1 140 12384 Mar 20 14:17 titer.c.Z -rw-r--r-- 1 140 10694 Mar 20 14:17 titer.p.Z -rw-r--r-- 1 140 4661 Mar 20 14:12 titer.plates.Z -rw-r--r-- 1 140 2722 Mar 20 12:34 tkod.p.Z -rw-r--r-- 1 140 87 Mar 20 10:31 dirtyp.Z -rw-r--r-- 1 140 5008 Mar 20 10:31 dirty.p.Z -rw-r--r-- 1 140 1517 Mar 19 18:17 ww.c.Z -rw-r--r-- 1 140 1271 Mar 19 18:16 ww.p.Z -rw-r--r-- 1 140 34749 Mar 19 14:11 search.p.Z -rw-r--r-- 1 140 15296 Mar 15 14:11 README -rw-r--r-- 1 140 3319 Mar 8 18:01 nulldate.p.Z -rwxr--r-- 1 140 10147 Mar 8 17:38 nulldate.Z -rw-r--r-- 1 140 13765 Mar 6 12:35 module.p.Z -rw-r--r-- 1 140 3986 Mar 6 12:34 moddef.Z -rw-r--r-- 1 140 46524 Feb 15 12:40 xyplo.c.Z -rw-r--r-- 1 140 7438 Feb 15 12:31 dalvec.c.Z -rw-r--r-- 1 140 213 Feb 15 12:31 README.logo.Z -rwxr--r-- 1 140 196 Feb 15 12:29 2c.Z -rw-r--r-- 1 140 5865 Feb 15 12:23 alpro.c.Z -rw-r--r-- 1 140 25779 Feb 15 12:21 makelogo.c.Z -rw-r--r-- 1 140 29 Feb 15 12:14 marks.Z -rw-r--r-- 1 140 37858 Feb 14 15:14 libdef.Z -rw-r--r-- 1 140 10 Feb 11 11:21 genhisp -rw-r--r-- 1 140 10587 Feb 11 11:21 genhis.p.Z -rw-r--r-- 1 140 22198 Jan 18 14:56 makelogo.p.Z -rw-r--r-- 1 140 803 Jan 16 20:17 makelogop.dna.Z -rw-r--r-- 1 140 808 Jan 16 20:17 makelogop.ribo.Z -rw-r--r-- 1 140 945 Jan 16 20:17 makelogop.protein.Z -rw-r--r-- 1 140 803 Jan 16 20:17 makelogop.lambcro.Z -rw-r--r-- 1 140 838 Jan 16 20:17 makelogop.demo.Z -rw-r--r-- 1 140 790 Jan 16 20:16 makelogop.alphabet.Z -rw-r--r-- 1 140 8094 Jan 16 14:40 nat.bst.Z -rw-r--r-- 1 140 18717 Jan 11 16:39 encode.p.Z -rw-r--r-- 1 140 15571 Jan 11 16:39 alist.p.Z -rw-r--r-- 1 140 3895 Jan 4 10:01 ver.p.Z -rw-r--r-- 1 140 20030 Dec 19 17:00 rsgra.p.Z -rw-r--r-- 1 140 214 Dec 14 13:27 listerp.Z -rw-r--r-- 1 140 8349 Dec 14 13:27 count.p.Z -rw-r--r-- 1 140 19010 Dec 14 13:27 lister.p.Z -rw-r--r-- 1 140 5404 Dec 10 12:12 p2c.h.Z -rw-r--r-- 1 140 7233 Dec 9 16:47 calhnb.p.Z -rw-r--r-- 1 140 7339 Dec 9 15:59 t7.logo.Z -rw-r--r-- 1 140 6172 Dec 9 15:57 dalvec.p.Z -rw-r--r-- 1 140 871 Dec 9 15:29 rf.p.Z -rw-r--r-- 1 140 4982 Dec 9 14:05 alpro.p.Z -rw-r--r-- 1 140 4018 Dec 8 20:18 lambcro.logo.Z -rw-r--r-- 1 140 5101 Dec 8 20:16 globin.logo.Z -rw-r--r-- 1 140 806 Dec 8 19:07 makelogopo.ribo.Z -rw-r--r-- 1 140 156 Dec 8 18:11 colors.dna.Z -rw-r--r-- 1 140 854 Dec 5 23:05 xyplop.test.Z -rw-r--r-- 1 140 1404 Dec 5 23:05 xyplop.mul.Z -rw-r--r-- 1 140 1448 Dec 5 23:05 xyplop.demo.Z -rw-r--r-- 1 140 1448 Dec 5 23:05 xyplop.Z -rw-r--r-- 1 140 41792 Dec 5 23:05 xyplo.p.Z -rw-r--r-- 1 140 553 Dec 5 23:05 xyin.test.Z -rw-r--r-- 1 140 410 Dec 5 23:05 xyin.mul.Z -rw-r--r-- 1 140 655 Dec 5 23:05 xyin.demo.Z -rw-r--r-- 1 140 655 Dec 5 23:05 xyin.Z -rw-r--r-- 1 140 4607 Dec 5 23:05 sortbibtex.p.Z -rw-r--r-- 1 140 6918 Dec 5 23:05 ref2bib.p.Z -rw-r--r-- 1 140 4139 Dec 5 23:05 verbop.p.Z -rw-r--r-- 1 140 1554 Dec 5 23:05 jmb.sty.Z -rw-r--r-- 1 140 13280 Dec 5 23:04 jmb.bst.Z -rw-r--r-- 1 140 4333 Dec 5 23:04 nar.sty.Z -rw-r--r-- 1 140 8013 Dec 5 23:04 nar.bst.Z -rw-r--r-- 1 140 2040 Dec 5 23:04 rembla.p.Z -rw-r--r-- 1 140 1322 Dec 5 23:04 symvec.dna.Z -rw-r--r-- 1 140 385 Dec 5 23:04 symvec.demo.Z -rw-r--r-- 1 140 2154 Dec 5 23:04 rsdata.dna.Z -rw-r--r-- 1 140 7670 Dec 5 23:04 protseq.globin.Z -rw-r--r-- 1 140 8405 Dec 5 23:04 logo.tex.Z -rw-r--r-- 1 140 1821 Dec 5 23:04 logo.bbl.Z -rw-r--r-- 1 140 158 Dec 5 23:04 colors.two.Z -rw-r--r-- 1 140 305 Dec 5 23:04 colors.protein.Z -rw-r--r-- 1 140 474 Dec 5 23:04 colors.jm.Z -rw-r--r-- 1 140 416 Dec 5 23:04 colors.dg.Z -rw-r--r-- 1 140 197 Dec 5 23:04 colors.demo.Z -rw-r--r-- 1 140 142 Dec 5 23:04 colors.alphabet.Z -rw-r--r-- 1 140 186 Dec 5 23:04 colors.Z -rw-r--r-- 1 140 1216 Dec 5 23:04 w71.Z -rw-r--r-- 1 140 1085 Dec 5 23:04 w51.Z -rw-r--r-- 1 140 1457 Dec 5 23:04 w101.Z -rw-r--r-- 1 140 21444 Dec 5 23:04 rseq.p.Z -rw-r--r-- 1 140 9313 Dec 5 23:04 rawbk.p.Z -rw-r--r-- 1 140 11759 Dec 5 23:04 patval.p.Z -rw-r--r-- 1 140 14519 Dec 5 23:04 patser.p.Z -rw-r--r-- 1 140 319 Dec 5 23:04 patli.Z -rw-r--r-- 1 140 122 Dec 5 23:04 patin.Z -rw-r--r-- 1 140 291 Dec 5 23:04 patbk.Z -rw-r--r-- 1 140 24009 Dec 5 23:04 makebk.p.Z -rw-r--r-- 1 140 1378 Dec 5 23:04 loocat.p.Z -rw-r--r-- 1 140 188 Dec 5 23:04 ex8in.Z -rw-r--r-- 1 140 144 Dec 5 23:03 ex7in.Z -rw-r--r-- 1 140 325 Dec 5 23:03 ex6in.Z -rw-r--r-- 1 140 172 Dec 5 23:03 ex5in.Z -rw-r--r-- 1 140 173 Dec 5 23:03 ex4in.Z -rw-r--r-- 1 140 114 Dec 5 23:03 ex3in.Z -rw-r--r-- 1 140 235 Dec 5 23:03 ex2in.Z -rw-r--r-- 1 140 96 Dec 5 23:03 ex1in.Z -rw-r--r-- 1 140 137765 Dec 5 23:03 delman.Z -rw-r--r-- 1 140 49972 Dec 5 23:03 delila.p.Z -rw-r--r-- 1 140 12186 Dec 5 23:03 dbbk.p.Z -rw-r--r-- 1 140 27183 Dec 5 23:03 catal.p.Z found delila ok ncifcrf.gov:/pub/delila/
toms@fcs260c2.ncifcrf.gov (Tom Schneider) (05/15/91)
Archive-name: bionet/molbio/delila/1991-05-09 Archive-directory: ncifcrf.gov:/pub/delila/ [129.43.1.11] Original-posting-by: toms@fcs260c2.ncifcrf.gov (Tom Schneider) Original-subject: Re: Software for automated subseqence extraction Reposted-by: emv@msen.com (Edward Vielmetti, MSEN) In article <12911@uhccux.uhcc.Hawaii.Edu> jlong@uhunix1.uhcc.Hawaii.Edu (John Long) writes: >In article <1991May1.114219.25483@phri.nyu.edu> roy@phri.nyu.edu (Roy Smith) writes: >>toms@fcs260c2.ncifcrf.gov (Tom Schneider) writes: >>Her? Why is a librarian automatically assumed to be female? >With a name like 'Delila' I think it's safe to assume that he/she/ye/it is a >female. Maybe the creator named it after herself. Call it artistic license. >BFD. >Besides, doesn't it just make sense that software would be female and hardware >be male? I was designing a computer language with which one can extract portions of a DNA sequence. I needed a name, and one morning woke up and wrote down: DEoxyribonucleic acid LIbrary LAnguage DELILA hence the name. See @article{Schneider1982, author = "T. D. Schneider and G. D. Stormo and J. S. Haemer and L. Gold", title = "A design for computer nucleic-acid sequence storage, retrieval and manipulation", journal = "Nucl. Acids Res.", volume = "10", pages = "3013-3024", year = "1982"} "She"'s available by anonymous ftp from ncifcrf.gov in pub/delila. >Aloha, >-LongJohn Tom Schneider National Cancer Institute Laboratory of Mathematical Biology Frederick, Maryland 21702-1201 toms@ncifcrf.gov