mnr@daisy.learning.cs.cmu.edu (Marc Ringuette) (12/08/90)
It would be extremely useful to have access to an archive of source code for common AI problems. Such an archive could contain simple planners, parsers, frame-based representations, and commonly used algorithms. This would encourage sharing and discourage reinventing the wheel. A second emphasis of such an archive could be as a research resource. It could contain implementations of published work, experimental results and challenge problems, and domains for testing (for instance) robot agents. I would put a version of my Tileworld domain in such an archive, if I knew of one. Does such a repository exist? If not, I'm sure the AAAI would be willing to sponsor such an effort. Do you think it would be worthwhile, and if so do you have any ideas for additional material it should contain? Please share your comments. \\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\ \\\ Marc Ringuette \\\ Carnegie Mellon University, Comp. Sci. Dept. \\\ \\\ mnr@cs.cmu.edu \\\ Pittsburgh, PA 15213. Phone 412-268-3728(w) \\\ \\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\
forbus@ils.nwu.edu (Kenneth Forbus) (12/09/90)
AAAI is indeed sponsoring such an effort. There are two important purposes for such a library. First, as a field we have done a terrible job at record-keeping. Programs die, due to bit-decay (i.e., the language they are written in evolving out from under them) and by their authors simply not keeping copies around. The existence of Common Lisp makes bit-decay easier to prevent, Keeping copies around, however, should be made easier. In other fields not being able to easily duplicate one's experiments is considered very shoddy. I'm told that in psychology, for example, some journals require that authors maintain data on which articles are based for at least X years, where X varies with the journal. The second purpose is communication and education. Programs are our main experimental apparatus, and sharing programs can help us make progress better. How many times have you read about some interesting technique, and really wanted to try it on some example, but been stymied by the effort it would take to re-implement the technique? Having a set of well-developed, portable, well-documented programs, with examples, would help overcome such problems. Clearly, these two goals conflict: Asking someone to produce a high-quality, bullet-proof program before archiving it would simply mean that few would archive their programs. So, the idea is that the Program Library will have two kinds of programs: 1. Archival systems, such as thesis programs, which are being deposited purely for purposes of scientific replication and inspection. 2. "Vetted" systems, which have passed the inspection of an Editorial Board, to make sure they are adequately documented, run on the supplied examples, are reasonably portable, etc. Included with the system will be reviews of it. We are still investigating the right way to run the legalities, so that AAAI doesn't get sued if someone misuses programs, or tries to deposit their company's trade secrets. The model we are looking at right now is the Free Software Foundations Copyleft. Access will be via anonymous ftp and other media; details still being worked out. While alot has been worked out, many things remain to be worked out. Progress has been somewhat slowed by my recent move, but I hope to have an initial version of the Library up and running by the middle of next year. I'll be posting more details as soon as things are better worked out. In the meantime, I'd be happy to hear any comments, questions, or suggestions anyone has. Ken Forbus The Institute for the Learning Sciences Northwestern University 1890 Maple Avenue Evanston, IL, 60201, USA P.S. Please be forewarned that my email response time varies wildly with my other duties, so patience may be required :-) In article <11331@pt.cs.cmu.edu>, mnr@daisy.learning.cs.cmu.edu (Marc Ringuette) writes: > It would be extremely useful to have access to an archive of source code > for common AI problems. Such an archive could contain simple planners, > parsers, frame-based representations, and commonly used algorithms. This > would encourage sharing and discourage reinventing the wheel. > > A second emphasis of such an archive could be as a research resource. It > could contain implementations of published work, experimental results and > challenge problems, and domains for testing (for instance) robot agents. > I would put a version of my Tileworld domain in such an archive, if I knew > of one. > > > Does such a repository exist? If not, I'm sure the AAAI would be willing > to sponsor such an effort. Do you think it would be worthwhile, and if so > do you have any ideas for additional material it should contain? > > Please share your comments. > > \\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\ > \\\ Marc Ringuette \\\ Carnegie Mellon University, Comp. Sci. Dept. \\\ > \\\ mnr@cs.cmu.edu \\\ Pittsburgh, PA 15213. Phone 412-268-3728(w) \\\ > \\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\
joshua@athertn.Atherton.COM (Flame Bait) (12/09/90)
mnr@daisy.learning.cs.cmu.edu (Marc Ringuette) writes: >It would be extremely useful to have access to an archive of source code >for common AI problems. Such an archive could contain simple planners, >parsers, frame-based representations, and commonly used algorithms. This >would encourage sharing and discourage reinventing the wheel. Instead of a central archive, have a central index. That means that one machine does not have to store all the source, etc. It just needs to store an index and "how to get" instructions from all the various sites listed in the index. This has been successfully done on rec.games.frp, where someone keeps a list of electronic resources and how to use them. It is posted every month, I think. Joshua Levy joshua@atherton.com (408) 734-9822
theo@cs.fau.edu (Theo Heavey) (12/09/90)
> >It would be extremely useful to have access to an archive of source code > >for common AI problems. Such an archive could contain simple planners, > >parsers, frame-based representations, and commonly used algorithms. This > >would encourage sharing and discourage reinventing the wheel. > > Instead of a central archive, have a central index. That means that > one machine does not have to store all the source, etc. It just needs > to store an index and "how to get" instructions from all the various > sites listed in the index. > > This has been successfully done on rec.games.frp, where someone keeps a > list of electronic resources and how to use them. It is posted every month, Wouldn't it be more efficient to keep this "index" at an anon ftp site? This would reduce the amount of repostings of the entire "index". If the moderator (if there is one) on rec.games.frp just listed the new sites for information via anon ftp as they are introduced I think it would be a lot more helpful. Theo Heavey Florida Atlantic University Dept. of Computer Science Boca Raton, FL Internet: theo@cs.fau.edu
pat@cs.strath.ac.uk (Pat Prosser) (12/13/90)
In article <11331@pt.cs.cmu.edu> mnr@daisy.learning.cs.cmu.edu (Marc Ringuette) writes: >It would be extremely useful to have access to an archive of source code >for common AI problems. Such an archive could contain simple planners, >parsers, frame-based representations, and commonly used algorithms. This >would encourage sharing and discourage reinventing the wheel. Extremely > >Do you have any ideas for additional material it should contain? > Recently there was such a request posted to this group for public domain (PD) algorithms for the constraint satisfaction problem. I would like to see this, along with a set of standard csp problems. Also, for those among us that are interested in scheduling, a number of scheduling problems could be made available. These algorithms/problems would not only discourage reinvention, but would also allow us to compare algorithms on given data sets. This would be progress!
POPX@vax.oxford.ac.uk (Jocelyn Paine) (12/14/90)
Newsgroups: comp.ai Subject: Re: Repository of AI source code Summary: Expires: References: <11331@pt.cs.cmu.edu> Sender: Reply-To: popx@vax.ox.ac.uk (Jocelyn Paine) Followup-To: Distribution: Organization: Experimental Psychology, Oxford University, GB Keywords: In article <11331@pt.cs.cmu.edu> mnr@daisy.learning.cs.cmu.edu (Marc Ringuette) writes: >It would be extremely useful to have access to an archive of source code >for common AI problems. Such an archive could contain simple planners, >parsers, frame-based representations, and commonly used algorithms. This >would encourage sharing and discourage reinventing the wheel. > In 1987 I set up such a library for Prolog, for the very reasons you describe. I'd be willing to extend it to cover AI software in general. In fact, why don't I decide to do so now? So here's what I have to offer: ``I teach AI to psychology undergraduates, using Pop-11 and Prolog (the course used to be entirely Prolog, but I'm moving towards Pop-11). During the course, I talk about topics like scripts, mathematical creativity, planning, natural language analysis, and expert systems; I exemplify them by mentioning well-known programs like GPS, Sam, and AM. I hope before too long (May 1991) to integrate these into a computer-simulated animal. I would like my students to be able to run existing AI programs, from GPS to Mycin and up, and to investigate their mechanism and limitations. For students to incorporate into their own programs, I'd also like to provide a library of tools such as chart parsers, inference engines, search routines, and planners. Unfortunately, published descriptions of the famous programs give much less information than is necessary to re-implement them (it would be easier to re-implement cold fusion than the average AI program). As for the tools: some are reproduced in textbooks. But the published code has to be kept small to satisfy publishers, and it is often not available in machine-readable form. I therefore decided in 1987 to start up a library of Prolog code. I shall now extend it to cover any AI language in which people want to send programs. Sending contributions. ---------------------- Please E-mail them to user POPX at Janet address UK.AC.OX.VAX (the Vax-Cluster at Oxford University Computing Service). Only send text, not object or binary files (I will not accept programs in any form other than source text). If a file occupies more than a megabyte, please E-mail me about it first, but don't send the big file itself until I reply to request it. This will avoid the problem we sometimes have where our mailer rejects big files because there isn't room for them. I accept all entries on the understanding that they will be distributed to anyone who asks for them. I intend that the contents of the library be treated in the same way as proofs in the maths literature, and algorithms in computer science textbooks - publicly available ideas which everyone can experiment with, criticise, and improve. I'll try to put entries into the library within two weeks of arrival, and to test those entries for which I have a suitable language implementation. Catalogue. ---------- I keep a catalogue of entries. It contains for each entry: the name and geographical address of the entry's contributor (to prevent contributors receiving unwanted E-mail, I don't include their E-mail addresses unless they ask me to); a description of the entry, usually with examples of use; and an approximate size in kilobytes (to help those whose mailers can't receive large files easily). For those entries which I can run, I also include my evaluations of ease of use, portability, standardness, and documentation. Quality of entries. ------------------- Any contribution may be useful to someone out there, so I'll accept anything. I'm not just looking for elegant code and declarative respectability. However, it would be nice if entries were to be adequately documented (with literature references if appropriate, plus respectable documentation for both the users and the programmers). Requesting entries. ------------------- It prefer to send by E-mail, and can do so into any network that's connected cost-free to the UK academic network Janet. I can also send files as DOS text on IBM-PC discs, or on VAX tapes. In this case, I will ask for you to send either media, or payment for media, in advance. We hope eventually to get a mail server running. You may request the catalogue, or a particular entry in it, or (for example) "all the expert system shells written in LISP you have". I'll try to answer all requests within two weeks. If you get no reply, please send a message by paper mail to my address. Give full details of where your E-mail was sent from, the time, etc.; this may help us trace lost messages. Jocelyn Paine, Experimental Psychology, South Parks Road, Oxford OX1 3UD. POPX @ UK.AC.OX.VAX ''
joshua@athertn.Atherton.COM (Flame Bait) (12/14/90)
This is the current context: >> >It would be extremely useful to have access to an archive of source code >> >for common AI problems. Such an archive could contain simple planners, >> >parsers, frame-based representations, and commonly used algorithms. This >> >would encourage sharing and discourage reinventing the wheel. >> >> Instead of a central archive, have a central index. That means that >> one machine does not have to store all the source, etc. It just needs >> to store an index and "how to get" instructions from all the various >> sites listed in the index. >> >> This has been successfully done on rec.games.frp, where someone keeps a >> list of electronic resources and how to use them. It is posted every month, To which theo@cs.fau.edu (Theo Heavey) replied: >Wouldn't it be more efficient to keep this "index" at an anon ftp site? >This would reduce the amount of repostings of the entire "index". >If the moderator (if there is one) on rec.games.frp just listed the >new sites for information via anon ftp as they are introduced I think >it would be a lot more helpful. In theory you're right, but in practice, posting the list is better. The index can be posted automatically every month, so no one has to worry about it. Also, the people most likely to use it are new to the newsgroup, and may not even know of the index's existance. Also, posting it regularly serves to remind the "old timers" of its existance and the various sources of sources. At the minimum you should post the location of the index every month (or every two weeks). If the index is not posted regularly, then it should be available via email archive-server (not just FTP). Remember, most of the people who get newsgroups can not FTP things. They only have UUCP connections to the net, and can not use FTP, only email. Joshua Levy (joshua@atherton.com)
reece@enuxha.eas.asu.edu (Glen A. Reece) (12/15/90)
In article <5282@baird.cs.strath.ac.uk>, pat@cs.strath.ac.uk (Pat Prosser) writes: > In article <11331@pt.cs.cmu.edu> mnr@daisy.learning.cs.cmu.edu (Marc Ringuette) writes: > >It would be extremely useful to have access to an archive of source code > >for common AI problems. Such an archive could contain simple planners, > >parsers, frame-based representations, and commonly used algorithms. This > >would encourage sharing and discourage reinventing the wheel. > > Extremely > > > > >Do you have any ideas for additional material it should contain? > > > > Recently there was such a request posted to this group for > public domain (PD) algorithms for the constraint satisfaction > problem. I would like to see this, along with a set of standard > csp problems. Also, for those among us that are interested in scheduling, > a number of scheduling problems could be made available. > These algorithms/problems would not only discourage reinvention, but > would also allow us to compare algorithms on given data sets. This would > be progress! This is an excellent idea, and seems like many people are interested in doing such a thing (i.e., Ken Forbus). I would like to second Pat's call for including scheduling problems and algorithms/techniques. I'm currently working in the area of job shop scheduling for my thesis in AI and I'm running into the vary problem of reinventing work that I know for a fact was done in the past. In fact, I'm working with Karl Kempf from the Intel AI Lab in Santa Clara, California, and his position is that the results of the work must be made available so people don't keep bumping their heads against the same walls. - Glen ------------------------------------------------------------------------ = Glen A. Reece = Arizona State University = = Industrial Fellow = Artificial Intelligence Lab. = = = Dept. of Computer Science & Engineering = = (602) 965-2735 = Tempe, Arizona 85287-5406 = = = = = reece@enuxha.eas.asu.edu = What's another word for Thesaurus? = ------------------------------------------------------------------------
dmocsny@minerva.che.uc.edu (Daniel Mocsny) (12/25/90)
In article <1933@enuxha.eas.asu.edu> reece@enuxha.eas.asu.edu (Glen A. Reece) writes: >[...] I'm >currently working in the area of job shop scheduling for my thesis in >AI and I'm running into the vary problem of reinventing work that I >know for a fact was done in the past. In fact, I'm working with >Karl Kempf from the Intel AI Lab in Santa Clara, California, and his >position is that the results of the work must be made available so >people don't keep bumping their heads against the same walls. This is a problem endemic to most areas of science and engineering. Science and engineering advance only when communities of investigators share their findings with each other, and build on the results of previous work. The traditional vehicle for sharing findings is, of course, the printed literature. This vehicle was adequate in ancient times when most scientists and engineers worked on comparatively simple problems. When your results consisted of a few concise equations, maybe a few plots and nomographs, or a manageable table of data, your paper was a complete summary of your work. Any of your peers with comparable skills could read your paper and immediately begin building on your results. Today, computer technology has enabled scientists and engineers to embark on complex research that tends to defy concise verbal explanation. Most significant results today can't be *functionally* expressed in words, since their real expression is now in computer code. That doesn't render words obsolete---we still need those high-level descriptions to organize our approach to the low-level details. However, merely reading a high-level description no longer enables the reader to reproduce the original results, nor to build on them productively and efficiently. The traditional literature is now faltering in its mission as a vehicle for sharing ideas. Technical readers once expected to read a paper, and find something immediately useful. Today, many technical papers read more like advertisements, their functional content emasculated, and the reader no more capable after finishing the paper than before. This sad trend appears to be the result of two forces: (1) traditionalism, and (2) hucksterism. Historically, science developed as a hobby of the idle rich. Scientific results also tended to be too simple to have much commercial potential. Since hucksterism was not a necessary or practical choice for scientists most of the time, they had the luxury of establishing a rather lofty tradition of excluding it. However, scientists still had a scarce commodity to ration---peer recognition. Instead of competing economically, they competed on the basis of the quantity and quality of their contributions to the literature. However, the exact nature of those "contributions" became intimately entwined with the particular technological basis for that literature: the printing press. When science was simple, this was not a problem. Today, science is no longer simple, but the definition of "contribution" still follows from the technology of Gutenberg. The massive expansion of science and technology after the Second World War overwhelmed the breeding capacity of the idle rich. The only way to sustain such expansion has been to recruit people from the middle class, by turning science and technology into a set of professions. For most scientists and engineers, peer recognition is more than something to feel good about while relaxing in the den. It is the key to sustaining and advancing careers. At the same time, scientific results have become much more complex and economically valuable. The scientist today, upon discovering something useful, must consider its commercial potential before reporting it. A useful result can now become the basis for a major new industry in just a few years. This is a profound temptation for a salaried employee. What is the answer? I don't know. Reward systems and productivity in science and technology today need some serious investigation. Scientists and engineers need a vehicle for publishing their *complete* results, not just advertisements about their results. They also need incentives for doing so. We need some sort of "productivity index" to attach to scientific publications (of all types). Does the publication increase the capability of the reader in any measurable way? -- Dan Mocsny Snail: Internet: dmocsny@minerva.che.uc.edu Dept. of Chemical Engng. M.L. 171 dmocsny@uceng.uc.edu University of Cincinnati 513/751-6824 (home) 513/556-2007 (lab) Cincinnati, Ohio 45221-0171