wales@CS.UCLA.EDU (02/06/88)
I recently had an idea for a new kind of User Agent mail program. I'd like to see if I can develop it somehow into a research project that could form the basis for my Ph.D. dissertation. (The bare idea, all by itself, doesn't appear to be substantive enough.) Although this topic doesn't specifically concern mail headers or trans- port protocols, I am including the "comp.mail.headers" newsgroup because I want to be particularly sure of reaching people on the Internet (via the HEADER-PEOPLE mailing list) who are familiar with non-UNIX computing environments. First, some background. Many existing systems (such as Berkeley Mail and MH) work along a "fil- ing cabinet" model. That is, the user selects a "folder" of messages and can then examine the contents of that folder. (Whether the "fol- ders" are implemented as single files, as in Berkeley Mail, or as direc- tories, as in MH, is not really crucial to the concept.) Incoming mail goes into a special "in-box" folder, which is generally the folder the user selects by default. Mail can be kept in the folder, moved to another folder, or deleted entirely. The "filing cabinet" model, I feel, starts to break down when one deals with large amounts of mail. The main problem is that there is usually no good way to locate a given piece of mail, unless the user can remem- ber which folder he filed it in. The idea I had was to take all of one's incoming mail and put it into a information retrieval system. Messages could then be searched for by any of a number of criteria (dates, addresses, subject, user-assigned keywords, etc.). The kinds of searches available would be limited only by the resources available to do the indexing and the searching. This concept is similar in some respects to the "keyword-based news" system proposed by Brad Templeton some years ago (though I'm not saying that it is the same as Brad's idea). Especially if one considers recent developments in personal filing sys- tems (e.g., Hypercard), it seems like an "information-retrieval-model" mail management system should be feasible. When I proposed the idea to a graduate seminar here at UCLA a couple of weeks ago, one participant commented that my idea could probably be implemented in a couple of days using Hypercard or other similar tools (and would, as a result, not be very interesting as original research). Yet, as far as I've been able to discover so far in the course of my reading, no one has done this. Is anyone out there on the net aware of a mail system that does the kinds of things I am suggesting here? If in fact it hasn't been done, is there some major "show-stopper" problem that has kept it from being done? Although I wish the answer were simply that no one had ever thought of doing this kind of thing before me, I doubt this is the case. I freely confess to a certain amount of "UNIX myopia", and would partic- ularly like to hear about e-mail management tools that are radically different from those customarily used on most UNIX systems. -- Rich Wales // UCLA Computer Science Department // +1 (213) 825-5683 3531 Boelter Hall // Los Angeles, California 90024-1596 // USA wales@CS.UCLA.EDU ...!(ucbvax,rutgers)!ucla-cs!wales "Sir, there is a multilegged creature crawling on your shoulder."
blarson@skat.usc.edu (Bob Larson) (02/06/88)
In article <11120@shemp.UCLA.EDU> wales@CS.UCLA.EDU (Rich Wales) writes: >The idea I had was to take all of one's incoming mail and put it into a >information retrieval system. Messages could then be searched for by >any of a number of criteria (dates, addresses, subject, user-assigned >keywords, etc.). The kinds of searches available would be limited only >by the resources available to do the indexing and the searching. Tops-20 Mm allows message selection by all of the above mentioned things plus some others. (message body contents, unseen, flagged, etc.) It also supports the multiple folder model. It does not use a general purpouse database. There are at least two copies of Mm for unix, the one I have used has major problems, I consider it "almost usable". I also have a Mm subset for primos. >Is anyone out there on the net aware of a mail system that does the >kinds of things I am suggesting here? If in fact it hasn't been done, >is there some major "show-stopper" problem that has kept it from being >done? No major show stopper that I know of. Implamentation problems such as limited mailbox size do exist. (Startup is also slower on large mailboxes.) Most of the problems with keywords have been rediscovered by mm users, although since the same person assigns and uses the keywords they are less than a general keyword system. -- Bob Larson Arpa: Blarson@Ecla.Usc.Edu blarson@skat.usc.edu Uucp: {sdcrdcf,cit-vax}!oberon!skat!blarson Prime mailing list: info-prime-request%fns1@ecla.usc.edu oberon!fns1!info-prime-request
sa@ttidca.TTI.COM (Steve Alter) (02/12/88)
In article <11120@shemp.UCLA.EDU> wales@CS.UCLA.EDU (Rich Wales) writes: } ... } Many existing systems (such as Berkeley Mail and MH) work along a "fil- } ing cabinet" model. That is, the user selects a "folder" of messages } and can then examine the contents of that folder. } ... } } The "filing cabinet" model, I feel, starts to break down when one deals } with large amounts of mail. The main problem is that there is usually } no good way to locate a given piece of mail, unless the user can remem- } ber which folder he filed it in. Sorry to rain on your parade, Rich, but there is a widely available user-agent mailer that provides at least a portion of the facilities you describe. Although this system does not automatically handle addresses and dates (as far as I know) in its "message-perusal" functions, it does handle user-defined keywords and each message can be assigned more than one of them. The user can then peruse all messages that contain a selected keyword. This capability replaces and augments the concept of folders. Furthermore, if you forget the exact form/spelling of a keyword that you chose months ago, this system provides auto-completion and choice-listings just like the Tenex c-shell (tcsh). The disadvantage of this specific mailer is that it is quite large and won't fit on small machines such as PDP-11s. I am refering to GNU Emacs and its "rmail" package. In retrospect, it shouldn't be that difficult to augment the MH system to support "cross-filing" between folders, and use hard-links so that different message-numbers in different folders point to the same message. All it would really need is a clean specification for the command-line syntax. Hey RAND, are you listening? -- Steve Alter ...!{csun,rdlvax,trwrb,psivax}!ttidca!alter or alter@tti.com Citicorp/TTI, Santa Monica CA (213) 452-9191 x2541
gillies@uiucdcsp.cs.uiuc.edu (02/15/88)
Re: The problem of losing mail somewhere within a myriad of folders I know of at least one mail system that solves this problem. It's called Babar, and runs at Xerox on smalltalk machines. It has some pretty nice database functions, but mainly it provides heirarchical mail folders, and the ability to enter a message under multiple categories. Many smalltalk users manage 10 megabytes of mail (2000+ messages) with no problem. They are running on Doradoes, 68020-class workstations. Every message goes into one huge file, and folders are implemented as sets of pointers into this file. Therefore, the text of a message is only stored once, even it appears in 25 categories. There are some standard categories, like "deleteable", "sent-by-me", "everything", etc. that the system maintains. When you delete a message it goes into the deleteable category. When you expunge, everything in this category is zapped. Whenever you send a message, a copy is saved in the sent-by-me category. The everything category references all the messages. In particular, you can do a text search through everything and the results are stored in a new category. This makes it very easy to relocate a containing a unique keyword. The system has a built-in mail file scavenger. It also modifies the mail database using atomic actions. The atomic actions are specialized, so you can even mount a remote mail database on an IFS file server, and access it transparently. Don Gillies {ihnp4!uiucdcs!gillies} U of Illinois {gillies@p.cs.uiuc.edu}
marvit@hpcea.CE.HP.COM (Peter Marvit) (02/18/88)
> In retrospect, it shouldn't be that difficult to augment the MH system > to support "cross-filing" between folders, and use hard-links so that > different message-numbers in different folders point to the same > message. All it would really need is a clean specification for the > command-line syntax. Hey RAND, are you listening? Well, I'm not RAND, but MH (6.5) already has this feature. In fact, the original poster may wish to consider using MH as a "back-end" to his system -- at least for the prototype stage. In production, he may wish to rewrite some parts of the system for efficiency sake since MH can be slow for some functions. Back to the quoted posting above, refile <msg> +<folder1> +<folder2> +<folder3> -link will hard link the <msg> to a number of folders simultaneously. The main problem is going the other way -- deciphering which folders a particular message belongs to. -Peter Marvit HP Labs <marvit@hplabs.hp.com> or <{any biggie}!hplabs!marvit>
scott@tekcrl.TEK.COM (Scott Huddleston) (02/22/88)
>In retrospect, it shouldn't be that difficult to augment the MH system >to support "cross-filing" between folders, and use hard-links ... "refile -link ..." already does this. The Unix file system is hardly a candidate for serious database capabilities, however. It's limitations include: a). keywords are limited to Unix filenames. b). MH folders give records (mail msgs) per keyword, but not keywords per record. (i.e., associations are only one-way). c). hard-links can't be made to other file-system partitions (ruling out netnews from this "database" mechanism) d). the space and performance costs of using one inode per relation are substantial.