sarge@metapsy.UUCP (Sarge Gerbode) (10/31/88)
In article <5790@hoptoad.uucp> tim@hoptoad.UUCP (Tim Maroney) writes: >Let's see, if it takes about two minutes to scan and convert a page, >and the average book has 250 pages, then that's 500 minutes or over 8 >hours per book -- let's say ten hours to be conservative. So it would >take 6710 hours or about three and a third work years to scan in 671 >books. And I think my two minutes a page estimate may be optimistic, >not to mention extra costs for indexing and mastering. Not a basement >project, I'm afraid. Well, this must be somewhat ameliorated by the fact that many publishers surely have most or all of their books in electronic form; and there are fairly decent full-text retrieval and indexing programs that would make a normal index obsolete. One in particular is a product called "Elexir", from ThirdEye Software in Palo Alto, that is currently in alpha testing, which allows one to do context searches and a variety of other actions that an index alone cannot accomplish. -------------------- Sarge Gerbode -- UUCP: pyramid!thirdi!metapsy!sarge Institute for Research in Metapsychology 950 Guinda St. Palo Alto, CA 94301 -- -------------------- Sarge Gerbode -- UUCP: pyramid!thirdi!metapsy!sarge Institute for Research in Metapsychology 950 Guinda St. Palo Alto, CA 94301
bill@bilver.UUCP (Bill Vermillion) (10/31/88)
In article <5790@hoptoad.uucp> tim@hoptoad.UUCP (Tim Maroney) writes: >In article <3447@pt.cs.cmu.edu> ns@cat.cmu.edu (Nicholas Spies) writes: >>In article <5772@hoptoad.uucp> tim@hoptoad.UUCP (Tim Maroney) writes: >>>... >>>And according to this estimate, a Next disk will hold 671 books at 256M. >> >>At $40/book that's $26,840.00 + $50.00 for the disc itself. Just the >>author's royalties, figured at 15%, would make the disc cost $4,026 (after >>all, why should the authors take a loss?). Therein lies the problem of very >>dense media. > .... stuff deleted .... >entirely. Even with public domain books, the costs of scanning and >character-recognizing are pretty large. ....... ........ >Let's see, if it takes about two minutes to scan and convert a page, >and the average book has 250 pages, then that's 500 minutes or over 8 >hours per book -- let's say ten hours to be conservative. So it would >take 6710 hours or about three and a third work years to scan in 671 >books. And I think my two minutes a page estimate may be optimistic, >not to mention extra costs for indexing and mastering. Not a basement >project, I'm afraid. >-- Let's take a step back and look at this again. If a book is on disk we don't neccesarily need to be able to read it on a character basis. The idea is to be able to READ Shakespeare, not to re-edit, re-create, re-print, etc. I would suspect that it would be a bit difficult to get publishers to agree to that form of distribution. However - if we go to image storage we can still see the book on the screen, we could have images from the book, we would be able to search through the book (providing it was indexed - more in a later paragraph), we would be able to do almost anything except re-edit, re-(etc.).... So from 8 hours per pook at 2 minutes per page, we can go to 12.5 minutes per book at 3 seconds per page. Now before you say that can't be done - let me tell you I saw it. I forgot the company that makes it, but the system was a document storage and retreival system using high speed scanners, fast photo-copy type printers, and 12" laser disk media. One of the options was a 12 video juke box. I don't recall the exact capacity, but it was large. Let's just map this onto existing video technology. In a CAV mode a 12 disk can store approximate 55,000 frames per side. When these disks are used for data they are about 1.2 gigs. That is about 4.5 times more than the 256meg disk. That means we should be able to get about 11,750 (rounded) pages per 256Meg disk. Or 47 books per disk. Media cost then is approximately $1.00 per book which puts it just above paper-back printing costs, but below hard-bound. And I would estimate it would cost you under $2.00 to ship the disk first class, as opposed to $$$$ to ship 47 books that way. The document storage/retrieval system also had software so that you would index the document as you stored it. Then anytime you needed the document you would go to the index and get it. On a large juke-box that could take 20 to 30 seconds to find the disk, place it, search and then display. But on a large juke-box that was finding 1 document out of FIVE MILLION. THen at a touch of a button you had a full hard copy of the original, and the company had information on the legal acceptability of such documents. Quite impressive. So instead of 6700 books taking 3 years, we get 50 books taking 10 hours. This seems a more reasonable route. An aside - that relates to the above. Before Sony and Phillip cross-licensed their CD technology, Sony had developed a "digital audio disk". They could see no market for the disk. Why. Well they had this disk, 12" in diameter, and they could not conceive of being able to market a record that played for 20 HOURS per side. Phillips had a 4" (approx) disk. Playing time was under 1 hour. One of the favorite works of a Sony exec. was 73 minutes long, so the disk was designed for that. That is where the 12 cm disk came from. It is probably better to waste space and have a marketable item, than to achieve maximum capability and have no market at all. Who - execept a library would want 6700 disks on a volume. And what about accesibility to the 6699 other books when someone has the disk to read 1 volume. -- Bill Vermillion - UUCP: {uiucuxc,hoptoad,petsd}!peora!rtmvax!bilver!bill : bill@bilver.UUCP
postmaster@mailcom.FIDONET.ORG (Bernard Aboba) (10/31/88)
Not to mention the copyright problems, which many publishing firms have already concluded to be insurmountable. It is important to keep in mind that "higher technology" does not necessarily imply "higher profit." In fact, it can be argued that the single largest force pushing the adoption of high technology is the desire to remain competitive -- i.e. if I don't develop the technology, someone else will. This force is NOT operative in publishing -- if I own the copyright to informmation X, I'm the only one who can publish it, in whatever media. At that point the question becomes "which media will generate the most profit?" The answer, most assuredly is NOT optical, or CD-ROM, and may NEVER be. Right now, the cost of time and materials for copying a $40 textbook of, say 500 pages, makes the project barely worthwhile at $0.05 per page. However, the economics of ripping off an entire Encylopedia Brittanica or two is much better if the Encyclopedia is on an optical disk. Plus, the deed could be done in a fraction of the time. Is it so strange for publishers to conclude that the major benefactor of optical publishing would be pirates? The record industry has already concluded the same thing, which is why they have vehemently opposed DAT. My own guess is that floptical drives may well sound the death nell not only for CD-ROM, but for much of the optical publishing industry, which right now exists almost exclusively to serve vertical markets. In these markets, where you sell a few copies at high prices, piracy has devastating effects. Imagine the damage that could be done, say if a volume of legal references were copied by virtually every student at a law school, who then took the pirated copies with them into their practices? You'd not only kill immediate sales, but sales of the product down the line. The advent of eraseable optic media therefore shifts the development away from REFERENCE materials such as encyclopedias, to information with a TIME VALUE, such as stock price data. -- ------------------------------------------------------------------------------ FidoNet: 1:204/444 UUCP: ...!sun!sunncal!mailcom!bernard INTERNET: f444.n204.z1.Fidonet.org US MAIL: Bernard Aboba, 101 First St. #224, Los Altos, CA 94022
tim@hoptoad.uucp (Tim Maroney) (11/01/88)
In article <5790@hoptoad.uucp> tim@hoptoad.UUCP (Tim Maroney) writes: >And I think my two minutes a page estimate may be optimistic, >not to mention extra costs for indexing and mastering. In article <557@metapsy.UUCP> sarge@metapsy.UUCP (Sarge Gerbode) writes: >There are fairly decent full-text retrieval and indexing programs >that would make a normal index obsolete. I was referring to an automatically generated inverted index, not an ordinary book index, which would be silly on a high-density optical medium. It would still require human checking in any case, just as optical character recognition does, so the time would be noticeable. Because of the slow seeks and large amounts of data, it is neccessary to set up an index on an optical read-only medium at publication time; run-time search algorithms are way too slow. -- Tim Maroney, Consultant, Eclectic Software, sun!hoptoad!tim "What's bad? What's the use of turning? In Hell I'll be there a-burning! Meanwhile, think of what I'm earning! All on account of my name." - Bill Sykes, "Oliver"
tim@hoptoad.uucp (Tim Maroney) (11/01/88)
In article <282@bilver.UUCP> bill@bilver.UUCP (Bill Vermillion) writes: >If a book is on disk we don't >neccesarily need to be able to read it on a character basis. The idea is to >be able to READ Shakespeare, not to re-edit, re-create, re-print, etc. Wrong. The idea is to be able to read Shakespeare, to copy and paste relevant sections for critical essays, to print sections for reading at leisure when away from the computer, to do word-frequency analyses, to follow cross-reference chains among related keywords and topics, and so on. Computers are a terrible medium for leisure reading -- less text shows on a screen than on a printed page, and the screen luminescence leads to eye fatigue, not to mention the lack of physical portability. If all you can do is read, what you have is far worse than a printed book. And I have yet to see a stage show where the director didn't do some editing of the script! >However - if we go to image storage we can still see the book on the screen, >we could have images from the book, we would be able to search through the >book (providing it was indexed - more in a later paragraph), we would be able >to do almost anything except re-edit, re-(etc.).... Almost anything; except everything you would expect to be able to do with computer text, such as copy and paste it, do keyword searches, etc. You'd be able to read it and print it out. What an awesome improvement over the printed page. >So from 8 hours per pook at 2 minutes per page, we can go to 12.5 minutes per >book at 3 seconds per page. 3 seconds a page? Is that using clairvoyance or what? Visualize the process of positioning a book on a flat-bed scanner for a moment. It takes anywhere from five to twenty seconds. Now add the scanning time, which is at the minimum 3 seconds a page. >Now before you say that can't be done - let me tell you I saw it. I forgot >the company that makes it, but the system was a document storage and retreival >system using high speed scanners, fast photo-copy type printers, and 12" laser >disk media. One of the options was a 12 video juke box. I don't recall the >exact capacity, but it was large. Perhaps you're referring to the Wang system that has gotten so much publicity. I don't see how it is well suited to mass distribution of books; it is meant for keeping copies of receipts and so forth. >The document storage/retrieval system also had software so that you would >index the document as you stored it. Then anytime you needed the document you >would go to the index and get it. On a large juke-box that could take 20 to >30 seconds to find the disk, place it, search and then display. But on a >large juke-box that was finding 1 document out of FIVE MILLION. That's a great approach for receipts. For books, you're talking at least two extra minutes per page, with a high error rate and an extremely inconvenient interface requiring that you "lasso" the words being indexed. You also have to type them out. >THen at a touch of a button you had a full hard copy of the original, and the >company had information on the legal acceptability of such documents. Quite >impressive. And quite irrelevant. >So instead of 6700 books taking 3 years, we get 50 books taking 10 hours. >This seems a more reasonable route. How about a trillion books for no money at all? That's much more attractive. Coming soon to your Isuzu dealer. -- Tim Maroney, Consultant, Eclectic Software, sun!hoptoad!tim "Because there is something in you that I respect, and that makes me desire to have you for my enemy." "Thats well said. On those terms, sir, I will accept your enmity or any man's." - Shaw, "The Devil's Disciple"
sarge@metapsy.UUCP (Sarge Gerbode) (11/01/88)
In article <5799@hoptoad.uucp> tim@hoptoad.UUCP (Tim Maroney) writes: >>There are fairly decent full-text retrieval and indexing programs >>that would make a normal index obsolete. > >I was referring to an automatically generated inverted index, not an >ordinary book index, which would be silly on a high-density optical >medium. It would still require human checking in any case, just as >optical character recognition does, so the time would be noticeable. > >Because of the slow seeks and large amounts of data, it is neccessary >to set up an index on an optical read-only medium at publication time; >run-time search algorithms are way too slow. I'm really out of my depth on this topic, but I believe one can improve considerably on a mere inverted index. Furthermore, All the indexing could be resident on the disk (estimated about 1/3 or 1/2 the space of the text itself), and one would not have to *create* the index at run time, merely *use* it, a process which would take very little time (less than a second, probably, for a fairly hefty search). -- -------------------- Sarge Gerbode -- UUCP: pyramid!thirdi!metapsy!sarge Institute for Research in Metapsychology 950 Guinda St. Palo Alto, CA 94301
nujohnso@ndsuvax.UUCP (Ceej) (11/01/88)
In article <5790@hoptoad.uucp> tim@hoptoad.UUCP (Tim Maroney) writes: > >Let's see, [...] > So it would >take 6710 hours or about three and a third work years to scan in 671 >books. And I think my two minutes a page estimate may be optimistic, >not to mention extra costs for indexing and mastering. I would say that if you automated the process, it would cut that time down to around 2500 hours. By automating, I mean setting the process up so that the pages are fed into the process continually and ~24 hours a day. Note that this estimate is certainly not conservative, and the time required to set up this system is not included. Actual requirements may vary. Please consult your CD-ROM handbook for details. -- nujohnso@ndsuvax.bitnet nujohnso@plains.NoDak.edu ...!uunet!ndsuvax!nujohnso i want a shoehorn with teeth
olsen@XN.LL.MIT.EDU (Jim Olsen) (11/02/88)
In article <300.236DAA95@mailcom.FIDONET.ORG>, Bernard Aboba writes: >Not to mention the copyright problems, which many publishing firms have >already concluded to be insurmountable. But there is a wealth of important works in the public domain, such as government documents and works of pre-20th-century authors. While one would still have to recover scanning costs, these are small compared to the costs of producing an original work, and will continue to decrease. Much of the more recent material is already in digital form. >Imagine the damage that could be done, say if a volume of legal >references were copied by virtually every student at a law school, who >then took the pirated copies with them into their practices? Imagine the value to those law students of having, for modest cost, the entire United States Code, Code of Federal Regulations, or United States Reports (Supreme Court decisions) in their shirt pockets!
dmocsny@uceng.UC.EDU (daniel mocsny) (11/02/88)
In article <300.236DAA95@mailcom.FIDONET.ORG>, postmaster@mailcom.FIDONET.ORG (Bernard Aboba) writes: > Not to mention the copyright problems, which many publishing firms have > already concluded to be insurmountable. As electronic publishing methods mature and provide convenience and capability far beyond printed media, we find our concepts of intellectual property to be preventing us from taking advantage of these benefits. My main quarrel is not with the publishers of the Brittanica, but with the firms that profit from the sale and distribution of scholarly journals. The authors of these works do not usually derive any royalty from them. Furthermore, most of the work is publicly funded, and the authors want to obtain the widest possible exposure. The system we have now, that of relying on private companies to typeset, print, and disseminate the journals, has worked well enough in the past. However, these companies exist to serve the technical community, and not vice versa. If electronic publishing can help the members of the technical community share results with each other more effectively, then we must remove legal barriers that interfere with it. That does not have to mean bankruptcy for the technical publishers. If they took the lead in organizing the infrastructure for electronic dissemination of the research literature, they could provide better service for the same price that the average institution pays now for its journal subscriptions. Their costs would be lower and their profits higher. Instead, they will probably sit on the fence and continue to render our information less available via paper, until we take matters into our own hands, adopt markup-language standards, and distribute our own literature free of charge over our own networks. Mass-market publishers have a different sort of problem, because they do not serve a community of peers. I.e., a real distinction exists between producers (writers) and consumers. Since the writers are profit-motivated, they need paper to defend their intellectual property rights. At some point, however, the utility of printed information must become so much lower than the utility of electronic information that paper will lose its advantage. Dan Mocsny
cramer@optilink.UUCP (Clayton Cramer) (11/02/88)
In article <5800@hoptoad.uucp>, tim@hoptoad.uucp (Tim Maroney) writes: > In article <282@bilver.UUCP> bill@bilver.UUCP (Bill Vermillion) writes: > Wrong. The idea is to be able to read Shakespeare, to copy and paste > relevant sections for critical essays, to print sections for reading at > leisure when away from the computer, to do word-frequency analyses, to > follow cross-reference chains among related keywords and topics, and so > on. Computers are a terrible medium for leisure reading -- less text > shows on a screen than on a printed page, and the screen luminescence > leads to eye fatigue, not to mention the lack of physical portability. > If all you can do is read, what you have is far worse than a printed book. Issac Asimov wrote a marvelous parody of _The_Double_Helix_ about these wild, womanizing scientists at Oxford, a century or two from now, reinventing the book for exactly these reasons. If you doubt it, consider how many people curl up with a good machine- readable book and a computer at the end of long, busy day. Also, the number of people who bring along a laptop to sit in an open field and read for the pleasure of it. Anyone that wants to spend more time reading in front of a computer, instead of a printed page, isn't working hard enough! -- Clayton E. Cramer ..!ames!pyramid!kontron!optilin!cramer
bzs@encore.com (Barry Shein) (11/02/88)
On-line publishing will require new economics and ways of doing business, the old ways may have to wither. What will drive it will be watching competitors making profits. If they don't, then it was a failure, if they do one will change their way of looking at their own business to adapt (or die.) Current "problems" in the economics of paper publishing cannot be viewed as insurmountable obstacles to on-line publishing, only just what they are, the old order. I have little doubt someone suggested automobiles would never catch on due to the large investments buggy builders had in horse farms. -Barry Shein, ||Encore||
geb@cadre.dsl.PITTSBURGH.EDU (Gordon E. Banks) (11/02/88)
In article <5790@hoptoad.uucp> tim@hoptoad.UUCP (Tim Maroney) writes: > >Yep. All I was talking about was how many would fit. Whether it could >ever be economically feasible to publish such a disk is another matter >entirely. Even with public domain books, the costs of scanning and >character-recognizing are pretty large. Not really. You can estimate it by the cost of getting books from University Microfilms. They have to photocopy each page. A normal sized book is around $50. This covers retrieving the book from whatever library has it, and the labor of copying it. Of course, they expect to sell more copies of the microfilm later, but this would apply in spades to optical disk versions. OCR programs will soon be sophisticated enough that it won't add much to the cost of simply photocoping the book. Compared to conventional publication (typesetting) this cost is trivial. If all books worth reading in the public domain were done, it would be a wonderful thing. I suspect people will start doing this as soon as the market is large enough. The real hang up is going to be with current books where royalties will have to be paid.
vnend@ms.uky.edu (D. W. James -- Staff Account) (11/03/88)
In article <282@bilver.UUCP> bill@bilver.UUCP (Bill Vermillion) writes:
)Now before you say that can't be done - let me tell you I saw it. I forgot
)the company that makes it, but the system was a document storage and retreival
)system using high speed scanners, fast photo-copy type printers, and 12" laser
)disk media. One of the options was a 12 video juke box. I don't recall the
)exact capacity, but it was large.
)Bill Vermillion - UUCP: {uiucuxc,hoptoad,petsd}!peora!rtmvax!bilver!bill
If it was the same system that I saw written up in (I think) PC_WEEK
last year it's capacity at the limit was 1.2 TERABYTES. Not a trivial
amount of storage...
--
Vnend, posting from his other account, on a machine about 100 yards
horizontally, and 40 yards vertically, from the other one.
vnend@ms.uky.edu or vnend@ukma.bitnet or vnend@engr.uky.edu
"A few days later, I got a letter... advising me to forsake my sordid lifestyle and give all my hickies to the living Terim." The Countess, CEREBUS #54
tim@hoptoad.uucp (Tim Maroney) (11/03/88)
In article <5790@hoptoad.uucp> tim@hoptoad.UUCP (Tim Maroney) wrote: >Even with public domain books, the costs of scanning and >character-recognizing are pretty large. In article <1676@cadre.dsl.PITTSBURGH.EDU> geb@cadre.dsl.pittsburgh.edu (Gordon E. Banks) has been writing: >Not really. You can estimate it by the cost of getting books from >University Microfilms. They have to photocopy each page. A normal >sized book is around $50. This covers retrieving the book from >whatever library has it, and the labor of copying it. So $50*671 = $33,500. Not a trivial investment. This is the cost to the publisher of making the book, though it would be spread out among the individual copies. And that's still not factoring in the OCR running and proofreading, not to mention pre-mastering and mastering and duplication. And promotion and.... >OCR programs will >soon be sophisticated enough that it won't add much to the cost >of simply photocoping the book. Disagree. It'll always take proofreading, and for 671 books that's quite a lot of skilled labor to pay for. >Compared to conventional publication (typesetting) this cost is trivial. Agree provisionally; per book it's relatively trivial; for hundreds of books it far exceeds the production cost of a single typeset book. >If all books worth reading in >the public domain were done, it would be a wonderful thing. I suspect >people will start doing this as soon as the market is large enough. >The real hang up is going to be with current books where royalties >will have to be paid. Completely agree! I hope it happens, but as someone who did a minor feasibility study on doing it himself, I have to say it seems a long way off. The barriers are formidable. -- Tim Maroney, Consultant, Eclectic Software, sun!hoptoad!tim "The time is gone, the song is over. Thought I'd something more to say." - Roger Waters, Time
prem@andante.UUCP (Swami Devanbu) (11/03/88)
Bah humbug. When I read a book, I want to curl up in a comfy chair, with a blanket around me, a bowl of curried popcorn, and a pot of tea. A computer is a computer and a book is a book. Prem Devanbu Artificial Intelligence Principles Research Dept., (W) 201 582 2062 (H) 201 757 3748 MH 3C-438 AT&T Bell Laboratories 600 Mountain Ave, Murray Hill NJ 07974, USA prem%allegra@research.att.com {ihnp4,ucbvax,vax135,decvax,....}!allegra!prem
bob@allosaur.cis.ohio-state.edu (Bob Sutterfield) (11/04/88)
In article <13203@andante.UUCP> prem@andante.UUCP (Swami Devanbu) writes: >A computer is a computer and a book is a book. One of my pet peeves is the blatant misapplication of technology, in the rush to make everything high-tech to sell to the American MTV culture. I could buy a refrigerator that has a microprocessor- controlled front panel to dispense ice to me, and a microwave that's preprogrammed for all kinds of things I never cook, and a Datsun that talks to me. Technology should be applied to advantage where appropriate, and engineers should have enough restraint not to put a microprocessor in every toaster oven they sell. Similarly, there are many current products of the publishing industry that are completely inappropriate for mass electronic distribution. This will be decided in the market place, where it should be. That said, I am in favor of making the computer a useful reference and mind-amplification tool, again only where appropriate. For example, I use a semi-automated concordance for bible studies, and I use a spelling checker for almost everything of consequence I write. I look forward to the information management capabilities that the NeXT machine may someday put on my desk, but my wife would object if I curled up with the cube for a little bedtime reading :-) -=- Zippy sez, --Bob Here I am in the POSTERIOR OLFACTORY LOBULE but I don't see CARL SAGAN anywhere!!
henry@utzoo.uucp (Henry Spencer) (11/04/88)
In article <300.236DAA95@mailcom.FIDONET.ORG> postmaster@mailcom.FIDONET.ORG (Bernard Aboba) writes: >...The advent of eraseable optic media therefore shifts the development away >from REFERENCE materials such as encyclopedias, to information with a >TIME VALUE, such as stock price data. Or to material that has a mass market. A CD-ROM encyclopedia that cost $100, rather than thousands, would probably sell quite briskly and not be too troubled by piracy. Bulk copying of digital media becomes a problem in the same situation where photocopying of books becomes a problem: when the price far exceeds copying costs, i.e. when the publisher has decided to gouge a small market rather than try for modest profits from a large one. For some types of material, the publisher doesn't have a choice, since the market simply *is* small. For things like encyclopedias, though, simply dropping the price will expand the market. -- The Earth is our mother. | Henry Spencer at U of Toronto Zoology Our nine months are up. |uunet!attcan!utzoo!henry henry@zoo.toronto.edu
dmocsny@uceng.UC.EDU (daniel mocsny) (11/04/88)
...then let's build a world our machines can work in. 100 years ago people were trying to replace the horse with internal combustion engine-driven vehicles. Now the obvious approach would have been to build some sort of mechanical analog of the horse, strap an engine on it, and keep everything transparent to the users. Since that was not possible, the next easiest thing was to change the world to accommodate the strength/weakness mix of the best way to run engines: on wheeled chassis. So we put $ billions into paving over some of the best real estate in the country. Now we have a world that accommodates motor vehicles, to some extent. In article <5821@hoptoad.uucp>, tim@hoptoad.uucp (Tim Maroney) writes: > So $50*671 = $33,500. Not a trivial investment. This is the cost to the > publisher of making the book, though it would be spread out among the > individual copies. And that's still not factoring in the OCR running and > proofreading, not to mention pre-mastering and mastering and duplication. > And promotion and.... > > It'll always take proofreading, and for 671 books that's quite > a lot of skilled labor to pay for. Let's not forget that virtually every book that makes it into print these days passes through a computer at some stage in its production. Most authors use word processors (either directly or through secretaries), most publishers use electronic typesetting, and some of us authors dabble in both. So most of the work the CD-ROM publishers have to do has already been done somewhere. Printing books degrades the utility that was present when that information was originally in electronic form. From the standpoint of the CD-ROM vendors and potential users, publishers and authors who release information in printed form exclusively are destroying wealth. By refusing to establish and adhere to electronic document standards, we are reducing the amount of information we can exploit and pass on to our progeny. In other words, we are shooting ourselves in the foot. A world optimized for horses was no good for automobiles. The latter was useless until a new world was built. Similarly, a world optimized for paper is no good for computers. To get the most benefit out of our new technology, we need to change the way we do things. Obviously the existing stock of printed information will not benefit from re-designing our world to match the strengths and weaknesses of computers. But I would hesitate to say that OCR will _always_ require proofreading. OCR is a hard problem, but certainly not an impossible problem. It is only a mapping from the (very large) vector space of possible letter bitmaps to the smaller space of letter codes and font descriptions. The structure of that mapping is complex, but not infinitely so, else we could not read. Connectionist approaches to OCR are already showing great promise. In ten years it might be essentially a solved problem. A harder problem will be to have a computer make sense of arbitrary figures and diagrams. But that won't be necessary; the OCR machine can simply vectorize or bitmap anything it can't otherwise interpret. Give a smart OCR device, we could ``mine'' libraries for their information content. Just load the hopper with books, press the button, and take the information out of those mouldering tombs and put it in the hands of people who can go out and create wealth with it. Dan Mocsny
danscott@atari.UUCP (Dan Scott) (11/04/88)
in article <1147@xn.LL.MIT.EDU>, olsen@XN.LL.MIT.EDU (Jim Olsen) says: > Imagine the value to those law students of having, for modest cost, > the entire United States Code, Code of Federal Regulations, or United > States Reports (Supreme Court decisions) in their shirt pockets! I would have to agree. I usually take what a lawyer tells me with a grain of salt, but perhaps I would have more faith if I knew he had access to all cases that set president for what I am dealing with. Dan
lange@lanai.cs.ucla.edu (Trent Lange) (11/04/88)
In article <13203@andante.UUCP> prem@andante.UUCP (Swami Devanbu) writes: > >Bah humbug. > >When I read a book, I want to curl up in a comfy chair, with a blanket >around me, a bowl of curried popcorn, and a pot of tea. > >A computer is a computer and a book is a book. > >Prem Devanbu Well, then, what you *really* want is a flat screen about the size of a piece of notebook paper, and maybe about half an in deep. With perhaps a touch screen, for point and click operations. Maybe an infrared or radio connection, a *really* fast one, to the main cube on your desk. *Then* you could curl up in your comfy chair and blanket, and have access to the complete works of Shakespeare. And, of course, your newspaper would be electronic, so you wouldn't have to worry about getting grimy newsprint on your curried popcorn! Someday... Trent Lange ********************************************************************** * "UCLA: The fifth best country in the Olympics" * **********************************************************************
sac@well.UUCP (Steve Cisler) (11/04/88)
Bernard Aboba is quite right; the print publishers are quite reticent about seeing their libraries moved to electronic format. The Library of Congress has a number of projects including a new one called AMERICAN MEMORY where they hope to make available some of their 85,000,000 items to the American people through some optical medium. While they house the stuff, they don't have the rights and permissions. The Faxon Company recently did a survey of periodical publishers, and most had little or no interest in seeing their print pubs in any electronic format. A few did, but most of those were unsure if it would be profitable. UMI has an experiment at Northwestern Univ. in Evanston where the images of periodical pages are on CD-ROM. Evidently the publishers would not allow just the ascii to be put on. The only pointers are to articles not to a fragment of text (keyword in context). But at least you get to see the ads too! Steve Cisler Apple Library 408 974 3258
geb@cadre.dsl.PITTSBURGH.EDU (Gordon E. Banks) (11/05/88)
In article <5821@hoptoad.uucp> tim@hoptoad.UUCP (Tim Maroney) writes: >Completely agree! I hope it happens, but as someone who did a minor >feasibility study on doing it himself, I have to say it seems a long >way off. The barriers are formidable. >-- I think you will find that libraries, including the Library of Congress will be doing this for us. Book preservation is very expensive and putting them all on CD while the actual copies get stored in CO2 or such is one answer to this problem. It may be a lot cheaper for a library to give you electronic access to its collection than actual access. The only thing is, I like to read in bed, and even a laptop gets heavy on my chest.
desnoyer@Apple.COM (Peter Desnoyers) (11/05/88)
>In article <5790@hoptoad.uucp> tim@hoptoad.UUCP (Tim Maroney) writes: >> So it would >>take 6710 hours or about three and a third work years to scan in 671 >>books. And I think my two minutes a page estimate may be optimistic, >>not to mention extra costs for indexing and mastering. Unbind the book, first, then put it through a sheet feeder. I'm sure there's a high-tech way to unbind a book, but zipping the binding off on a good circular saw works fine. (I've seen it done to Inside Mac, to loose-leaf bind it.) Should be ~5min per book, plus <5 sec. per page for per-sheet paper handling. (Use the guts of a good copy machine.) Peter Desnoyers
bzs@encore.com (Barry Shein) (11/06/88)
Once again a thought virus...people's minds are running amuck. The point of putting books on-line is not to ensure you never read in bed again, it's to make them accessible to new generations of tools, for the researchers, writers and curious of the world. Here's a good one. A long time ago on a list far far away someone claimed that men could breast feed under certain conditions citing various second-hand accounts (eg. talks from La Leche League.) The claim was that if the infant suckled a male breast long enough (I suppose supplementary feeding is necessary) it would begin to produce milk. Further, they claimed this was indeed not uncommon in those apocryphal primitive tribes who are always doing these amazing things. I had no idea, so I went to the University library to spend a few hours seeing if I could track down something authoritative on the subject. Try it, it's nearly impossible (although seemingly too literal-minded I finally found some references in the Nursing library.) I didn't quite get satisfaction but it appears to be untrue, bordering on urban legend (I'd be glad to hear about anything *authoritative* about it still, something more than you once heard it was true or read it in a pamphlet somewhere.) That's the kind of thing on-line libraries should get you, and it's invaluable. Let's not extrapolate ad nauseum just to appear to have poked a hole in an idea. -Barry Shein, ||Encore|| P.S. This is not meant to criticize La Leche, in fact, I doubt they ever claimed the above, it may have just been one of those things that "goes around", using their name for seeming authority.
ns@cat.cmu.edu (Nicholas Spies) (11/06/88)
In article <1218@atari.UUCP> danscott@atari.UUCP (Dan Scott) writes: >in article <1147@xn.LL.MIT.EDU>, olsen@XN.LL.MIT.EDU (Jim Olsen) says: > >> Imagine the value to those law students of having, for modest cost, >> the entire United States Code, Code of Federal Regulations, or United >> States Reports (Supreme Court decisions) in their shirt pockets! > >I would have to agree. I usually take what a lawyer tells me with a >grain of salt, but perhaps I would have more faith if I knew he had access to >all cases that set president for what I am dealing with. > >Dan More fun yet would be an expert system that would scarf through the legal database looking for precedents relating to your current case, tie them together into an easily-understood argument for your review, and not cost an arm and a leg for each user. If laws are in any sense rational and analogies between earlier and current cases hold any water, this should be possible in principle. The extremely important thing to note is that this technology, once developed, should not be privately owned but freely available to prosecutors and defendents alike, as each citizen should be entitled to equal access to the body of information that constitutes "the law", in practice as well as theory. As this is most definitely _not_ the case now, and a great many lawyers profit mightily under the present situation, chances are better than even that legal AI would be made illegal--except for use by "qualified professionals"... -- Nicholas Spies ns@cat.cmu.edu.arpa Center for Design of Educational Computing Carnegie Mellon University
barry@confusion.ads.com (Barry Lustig) (11/06/88)
In article <26543@tut.cis.ohio-state.edu> bob@allosaur.cis.ohio-state.edu (Bob Sutterfield) writes:
...
Technology should be applied to advantage where appropriate, and
engineers should have enough restraint not to put a microprocessor in
every toaster oven they sell.
...
I was in Macy's the other day. In the kitchen gadget department I saw
the following label on one of the tosters:
Microchip toasting technology. Microprocessor temperature
controlled.
Barry Lustig
Advanced Decision Systems barry@ADS.COM
wetter@cit-vax.Caltech.Edu (Pierce T. Wetter) (11/08/88)
> I think you will find that libraries, including the Library of > Congress will be doing this for us. Book preservation is very > expensive and putting them all on CD while the actual copies get > stored in CO2 or such is one answer to this problem. It may be > a lot cheaper for a library to give you electronic access to > its collection than actual access. The only thing is, I like > to read in bed, and even a laptop gets heavy on my chest. The last time I was in the library of congress, they were scanning the books at 300dpi and displaying them on special terminals. Clearly not the most efficent way of doing this. What really needs to be done is to make a standard for electronic books. Here's my quick draft of a storage method: Every book is composed of a series of records. Each record consists of a header followed by some data. There are three major types of records: formatting, text and pictures. A format record contains formatting information for a following record of text or pictures. (Formatting codes could be either TeX or RichTextFormat or Postscript or something special.) Pictures are stored in Postscript, GIF or Tiff format depending on their origin (line art or pictures) Pierce ____________________________________________________________________________ You can flame or laud me at: wetter@tybalt.caltech.edu or wetter@csvax.caltech.edu or pwetter@caltech.bitnet Caution: All my postings are 100% accurate from my point of view. However, my point of view rarely translates into english. Therefore any errors in my posting are your fault for not interpreting it correctly.
dmocsny@uceng.UC.EDU (daniel mocsny) (11/09/88)
In article <13203@andante.UUCP>, prem@andante.UUCP (Swami Devanbu) writes: > When I read a book, I want to curl up in a comfy chair, with a blanket > around me, a bowl of curried popcorn, and a pot of tea. > > A computer is a computer and a book is a book. An amateur historian is completing a study on the origins of networked civilizations. She is working diligently in her favorite location -- in a small rowboat floating in a farmpond. Being something of a traditionalist, she has so far resisted the temptation to get the brain stem implants so many of her friends are raving about. She's still sticking with the fast-obsoleting virtual workstation hardware. She wears a translucent pair of wrap-around goggles. These display a pair of binocular images to her, each with pixel and color resolution matching her visual acuity. The goggles provide a field of view as wide as her visual arc. Head-motion sensors in the goggles send information to her pocket-sized 100 GIPS computer. The computer uses this information to pan the displays to cancel her head motion, so she has the convincing impression of being inside a virtual environment. She can interact with objects in the environment by moving her hands in a natural way. She is wearing a thin pair of gloves that report her hand motion to the computer. The computer projects an image of her hands in the virtual environment and adjusts virtual objects as she manipulates them. The goggles also track her eye motion, so she can point to objects simply by looking at them and speaking commands (the computer recognizes her speech). To perform her study, she whispers to her computer, ``historical archives.'' The computer creates an animated representation of sailing over a city and landing before a large building. She floats inside and settles at a wide mahogany table. She starts naming off topics of interest, the corresponding virtual books float out of their virtual shelves, glide to her, open themselves to the pages of interest, and float before her. With a practiced flurry of glances and gestures, she arranges them to her liking, scans a few documents, and begins to dictate her thoughts. As her essay ranges to other topics, her computer suggests additional reference material. At one point she is reviewing an archived discussion from the historically significant Usenet. She stumbles upon a thread relating to the early efforts to place printed materials on optical disks. She reads a few quotes and marvels at how quaint they sound in retrospect. Imagine, real paper books! She recalls seeing a few at a museum, carefully stored under nitrogen beneath thick glass. How her predecessors must have struggled with them...they looked so heavy, so bulky, so clumsy, and above all, so inflexible! Having data in a static form, how could one search it, extract portions for comment, analysis, or elaboration? What if a book contained errors? How was one to locate all the copies and notify their owners? How could one simultaneously view a hundred of them? How could one possibly have enough on hand to do any serious work? How to write anything at all, never having assurance that one's readers would have immediate access to all the necessary background material? She speculates that the hapless writers of the past either had to speak hopelessly above most reader's heads or else painstakingly repeat information already available elsewhere. No wonder progress had been so slow! With hordes of people duplicating each other's efforts, that progress had occurred at all was amazing. And how was anyone to read comfortably? Fumbling with turning pages, struggling to to get the correct lighting...could those people have read anything while lying in bed? She struggles with the idea momentarily, then gives up. Wearying with her thoughts and labors, she tells her computer to save her work environment. She will return to it later. She pulls off her goggles and gloves, and slides them into a case on her belt. She seizes the oars, and slowly makes for shore. Dan Mocsny
fenwick@garth.UUCP (Stephen Fenwick) (11/09/88)
In article <17555@shemp.CS.UCLA.EDU> lange@cs.ucla.edu (Trent Lange) writes: >In article <13203@andante.UUCP> prem@andante.UUCP (Swami Devanbu) writes: >>Bah humbug. >>When I read a book, I want to curl up in a comfy chair, with a blanket >>around me, a bowl of curried popcorn, and a pot of tea. >>A computer is a computer and a book is a book. >>Prem Devanbu >Well, then, what you *really* want is a flat screen about the size of >a piece of notebook paper, and maybe about half an in deep. With >perhaps a touch screen, for point and click operations. Maybe >an infrared or radio connection, a *really* fast one, to the main >cube on your desk. > >*Then* you could curl up in your comfy chair and blanket, and have >access to the complete works of Shakespeare. And, of course, your >newspaper would be electronic, so you wouldn't have to worry about >getting grimy newsprint on your curried popcorn! > >Someday... > >Trent Lange Bah Humbug, Trent. Prem was right. When I want to turn a page, I want to turn a real page, made of paper (preferably acid-free). What's the rated lifetime of a CD-ROM (before oxygen starts to corrupt to aluminum surface) ? What are the environmental limits (temperature, humidity, etc) ? I have some books that are well over 100 years old, and still in excellent condition (no data drop outs). Some have been exposed to excessive humidity (drop in water, 'way back when); some have been heat-stressed (airplane cargo bay to hot car). None have lost any information. One of my primary tools is a 14th Ed., 1929 Encyclopaedia Britannica. If it is true that the life of a CD-ROM is less than 50 years, I would now be seeing data loss. This is unacceptable. Steve Fenwick -- \\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\///////////////////////////////////////// My company is not responsible for what I say. I might be... E-Mail route: ...!{ sun | sri-unix }!pyramid!garth!fenwick USPS: Intergraph APD, 2400 Geng Road, Palo Alto, California AT&Tnet: (415) 852-2325 //////////////////////////////////////\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\
fenwick@garth.UUCP (Stephen Fenwick) (11/09/88)
In article <4105@encore.UUCP> bzs@encore.com (Barry Shein) writes: >The point of putting books on-line is not to ensure you never read in >bed again, it's to make them accessible to new generations of tools, >for the researchers, writers and curious of the world. > >[ very sound example of the need for automated library search capabilities ] The only problem with this is keeping everything on file in a manner that allows users to find what they need. This is non-trivial, as the information content of a work may not be limited by the author's conception of the its content. Watch the PBS series "Connections" to see what I mean. Machines are currently very good a fast data retrieval, but decidedly bad at making inferences about the data that they store. Steve Fenwick -- \\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\///////////////////////////////////////// My company is not responsible for what I say. I might be... E-Mail route: ...!{ sun | sri-unix }!pyramid!garth!fenwick USPS: Intergraph APD, 2400 Geng Road, Palo Alto, California AT&Tnet: (415) 852-2325 //////////////////////////////////////\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\
dmocsny@uceng.UC.EDU (daniel mocsny) (11/10/88)
In article <1803@garth.UUCP>, fenwick@garth.UUCP (Stephen Fenwick) writes: > One of my primary tools is a 14th Ed., 1929 Encyclopaedia Britannica. > If it is true that the life of a CD-ROM is less than 50 years, I would > now be seeing data loss. This is unacceptable. This is also incomprehensible. Do we imagine that in fifty years we will be unable to create arbitrarily many backups of important information? I suppose some proponents of copyright law would like to jeopardize your data security, but this will not be a technological problem. Indeed, it would not have to be a problem right now if the WORM people could get their standards together. Unfortunately, many recent books were not printed on acid-free paper. And few books are of sufficient quality to stand up to serious use. Many libraries' collections are crumbling away. We must archive this knowledge to electronic form soon or lose it forever. Books certainly work well when you don't need many of them, or when you refer to passages so frequently that you might as well leave them open on your desk. Nobody is trying to do away with books altogether (yet). When you need occasional access to information stuck in a huge collection (system docs, parts catalogs, technical literature) CD-ROM makes sense. As display and storage technologies mature, electronic publishing will spread. Dan Mocsny
dmocsny@uceng.UC.EDU (daniel mocsny) (11/10/88)
In article <1804@garth.UUCP>, fenwick@garth.UUCP (Stephen Fenwick) writes: > The only problem with this is keeping everything on file in a manner that > allows users to find what they need. This is non-trivial, as the information > content of a work may not be limited by the author's conception of the its > content. Watch the PBS series "Connections" to see what I mean. This is exactly why we need to store information in a form that retains the maximum flexibility, because the author cannot predict all the uses it might find. Suppose we just store all the books and articles as fully-indexed files, and follow the present card catalog system. Is this going to make information less accessible than it now is in printed form? How much human effort goes into re-typing printed information? Look at almost any scholarly paper out there. Up to half of it is literature survey. Most of the survey is there because the author can't count on readers having ready access to all the previous papers. Sometimes the survey adds value, by putting previous work in perspective, but a lot of it simply gives researchers useless work to do. > Machines > are currently very good a fast data retrieval, but decidedly bad at making > inferences about the data that they store. True enough, but I'll be happy to make the inferences about what I need. First I've got to get at the information. A machine that did no more than automatically retrieve all the citations in a given paper would be an enormous help. (You know how frustrating being stumped by a missing citation is -- the author skips some important steps because they're in paper X, your library doesn't have the journal, so off you go, wasting valuable time and money trying to track it down.) I could also make real progress with a few boolean expressions and short phrases, provided that I could search abstracts and/or text of papers and books. Perhaps someday we will have machines that ``look over your shoulder'' and spot analogies between problem X that's stumping you and problem Z that appeared in some obscure East-block journal. If we could do that today, our 50-year technological diffusion patterns would speed up to weeks and days. Dan Mocsny
vitale@hpcupt1.HP.COM (Phil Vitale) (11/11/88)
> > One of my primary tools is a 14th Ed., 1929 Encyclopaedia Britannica. > > If it is true that the life of a CD-ROM is less than 50 years, I would > > now be seeing data loss. This is unacceptable. > ... Do we imagine that in fifty years we will > be unable to create arbitrarily many backups of important information? If there is ultimately no hardcopy -- relying on arbitrary numbers of backups gets scary. "Who" decides what books are *important* enough to backup? The number of works will not decrease, making the backup process an ever growing problem. In addition, CD-ROM is probably the first major print distribution medium where the act of copying and the act of modifying are of equal ease. "Who" will insure that the copy of the book in front of you is really a copy of the original, or one that was modified along the way by a "concerned" individual/party/government when it was "backed-up." (Orwell and 1984.) (Not that these concerns are new to CD-ROMs, just that the potential for abuse seems greater.) > Unfortunately, many recent books were not printed on acid-free paper. > And few books are of sufficient quality to stand up to serious use. > Many libraries' collections are crumbling away. We must archive this > knowledge to electronic form soon or lose it forever. Electronic form is not the only way to preserve knowledge. Books have been remarkably successful at preserving information across the ages. (Are we really going to have a CD-ROM reader capable of reading the disks we make today say 300 years from now?) There are serious efforts underway to come up with methods to de-acidify large (room-sized) numbers of books at a time. Also from what I gather, there are no longer major price or technology obstacles in producing acid or non-acid paper. Rather it is an issue of investments in existing processing equipment. (Can someone with closer experience to the industry comment?) > Nobody is trying to do away with books altogether (yet). I would hope not ever, not completely. The methods used for long-term preservation and rapid information access need not preclude each other. I really enjoyed pouring over some of the original papers of da Vinci, Darwin, and Bach. (Handling drafts, pencil-written by Tolkein, was quite a thrill for a young undergrad.) Somehow the same information loses its impact when it is displayed on a CRT screen. Then again, it would have been nice to have some backup disks of the library at Alexandria before the fire ... > Dan Mocsny Phil Vitale
ekwok@cadev4.intel.com (Edward C. Kwok) (11/11/88)
In article <1218@atari.UUCP> danscott@atari.UUCP (Dan Scott) writes: >in article <1147@xn.LL.MIT.EDU>, olsen@XN.LL.MIT.EDU (Jim Olsen) says: > >> Imagine the value to those law students of having, for modest cost, >> the entire United States Code, Code of Federal Regulations, or United >> States Reports (Supreme Court decisions) in their shirt pockets! > >I would have to agree. I usually take what a lawyer tells me with a >grain of salt, but perhaps I would have more faith if I knew he had access to >all cases that set president for what I am dealing with. Not very likely, each volume of the United States Report contains, on the average 1500 pages, and each page contains roughly 8000 characters. That makes about 12 Mbytes per volume. There are more than 470 volumes, the last time I look. de
dmocsny@uceng.UC.EDU (daniel mocsny) (11/12/88)
In article <-290109999@hpcupt1.HP.COM>, vitale@hpcupt1.HP.COM (Phil Vitale) writes: > If there is ultimately no hardcopy -- relying on arbitrary numbers > of backups gets scary. I observe a fair number of people using computers. I have no statistics to back me up, but I think a see a general trend: computer users with less experience and sophistication are often quicker to hit the printers for rough drafts, etc. That is, they tend to generate more hardcopy per unit of work done. I say this not to disparage them. Present-day computers are still brittle, expensive, and lacking adequate displays. Learning to work with less paper takes practice and motivation. We are still a long way from eliminating paper. I do not expect this to always be true. Eventually, computers will be cheap and reliable to the point of transparency. We will not fear relying on them over paper any more than we currently fear relying on telephones over couriers. (What?!? Your office has X telephones that you use constantly, and you don't retain a comparable number of messenger boys on your staff? What if something went...wrong????) > "Who" decides what books are *important* enough to backup? The number of > works will not decrease, making the backup process an ever growing problem. The only reason we have to ration our information recording is because we have not yet mastered the art of doing it cheaply. This will change. Have you seen the projected pricing on digital paper? A one GB diskette for $5. 660 GB tape reels, at $0.005/MB. The number of works will increase, but storage technologies are advancing faster. I expect to live to see the day when the average person can afford storage capacity sufficient for today's Library of Congress. > "Who" will insure that the copy of the book in front of you is really a > copy of the original, or one that was modified along the way by a "concerned" > individual/party/government when it was "backed-up." (Orwell and 1984.) > > (Not that these concerns are new to CD-ROMs, just that the potential > for abuse seems greater.) How do you know that a paper book is legitimate? I think this has more to do with the number of copies than anything else, not to mention the ease with which two copies may be compared. Which would you rather do, verify that two paper books were identical, or type diff? You are right, though, we need a central repository to maintain archived masters, since electronic information invites editing. > Electronic form is not the only way to preserve knowledge. Books have > been remarkably successful at preserving information across the ages. > (Are we really going to have a CD-ROM reader capable of reading the > disks we make today say 300 years from now?) Books have been amazingly good, with maddening exceptions, of course. If we were really into leaving a heritage, we would go back to clay tablets. They resist burning better. In 300 years our information technologies could be so ridiculously advanced that we should have machines that could start with almost any chunk of matter and tell whether or not it was an artifact. They should be able to tell whether said artifact contains recorded information, and then extract as much of it as remains. A CD-ROM would be easy pickings compared to digging up clay tablets written in totally forgotten languages. However, the beauty of electronic media is that they make information readily available. Unless we suffer a breakdown in civilization (at which point reading CD-ROMs will be the least of our worries), we can easily copy our accumulated information to each succeeding storage technology that appears. With paper you're stuck with what you've got. Time hates a static medium. > I really enjoyed pouring over some of the original papers of da Vinci, > Darwin, and Bach. (Handling drafts, pencil-written by Tolkein, was quite > a thrill for a young undergrad.) I hope the manuscripts were not badly damaged by whatever you poured over them. :-) > Somehow the same information loses its impact > when it is displayed on a > CRT screen. You'll never hear me claiming today's display technology to be anywhere near adequate. It is improving, albeit much too slowly. I see no reason to doubt that computer displays will eventually match our visual acuity. I do doubt I will see that happen soon. > Then again, it would have been nice to have some backup disks of the > library at Alexandria before the fire ... My sentiments exactly. Putting all your information on hard-to-copy media and stacking them in one place is thumbing your nose at reality. Then again, pity the poor despots who will be robbed of their ability to make a shocking public spectacle. Somehow, typing rm * just doesn't have the same impact as a big, roaring bonfire. > Phil Vitale Dan Mocsny
tiedeman@acf3.NYU.EDU (Eric S. Tiedemann) (11/12/88)
In article <-290109999@hpcupt1.HP.COM> vitale@hpcupt1.HP.COM (Phil Vitale) writes: >"Who" decides what books are *important* enough to backup? The number of >works will not decrease, making the backup process an ever growing problem. To a greater extent than in the past, individuals will be able to afford to make backups. This is the good we see. >In addition, CD-ROM is probably the first major print distribution medium >where the act of copying and the act of modifying are of equal ease. > >"Who" will insure that the copy of the book in front of you is really a >copy of the original, or one that was modified along the way by a "concerned" >individual/party/government when it was "backed-up." (Orwell and 1984.) > >(Not that these concerns are new to CD-ROMs, just that the potential >for abuse seems greater.) Ultimately, the reader will have to ensure this for himself. One way is to use public key authentification. It's a lot easier to be sure you have the right key than that you have the right text. >Electronic form is not the only way to preserve knowledge. Books have >been remarkably successful at preserving information across the ages. OK. The prudent among us may use codex form. Again, this will be cheaper to do if you have the text in machinable form to begin with. >There are serious efforts underway to come up with methods to de-acidify >large (room-sized) numbers of books at a time. Also from what I gather, References? >Somehow the same information loses its impact >when it is displayed on a >CRT screen. Somehow information loses its impact when it's gone, as you go on to note. >Then again, it would have been nice to have some backup disks of the >library at Alexandria before the fire ... Eric tiedeman@acf3.nyu.edu
news@littlei.UUCP (11/12/88)
In article <1804@garth.UUCP> fenwick@garth.UUCP (Stephen Fenwick) writes: |In article <4105@encore.UUCP> bzs@encore.com (Barry Shein) writes: |>The point of putting books on-line is [...] to make them accessible to new |>generations of tools, [...] | |The only problem with this is keeping everything on file in a manner that |allows users to find what they need. This is non-trivial, as the information |content of a work may not be limited by the author's conception of the its |content. Watch the PBS series "Connections" to see what I mean. Machines |are currently very good a fast data retrieval, but decidedly bad at making |inferences about the data that they store. With a general purpose hypertext system, humans would make the machine record the connections as they were discovered. See alt.hypertext. From what I've read here in comp.sys.next, the "Digital Librarian" is not a general purpose hypertext system. I'm not even sure it's hypertext. Scott Peterson -- OMSO Software Engineering -- Intel, Hillsboro OR uunet!littlei\ tektronix!reed!foobar >!sdp!sdp -- or -- sdp@sdp.hf.intel.com psu-cs!foobar/
ebh@argon.UUCP (Ed Horch) (11/12/88)
In article <1803@garth.UUCP> fenwick@garth.UUCP (Stephen Fenwick) writes: >One of my primary tools is a 14th Ed., 1929 Encyclopaedia Britannica. >If it is true that the life of a CD-ROM is less than 50 years, I would >now be seeing data loss. This is unacceptable. Well, you've got two alternatives. Either the technology will have advanced, and you can copy the data onto whatever the latest nifty storage media is (terabit EPROMS? :-), or technology will have stag- nated, and you'll have to settle for copying it onto another CD-ROM. Suppose the life of a CD-ROM is only ten years - recopying your data every *nine* years doesn't exactly sound like a full-time job. Haven't you ever copied archival data from an old floppy or tape to a fresh one? On the other hand, what are you going to do with your Britannica when the pages get too brittle to turn, and you're forced to put the books in climate controlled storage to keep them from decaying further? BTW, what's the life expectancy of microfilm and microfiche? -Ed This has strayed from anything NeXT-specific, so I've redirected followups to comp.periphs.
sac@well.UUCP (Steve Cisler) (11/12/88)
I'm impressed with the longevity of the discussion about optical publishing. As a librarian, I tend to think that the trails blazed by NeXT (and other companies) to move information from one medium to another will have more effect on society than the speed or choice of CPU or DSP. Of course, the total package will make users more or less apt to use these electronic libraries. As an example, William Arms of Carnegie-Mellon U. spoke at the October 88 Educom Conference about Project Mercury--an electronic library on campus, serving the computer science and engineering departments (initially). What struck me was his choice of a Sun workstation or equivalent as the minimum quality interface for the user (i.e. no AT's or Mac SE's, etc). Evidently, the large screen is extremely important to Arms; he wants people to read on screen. Even the current displays for the Mac, NeXT and Sun just don't have the same amount of information for the eye as does a paperback. Consequently, displays have to really improve before you will get an English professor to read onine. I think, though, the types of text retrieval software that NeXT is bundling will help get people to use the digital library instead of the print version of the same works. Does anyone have thoughts about traditional publishers willingness to go into a new medium (and distribution method)? I think most are afraid of cannibalizing their print market. What is going to woo them away? CD-ROM has done it to a very limited extent. Steve Cisler Connect: Libraries and Telecommunications Box 992 Cupertino, CA 95015
wald-david@CS.YALE.EDU (david wald) (11/14/88)
In article <398@uceng.UC.EDU> dmocsny@uceng.UC.EDU (daniel mocsny) writes: >Wearying with her thoughts and labors, she tells her computer to >save her work environment. She will return to it later. She pulls >off her goggles and gloves, and slides them into a case on her belt. >She seizes the oars, and slowly makes for shore. Explicitly saving? What's wrong with autosave? ============================================================================ David Wald wald-david@yale.UUCP waldave@yalevm.bitnet ============================================================================
dmocsny@uceng.UC.EDU (daniel mocsny) (11/15/88)
In article <42955@yale-celray.yale.UUCP>, wald-david@CS.YALE.EDU (david wald) writes: > In article <398@uceng.UC.EDU> dmocsny@uceng.UC.EDU (daniel mocsny) writes: > >Wearying with her thoughts and labors, she tells her computer to > >save her work environment. > Explicitly saving? What's wrong with autosave? The advantage of fiction is that I make the rules up as I go along. So I can easily say: 1. Of course her computer maintains a triply-redundant audit trail of everything she has ever done. That way it can perform statistical studies of her usage patterns, and automatically optimize its command and file structures to suit her. She tells the computer to save simply out of habit, the same way one's boss tells one to do the things one was hired to do and is consequently doing already. 2. Autosave wasn't in Virtual Workstation Release 2.1a. Release 2.1b is out, but she hasn't bothered to upgrade yet. Dan Mocsny p.s. to the poster who wanted references to this sort of technology, you can start with ``NASA's Virtual Workstation: Using Computers to Alter Reality,'' NASA Tech Briefs, July/August 1988. Also see Scientific American's article on advanced user interfaces, published in their special issue on computing sometime in the Fall of 1987.
tim@hoptoad.uucp (Tim Maroney) (11/16/88)
In article <1147@xn.LL.MIT.EDU>, olsen@XN.LL.MIT.EDU (Jim Olsen) says: > Imagine the value to those law students of having, for modest cost, > the entire United States Code, Code of Federal Regulations, or United > States Reports (Supreme Court decisions) in their shirt pockets! In article <3177@mipos3.intel.com> ekwok@cadev4.UUCP (Edward C. Kwok) writes: >Not very likely, each volume of the United States Report contains, on the >average 1500 pages, and each page contains roughly 8000 characters. That >makes about 12 Mbytes per volume. There are more than 470 volumes, the >last time I look. That's less than 6 Gigs. I don't think that's an unrealistic expectation for optical disks in 1998. Of course, by then, there will be more volumes. A set of 10 or so current 660Mb disks is still going to be a lot easier to deal with than a wall full of large books, especially with indexing. There'd be a main index on one disk that also contained the latest volume(s) in progress; that one disk would be periodically updated. Unfortunately, the Next computer will turn out to require two floptical drives to be useful for this kind of heavy-duty archiving. Can anyone give us realistic compression and indexing estimates? The assumption that the two balance out is beginning to bother me. -- Tim Maroney, Consultant, Eclectic Software, sun!hoptoad!tim "Next prefers its X and T capitalized. We'd prefer our name in lights in Vegas." -- Louis Trager, San Francisco Examiner