saja@ujocs.joensuu.fi (Jorma Sajaniemi) (10/09/90)
I asked in comp.graphics about electronic document archiving systems that enable one to store documents as scanned images and to retrieve and display them on the screen. The main assumptions were that the documents will be, e.g., pictures and hand-written texts, and that the number of documents will be v e r y large. Here is a summary of the answers I got. Thanks for everybody that replied. Jorma Sajaniemi University of Joensuu, Finland Department of Computer Science saja@ujocs.joensuu.fi ====================================================================== I know Kodak Australia sells such a system. If you are interested I could get you some contact names at Kodak. Hiren Patel Phone ISD: +61 3 587 1444 Design Engineer Fax: +61 3 580 5581 Labtam Information Systems P/L Telex: LABTAM AA33550 43 Malcolm Road Internet: hiren@labtam.oz.au Braeside ACSNET/CSNET: hiren@labtam.oz Victoria 3195 ARPA: hiren%labtam.oz@uunet.uu.net Australia JANET: labtam.oz!hiren@ukc UUCP: ...!uunet!munnari!labtam.oz!hiren ====================================================================== Sorry if this sound like a product plug, however this is an area that Intergraph has spent alot of time developing products in, we currently sell hardware configurations based round servers,workstations,scanners and large optical disk jukeboxes. The software to manage large amounts of data is provided by NFM (Network File Manager) and DMANDS (Document Management And Distribution System). Obviously the UK office cannot provide you with sales information, however here is the phone number of the Finnish office 804-554744 Nik Simpson UUCP : uunet!ingr!swindon!st_nik!nik Senior Systems Engineer. Intergraph UK Ltd. ====================================================================== I spent a long time "pre-mastering" CD ROMS. Typical issues were crammed with the capacity of 650 Megabytes. Text was all shipped to the orient for formatted input. Images were scanned with a monster scanner (something like 15 pages per minute of 2 bit graphics), and then stored on an "optical juke box". This beast stored 2 terrabytes on 12 inch optical platters, and was controlled by its own node of a VAX cluster. This seems like a significant volume. During a consultation with Diner's Club, I got to know their optical system which is a somewhat smaller version (to store the thousands of charge tickets that flow in.) Both these systems are available commercially, but the reality is that a fast system can be assembled off the shelf. Before beginning your project make certain that you have very realistic ideas about growth and acceptable speed of response. Both systems suffered mightily, and were upgraded in million dollar increments regularly (always a bit behind the actual need, however.) Please feel free to contact me about such systems. Mark Richard-Fogg, principal designer Fogg Design & Manufacturing Groups Woodside House 1644 Emerson Street Denver, CO 80218 (303) 839-9296 fax ====================================================================== I don't know about unix-based systems but for Macintosh you could try Micro Dynamics (301) 589-6300 in Maryland (they are working on Sun versions) and for PC ViewStar (415) 841-8565 Both of these systems use the Sony WORM JukeBox to hold 50*6.4 GigaBytes, and provide for keyword searching, OCR conversion, as well as storage of the scanned images. Gary White gwhite@inetg1.arco.com ====================================================================== The company I work for builds and sells a product that does something very similar to what you describe. We can store the images in an optical disk jukebox, with caching to ordinary magnetic hard disks. The image and indexing server system is Unix based. What other information could I provide you? ...Chris Johnson chris@c2s.mn.org ..uunet!bungia!com50!chris Com Squared Systems, Inc. St. Paul, MN USA +1 612 452 9522 ====================================================================== Please contact Bill Turner at wrt@cornellc.cit.cornell.edu The library system at Cornell is experimenting with digtal preservation of deteriorating books. Xerox is providing equipment to digitize 1000 books. They are going to keep the pages as images, rather than turning them into text via optical character recognition. Mike Oltz MYK@cornella.bitnet ====================================================================== I just talked to somebody here at Stanford about this very subject. Talk to Andy Cargile at the Imaging Project in the Data Center. His phone number is (415) 725-0613, and email is gq.ajc@forsythe.stanford.edu. The Imaging Project is doing an evaluation of system that is just going commercial, produced by Image Business Ssytems. Currently, the IBS system is IBM-PC-based, using 3 main system elements: a server to hold images in compressed form (an IBM RT), a scan station (a PC hooked up to a scanner), and a print/FAX station (also a PC, hooked up to an HP ink jet). From what I understand, the scan station has a software implementation of three compressing algorithims, CCITT Group 3 and Group 4, and an IBM algorithm, MMR. The CCITT algorithims are what FAX machines use, hence the capability of the print station to send FAXes. According to Andy, the CCITT algorithms can reach best-case compression of a 1Meg image (one-bit) into 50-100K. (That must explain why FAXes can work so fast, even with a 9600 baud modem.) The server stores the images in compressed form, and a print station equipped with the proper decompression algorithm can view and print the image. The system currently runs under Microsoft Windows 2.86. In addition, an image information database is under construction on Forsythe, using the Spires/PRISM database. Keywords and information about the images will be stored, with links to the images themselves on the server. Also according to Gary, a Mac front-end is on the horizon. I hope this helped! Jesse Ellenbogen elbow@jessica.stanford.edu ====================================================================== My company has a product which sounds e x a c t l y like what you want. Our distributor in Scandinavia is Capture Technology AB Box 81017 Hammarbyvagen 27 B 104 81 Stockholm Sweden Telephone 08-702 96 50 Fax 08-41 11 21 Contact name is Hans-Gosta Ricknell (sales director) I have passed your name to him. Graham Underwood graham@advent.co.uk ====================================================================== Jorma, we don't have any experience but we are moving in this direction. We produce about 30 million pages of material a year and this has become an incredible hassle when it comes to storage etc. Plus our design and manipulations costs are outrageous. Thus, we are attempting to move our entire design/development/storage and production systems into the electronic age. We now use Sun platforms to do the visual design and we have just been told that we'll be getting two Xerox 5090 reprographic systems that allow us to take the Sun output and get it all the way to the negative or plate stage and then these machines will also do the printing. From what I understand of your problem, you want to get the input stage cleaned up better. We don't have anything at the moment but we are trying to design and implement a system that would permit multimedia inpt capture and storage so that a variety of people can manipulate the information as and when they want to . Kevin "auric" Crocker Athabasca University UUCP: ...!{alberta,ncc}!atha!kevinc Inet: kevinc@cs.AthabascaU.CA ======================================================================