roy@phri.UUCP (Roy Smith) (10/25/87)
In article <641@zen.UUCP> vic@zen.UUCP (Victor Gavin) writes: > I have been asked to write some software which can (given an image > produced by the scanner) reproduce the original text of the paper in a > machine readable form. I don't know much about it, but a company called DEST markets a 300-dpi scanner for the Macintosh (and, I think, IBM-PC) for about $2k, including character recognition software. Unless your application has some special requirements, I would imagine getting one of these jobs would be a lot more cost-effective than writing your own software. I've added comp.sys.mac to the Newsgroups line to see if anybody there has any experience with the DEST they could share. While I'm at it, can somebody compare and contrast the O($2k) scanners with the el-cheapo Thunderscan for me. What to the "real" scanners have going for them that I can't do with a Thunderscan? -- Roy Smith, {allegra,cmcl2,philabs}!phri!roy System Administrator, Public Health Research Institute 455 First Avenue, New York, NY 10016
oster@dewey.soe.berkeley.edu (David Phillip Oster) (10/25/87)
In article <2984@phri.UUCP> roy@phri.UUCP (Roy Smith) writes: >In article <641@zen.UUCP> vic@zen.UUCP (Victor Gavin) writes: >> from a scanner image reproduce the original text of the paper in a >> machine readable form. >can somebody compare and contrast the O($2k) scanners with the el-cheapo >Thunderscan for me. What to the "real" scanners have going for them that I >can't do with a Thunderscan? Thunderscan offers very high quality scanning, at resolutions up to 300 dpi, and up to 5 bits per pixel. (32 grays.) It can handle originals up to 15" wide (in a wide carriage imagewriter) and at least 32767 scan lines long. (I haven't actually tried anything longer than 11", but when it finishes, the "continue scan" button is still waiting to be presssed.) However, it is slow, (5 to 40 minutes, depending on resolution and size of original.) and only works on single sheet, thin, bendable material. (The material has to fit in the imagewriter printer.) That means you'd do well to have a xerographic copier handy. The expensive scanners are flat bed, copier style machines, and do their work faster (can't be too much faster, though. It takes 15minutes to send an 8"x10" page at 1-bit per pixel 300dpi, over a 9600 baud line if you do not use a compressing transfer protocol.) Olduvai Software makes a line of software that parses scanned pages back into text. Either the current issue of MacUser has a review, or I saw it in a recent copy of MacWeek, but for < $200.00 you get a software package to do syntactic pattern recognition of letter features, to determine the ASCII for the scanned page. It is still cheaper to hire a human typist, but soon the cost balance will flip the other way. (I expect that copy shops will offer a service: bring in your books and blank disks, and for a few cents a page, get them digitized to ASCII. (And won't that boost our needs for on-line storage (What, only 300Gigabytes! How do your get by with such a small library?))) (note, I've directed followups to just comp.misc. If people want to continue this discussion, they can read it there.) --- David Phillip Oster --A Sun 3/60 makes a poor Macintosh II. Arpa: oster@dewey.soe.berkeley.edu --A Macintosh II makes a poor Sun 3/60. Uucp: {uwvax,decvax,ihnp4}!ucbvax!oster%dewey.soe.berkeley.edu
korn@apple.UUCP (Peter "Arrgh" Korn) (10/26/87)
In <21433@ucbvax.BERKELEY.EDU>, oster@dewey.soe.berkeley.edu.UUCP (David Phillip Oster) said: >>In article <641@zen.UUCP> vic@zen.UUCP (Victor Gavin) writes: >>> from a scanner image reproduce the original text of the paper in a >>> machine readable form. > >...[discission of the ThunderScan scanner]... > >The expensive scanners are flat bed, copier style machines, and do >their work faster (can't be too much faster, though. It takes >15minutes to send an 8"x10" page at 1-bit per pixel 300dpi, over a >9600 baud line if you do not use a compressing transfer protocol.) If you assume that 9600 baud is the fastest they are transmitting data. The macintosh can accept data over it's serial port at a rate that is quite a bit faster than that (56K baud easily, and appletalk is another 8 times faster than that). Also, most of the newer 'professional' scanners are using the SCSI port, which can get you a full page scanned and transmitted to the Mac's RAM, displayed on the screen eagerly awaiting the deftest commands of the user in as fast as 14 seconds (and perhaps even a second or two faster than that). >Olduvai Software makes a line of software that parses scanned pages >back into text. Either the current issue of MacUser has a review, or I >saw it in a recent copy of MacWeek, but for < $200.00 you get a >software package to do syntactic pattern recognition of letter >features, to determine the ASCII for the scanned page. Unfortunately their advertisements seemed to be a little ahead of their ability to deliver when I spoke with them about a month ago. I recall their saying something about it being at least Christmas before they would actually be shipping product--don't quote me on this last one, as the event happened fully 30 days ago. Nonetheless, after at least two months of advertising in MacUser their product wasn't anywhere near shipping when I called them. >It is still cheaper to hire a human typist, but soon the cost balance >will flip the other way. I hope this happens soon. However, from my experience with character recognition, it won't happen for a little while yet. *If* all that you are scanning is 10 or 12 pitch mono-spaced Courier, Letter Gothic, or one of a small set of other fonts, then computer character recognition is a viable option for you that may well save you a lot of $$ vs. paying a typist to do it. However, to my knowledge, there exists no scanner anywhere that can properly deal with all types of proportional spaced fonts at anything near acceptable accuracy (remember that 99.5% accuracy works out to 3 errors every typewritten page) let alone handle typeset text that is kerned (such as you find in the newspapers and books that you read). Having spend the better part of 6 months selling these beasties, and going to school at a University that had one of the more expensive Kurtzweil machines, I've become somewhat jaded by their promise. They seem to be much like expert systems--very good in a tightly controled environment, but not very good beyond that. >... > >(note, I've directed followups to just comp.misc. If people want to continue >this discussion, they can read it there.) Normally I would have respected this; and all followups to this posting I have redirected to comp.misc, but I felt that there's been enough interest at least in comp.sys.mac to correct some of the statements made about scanning speed and character recognition software in the forum in which it was made. Peter -- Peter "Arrgh" Korn korn@apple.com !hplabs!amdahl!apple!korn "hi mom!"
wew@naucse.UUCP (Bill Wilson) (10/28/87)
Flagstaff Engineering (602-523-6461) is currently writing character recognition software for scanners and PC's. You may want to give them a call.
kevinc@auvax.UUCP (Kevin Barry Crocker) (10/28/87)
In article <2984@phri.UUCP>, roy@phri.UUCP (Roy Smith) writes: > In article <641@zen.UUCP> vic@zen.UUCP (Victor Gavin) writes: > > I have been asked to write some software which can (given an image > > produced by the scanner) reproduce the original text of the paper in a > > machine readable form. > > I don't know much about it, but a company called DEST markets a > 300-dpi scanner for the Macintosh (and, I think, IBM-PC) for about $2k, This may not be relevant to all, but a recent issue of PC Magazine does a review of both Desktop Publishing and Scanners for the PC Market. The issue is Volume 6 Number 17 October 13, 1987. Now, I realize that for Mac users this may not be totally relevant but some of these companies may make suitable software to make thier product usable on the Mac - especially those that link to PageMaker. In fact I seem to remember some vendors products being touted as both market products. ihnp4!alberta!auvax!kevinc (Kevin Crocker Athabasca University) Do our employers have opinions or is that what we get paid for!
cem@ihlpa.ATT.COM (45261-Malloy) (11/03/87)
In article <477@naucse.UUCP>, wew@naucse.UUCP (Bill Wilson) writes: > > Flagstaff Engineering (602-523-6461) is currently writing > character recognition software for scanners and PC's. > You may want to give them a call. I called them a while back and they send me a demo of their OCR software. The demo does everything that I wanted. I guess the 600$ is a little redundent. Can anyone confirm this? Clancy Malloy ihlpj!cem