ben@hpcvlx.cv.hp.com (Benjamin Ellsworth) (10/24/90)
Index Number: 11247 [Note from Bill McGarry: This was from issue 10.52 of the RISKS digest.] > The system must have used some kind of voice-recognition algorithm, > because no human typist that I know could have kept up with the > speaker at times. I very strongly doubt this. I would bet a substantial sum of money that there was a stenographer and not a computer capturing the words. > The weakness of the voice-recognition system was made painfully > obvious... There is RISK of assuming all failures are technologically induced. It could very well be that the stenographer hired was simply not very good. The good ones are expensive, and to do "real-time" stenography takes a good stenographer. There is a plausible explanation involving computer RISKs however. The translation from the steno notation to full english words was in all likelyhood automated. In stenography there are a number of dialects (usually called theories). Some dialects, especially the older ones, are not particularly suitable to machine translation. There are also more than a few translation programs. Between stenographic dialects and computer translators there can be a significant compatibility problem. It could be that the stenographer was extremely capable in the courtroom (where the translations are done off-line by a human), while at the same time using a style/dialect/theory which was incompatible with the machine translator. There has been an interesting interaction between technology and court recording in the last couple of decades. My mother, for instance, is in the process of re-learning her stenography in a computer compatible dialect. It reminds me of pilots who have to learn to fly in a computer compatible way (training around system weaknesses). Benjamin Ellsworth ben@cv.hp.com All relevant disclaimers apply.
mfidelma@BBN.COM (Miles R. Fidelman) (10/24/90)
Index Number: 11248 [Note from Bill McGarry: This is from issue 10.53 of the RISKS digest.] I've seen a talk where real-time transcription was provided by court stenographers. They used a version of a stenotype machine coupled to display software. Stenotype machines have phonetic keyboards, and their raw output looks very much like what is described here. In courtroom practice, a clean transcript is made later. In the talk I saw, some software provided partial on-the-fly cleanup, but no where near perfect. Another reader comments that an ASL translator would be preferable. My own take is that for technical talks this real-time transcription seems better able to catch technical vocabulary.
rudy@mtqua.att.com (Avram R Vener) (10/26/90)
Index Number: 11272 In article <15124@bunker.UUCP>, ben@hpcvlx.cv.hp.com (Benjamin Ellsworth) writes: > Index Number: 11247 > > > The system must have used some kind of voice-recognition algorithm, > > because no human typist that I know could have kept up with the > > speaker at times. > > I very strongly doubt this. I would bet a substantial sum of money > that there was a stenographer and not a computer capturing the words. You would be right. At work I use a court reporter operating a StenoRAM (TM) which is attached to an Xscribe (TM) computer to give me real time speech to text during meetings. Accuracy of translation is typically very good. Usually better than 99.5 percent. However, this accuracy is dependant upon individualized dictionaries which each stenographer must create through practice while using the system. The computer program uses the dictionaries when converting the output of the StenoRAM into English. The problem that often occurs in real time steno to English translations is that a word or phrases is encountered which the computer 'almos' recognizes and uses the nearest match. This can result in often ludicrous translations. The trick is to have the court reporter learn any new vocabulary before hand and incorporate them into the dictionary. This is not always possible in the case of captioned news programs. In my situation, such words only cause trouble once, the they are incorporated into dictionaries and are correctly translated thereafter. Rudy Vener AT&T BTL MT 2D-509 uucp: att!mtqua!rudy
Jack.O'keeffe@f26.n129.z1.fidonet.org (Jack O'keeffe) (10/30/90)
Index Number: 11372 [This is from the Silent Talk Conference] MR> I've seen a talk where real-time transcription was provided by MR> court stenographers. They used a version of a stenotype machine MR> coupled to display software. . . . . . . . . In the talk I saw, MR> some software provided partial on-the-fly cleanup, but no where MR> near perfect. Real time captioning of several sessions at the SHHH Little Rock convention was done by American Data Captioning (CaptionAmerica) of Pittsburgh. They also do captioning for NBC and others. The arrangement was the best I've seen. A TV camera at the rear of the room videographed the speaker's face, and this was projected on a large screen at the front to facilitate speechreading. Captions were keyed on a stenotype machine by Joe Karlovits (one of the partners in CaptionAmerica), translated into something very close to English on a small computer, and projected across the bottom of the screen. The translation gaffes occured when encountering words that were not in the computer's translation dictionary. Proper names and place names are frequently garbled, but this can easily be overcome if the caption recorder is given a list beforehand to add to the dictionary. Technical terms are another likely source of error. There have been a few really classic errors that rank right up there with the "REPUBLICANS / RUBBLE CANS" from the Jimmy Carter talk. These happen regularly, even on the networks. Our visually impaired friends should appreciate one I saw where "RUMBLINGS" became "RUM BLINKS". But currently in first place in the gaffe hall of fame is one I viewed within the past week. The word "ABHORRENT" was transmuted to caption as "AB WHORE RENT". MR> Another reader comments that an ASL translator would be MR> preferable. I cannot agree with that, since such a miniscule segment of the population, even of the deaf population, is fluent in ASL. MR> My own take is that for technical talks this real-time MR> transcription seems better able to catch technical vocabulary. There is one other mildly disconcerting aspect, at least for the speaker. During my talk at Little Rock, I tried as always to establish eye contact with members of the audience which included many speechreaders. Try as I might, they were all looking up and off to the left - not directly at me. Eventually I realized they were not watching me, but were watching my video image on the big screen with the captions underneath. ... Jack. -- Uucp: ..!{decvax,oliveb}!bunker!hcap!hnews!129!26!Jack.O'keeffe Internet: Jack.O'keeffe@f26.n129.z1.fidonet.org