sdyer@bbncca.ARPA (Steve Dyer) (02/17/84)
A friend of mine is beginning to look at computer-aided analysis of versification in Russian poets, and one of the first issues to arise is whether there is any standard way to represent the Cyrillic alphabet in 7 or 8-bit bytes (would this be called RuSKII?) It would be easy to assume some arbitrary mapping, but it seems preferable to hew to any standard, if it exists. Thanks, -- /Steve Dyer {decvax,linus,ima}!bbncca!sdyer sdyer@bbncca.ARPA
sdyer@bbncca.ARPA (02/17/84)
Relay-Version: version B 2.10.1 6/24/83; site akgua.UUCP Posting-Version: version B 2.10 5/3/83; site bbncca.ARPA Path: akgua!clyde!floyd!harpo!decvax!bbncca!sdyer Message-ID: <589@bbncca.ARPA> Date: Fri, 17-Feb-84 02:30:50 EST Date-Received: Fri, 17-Feb-84 17:34:49 EST Organization: Bolt, Beranek and Newman, Cambridge, Ma. A friend of mine is beginning to look at computer-aided analysis of versification in Russian poets, and one of the first issues to arise is whether there is any standard way to represent the Cyrillic alphabet in 7 or 8-bit bytes (would this be called RuSKII?) It would be easy to assume some arbitrary mapping, but it seems preferable to hew to any standard, if it exists. Thanks, -- /Steve Dyer {decvax,linus,ima}!bbncca!sdyer sdyer@bbncca.ARPA
colonel@sunybcs.UUCP (George Sicherman) (02/20/84)
This line gets eaten by a w>@=nnnn---* I don't know of a Russian alphabet mapping for ASCII, but IBM designed one for EBCDIC. If you don't mind forgoing lower-case, you can compose theirs with one of the accepted ASCII-EBCDIC maps. 00 space space 10 & K 20 - - 30 0 0 01 A A 11 J L 21 / / 31 1 1 02 B B 12 K M 22 S F 32 2 2 03 C V 13 L N 23 T KH 33 3 3 04 D G 14 M O 24 U TS 34 4 4 05 E D 15 N P 25 V CH 35 5 5 06 F E 16 O R 26 W SH 36 6 6 07 G ZH 17 P S 27 X SHCH 37 7 7 08 H Z 18 Q T 28 Y Y 38 8 8 09 I I 19 R U 29 Z M.Z. 39 9 9 0B . . 1B $ LOZ. 2B , , 3B # YU 0C < I KR. 1C * * 2C % E OBO. 3C @ YA 0E + + Abbreviations: I KR. is short I (I with a breve); E OBO. is backwards E; M.Z. is Soft Sign; LOZ. is the lozenge symbol from the old IBM commercial BCD character set (sort of a concave-sided square). No Hard Sign or pre- revolutionary letters. Alternative: adapt Russian Morse Code, which uses a more natural cor- respondence (with International Morse). Col. G. L. Sicherman ...seismo!rochester!rocksvax!sunybcs!colonel
russell@cmcl2.UUCP (02/22/84)
#R:bbncca:-58900:cmcl2:4300002:000:837 cmcl2!russell Feb 21 17:49:00 1984 I have a very old chart from Honeywell that lists the following USSR standard for the representation of characer data in 8-bits (as a sidebar): GOST 13052-67 defines the USSR set, shown in the lower row entry position, of columns 12-15. Actually, the standard defines the characters for columns 4-7 of a 7-bit set (SO=Russian register, SI=Latin register). Columns 8-11 are idential o 0-3. The chart has a little box that reads: Reprints of this chart are available from the Honeywell Computer Journal (P.O. Box 6000, Phoenix, AX 85005) at $1 each postpaid. I imagine that the price (if available at all) has gone up. The chart has no date on it, but the latest ANSI standard mentioned is dated 1970. I would guess that you could write a letter to the Russian Trade Mission in NY or Washington and get the appropriate documents. -- Bill Russell UUCP: ...!floyd!cmcl2!russell (212) 460-7292 InterNet: Russell@NYU.ARPA
ljdickey@watmath.UUCP (Lee Dickey) (03/19/84)
ECMA (European Computer Manufacturers Association) acts as the registrar for the International Standards Organization for the purpose of keeping the registry of character sets. There is a Cyrillic set and a Cyrillic Extension. -- Lee Dickey, University of Waterloo. (ljdickey@watmath.UUCP) ...!allegra!watmath!ljdickey ...!ucbvax!decvax!watmath!ljdickey