kateveni@ariadne.UUCP (11/09/85)
.TH LCG L "August 1984" .UC 4 .SH NAME LCG \- Latin-Coded Greek notation and filters .br *lcg* \- filters for conversion to and from LCG format .SH SYNOPSIS [\fB lcg2qtroff, lcg2vtroff, lcg2itroff, lcg2pc, lcg2pchex \fR] [ files ] ..... .br [\fB pc2lcg, pchex2lcg \fR] [ files ] ... .br [\fB bin2hex, hex2bin \fR] .SH DESCRIPTION .PP .B LCG (``Latin-Coded Greek'') is a notation for writing Greek text using Latin characters, in a ``phonetic'' fashion. The definition of LCG is biased towards it being used as \fBtroff\fR input. .PP \fBLcg2qtroff\fR, \fBlcg2vtroff\fR and \fBlcg2itroff\fR are pre-\fBtroff\fR filters that take LCG text and convert it into the corresponding escape-character sequences for the greek letters on the ``special'' font of troff. .PP Thus, for example, the input: English .G ElliinikA .br generates the output: English E\\(*l\\(*l\\(*y\\(*n\\(*i\\(*k \\v'-0.1m'\\h'0.32m'\\z\\'\\h'-0.32m'\\v'0.1m'\\(*a .br (where the last line generates an alpha with accent, and is, in reality, a continuation of its previous line with no new-line in between). .PP .B LCG follows the ``monotoniko'' (single-accent) system. .PP When invoked with no arguments, the .B lcg2* filters read from standard input. When invoked with argumnets, they considers them to be file names, and they read those files as input, in the sequence in which they are given. All programms mentioned here send their output to the standard output. Thus, typical uses may be as follows: lcg2vtroff textfile1 textfile2 | vtroff -ms lcg2vtroff textfile1 f2 f3 xyz | tbl | eqn | vtroff -ms .PP .B Lcg2qtroff has been optimized for the QMS "LASERGRAFIX" printer (using \fBqtroff\fR). .B Lcg2vtroff is the same filter, except that it is adjusted for the Varian Electrostatic Plotter (using \fBvtroff\fR) (the accent marks must be adjusted differently due to the different character heights and widths). And .B lcg2itroff is again the same filter, adjusted for the Imagen Laser Printer (using \fBitroff\fR) (that printer has no terminal-sigma character!). .PP .B Lcg2pc is a filter that converts LCG text into ``extended-ASCII'' text for the IBM-personal-computers (pc) that are being sold in Greece (the ones that are available in the Cretan Research Center). .B Pc2lcg is the inverse filter. .PP The source of these filters is organized in such a way that it is easy to define new codes and to compile the corresponding filters: use the files .B code.*.h in the source-directory, and in particular the file .B code.guide.h . The inverse filters only work when the code for each Greek or Latin letter is just a single byte (may be a full-8-bit byte). .PP Two additional filters are provided for the communication between a VAX-UNIX and an IBM-pc. Because that communication uses 7-bit bytes, the filters .B bin2hex and .B hex2bin can be used to convert between a full-8-bit-byte representation (bin) and a hexadecimal representation (hex) where each original byte is represented as a two-digit (two-byte) hexadecimal number. The filters .B lcg2pchex and .B pchex2lcg are simple shell-scripts that specify pipe connections between lcg2pc and bin2hex on one hand, and hex2bin and pc2lcg on the other hand. .SH "GREEK/LATIN (CONVERT/NO-CONVERT) MODES" .PP During its operation, the .B LCG scanner can be in one of two possible modes: L Latin-mode copy input to output G Greek-mode convert input to output .br When in Latin mode, it copies its input -- unchanged -- to the standard output. When in greek-mode, it treats its input as greek text writen with latin characters, parses it according to the lexical rules given below, and sends the corresponding troff escape-sequences to the standard output. The only exceptions are: .br (1) The \fBlcg\fR commands for mode/font change (see below). .br (2) Other lines that begin with a dot (period, ``.'') as their first character (troff commands) are copied unchanged to the standard output, regardless of the mode in which \fBlcg\fR is. .PP The .B LCG scanner starts executing in the \fILatin\fR mode. Some specific character sequences in the input stream are recognized as commands to the \fBlcg\fR scanner, for it to change mode. When \fBlcg2*\fR read their input from multiple files, the mode that is in effect at the end of a file is the mode in which the next file starts being read. The commands to change mode are shown below, together with their effect as well as the output which they generate. INPUT .ft G .G \\fG EFFECT change to Greek-mode OUTPUT none INPUT .ft L .L \\fL EFFECT change to Latin-mode OUTPUT none INPUT .ft R .R \\fR .ft B .B \\fB .ft I .I \\fI EFFECT change to Latin-mode OUTPUT echo input to output INPUT .ft P .ft \\fP EFFECT and OUTPUT: .RS Restore the previous mode/font: If the current mode is Greek, and if the last mode (until the last mode/font change) was Latin, then change to Latin mode and give no output. If the current mode is Latin, then echo the input to the output (i.e. change to previous R/B/I font), and, in addition, if the last mode (until the last mode/font change) was Greek then change to Greek mode. .RE These commands are patterned after the font-change commands of troff. The ones that begin with a period must appear on a line by themselves, while the ones that begin with a back-slash can appear ``in-line'', just like in troff. .PP When in Greek mode, the .B LCG scanner does not recognize any ``in-line'' troff commands other than the mode/font-change ones listed above. If you need to use such commands, you should ``insulate'' them. Example: kAti \\fL\\s+2\\fG spoudaIo \\fL\\s-2\\fG .br See the section ``BUGS'', for some more limitations of the .B LCG scanner. .SH "LEXICAL RULES" .PP When in Greek mode, the .B LCG scanner parses its input into groups of 1, 2, 3, or 4 characters, according to the list of recognized patterns that is given below. The \fIlongest\fR pattern that matches the input at the current position is chosen and converted into the corresponding output pattern. Thus, for example, even though a ``t'' produces a ``tau'' and an ``h'' produces an ``eta'' when by themselves, a ``th'' produces a ``theta''. .B LCG uses some context sensitivity in the cases of sigma's and accents -- see the table below. .PP The table with the recognized input patterns (and the alternatives that some of them have) and the corresponding interpretation follows: INPUT (OR) MEANING lower-case letters: a alfa (atono -- no accent) v b biita g gama d delta e epsilon (atono) z ziita ii h iita (atono) th thiita i iwta (atono) k kapa l lamda m mi n ni x xi (ksi, opws: xydi) o omikron (atono) p pi r rw s [ followed by a,...,z,A,E,H,I,O,Y,U,W or ' -- but not '' ] sigma s [ followed by anything else, including '' ] terminal-sigma t tau y u ypsilon (atono) f fi ch chi (opws: chioni) ps psi (opws: psari) w wmega (atono) upper-case letters (except for accents -- see below): A A (ATONO) B V BIITA G GAMA D DELTA E E (ATONO) Z Z II Ii H H (ATONO) TH Th THIITA I IWTA (ATONO) K K L LAMDA M M N N X XI (KSI, OPWS: XYDI) O O (ATONO) P PI R RW S SIGMA T T Y U YPSILON (ATONO) F FI CH Ch CHI (OPWS: CHIONI) PS Ps PSI (OPWS: PSARI) W WMEGA (ATONO) When immediately preceeded by a lower-case letter: A alfa tonos (accent) E epsilon tonos II Ii H iita tonos I iwta tonos O omikron tonos Y U ypsilon tonos W wmega tonos Other accents: 'a alfa tonos (accent) 'e epsilon tonos 'ii 'h iita tonos 'i iwta tonos 'o omikron tonos 'y 'u ypsilon tonos 'w wmega tonos 'A ALFA TONOS 'E EPSILON TONOS 'II 'Ii 'H IITA TONOS 'I IWTA TONOS 'O OMIKRON TONOS 'Y 'U YPSILON TONOS 'W WMEGA TONOS Dialytika: :i: iwta dialytika :y: :u: ypsilon dialytika :'i: iwta tonos dialytika :'y: :'u: ypsilon tonos dialytika :I: IWTA DIALYTIKA :Y: :U: YPSILON DIALYTIKA .SH "EXAMPLE" .LP This is an example of \\fBlcg\\fR text. .G .LP AutO eInai 'ena parAdeigma keimEnou \\fBlcg\\fP. .sp 3 .ce 3 SKOPOS TOY INSTITOYTOY PLIIROFORIKIIS TOY EREYNIITIKOY KENTROY KRIITIIS (apO to ProedrikO DiAtagma 'IdrysIIs tou) .PP SkopOs tou EreuniitikoU K'entrou KrIItiis eInai ('arthro 2) ``ii diexagwgII basikIIs, efarmosmEniis, kai technologikIIs 'ereunas, kai ii anAptyxii efarmogWn stous exIIs tomeIs technologiWn aichmIIs:....'' .PP GiA to InstitoUto PliiroforikIIs ('arthro 3): ``... skopOs tou InstitoUtou autoU eInai ii 'ereuna, ii melEtii, kai ii ylopoIhsii systiimAtwn pliiroforikIIs pros 'ofelos tiis EthnikIIs OikonomIas kai tiis DiimOsias DioIkiisiis.'' .L .sp 2 .ce \\l'6i' .sp 2 .TS center,box; c s l|l. .G TechnikII OrologIa: _ mikroepexergastIIs \\fLmicroprocessor\\fG olokliirwmEno kYklwma \\fLintegrated circuit\\fG .TE .sp 3 .LP EdW, s' autO to parAdeigma, 'echoume 'ena sIgma m' apOstrofo, enW edW: ``autOs'' 'echoume 'ena sIgma amEsws prin apO eisagwgikA pou kleInoun. O sarwtIIs (\\fLscanner\\fG) giA \\fBlcg\\fP katalabaInei mOnos tou, schedOn pAnta, pOte to sIgma eInai ``mesaIo'' kai pOte eInai ``telikO''. .LP K'ati 'allo pou thElei eidikII prosochII: oi lExeis pistopoi\\fGiitikO, no\\fGiimosYnii thEloun mEsa tous 'ena \\fL\\f\\fLG\\fG, an den thEloume na tis grApsoume: "pistopoihtikO", "nohmosYnii". .sp T'elos tou paradeIgmatos. .L .br End of the example. .SH "SEE ALSO" lcg, troff, qtroff, vtroff, itroff, tbl, eqn .SH FILES /usr/src/local/lcg/* sources /usr/src/local/lcg/code.*.h definitions of codes /usr/src/local/lcg/code.guide.h guide for new codes /usr/local/ objects .SH AUTHOR Manolis G.H. Katevenis, Institute of Computer Science, Research Center of Crete, August 1984. .SH BUGS .PP When in Greek mode, it does not recognize in-line troff commands (troff commands that begin with back-slash): it will convert them to greek, i.e. it will destroy them. Exception: the mode/font-change commands. .PP It does not recognize input-file diversions with the command: .ce .so filename .PP Also, it does not recognize text intended for processing by EQN, neither the table-formatting instructions to TBL. Again, it will convert them to Greek, thus destroying them. .PP It does not recognize the arguments of troff commands, like, for example: .ds LF "InstitoYto PliiroforikIIs KrIItiis" .br and thus, it will not transform them into Greek. .PP The commands which ``restore the previous mode/font'', try to do what you would expect them to do, and also to leave Latin text that uses them (and was writen ignoring .I lcg ) as unmodified as possible. However, it is not clear that they succeed in doing so. Also, they are not completely tested. .PP It choses the wrong kind of sigma ("messaio" instead of "terminal") in the case of words that are truncated and a period is used to indicate that. Example: "To mAthiima Fys. IV ascholeItai me..." (anti "FysikII IV"). .PP Send other bugs to: ariadne!kateveni