vandome@imag.UUCP (Gerard Vandome) (05/01/86)
I would like to clarify the definition of an "international character". First, becareful with some words such as : character, char, byte, integer, letter, string ... For example, SVID 1 indicates in GETC(BA_LIB) that : "the function getc returns the next character (i.e., byte) ... " although its definition is : int getc(stream). Secondly, using ISO 646 (US ASCII), no problems arise because of a correspondance between byte and character. The fact that the result is an integer and not an unsigned integer (as expected) allows the test of EOF (generally -1). Consider the following problem in CONV(BA_LIB) : int toupper(c) (with int c) called, for example, in ISO 8859/1 with character c = ll must return Ll in Spanish. In an international version of UNIX, what should be a "character" ? with ISO 8859/1 (latin 1) code with CCITT (teletext) code where acharacter may be constituted by a diacritical sign followed by a letter with JIS 6226 (japanese) code where a character stands on 2 bytes QUESTIONS: - What is the size in bytes of a character ? - Is that question a real question? - Are double letters such as "ij" in Dutch or "'e" in teletext code considered as one character? - Is an international character a signed or an unsigned character? I will be pleased to receive yours comments on this topic. Pascal BEYLS BULL France EUNET : mcvax!vmucnam!echbull!xopen