R1TMARG%AKRONVM.BITNET@cornellc.cit.cornell.edu (Tim Margush) (11/18/89)
I know that the question regarding conversion of strings to upper case was asked from the turbo pascal perspective. Here at the Univ. of Akron, our introductory pascal class uses an IBM mainframe. In pascal, this means that the collating sequence for the characters is a bit different from that on most other machines (EBCDIC vs ASCII). The conversion routines posted both relied upon the contiguity of the codes for the characters a..z and A..Z. On some EBCDIC systems, there are valid characters within this range that should not be converted in an upcase/lowcase operation. This is something to consider for those writing programs that might be used in both environments. After all, isn't Pascal code completely portable? --------------------------------------------------------------------- Tim Margush R1TMARG@AKRONVM.BITNET Department of Mathematical Sciences R1TMARG@VM1.CC.UAKRON.EDU University Of Akron R1TMARG@AKRONVM.UAKRON.EDU Akron, OH 44325 (216) 375-7109
balcer@jaguar (Marc J Balcer) (11/18/89)
R1TMARG%AKRONVM.BITNET@cornellc.cit.cornell.edu (Tim Margush) writes: >... >This is something to consider for those writing programs that might be used >in both environments. After all, isn't Pascal code completely portable? Not only is portability important, but why memorize the ASCII (or EBCDIC) tables? Here's a conversion that's rather character-set independent: function uppercase (ch: char) : char; { Returns the uppercase equivalent of the given character. (If ch is already uppercase or is not a letter, it returns the value of ch unchanged. } begin if ch in ['a','b','c','d','e','f','g','h','i','j','k','l','m', 'n','o','p','q','r','s','t','u','v','w','x','y','z'] then uppercase := chr (ord(ch) + ord('A') - ord('a')) else uppercase := ch end; The only assumption that this function makes is that the distance between every capital letter and its lowercase equivalent must be the same. In other words, (ord('a')-ord('A')) = (ord('b')-ord('B')) = (ord('c')-ord('C')) = ... I don't know of any character set (that has both capitals and lowercase) in which this is not true. The ugly set expression is that way because EBCDIC has "holes" in its alphabetic range: there are non-alphabetic characters in between some of the alphabetic characters. (If you knew exactly where they are you could probably shorten the expression.) --------------------------------------------------------------------------- Marc J. Balcer [balcer@cadillac.siemens.com] Siemens Research Center, 755 College Road East, Princeton, NJ 08540 (609) 734-6531