karl@haddock.ISC.COM (Karl Heuer) (10/21/87)
Several posters have remarked that certain algorithms would be much simpler if the natural language being processed by them had been designed sensibly. This remark usually draws a reply like "The software must change to meet the needs of the users. Computers are the servant of Man, not the other way around. It is absurd to suggest that a society should change its alphabet.". It's true that a computer is a tool. What nobody seems to have noticed is that *natural language is also a tool*. The alphabet is the servant of Man, not the other way around; thus it is appropriate to suggest that it should evolve to meet Man's changing needs. I learned from the textbooks that English has certain rules concerning whether punctuation goes inside or outside of quotes. As a computer user, I regularly break these rules and instead apply a more sensible one: the punctuation goes inside if and only if it is part of the text being quoted. If the text being quoted is input to a computer, this can be critical; but I do this even with straight English. We who follow this convention are figuratively rewriting the textbooks. If it is painful to adapt the software to handle the peculiarities of certain languages/alphabets (I have in mind Chinese, Japanese, and to a lesser extent the accented letters of some European languages, and to some extent English), then it is reasonable to consider the possibility that the language/alphabet should change instead of the software. I am not saying that the former *must* be the one to change, only that it should be considered. I recognize that there's a lot of inertia to overcome, but might not the benefits be worth it? Karl W. Z. Heuer (ima!haddock!karl or karl@haddock.isc.com), The Walking Lint Disclaimer: The word "Man" in the above denotes the entire species. ---> Followup cautiously; this article was cross-posted! <---
dik@cwi.nl (Dik T. Winter) (10/21/87)
In article <1446@haddock.ISC.COM> karl@haddock.isc.com (Karl Heuer) writes: > The alphabet is the servant of Man, > not the other way around; thus it is appropriate to suggest that it should > evolve to meet Man's changing needs. > If it is painful to adapt the software to handle the peculiarities of certain > languages/alphabets (I have in mind Chinese, Japanese, and to a lesser extent > the accented letters of some European languages, and to some extent English), > then it is reasonable to consider the possibility that the language/alphabet > should change instead of the software. I am not saying that the former *must* > be the one to change, only that it should be considered. I recognize that > there's a lot of inertia to overcome, but might not the benefits be worth it? Oh yes, pi is about 3.1103. I do not understand you ask? Well it is clear, we use base 8. Oh, you ask, why not base 16? Mm, are our computers using different alphabets? -- dik t. winter, cwi, amsterdam, nederland INTERNET : dik@cwi.nl BITNET/EARN: dik@mcvax
gls@odyssey.ATT.COM (g.l.sicherman) (10/22/87)
> It's true that a computer is a tool. What nobody seems to have noticed is > that *natural language is also a tool*. The alphabet is the servant of Man, > not the other way around; thus it is appropriate to suggest that it should > evolve to meet Man's changing needs. What Karl says is true, but it might be well to distinguish the alphabet from natural language. Alphabetic writing is highly unnatural. Historically, when alphabets were introduced they proved as revolutionary as computers are proving now. Remember the story of Cadmus and the dragon's teeth? > I learned from the textbooks that English has certain rules concerning whether > punctuation goes inside or outside of quotes. As a computer user, I regularly > break these rules and instead apply a more sensible one: ... Naturally. Programmers cannot afford to position their "punctuation marks" wherever they will look best. Move the semicolon outside the right brace and you have a syntax error. With truly "natural" language--that is, speech--the problem does not arise. As McLuhan says, Shakespeare never heard a grammatical error. (There were none!) --- "No more `mutiny'!" --A. Razaf, "Christopher Columbus" -- Col. G. L. Sicherman ...!ihnp4!odyssey!gls
jim@hpiacla.HP.COM (Jim Rogers) (10/23/87)
The concept that "natural languages" are tools has merit. The concept that these languages should be standardized to simplify the life of computer programmers is ludicrous. Each local language has local customs, history, and even thought patterns deeply imbeded in its fabric. Every time a local language replaced by another language an irreplacable piece of human creativity and knowledge is lost. Language is more than just the way governments control their populace. Language is the fundamental basis for all human communication. The basenote made reference to the "inertia" invloved in scrapping all "natural languages" in favor of a single standard language. I say that this is not only impractical but also undesirable. Would we re-write all literature for the simple purpose of making life easier for computer programmers? When this is done how many different versions of the "natural language" parsing tools would be built? I would guess there would be atleast one for every version of every computer language in use now or in the future. Why not first invent a standardized computer environment which is used by all hardware. This would include operating systems, file systems, networking, and, of course, computer languages. After all, only a small percentage of the residents of this planet are computer programmers. It should be musch easier to create a standard acceptable to the smaller group than to create one acceptable to the general population of the planet. The answer to that question is obvious. There are many different computer environments for many reasons. Most of those reasons reduce to the fact that a given computer environment has been designed and developed to meet a specific set of needs. I have not yet met the genius who has designed a computer environment which meets all the critical needs of all users. Am I saying that standards are impossible? No. I am saying that there are (and must be) a finite set (containing more than one member) of standards which will be developed to meet the known needs of the computing community. New research and design in areas not covered by standards, or in methods not accepted by standards will (and must) continue. As capabilities, expectations, and needs change the standards must change. The only constant is change, and even that happens at varying rates. Jim Rogers
hmj@tut.fi (Matti J{rvinen) (10/26/87)
In article <1446@haddock.ISC.COM> karl@haddock.isc.com (Karl Heuer) writes: > The alphabet is the servant of Man, > not the other way around; thus it is appropriate to suggest that it should > evolve to meet Man's changing needs. > If it is painful to adapt the software to handle the peculiarities of certain > languages/alphabets (I have in mind Chinese, Japanese, and to a lesser extent > the accented letters of some European languages, and to some extent English), > then it is reasonable to consider the possibility that the language/alphabet > should change instead of the software. I am not saying that the former *must* > be the one to change, only that it should be considered. I recognize that > there's a lot of inertia to overcome, but might not the benefits be worth it? Is this a joke or are you really stupid enough to be serious? Those so called "accented" letters are very important in some languages. How would you change alphabets using them as separate letters? If { refers to a with dots (umlaut a), I may write two Finnish words valittaa and v{litt{{ having meanings "mourn" and "deliver". So, replacing { with a can not be done. Letter e can be after letters a or o, so replacing { with ae can not be done. Finnish is written as it is spoken. Every letter has only one way to pronounce it. If you drop letters off, how would you write words containing those letters? This all is (partially) true for several languages (e.g. Swedish and German). KEEP YOUR NASTY FINGERS OUT OF OUR ALPHABET AND FIX YOUR PROGRAMME(R)S!! -- Hannu-Matti Jarvinen, Tampere University of Technology, Finland Project EAST - European Advanced Software Technology hmj@tut.fi, hmj@tut.uucp, hmj@tut.funet (tut.ARPA is not the same computer).
john@frog.UUCP (John Woods, Software) (10/27/87)
In article <365@zuring.cwi.nl>, dik@cwi.nl (Dik T. Winter) writes: >In article <1446@haddock.ISC.COM> karl@haddock.isc.com (Karl Heuer) writes: >> The alphabet is the servant of Man, >>not the other way around; thus it is appropriate to suggest that it should >>evolve to meet Man's changing needs. > > Oh yes, pi is about 3.1103. > I do not understand you ask? > ...etc... Fie on you. Languages are constantly evolving to meet the needs of those using them (except, perhaps, for CERTAIN languages with governmental bodies created to ensure permanent ossification... :-). English, for instance, dropped grammar-coding endings many centuries ago, mostly because of the difficulties people encountered in trying to reconcile differing sets of endings (thanks to the recent Norse invaders, etc.) (there is a PBS series, and a corresponding book, "The Story of English", that tells of this and many more things, in quite an entertaining style). Some believe that humans walk upright because of evolving to better use tools. Perhaps you feel this was a mistake, and that sticks should have been designed to be used while knuckle-walking... :-) (Note, I don't necessarily feel that alphabets must, or even should, change because of inadequacies of computers. It's still an idea worth contemplating, however.) -- John Woods, Charles River Data Systems, Framingham MA, (617) 626-1101 ...!decvax!frog!john, ...!mit-eddie!jfw, jfw@eddie.mit.edu "Cutting the space budget really restores my faith in humanity. It eliminates dreams, goals, and ideals and lets us get straight to the business of hate, debauchery, and self-annihilation." -- Johnny Hart
karl@haddock.ISC.COM (Karl Heuer) (10/27/87)
In article <190001@hpiacla.HP.COM> jim@hpiacla.HP.COM (Jim Rogers) writes: >The concept that "natural languages" are tools has merit. The concept that >these languages should be standardized to simplify the life of computer >programmers is ludicrous. Actually, not so much the programmers as the users. There are a lot more of the latter. >Each local language has local customs, history, and even thought patterns >deeply imbeded in its fabric. I don't think Spanish would be impoverished if "ch" were to be sorted as two letters instead of one, nor do I think Spanish-speaking people would be losing a significant part of their cultural heritage if they straightened this out. (Just to pick one example. Btw, an equally valid "fix" would be to make it a single letter, with its own ASCII value and everything.) >The basenote made reference to the "inertia" invloved in scrapping all >"natural languages" in favor of a single standard language. I did *not* suggest a single standard language. I didn't even ask for a single standard alphabet, although that would solve a lot of problems. I merely suggested that if the lexical warts get in the way, it's possible that they'll get removed rather than avoided. Karl W. Z. Heuer (ima!haddock!karl or karl@haddock.isc.com), The Walking Lint
edwards@uwmacc.UUCP (mark edwards) (10/27/87)
In article <1890@frog.UUCP> john@frog.UUCP (John Woods, Software) writes: >In article <365@zuring.cwi.nl>, dik@cwi.nl (Dik T. Winter) writes: >>In article <1446@haddock.ISC.COM> karl@haddock.isc.com (Karl Heuer) writes: >>> The alphabet is the servant of Man, >>>not the other way around; thus it is appropriate to suggest that it should >>>evolve to meet Man's changing needs. >Fie on you. Languages are constantly evolving to meet the needs of those >using them (except, perhaps, for CERTAIN languages with governmental bodies >created to ensure permanent ossification... :-). [...] >(Note, I don't necessarily feel that alphabets must, or even should, change >because of inadequacies of computers. It's still an idea worth contemplating, >however.) Well time for my two cents. There have been histories of countries changing other countries alphabets. Consider the countries of IndoChina. They use to use Chinese characters. After France had its way the connection between China and the countries of Indochina became historical. Imagine having to learn a different language just inorder to read your countries history book in Original form. Now to take the argument of changing alphabets to meet the needs of computers. If people think that is a reasonable argument then I suggest we go further and change natural language to meet the needs of the computer. After all the benefits of having 100% computer understanding of every thing we say and write and do is just astronomical. We would have to mold our speaking to speak in simple, very distinct, and very unambiguous words and sentences. The sentences we would speak would be precise and very logical. Each sentence would have a determinable boolean qualifier. It would be either true or false, "maybe" causes to many problems. Words like cute or pretty, or hot, and cold would disappear because what is cute to one person may not be to another. That is what us computer people call inconsistent. A computer can not have inconsistent rules. Language would become boring, life would become dull. Well unfortunately, or should I say fortunately computer people will not get there way in this one. There are domains where the methods must change to meet the needs of the computer. Language is not one of these domains. The computer/programs will have to change to meet the needs of the humans. And that includes characters in alphabets also. mark -- edwards@vms.macc.wisc.edu {allegra, ihnp4, seismo}!uwvax!uwmacc!edwards UW-Madison, 1210 West Dayton St., Madison WI 53706
joe@haddock.ISC.COM (Joe Chapman) (10/29/87)
Mr Karl Heuer: "The alphabet is the servant of Man, not the other way around; thus it is appropriate to suggest that it should evolve to meet Man's changing needs." Hra Hannu-Matti Jarvinen: "KEEP YOUR NASTY FINGERS OUT OF OUR ALPHABET AND FIX YOUR PROGRAMME(R)S!!" We've probably hammered this topic to death (at least in sci.lang) but it seems to me rather silly to lump languages such as Finnish, which have a few non-ascii characters, in with languages such as Chinese. A proposal to truncate alphabets in the former group seems to me, at the risk of seeming unpatriotic, as a peculiarly American sort of chauvanism. Whether the trend towards simplification of ideographic languages---for example, official simplified characters and the use of pinyin in Chinese, and the disuse of non-toyo list kanji in Japanese---reflects a natural evolution in the language or another instance of Western-influenced information-processsing arrogance is anyone's guess. The more interesting assertion to me is the notion that language is simply another tool which can be altered to suit societal needs, as opposed to something people and societies find themselves in the midst of. Granted, minor changes in the fabric of language can be made by governments and individuals, but one simply has to wait for the fundamental process of signification to change. This is a topic that can probably only be argued in French; any comments? -- Joe Chapman harvard!ima!joe
gls@odyssey.ATT.COM (g.l.sicherman) (10/29/87)
> Finnish is written as it is spoken. Every letter has only one way to > pronounce it. If you drop letters off, how would you write words > containing those letters? In English we use combinations like "th" and "ch." It works fine unless you insist on phonetic spelling. > This all is (partially) true for several languages (e.g. Swedish and German). > KEEP YOUR NASTY FINGERS OUT OF OUR ALPHABET AND FIX YOUR PROGRAMME(R)S!! This seems awfully possessive. He who steals my alphabet, steals trash. I can always invent a new one! By the way, I have yet to see a standard that accommodates the Seuss postzetals: yuzz, wum, um, humph, ... --- "No matter where you go, there you are ... except that when you're on the phone, you're nowhere." --Ollaroo MacNoonzai -- Col. G. L. Sicherman ...!ihnp4!odyssey!gls
lisper@yale.UUCP (Bjorn Lisper) (10/31/87)
In article <1924@kuukkeli.tut.fi> hmj@kuukkeli.UUCP (Hannu-Matti J{rvinen) writes: >In article <1446@haddock.ISC.COM> karl@haddock.isc.com (Karl Heuer) writes: >> The alphabet is the servant of Man, >> not the other way around; thus it is appropriate to suggest that it should >> evolve to meet Man's changing needs. >> If it is painful to adapt the software to handle the peculiarities of >> certain >> languages/alphabets (I have in mind Chinese, Japanese, and to a lesser >> extent >> the accented letters of some European languages, and to some extent >> English), >> then it is reasonable to consider the possibility that the language/alphabet >> should change instead of the software. I am not saying that the former >> *must* >> be the one to change, only that it should be considered. I recognize that >> there's a lot of inertia to overcome, but might not the benefits be worth >> it? > >Is this a joke or are you really stupid enough to be serious? >Those so called "accented" letters are very important in some languages. >How would you change alphabets using them as separate letters? >If { refers to a with dots (umlaut a), I may write two Finnish >words > valittaa and > v{litt{{ >having meanings "mourn" and "deliver". So, replacing { with a can not be >done. Letter e can be after letters a or o, so replacing { with ae >can not be done. > >Finnish is written as it is spoken. Every letter has only one way to >pronounce it. If you drop letters off, how would you write words >containing those letters? > >This all is (partially) true for several languages (e.g. Swedish and German). >KEEP YOUR NASTY FINGERS OUT OF OUR ALPHABET AND FIX YOUR PROGRAMME(R)S!! > There are, of course, sometimes ways to transcribe "nonstandard" characters to "standard" (with regard to the English character set) characters that is unambiguous. The three special Swedish letters a-with-circle, a-with-dieresis and o-with-dieresis, for instance, have the transcriptions aa, ae and oe, respectively. Thus all Swedish words can really be transcribed to "English" form. Context will then decide whether for instance "oe" means o-with-dieresis or o followed by e. But why did the Swedish character set include these extra characters in the first place? The answer is that Swedish has more vowels than can be expressed with ordinary latin characters without resorting to constructions as above and these "extra" vowels ARE AS IMPORTANT AS THE OTHERS for the meaning of the words and should not be treated differently; thus they deserve characters of their own. This is also economical since these vowels are frequent in Swedish and "single character codes" for them saves work and space. Another aspect is that according to the Swedish pronounciation rules "oe" should really be pronounced as "o" followed by "e", so the usage of this for o-with-dieresis should clutter the swedish pronounciation rules with "unswedish" exceptions. The alphabet is certainly the servant of Man, especially is a national alphabet the servant of the people of the nation in question. Bjorn Lisper (for ignorant anglosaxons) Bjoern Lisper (for somewh
larry@sgistl.SGI.COM (Larry Autry) (11/01/87)
In article <348@odyssey.ATT.COM>, gls@odyssey.ATT.COM (g.l.sicherman) writes: > > In English we use combinations like "th" and "ch." It works fine unless > you insist on phonetic spelling. When other languages such as Chinese, Japanese, and Polynesian are Anglicized, they appear to be spelled similar to Spanish rules of pronunciation. Am I mistaken? If more languages, even English, were to adopt at least similar guidelines, a large gap would close. -- Larry Autry larry@sgistl.sgi.com or {ucbvax,sun,ames,pryamid,decwrl}!sgi!sgistl!larry
karl@haddock.UUCP (11/02/87)
In article <1924@kuukkeli.tut.fi> hmj@kuukkeli.UUCP (Hannu-Matti J{rvinen) writes: >Is this a joke or are you really stupid enough to be serious? Neither. Please note that I did not make any specific proposals for changing any alphabet. My article can be summarized as "alphabets are not immutable"; it was a rebuttal to previous articles which seemed to implicitly assume the opposite. To forestall accusations of American chauvinism, let me concentrate on English (which my article also mentioned). English words include two non-letters, hyphen ("-") and apostrophe ("'"). Let's look at the latter. It's been several years since I've seen the word "Halloween" spelled with an apostrophe. Many traffic lights say "DONT WALK". So many people confuse "its" and "it's" that they might as well be alternate spellings of each other. Given the above, and the collation problem caused by apostrophe, I would consider it possible (not necessarily desirable) that American English may soon drop the use of apostrophe, at least in some contexts. This would create some collisions; I would guess that the existing words "cant" and "wont" (but not "shell") would probably be dropped from the language, just as "quean" disappeared after the Great Vowel Shift made it a homonym for "queen". Karl W. Z. Heuer (ima!haddock!karl or karl@haddock.isc.com), The Walking Lint Followups to sci.lang only.
minow@decvax.UUCP (Martin Minow) (11/08/87)
In article <18306@yale-celray.yale.UUCP> Bjorn Lisper (lisper@yale-celray.UUCP)
suggests that "All Swedish words can really be transcribed to 'English' form"
by replacing a-ring by 'aa', a-dieresis by 'ae', and o-dieresis by 'oe'.
Unfortunately, this will not work properly:
1. Many Finnish words are used in Swedish. For example, the common Finnish
name "Paavo" is pronounced with a long-a (as in father) sound, not with
an 'o' (as in boat). However P<aa>ve means "Pope"
2. Sequences of vowels become ambiguous. For example, Sj<aa>are (dock-worker).
Many of these sequences arise from Swedish compounding rules. In
general, a sequence of vowels will indicate a morepheme boundary.
Turning these sequences into what appear to be trigraphs will cause
confusion.
We can see examples of Bjorn's suggestion causing problems in modern Danish.
In 1948, Danish started using a-ring for the previously used 'aa' sequence.
Also, words beginning with a-ring were moved to the back of the alphabet,
*even* if they were written with 'aa.' Thus, AAlborg (the town) was
alphabetized after 'Z'. The one exception to this were foreign words,
such as from Finnish, with natural sequences of 'aa'.
Martin Minow
decvax!minow
lisper@yale.UUCP (11/12/87)
In article <182@decvax.UUCP> minow@decvax.UUCP (Martin Minow) writes: >In article <18306@yale-celray.yale.UUCP> Bjorn Lisper (lisper@yale-celray.UUCP) >suggests that "All Swedish words can really be transcribed to 'English' form" >by replacing a-ring by 'aa', a-dieresis by 'ae', and o-dieresis by 'oe'. > >Unfortunately, this will not work properly: > >1. Many Finnish words are used in Swedish. For example, the common Finnish > name "Paavo" is pronounced with a long-a (as in father) sound, not with > an 'o' (as in boat). However P<aa>ve means "Pope" > >2. Sequences of vowels become ambiguous. For example, Sj<aa>are (dock-worker). > Many of these sequences arise from Swedish compounding rules. In > general, a sequence of vowels will indicate a morepheme boundary. > Turning these sequences into what appear to be trigraphs will cause > confusion. > >We can see examples of Bjorn's suggestion causing problems in modern Danish. >In 1948, Danish started using a-ring for the previously used 'aa' sequence. >Also, words beginning with a-ring were moved to the back of the alphabet, >*even* if they were written with 'aa.' Thus, AAlborg (the town) was >alphabetized after 'Z'. The one exception to this were foreign words, >such as from Finnish, with natural sequences of 'aa'. > >Martin Minow >decvax!minow Certainly there will be problems. The meanings of aa, ae and oe will be context-dependent, as I pointed out in my previous posting. This is for exactly the same reasons as you mention. (Another example: "o-" is the prefix in Swedish equivalent to the English "un-". Thus the Swedish word for uneconomical, "oekonomisk", contains the "oe", but it is pronounced as "o" followed by "e", NOT as o-with-dieresis.) My proposal was merely rethorical and I do not advocate its enforcement. Bjorn Lisper