OEYO8722%TREARN.BITNET@uga.cc.uga.edu ( Hur AKDULGER) (09/06/90)
!> Date: Wed, 5 Sep 90 07:37:29 -0500 !> From: convex!graham@uxc.cso.uiuc.edu (Marv Graham) !> In article <24386@adm.BRL.MIL> you write: !> >and now we're converting the most repeated words (greater than 1) !> >to unused chars. Unused characters are (K)(L)(M)(N)(O)(P) in the OURRING. !> Suppose there are no unused characters? !> Marv Graham; Convex Computer Corp. {uunet,sun,uiucdcs,allegra}!converaham I received private mail from Mr. M. Graham today. It's private mail, but I'll answer him question on here. I think we can't suppose. Because it's impossible. I'll code a DICTIONARY (english-turkish, turkish-english). English alphabet have got 26 characters (uppers + lowers = 52 chars). And our (Turkish) alphabet have got 29 characters (uppers + lowers = 58 chars). 23 letters are identical (same) in both of alphabets. 26 - 23 = 3 (three letters aren't using in turkish alphabet). 29 - 23 = 6 (our 6 letters aren't using in english alphabet). Total = 2 * (23 + 6 + 3) = 64 characters (capital letters + small letters). and special characters're "~.:,;-'12345678910!&?<>()" and square brackets (I can't type square brackets on my term). New total = 64 + 25 = 89 characters. ASCII character set have got 255 characters. Unused characters count can be ---> 255 - 89 = 166. My algorithm don't useful, I think, the data file of dictionary includes maximum 89 different letters. It's special case. I use it only in my DICTIONARY program. --------- Now, We're developing new logics ---------------------- Converting two bytes to one byte algorithm better than one word (string) to one byte algorithm. Because On Dictionary file (Unused chars (A)(B)(C)(D)) string to one word algoritm. String Count New String ---------------- ----- ---------------- ---------------------- HOP 1 HOP It didn't change (same size) HOPE 1 HOPE " HOPEFUL 1 HOPEFUL " HOPEFULLY 1 HOPEFULLY " HOPEFULNESS 1 HOPEFULLNESS " HOPELESS 1 HOPELESS " HOPELESSLY 1 HOPELESSLY " HOPELESSNESS 1 HOPELESSNESS " two bytes to one byte algorithm. String Count chance ---------------- ----- ---------------- HO 8 HO ---> A PE 7 PE ---> B FU 2 FU ---> C LL 1 dont change LN 1 " ES 1 " LE 3 LE ---> D SS 4 SS ---> F LY 1 dont change NE 1 dont change String New String ---------------- ---------------- ---------------------- HOP AP 3 bytes will be 2 bytes HOPE AB 4 " " " 2 " HOPEFUL ABCL 7 " " " 4 " HOPEFULLY ABCLLY 9 " " " 6 " HOPEFULNESS ABCLNESS 11 " " " 8 " HOPELESS ABDF 8 " " " 4 " HOPELESSLY ABDFLY 10 " " " 6 " HOPELESSNESS ABDFNEF 12 " " " 7 " Our header string is "HOAPEBFUCLEDSSF". (15 bytes) size of input strings : 64 size of output strings + header : 39 + 15 = 54 it's not bad result, because all "HO"s in our data file will be "A". header size dont change. (3 bytes) if we've got 166 unused char and if we used all of them, our header string size will be 166 * 3 = 498 bytes. I trying Huffman tree, it's good way. But I cant code it... Hur AKDULGER...........