[comp.compression] Another naive question

byron@archone.tamu.edu (Byron Rakitzis) (04/08/91)

I was wondering if anyone could point me in the right direction so I can
find out more about the so-called "entropy" of data.

Just thinking about it in the shower, it occurs to me that one might be
able to take any given binary string, and say, "this turing machine foo
can print this string bar". If foo's number (you can enumerate all turing
machines) has a smaller number of digits than the original string bar, then
you've just shown something interesting about bar, namely that it has a more
compact representation in the form of a turing machine foo.

Now, I'm probably barking up the wrong tree when I speak of turing machines,
but I'm hoping someone will point me at the right references.

Byron.
---
char*c,q[512],m[256],*v[99],**u,*i[3];f[2],p;main(){for(m[m[60]=m[62]=32]=m[*m=m
[124]=9]=6;e(-8),gets(1+(c=q))||exit(0);r(0,0))for(;*++c;);}r(t,o){for(*i=i[2]=0
,u=v+98;m[*--c]^9;m[*c]&32?i[*c&2]=u[0],u-v^98&&++u:3)if(!m[*c]){for(*++c=0;!m[*
--c];);*--u=c+++1;}u-v^98?strcmp(*u,"cd")?*c?pipe(f),o=f[1]:1,(p=fork())?e(p),o?
r(o,0),z(o),z(*f):4,wait(0):(o?dup2(*f,0),z(*f),z(o):*i?1,z(0),e(open(*i,0)):5,t
?dup2(t,1),z(t):i[2]?9,z(1),e(creat(i[2],438)):2,e(execvp(*u,u))):e(chdir(u[1])*
2):6;}e(x){x<0?write(2,"?\n$ "-x/4,2),x+1||exit(1):5;}z(i){close(i);}
				(by Byron Rakitzis and Sean Dorward)

blaak@csri.toronto.edu (Raymond Blaak) (04/11/91)

byron@archone.tamu.edu (Byron Rakitzis) writes:
 
>Just thinking about it in the shower, it occurs to me that one might be
>able to take any given binary string, and say, "this turing machine foo
>can print this string bar". If foo's number (you can enumerate all turing
>machines) has a smaller number of digits than the original string bar, then
>you've just shown something interesting about bar, namely that it has a more
>compact representation in the form of a turing machine foo.

>Byron.
>---
>char*c,q[512],m[256],*v[99],**u,*i[3];f[2],p;main(){for(m[m[60]=m[62]=32]=m...
...lots of stuff deleted...
>2):6;}e(x){x<0?write(2,"?\n$ "-x/4,2),x+1||exit(1):5;}z(i){close(i);}
>				(by Byron Rakitzis and Sean Dorward)

First off, what does your signature do? I am afraid to run it on my machine.

About the entropy of data:

You are right. If you can find a program whose representation is smaller than
that of the string it generates, then you have a more compact representation.

But in order for this to happen their must be some regularity or pattern in
the string. If the string is totally random, the most compact representation
of the string is itself.

Another consideration is: even if there is a turing machine whose "number"
has less digits than the string it prints, how can you find it? There is no
universal algorithm for such a thing.

What compression algorithms do is just try a finite number of techniques
(often just 1) a choose the one that works the best. Even then the result may
be no where near perfect. Consider the digit string that represents pi. It is
infinitely long. Any frequency count method used will still result in an
infinitely long string. However, the program that generates pi is pretty
short (mind you, it takes infinitely long to run).

Cheers,
Ray

sigma@sun.ipl.rpi.edu (Kevin Martin) (04/11/91)

blaak@csri.toronto.edu (Raymond Blaak) writes:
>byron@archone.tamu.edu (Byron Rakitzis) writes:
>>char*c,q[512],m[256],*v[99],**u,*i[3];f[2],p;main(){for(m[m[60]=m[62]=32]=m...
>...lots of stuff deleted...
>>2):6;}e(x){x<0?write(2,"?\n$ "-x/4,2),x+1||exit(1):5;}z(i){close(i);}
>>				(by Byron Rakitzis and Sean Dorward)
>
>First off, what does your signature do? I am afraid to run it on my machine.

It's a simple shell.  Works pretty well, I'd say.  Wasn't it a winner in a
recent obfuscated C contest?

-- 
Kevin Martin
sigma@ipl.rpi.edu