[sci.philosophy.tech] Simplicity and truth

sarge@thirdi.UUCP (Sarge Gerbode) (01/01/70)

In article <20194@ucbvax.BERKELEY.EDU> kube@cogsci.berkeley.edu.UUCP (Paul Kube) writes:

>But a theory may make more assumptions
>(even logically stronger ones) than another, and yet have each of its
>assumptions play an explanatory role.  Ockham's razor enjoins us to
>disbelieve the first theory, other things being equal; and I'm still
>wondering if there's an argument for it.
>

I think if one theory has greater explanatory power than another, the two
theories don't satisfy the conditions for applying Occam's Razor (How do you
spell that, anyhow?).  The law (as I understand it -- I hope the same way
Occam did) states that of two explanations, each of which fits all the
available facts equally well, one should pick the simpler or more modest
one.  If one of the explanations doesn't explain all the facts as well (lacks
equal explanatory power), then Occam's razor doesn't apply.
-- 
"Absolute knowledge means never having to change your mind."

Sarge Gerbode
Institute for Research in Metapsychology
950 Guinda St.
Palo Alto, CA 94301
UUCP:  pyramid!thirdi!sarge

kube@cogsci.berkeley.edu (Paul Kube) (08/21/87)

In article <8727@ut-sally.UUCP> turpin@ut-sally.UUCP (Russell Turpin) writes:
>Ockham's razor is a demand for parsimony of assumptions. Whether
>or not one finds it compelling is a philosophic debate over which
>much paper has been dirtied (and bits flipped.) But it is a
>philosophic, as opposed to aesthetic, principle.

Philosophic it may be, but so far it is just a pronouncement of taste,
like Quine's preference for desert landscapes.  I was wondering if you
had an argument.

>In my original
>posting, I made it clear that I was talking about two physical
>theories which (a) had identical explanatory power and (b) one of
>which required logically weaker physical assumptions (laws). 

No, you did not make this clear.  You talked about three cases: A pair
of equiexplanatory theories, one of which makes fewer unexplained
assumptions than the other; a pair of equiexplanatory theories
which make the same assumptions but one; and a pair of equiexplanatory
theories which make all the same assumptions.  Entailment relations
between the assumption sets wasn't mentioned, nor were assumptions
equated with `laws'.

>In short, the other theory is making physical assumptions that
>provide no additional explanatory power. Since belief in physical
>laws is justified (in many epistemologies) by reference to their
>explanatory power (amongst other things), these extra assumptions
>(putative laws) should be rejected. (This, of course, is just a
>restatement of a strict version of Ockham's razor.)

Now this is an argument, but it's not an argument for Ockham's razor.
It's an argument for not believing theories that make assumptions that
play no explanatory role.  But a theory may make more assumptions
(even logically stronger ones) than another, and yet have each of its
assumptions play an explanatory role.  Ockham's razor enjoins us to
disbelieve the first theory, other things being equal; and I'm still
wondering if there's an argument for it.

--Paul kube@berkeley.edu, ...!ucbvax!kube

eric@snark.UUCP (Eric S. Raymond) (08/22/87)

In article <20194@ucbvax.BERKELEY.EDU>, kube@cogsci.berkeley.edu (Paul Kube) writes:
>                                             Ockham's razor enjoins us to
> disbelieve the first [simpler] theory, other things being equal; and I'm
> still wondering if there's an argument for it.
> 
> --Paul kube@berkeley.edu, ...!ucbvax!kube

There is no *formal* argument for Occam's Razor. It's a heuristic, based on
experience of what kinds of theory-building practices yield the most predictive
and robust theories. The argument for it, like the argument for scientific
method itself, is simply that it works.
-- 
      Eric S. Raymond
      UUCP:  {{seismo,ihnp4,rutgers}!cbmvax,sdcrdcf!burdvax,vu-vlsi}!snark!eric
      Post:  22 South Warren Avenue, Malvern, PA 19355    Phone: (215)-296-5718

kube@cogsci.berkeley.edu (Paul Kube) (08/24/87)

In article <100@thirdi.UUCP> sarge@thirdi.UUCP (Sarge Gerbode) writes:
>In article <20194@ucbvax.BERKELEY.EDU> kube@cogsci.berkeley.edu.UUCP (Paul Kube) writes:
>
>>But a theory may make more assumptions
>>(even logically stronger ones) than another, and yet have each of its
>>assumptions play an explanatory role.  Ockham's razor enjoins us to
>>disbelieve the first theory, other things being equal; and I'm still
>>wondering if there's an argument for it.
>>
>
>I think if one theory has greater explanatory power than another, the two
>theories don't satisfy the conditions for applying Occam's Razor...

That's why I said "other things being equal".  	I was supposing that
the only difference between the two theories was their "size" given
some metric on theories.  Others have suggested the metric be the
number of assumptions that the theory makes; I don't like this much
since assumptions seem hard to count but I'll go along with it for
sake of argument.  It's also unclear how to order theories with
respect to "explanatory power"... I've been assuming two theories to
have the same explanatory power if they license all the same
inferences among observation sentences, but this hasn't played much of
a role in the discussion yet.  (So if one of two equiexplanatory
theories has a logically stronger assumption set than the other, it
means the nonobservaton sentences it entails are a superset of the
other; they entail the same observation sentences.)

>(How do you spell that, anyhow?). 

My Webster's prefers Ockham, giving Occam as a variant.

>The law (as I understand it -- I hope the same way
>Occam did) states that of two explanations, each of which fits all the
>available facts equally well, one should pick the simpler or more modest
>one.  

Yes, "Don't multiply entities beyond necessity" is the gist of it.  And
there are reasons for taking this advice:  You tend to get theories
that are easier to understand, and maybe easier to use in practice
(though simplifying one's ontology can make a theory harder to use;
cf.  Hartry Field's _Science without Numbers_).  But some people
(maybe Ockham himself) wanted to say that one should always pick the
simpler theory *because it's more likely to be true*, and I have
been wondering what reasons there are for believing *that*.

--Paul kube@berkeley.edu, ...!ucbvax!kube

myers@tybalt.caltech.edu (Bob Myers) (08/25/87)

In article <20271@ucbvax.BERKELEY.EDU> kube@cogsci.berkeley.edu.UUCP (Paul Kube) writes:
>cf.  Hartry Field's _Science without Numbers_).  But some people
>(maybe Ockham himself) wanted to say that one should always pick the
>simpler theory *because it's more likely to be true*, and I have
>been wondering what reasons there are for believing *that*.

Are we still talking about scientific theories? What does it mean
to say that a scientific theory is more likely to be true?

Do you mean more likely to fit observation, over a wider range
of phenomena? That's the only possible explanation I can see.

As I've said, I think it is a mistake to use the word "true" with
respect to science. It has too many absolute connotations that
science just doesn't deal with.

-------------------------------------------------------------------------------

Bob Myers                                         myers@tybalt.caltech.edu
			 {rutgers,amdahl}!cit-vax!tybalt.caltech.edu!myers

sarge@thirdi.UUCP (Sarge Gerbode) (08/25/87)

In article <20271@ucbvax.BERKELEY.EDU> kube@cogsci.berkeley.edu.UUCP (Paul Kube) writes:

>I've been assuming two theories to
>have the same explanatory power if they license all the same
>inferences among observation sentences, but this hasn't played much of
>a role in the discussion yet.  (So if one of two equiexplanatory
>theories has a logically stronger assumption set than the other, it
>means the nonobservation sentences it entails are a superset of the
>other; they entail the same observation sentences.)

Very interesting concept of explanation.  But would the "nonobservation
sentences" (by which I assume you mean statements about unobserved or
unobservable entities) necessary have to be the same between two theories of
unequal explanatory power?  Seems to me that different explanations often make
different assertions about non-observed entities.  For instance, Jung talks
about Archetypes and Freud about Ego, Id, and Superego.  One could imagine
these theories explain the same observable phenomena but allege the existence
of *different* non obwervables;  one is not merely a superset of the other.

In this case, you might have a problem deciding which has greater explanatory
power, a problem which wouldn't exist if one was a superset of the other.
You'd have to make some type of judgment about the number of non-observed
entities alleged and the modesty of asserting the existence of these
non-observed entities.  For instance it is *slightly* more modest to say that
the "Id" causes various psychological effects than to say that Men From Mars
cause them.

>My Webster's prefers Ockham, giving Occam as a variant.

OK, but "Occam" is simpler :-).

>But some people
>(maybe Ockham himself) wanted to say that one should always pick the
>simpler theory *because it's more likely to be true*, and I have
>been wondering what reasons there are for believing *that*.

Well, when a theory is fundamentally flawed (such as the Ptolemaic system), it
tends to get very complex when one tries to shoehorn it into existing
observations.  The history of science, as Kuhn points out, is that of gradually
expanding complexity of a given paradigm, followed by a new paradigm that is
simpler and fits the same facts (or most of them).  In this case, simplicity is
a sign of truth.

Whether that's because the universe actually *is* simple or whether we just
get along better with simpler theories is something we probably will never
know.  It's probably safer to believe the latter.  More modest, anyway.
-- 
"Absolute knowledge means never having to change your mind."

Sarge Gerbode
Institute for Research in Metapsychology
950 Guinda St.
Palo Alto, CA 94301
UUCP:  pyramid!thirdi!sarge

kube@cogsci.berkeley.edu (Paul Kube) (08/26/87)

References:

In article <132@snark.UUCP> eric@snark.UUCP (Eric S. Raymond) writes:
>There is no *formal* argument for Occam's Razor. It's a heuristic, based on
>experience of what kinds of theory-building practices yield the most predictive
>and robust theories. The argument for it, like the argument for scientific
>method itself, is simply that it works.

If by `formal' you mean `deductive', I'd guess you're certainly right.
Ockham's razor isn't a truth of logic.  If you mean `rigorous', I don't see
what its being a heuristic has to do with it.  It's certainly possible to
give a rigorous justification for a heuristic.

I'd believe a strong inductive argument for Ockham's razor along the
lines you sketch, but I'm not optimistic about there being one.  For
one thing, the clearest sorts of violation of the razor--where a
theory is complexified *without any change in observational
consequences*--exactly preserve predictiveness and robustness.  For
another, I'd bet that application of the razor without regard to
observational consequences does at least as much harm as good; I'd
offhand expect complexification to work just as well.  But I'm
prepared to be convinced by a careful study of the history of science
that shows otherwise.  

I'd also believe an argument that goes along the folowing lines, but
I'm not optimistic about it being extendable in the appropriate ways.
Maybe you can suggest something: As I've characterized it, Ockham's
razor is a claim about the relative likelihoods of the truth of
theories, viz.  that the simpler of two is more likely to be true.  So
suppose there are two theories, T1 and T2, and let W1 and W2 be the
sets of possible worlds in which they are respectively true.  Suppose
T1 and T2 have all the same observational consequences; then W1 and W2
are both subsets of the set of possible worlds that, for all we can
tell by observation, the actual world is in.  The question is: Is the
actual world in W1 or W2?, and we want to maximize the likelihood of
making the right guess.  Well, the rational thing seems to be to pick
the bigger of W1 and W2 (and so the least restrictive, i.e. simplest,
of T1 and T2).  But it seems to me that for lots of cases we care
about, W1 and W2 are going to have the same cardinality; and a
natural measure will assign them both the same measure; and then I
don't know how to say one is more likely than the other.

--Paul kube@berkeley.edu, ...!ucbvax!kube

eric@snark.UUCP (08/27/87)

In the following, OR = Occam's Razor.

In article <20297@ucbvax.BERKELEY.EDU>, kube@cogsci.berkeley.edu (Paul Kube) writes:
> [quotes my claim that OR is just a heuristic, not formally demonstrable]
>
> I'd also believe an argument that goes along the folowing lines, but
> I'm not optimistic about it being extendable in the appropriate ways.
>
> [Argument that says we should trust weaker theories because they describe
>  larger subsets of the set of all possible universes, so our universe is
>  more likely to be in the set]

That's a really interesting way to think about the problem which hadn't
occurred to me at all. It's no kind of 'demonstration' of OR but it seems to
translate our intuitive notion of 'theory strength' into terms that may make
it easier to think about.

>   But it seems to me that for lots of cases we care about, [the truth sets
> of the theories being compared] are going to have the same cardinality; and a
> natural measure will assign them both the same measure; and then I
> don't know how to say one is more likely than the other.

Well, then...don't. There are situations like this in the real world where
scientists work with several predictively-equivalent but distinct formalisms. I
understand, for example, that there are three distinct formalism, very
different in style, in which you can do quantum mechanics. One of them is
called S-matrix theory, I think the second one is based on systematic use of
Feynman diagrams, and the I think the third involves trying to solve the
time-dependent Schrodinger equations by analytic means.

I may have those three flavors wrong (physicists be gentle with me, I am
merely a defrocked mathematician) but I hope this makes the point. Nobody
says that you *have* to choose one out of a bunch of predictively-equivalent
theories and swear allegiance to it...

> --Paul kube@berkeley.edu, ...!ucbvax!kube
-- 
      Eric S. Raymond
      UUCP:  {{seismo,ihnp4,rutgers}!cbmvax,sdcrdcf!burdvax,vu-vlsi}!snark!eric
      Post:  22 South Warren Avenue, Malvern, PA 19355    Phone: (215)-296-5718

sarge@thirdi.UUCP (08/27/87)

In article <20297@ucbvax.BERKELEY.EDU> kube@cogsci.berkeley.edu.UUCP (Paul Kube) writes:
>As I've characterized it, Ockham's
>razor is a claim about the relative likelihoods of the truth of
>theories, viz.  that the simpler of two is more likely to be true.  So
>suppose there are two theories, T1 and T2, and let W1 and W2 be the
>sets of possible worlds in which they are respectively true.  Suppose
>T1 and T2 have all the same observational consequences; then W1 and W2
>are both subsets of the set of possible worlds that, for all we can
>tell by observation, the actual world is in.  The question is: Is the
>actual world in W1 or W2?, and we want to maximize the likelihood of
>making the right guess.  Well, the rational thing seems to be to pick
>the bigger of W1 and W2 (and so the least restrictive, i.e. simplest,
>of T1 and T2).  But it seems to me that for lots of cases we care
>about, W1 and W2 are going to have the same cardinality; and a
>natural measure will assign them both the same measure; and then I
>don't know how to say one is more likely than the other.

Very interesting idea -- but it seems to me that you are talking about
*another* criterion for desirability in scientific theories, namely: scope.
Of two theories, we will pick the one that has the broadest potential
applicability (something that explains *all* trees, not just grapefruit trees,
for instance -- the biggest W1 or W2, as you say).  And this makes sense, for
the reason you give.  But is it really necessarily the *simpler* theory that
has the wider scope?  That doesn't seem to be necessarily true.  It seems that
sometimes to generate a wider W1 or W2 of applicability, one might have to add
complexities to the theory.  It *might* be the case, but if so, an example or
an argument to demonstrate this would help.

I may have missed the boat totally on what you are trying to say.
-- 
"Absolute knowledge means never having to change your mind."

Sarge Gerbode
Institute for Research in Metapsychology
950 Guinda St.
Palo Alto, CA 94301
UUCP:  pyramid!thirdi!sarge