[net.math] Hairy combinatorics problem.

greg@harvard.UUCP (Greg) (02/11/86)

For the combinatorics whizzes out there, I have a rather ugly combinatorics
problem; I would be interested in knowing whether or net there are any
good techniques for tackling such problems.  Anyway, here goes:

I wish to count the number of acyclic, saturated hydrocarbons with n carbons.
These molecules can be thought of as acyclic, undirected, connected graphs,
in which the carbons are nodes, the carbon-carbon bonds are edges, and the
hydrogens are ignored.  So, stated mathematically, the question is:  How many
non-isomorphic, acyclic, undirected, connected graphs are there with n nodes,
in which each node has at most four edges leaving it? 

Plus points are awarded for taking into account stereocenters; a stereocenter
in an organic molecule is a carbon atom with four different chemical groups
leading from it, so one should assign an orientation to every carbon with
either three or four non-isomorphic subgraphs leaving it.  Note that whether
or not the subgraphs leaving the carbon are isomorphic may depend on the
orientations in the subgraphs.
-- 
gregregreg

evert@botter.UUCP (Evert Wattel) (02/13/86)

The answer concerning the counts of different acyclic hydrocarbons is
already solved by Po'lya in 1937 in his article


  G. PO'LYA	Kombinatorische Anzahlbestimmungen fu"r Graphen, Gruppen
		und Chemische Verbindungen,
		Acta Math. 68 (1937) p145-254.

The method is simply a generating function application to study the number
regular tree graphs.
If this message is insufficient to help you solve the problem, please contact
directly through mail.
					Evert Wattel
					mcvax!vu44!botter!evert
					botter@evert

moews_b@h-sc1.UUCP (david moews) (02/14/86)

In article <702@harvard.UUCP> greg@harvard.UUCP (Greg) writes:
>... the question is:  How many
>non-isomorphic, acyclic, undirected, connected graphs are there with n nodes,
>in which each node has at most four edges leaving it? 

    The easiest way to do this is to first count the number of rooted trees with
n nodes and having degree <=4 at all vertices and degree <=3 at the root (this
is the number of alkyl groups with n carbons.)  Call these trees "Y-trees".
  If n=0, we set this number to be 1.  If we have a Y-tree with n>0 nodes, the
root will be connected to between 0 and 3 other vertices; the trees rooted at
these vertices will form 0 to 3 Y-trees: each one will have >=1 vertices, and
as a group, they will have a total of n-1 vertices.  If there are m<3 of these
Y-trees, we can add 3-m Y-trees of 0 vertices to bring the total number of 
Y-trees to 3 (since we have said that there is exactly one Y-tree with 0 
vertices.)  In this way, we equate every Y-tree with n vertices with a different
bag (= set with repetitions) of 3 Y-trees with a total of n-1 vertices.  It's
easy to see, also, that for each bag of 3 Y-trees with a total of n-1 vertices
there is a different Y-tree with n vertices; thus the number of Y-trees with n
vertices is in fact equal to the number of bags of 3 Y-trees with a total of
n-1 vertices.  Now if we let g[i] be the number of Y-trees with i vertices and
let h[i] be the number of bags of 3 Y-trees with a total of i vertices, we can

                                                           2        3
define the generating functions G(x) = g[0] + g[1]x + g[2]x  + g[3]x   + ...

                                                           2        3 
                         and    H(x) = h[0] + h[1]x + h[2]x  + h[3]x   + ...  .

Polya's Enumeration Theorem then gives us

   H(x) = Z(S  , G(x)); but by what we have said above, G(x) = 1 + x H(x), so
             3

                                                          1   3      
   G(x) = 1 + x Z(S  , G(x)).  The cycle index for S  is  - (c  + 3 c c + 2 c )
                   3                                3     6   1      1 2     3 ,

so this equation reduces to

                          3             2         3
    G(x) = 1 + 1/6 x (G(x)  + 3 G(x) G(x ) + 2 G(x )).

I don't have a closed-form solution for this equation, but it does at any rate
allow us to compute

                2     3     4     5      6      7      8       9       10
G(x) = 1 + x + x + 2 x + 4 x + 8 x + 17 x + 39 x + 89 x + 211 x + 507 x  + ... 

    Now we need to compute the number of unrooted trees with all vertices of
degree <=4.  In this case, there is a theorem which states that if we have
some variety of trees ("X-trees", say), then the number of unrooted X-trees
with n vertices is equal to the number of rooted X-trees with n vertices 
minus the number of X-trees rooted at an EDGE (instead of a vertex) that are
not symmetric about this edge.  Let an "X-tree" be a tree with all vertices of
degree <=4 (we do *not* allow any X-trees with 0 vertices).
A Y-tree is just a rooted X-tree with a root of degree <=3, and
we know that the generating function for Y-trees is G(x).  Now in the same
manner as every Y-tree with >=1 vertices corresponds to a bag of 3 Y-trees
with one less vertex, every rooted X-tree with n>=1 vertices corresponds
to a bag of 4 Y-trees with a total of n-1 vertices.  If the generating function
for rooted X-trees is J(x), then, we know from Polya that

  J(x) = x Z(S  , G(x)); the cycle index for S  is
              4                               4

  1     4         2      2         
 --- ( c  + 6 c  c  + 3 c  + 8 c  c + 6 c )   so this equation reduces to
 24     1      2  1      2      3  1     4 , 

                      4       2      2        2 2        3              4 
  J(x) = 1/24 x ( G(x) + 6 G(x ) G(x)  + 3 G(x )  + 8 G(x ) G(x) + 6 G(x )).

Now an X-tree rooted at an edge corresponds to a bag of 2 Y-trees, each 
with >=1 vertex (the edge connects their root vertices.)
The generating function for Y-trees with >=1 vertices is G(x)-1, and Polya
tells us that the generating function for bags of 2 Y-trees with >=1 vertices
each is 
                                               1    2
   Z(S  , G(x) - 1) ; the cycle index of S  is - ( c + c )  so this is just 
      2                                   2    2    1   2 ,  

                2      2
1/2 ( (G(x) - 1)  + G(x ) - 1) = I(x), say.

From this function we must subtract the generating function for numbers of 
X-trees rooted at an edge and symmetric about that edge.  The number of 
X-trees with n>=1 vertices that are rooted at an edge and symmetric about that
edge, however, is just the number of pairs of identical Y-trees with a total
of n>=1 vertices; this will be equal to the number of Y-trees with n/2 vertices
if n is even, or 0 if n is odd.  The generating function, then, will be just

       2
    G(x ) - 1.  The number of asymmetric X-trees rooted at an edge then has
generating function

                2
    I(x) - ( G(x ) - 1) =

                2     2               2
 1/2( (G(x) - 1) + G(x ) - 1) ) - (G(x ) - 1) = 

                2      2
 1/2( (G(x) - 1) - (G(x ) - 1) ) =

          2     2
 1/2( G(x) - G(x ) ) + 1 - G(x).

   Putting it all together, the generating function for X-trees is

                  2     2
  J(x) - 1/2( G(x) - G(x ) ) - 1 + G(x) 


                 4       2      2        2 2        3              4 
=   1/24 x ( G(x) + 6 G(x ) G(x)  + 3 G(x )  + 8 G(x ) G(x) + 6 G(x ))

             2        2 
   + 1/2( G(x ) - G(x) ) - 1 + G(x) 

         2    3      4      5      6      7       8       9       10
  = x + x  + x  + 2 x  + 3 x  + 5 x  + 9 x  + 18 x  + 35 x  + 75 x  + ...,

so, for example, there are 75 X-trees with 10 vertices.

--
David Moews        moews%h-sc4@harvard.ARPA     ...!harvard!h-sc4!moews

leimkuhl@uiucdcsp.CS.UIUC.EDU (02/16/86)

Bravo Bravo!

Now was that original?

eklhad@ihnet.UUCP (K. A. Dahlke) (02/17/86)

> I wish to count the number of acyclic, saturated hydrocarbons with n carbons.
> These molecules can be thought of as acyclic, undirected, connected graphs,
> in which the carbons are nodes, the carbon-carbon bonds are edges, and the
> hydrogens are ignored.  

Problems like this can be quite frustrating,
because I feel like I should be able to solve them.
We did similar problems in a graph theory course I
took 4 years ago, but I guess project planning and methodology
have withered my brain since then.
It has something to do with manipulating generating functions
for rooted instances of the carbon trees.

At any rate, I am not being mathematical today.
Instead, I am wearing my chemistry hat,
something I rarely do, since I have only taken one
college level chemistry course.  But here goes.
Representing hydrocarbons as trees with degree  <=  4 breaks down for
modest N (say N>15).
Yes, each hydrocarbon has a corresponding carbon tree,
but many trees do not correspond to physical molecules.
To illustrate, begin with a methane molecule,
represented by the degenerate tree (one point).
Now replace each hydrogen atom with a methyl group,
producing a pentane isomer.
The tree is a 4-way star.
Again, replace each hydrogen with a methyl group,
making a 17-carbon glob.
Repeat od-imnosium.
The graphical model gives no hint of trouble,
yet the number of carbon atoms grows exponentially, while the size of the
molecule (maximum radius) increases linearly.
We have a problem, right?!?
Even the 17-carbon glob may not be possible.
Don't have my table of bond angles and distances handy.
In fact, I conjecture, as N approaches infinity,
the probability that a random N-node tree will correspond to a physically
realizable hydrocarbon approaches zero.
Well anyways, it's still a fun problem.
-- 
	Why don't we do it in the road?
			Karl Dahlke    ihnp4!ihnet!eklhad