# Conditional probability with trees

Here’s an interesting question: consider a tree structure, where each branch (edge) has a leaf set $$L$$ containing two nodes. Choosing an edge at random, what is the probability that $$a$$ is in the leaf set given that $$b$$ is in the leaf set? Assume that there are a countable number of leaf set states.

From the definition of conditional probability, we can write $$\hat{P}(a|b) = \hat{P}(a,b)/\hat{P}(b)$$. The empirical distribution of $$b$$ is given by the number of edges whose leaf sets contain at least one instance of $$b$$, divided by the total number of edges. We write this as

The joint distribution is given by the number of edges that have in their leaf set both $$a$$ and $$b$$, divided by the number of edges:

Dividing, we have

Using the properties of the Iverson brackets, we can rewrite this as

Note that this is not the same as $$\hat{P}(a)$$, defined

Mistaking them would be entirely understandable—they do look quite similar—but the conditional probability involves a sum over a subset of the entire sample space, and has a different normalization constant.

Written on June 15, 2016