Jekyll2018-03-14T00:44:59+00:00http://daviddewhurst.github.io/daviddewhurst.github.ioUncertainty, risk, and lossContinuum preferential attachment, volume 22018-02-19T18:00:00+00:002018-02-19T18:00:00+00:00http://daviddewhurst.github.io/continuum-preferential-attachment-redux<p>Awhile ago—in fact, too long ago—I <a href="http://daviddewhurst.github.io/choose-a-firm/">noted</a> that
Peter Dodds and I were working on a short paper regarding continuum preferential attachment processes.
Well, <a href="https://arxiv.org/abs/1710.07580">here it is</a>.
We submitted it to Physical Review E and got a “reject with resubmit” late last fall; I haven’t yet had a chance
to resubmit.</p>
<p>To recap: we construct a continuum (read: PDE) mean-field model for preferential attachment processes and solve
it in all generality.
Because we didn’t have anything better to do, we then extended this process to \(N \geq 1\) dimensions.
We then note that the power-law distribution of firm sizes in the US can be partially explained by such a process.
While it is well-known that a power law fits this data, a theoretical explanation was not known to us; we propose a
simple economic model here that reproduces this phenomenon.</p>
<p>Let me know if you have any suggestions for the resubmit!</p>
<p><a href="https://twitter.com/share" class="twitter-share-button" data-via="d_r_dewhurst" data-size="large">Tweet</a>
<script>!function(d,s,id){var js,fjs=d.getElementsByTagName(s)[0],p=/^http:/.test(d.location)?'http':'https';if(!d.getElementById(id)){js=d.createElement(s);js.id=id;js.src=p+'://platform.twitter.com/widgets.js';fjs.parentNode.insertBefore(js,fjs);}}(document, 'script', 'twitter-wjs');</script></p>Awhile ago—in fact, too long ago—I noted that Peter Dodds and I were working on a short paper regarding continuum preferential attachment processes. Well, here it is. We submitted it to Physical Review E and got a “reject with resubmit” late last fall; I haven’t yet had a chance to resubmit.Proof of a Bonferroni inequality2017-09-02T23:59:59+00:002017-09-02T23:59:59+00:00http://daviddewhurst.github.io/bonferroni<p>Here is a very enjoyable theorem due to Bonferroni.
Let \(n \geq 2 \) and consider the probability triple \( (\Omega, \mathcal{F}, P) \) and a collection of sets of \( \Omega \) in \( \mathcal{F} \) denoted \( ( A_i )_{i=1}^n \).
Then the following holds:</p>
<script type="math/tex; mode=display">P \left( \bigcup_{i=1}^n A_i \right) \geq \sum_{i=1}^n P(A_i) - \sum_{i=1}^{n-1}\sum_{j=i+1}^n P(A_i \cap A_j)</script>
<h3 id="proof">Proof</h3>
<p>By induction. The case where \(n = 2\) is obvious.
Assume the inequality holds up to \(n\) and denote \(B = \bigcup_{i = 1}^n A_i \).
Then the base case implies</p>
<script type="math/tex; mode=display">P(A_{n + 1} \cup B) = P(A_{n + 1}) + P(B) - P(A_{n + 1} \cap B).</script>
<p>Using the distributive law, this becomes</p>
<script type="math/tex; mode=display">P(A_{n + 1} \cap B) = P\Big(A_{n + 1} \cap \bigcup_{i=1}^n A_i \Big) = P\Big( \bigcup_{i=1}^n (A_{n + 1} \cap A_i) \Big)</script>
<p>which, by Boole’s inequality, is less than or equal to \( \sum_{i=1}^n P(A_{n + 1} \cap A_i) \).
So,</p>
<script type="math/tex; mode=display">P(A_{n + 1} \cup B) \geq P(A_{n + 1}) + P(B) - \sum_{i=1}^n P(A_{n + 1} \cap A_i)</script>
<script type="math/tex; mode=display">\qquad \geq P(A_{n + 1}) + \left( \sum_{i=1}^n P(A_i) - \sum_{i=1}^{n-1}\sum_{j=i+1}^n P(A_i \cap A_j) \right) - \sum_{i=1}^n P(A_{n + 1} \cap A_i)</script>
<p>by the inductive hypothesis.
Simplifying, we note that the term \( \sum_{i=1}^n P(A_{n + 1} \cap A_i) \) is the increment of the sum over \( i < j \) in the inequality, thus resulting in</p>
<script type="math/tex; mode=display">P(A_{n + 1} \cup B) \geq \sum_{i = 1}^{n + 1} P(A_i) - \sum_{i=1}^{n}\sum_{j=i+1}^{n + 1} P(A_i \cap A_j),</script>
<p>which was to be proved.</p>
<p><a href="https://twitter.com/share" class="twitter-share-button" data-via="d_r_dewhurst" data-size="large">Tweet</a>
<script>!function(d,s,id){var js,fjs=d.getElementsByTagName(s)[0],p=/^http:/.test(d.location)?'http':'https';if(!d.getElementById(id)){js=d.createElement(s);js.id=id;js.src=p+'://platform.twitter.com/widgets.js';fjs.parentNode.insertBefore(js,fjs);}}(document, 'script', 'twitter-wjs');</script></p>Here is a very enjoyable theorem due to Bonferroni. Let \(n \geq 2 \) and consider the probability triple \( (\Omega, \mathcal{F}, P) \) and a collection of sets of \( \Omega \) in \( \mathcal{F} \) denoted \( ( A_i )_{i=1}^n \). Then the following holds:Brief and (somewhat) intuitive description of the Kolmogorov equations2017-08-08T19:00:00+00:002017-08-08T19:00:00+00:00http://daviddewhurst.github.io/bke-fke<p>I have never taken a course in probability, let alone a course in stochastic differential equations, so I am sort of winging it here. I think the following is one of the shorter “derivations” of the BKE and FKE that can be performed–but the disclaimer above is just in case there’s a well-known shorter derivation of which I’m unaware!</p>
<p>Suppose \( (\mathcal{F})_{t\geq 0}\) is a filtration that carries all known information about the system under study.
This simply means that \(\mathcal{F}_t\) carries all information about the system from time \(t=0\) up to, and including, time \(t\).
Let \(X_t\) be a diffusion process such that the following conditions are satisfied:</p>
<script type="math/tex; mode=display">\mathbb{E}[dX_t | \mathcal{F}_t] = \mu(x,t)dt</script>
<script type="math/tex; mode=display">\mathbb{E}[dX_t^2 | \mathcal{F}_t] = \sigma^2(x,t)dt</script>
<p>Since \(\mathcal{F}_t\) carries all information about the system up to time \(t\), we can rewrite the above as expectations under the space and time variables:</p>
<script type="math/tex; mode=display">\mathbb{E}_{x,t}[dX_t] = \mu(x,t)dt</script>
<script type="math/tex; mode=display">\mathbb{E}_{x,t}[dX_t^2] = \sigma^2(x,t) dt</script>
<h3 id="bke">BKE</h3>
<p>We will first derive the BKE; it is fundamental as it defines the state evolution operator.
The FKE (or Fokker-Planck equation) can be derived from the BKE.
Suppose we have \(t’ > t\).
Let us define \(V(X_T)\) as the function that gives the value of a payoff at time \(T\).
The value at time \(T\) is given by \(f(X_T, T) = V(X_T)\) (the payoff is known!), and hence, moving backward in time, the function \(f\) just gives the expectation of the final payoff: \(f(x,t) = \mathbb{E}_{x,t}[V(X_T)]\).
Thus we can write</p>
<script type="math/tex; mode=display">f(x,t) = \mathbb{E}_{x,t}[ \mathbb{E}[V(X_T) | \mathcal{F}_{t'}] ] = \mathbb{E}_{x,t}[ f(X_{t'}, t') ]</script>
<p>Let us set \(t’ = t + dt \) and \(X_t = x\).
Expanding \(f(X_{t + dt}, t + dt)\) in Taylor series gives</p>
<script type="math/tex; mode=display">f(X_{t + dt}, t + dt) = f(x,t) + \partial_t f dt + \partial_x f dX_t + \frac{1}{2}\partial_x^2 f dX_t^2 + \cdots</script>
<p>(We will truncate terms of the expansion at \(\mathcal{O}(dt^2)\).)
Expanding the above expectation, we have</p>
<script type="math/tex; mode=display">\mathbb{E}_{x,t}[ f(X_{t + dt}, t + dt) ] \simeq \mathbb{E}_{x,t}[f(x,t) + \partial_t f dt + \partial_x f dX_t + \frac{1}{2}\partial_x^2 f dX_t^2]</script>
<script type="math/tex; mode=display">= f(x,t) + \partial_t f dt + \partial_x f \mathbb{E}_{x,t}[dX_t] + \frac{1}{2}\partial_x^2f \mathbb{E}_{x,t}[dX_t^2]</script>
<script type="math/tex; mode=display">= f(x,t) + \partial_t f dt + \mu(x,t) \partial_x f dt + \frac{\sigma^2(x,t)}{2}\partial_x^2 f dt.</script>
<p>Now, since \( f(x,t) = \mathbb{E}_{x,t}[ f(X_{t + dt}, t + dt) ] \), we subtract \(f(x,t)\) from both sides of the equation to find the BKE:</p>
<script type="math/tex; mode=display">\partial_t f = -\mu(x,t)\partial_x f - \frac{\sigma^2(x,t)}{2}\partial_x^2f</script>
<h3 id="fke">FKE</h3>
<p>Clearly, the BKE can be rewritten as an operator equation:</p>
<script type="math/tex; mode=display">\partial_t f = - \mathcal{L}f,</script>
<p>where we have defined the linear operator \(\mathcal{L} = \mu(x,t) \partial_x + \frac{\sigma^2(x,t)}{2}\partial_x^2 \).
Heuristically speaking, what the BKE tells us is the probability distribution of the process \(X_t\) starting at the initial point \( (x,t) \) until the present; this is why it’s the more fundamental of the two equations.
What we would like is an equation that describes the probability distribution of the process going forward from the current point \( (x’, \tau) \).
It is intuitive that this equation is given by</p>
<script type="math/tex; mode=display">\partial_{\tau}f = \mathcal{L}^{\dagger}f,</script>
<p>where the dagger denotes the adjoint; we want the operator to work “backwards” in some way.
Our task is thus to find the adjoint operator of \(\mathcal{L}\).
Since we have been implicitly assuming (by use of the double expectations above) that \(f \in L^2\), we must solve the operator equation</p>
<script type="math/tex; mode=display">\langle f, \mathcal{L}g \rangle_{L^2} = \langle g, \mathcal{L}^{\dagger}f \rangle_{L^2}</script>
<p>To do this we will integrate by parts:</p>
<script type="math/tex; mode=display">\langle f, \mathcal{L}g \rangle_{L^2} = \int_{-\infty}^{\infty} dx f(x) ( \mu(x) \partial_x g + \frac{1}{2}\sigma^2(x) \partial_x^2g)</script>
<script type="math/tex; mode=display">= g(x)\mu(x)f(x)|_{-\infty}^{\infty} - \int_{-\infty}^{\infty}dx g(x) \partial_x(\mu(x)f(x)) + \frac{1}{2}\sigma^2(x)f(x)\partial_xg|_{-\infty}^{\infty} - \frac{1}{2}\int_{-\infty}^{\infty}dx (\partial_x g)\partial_x(\sigma^2(x) f(x))</script>
<script type="math/tex; mode=display">= - \int_{-\infty}^{\infty}dx g(x) \partial_x(\mu(x)f(x)) - \frac{1}{2}g(x)\partial_x(\sigma^2(x)f(x))|_{-\infty}^{\infty} + \frac{1}{2}\int_{-\infty}^{\infty}dx g(x) \partial_x^2(\sigma^2(x)f(x))</script>
<script type="math/tex; mode=display">= \int_{-\infty}^{\infty} dx g(x) \left[ -\partial_x(\mu(x)f(x)) + \frac{1}{2}\partial_x^2(\sigma^2(x)f(x)) \right]</script>
<p>Thus the adjoint operator must be</p>
<script type="math/tex; mode=display">\mathcal{L}^{\dagger} = -\partial_x \mu(x, \tau) + \frac{1}{2}\partial_x^2 \sigma^2(x,\tau),</script>
<p>giving the familiar Fokker-Planck equation as</p>
<script type="math/tex; mode=display">\partial_{\tau} f = - \partial_x(\mu(x,\tau) f) + \frac{1}{2}\partial_x^2(\sigma^2(x,\tau) f)</script>
<p>The FKE can be derived formally–it takes a little longer–but this treatment is intuitive. We can understand taking the adjoint of the operator as performing time reversal on the system; note that the sign of the drift term changes in the time reversal.
Note also that the system does not exhibit time symmetry, as the drift and diffusion functions are acted upon by the derivatives in the adjoint but not in the BKE.</p>
<p><a href="https://twitter.com/share" class="twitter-share-button" data-via="d_r_dewhurst" data-size="large">Tweet</a>
<script>!function(d,s,id){var js,fjs=d.getElementsByTagName(s)[0],p=/^http:/.test(d.location)?'http':'https';if(!d.getElementById(id)){js=d.createElement(s);js.id=id;js.src=p+'://platform.twitter.com/widgets.js';fjs.parentNode.insertBefore(js,fjs);}}(document, 'script', 'twitter-wjs');</script></p>I have never taken a course in probability, let alone a course in stochastic differential equations, so I am sort of winging it here. I think the following is one of the shorter “derivations” of the BKE and FKE that can be performed–but the disclaimer above is just in case there’s a well-known shorter derivation of which I’m unaware!Heat equation, apparently just for fun2017-08-01T17:00:00+00:002017-08-01T17:00:00+00:00http://daviddewhurst.github.io/diffusion-equation-why-not<p>It often happens in research that one writes a considerable amount of code only to find that it won’t turn out to be useful in whatever project one is working on. Such was the case with me today: I wrote a solver for the diffusion equation only to find that it won’t suit the purpose for which I originally wrote it. Oh well! Check out these cool numerical solutions anyway.</p>
<p>Recall that the diffusion equation with constant rate of diffusivity is given by</p>
<script type="math/tex; mode=display">\frac{\partial \rho}{\partial t} = \nabla^2 \rho.</script>
<p>My application required me to solve it in two dimensions only, so that’s what I’ll do here, setting the domain of solution \(\Omega = [0,1]\times[0,1]\).
I considered two basic kinds of boundary conditions:</p>
<ol>
<li>Neumann, given by \( \nabla_{n}\rho = 0 \) (or, more generally, some function of time), where \(n\) is a vector normal to the boundary \(\partial \Omega\).</li>
<li>Dirichlet, given by \(\rho(\partial \Omega) = 0 \) (again, more generally, some function of time).</li>
</ol>
<p>First, let’s see what happens when we start out with a multivariate normal distribution centered at \( \mu = (\frac{1}{2}, \frac{1}{2})^T \) with covariance matrix given by \( \Sigma = \frac{1}{4} I \) and use Neumann boundary conditions:</p>
<p><img src="/documents/heat_eqn_neumann_mult_norm.png" alt="png" /></p>
<p>The left panel is the initial condition, while the right is the state of the system after \(N_t\) timesteps of size \(\Delta t\).
Since I won’t be able to use this code for my research, I just made some fun shapes…</p>
<p><img src="/documents/heat_eqn_neumann_squares.png" alt="png" /></p>
<p><img src="/documents/heat_eqn_neumann_uneven_outline.png" alt="png" /></p>
<p>The above are both using the Neumann BCs.
For a interesting example of how boundary conditions <strong>really</strong> matter in PDEs (even with ones as straightforward as these!), constrast the following two simulations.
The first uses the Neumann BCs, the second uses Dirichlet.</p>
<p><img src="/documents/heat_eqn_neumann_corner_square.png" alt="png" /></p>
<p><img src="/documents/heat_eqn_dirichlet_corner_square.png" alt="png" /></p>
<p>Physically, we can interpret the Dirichlet conditions as saying that there is some energy source (or sink, in this case) that is constantly injecting (or withdrawing) energy from \(\Omega\) in order to keep the boundary a fixed temperature.
The Neumann conditions, on the other hand, describe the case of a perfectly insulated boundary.</p>
<h3 id="update">Update</h3>
<p>We’d better just solve the Laplace equation too, while we’re at it.
This is the steady state heat equation,</p>
<script type="math/tex; mode=display">\nabla^2\rho = 0,</script>
<p>and here I’ll solve it with Dirichlet boundary conditions only. What follows isn’t even related to anything else I’m doing–100% of this is just for fun.</p>
<p>With the BCs \( \rho(x, 0) = \cos(2\pi x) + 1,\ \rho(x, 1) = \sin(2\pi x) + 1,\ \rho(0, y) = \rho(1, y) = 1\):</p>
<p><img src="/documents/laplace_eqn_dirichlet_sin2pix1_cos2pix1_1_1.png" alt="png" /></p>
<p>With the tent map on the upper and lower boundaries, and zero boundary conditions on the sides:</p>
<p><img src="/documents/laplace_eqn_dirichlet_tent_map_0_0.png" alt="png" /></p>
<p>With the tent map on all sides:</p>
<p><img src="/documents/laplace_eqn_dirichlet_tent_map_all.png" alt="png" /></p>
<p>And yes, just to reassure you that my solutions are accurate, here’s one that many will instinctually recognize: bottom and left walls held constant at one, top and right walls held constant at zero:</p>
<p><img src="/documents/laplace_eqn_dirichlet_11_00.png" alt="png" /></p>
<p>The code is <a href="https://daviddewhurst.github.io/documents/fdm.py">here</a>. Be careful when running it for many timesteps (e.g., over 100K); it stores the entire state of the system at every timestep <strong>in memory</strong>, so that can grow quickly. (I’ll end up changing this behavior eventually.)</p>
<p>N.B.: This was compiled with a new compiler!</p>
<p><a href="https://twitter.com/share" class="twitter-share-button" data-via="d_r_dewhurst" data-size="large">Tweet</a>
<script>!function(d,s,id){var js,fjs=d.getElementsByTagName(s)[0],p=/^http:/.test(d.location)?'http':'https';if(!d.getElementById(id)){js=d.createElement(s);js.id=id;js.src=p+'://platform.twitter.com/widgets.js';fjs.parentNode.insertBefore(js,fjs);}}(document, 'script', 'twitter-wjs');</script></p>It often happens in research that one writes a considerable amount of code only to find that it won’t turn out to be useful in whatever project one is working on. Such was the case with me today: I wrote a solver for the diffusion equation only to find that it won’t suit the purpose for which I originally wrote it. Oh well! Check out these cool numerical solutions anyway.Quantum particles on the torus2017-07-20T11:59:00+00:002017-07-20T11:59:00+00:00http://daviddewhurst.github.io/particles-torus<p>Here is a simple yet beautiful problem in quantum mechanics: find the explict form of the wavefunction in the position basis for a single free particle (or multiple, noninteracting, distinguishable free particles) confined to move on the surface of the two-dimensional torus \(\mathbb{T}^2\).</p>
<p>We will just be solving the time-independent Schrödinger equation at first, since we know how to go from those solutions to the time-dependent solutions. Our beginning equation is then</p>
<script type="math/tex; mode=display">H \Psi = E \Psi</script>
<p>I’ll note now, for any physicists reading, that I’ve set all the constants (mass, reduced Planck constant, etc.) to one in the correct units.
Anyway, the fun part of this problem is the torus; what coordinate system should we use? I’ll think of the torus as \(\mathbb{T} = S^1 \times S^1\), which means that we should use a product of polar coordinates \(x = (\theta, \psi) \) with the radius held fixed.
(We could also see this essential linearity of the coordinates by thinking of the torus as the quotient group \(\mathbb{T}^2 = \mathbb{R}^2/\mathbb{Z}^2 \).)
Expressing the Hamiltonian in this basis gives \( \langle x | H | \Psi \rangle = \nabla^2 \Psi \), so our equation to solve is</p>
<script type="math/tex; mode=display">\left( \frac{1}{R_{\theta}^2}\frac{\partial^2}{\partial^2 \theta} + \frac{1}{R_{\phi}^2}\frac{\partial^2}{\partial^2 \phi} \right)\Psi(\theta, \phi) = E \Psi(\theta, \phi).</script>
<p>We also need some boundary conditions. Thankfully, those aren’t hard to figure out: since we’re on the torus, we know that any admissible solution must be periodic in both variables, so \(\Psi(\theta, \phi) = \Psi(\theta + 2 \pi, \phi)\) and \(\Psi(\theta, \phi) = \Psi(\theta, \phi + 2\pi)\).
Since \( \Psi^*(\theta, \phi)\Psi(\theta, \phi)\) must be a probability distribution, we also have the normalization condition</p>
<script type="math/tex; mode=display">\int_{0}^{2\pi}\int_{0}^{2\pi}d\theta d\phi R_{\theta} R_{\phi}\Psi^*(\theta, \phi)\Psi(\theta, \phi) = 1.</script>
<p>We’re ready to solve! This PDE is linear, so we write the wavefunction as a product of univariate functions \(\Psi(\theta, \phi) = \Theta(\theta)\Phi(\phi)\). Now we can write the PDE as a sum of ODEs:</p>
<script type="math/tex; mode=display">\frac{1}{R_{\theta}^2}\frac{\Theta''(\theta)}{\Theta(\theta)} + \frac{1}{R_{\phi}^2} \frac{\Phi''(\phi)}{\Phi(\phi)} = E</script>
<p>We will write \(E = \alpha + \beta\) and break this sum apart into two separate ODEs as</p>
<script type="math/tex; mode=display">\Theta''(\theta) = \alpha R_{\theta}^2 \Theta(\theta); \qquad \Phi''(\phi) = \beta R_{\phi}^2 \Phi(\phi).</script>
<p>These are easily solved; the solutions take the form of complex exponentials. Since we are in a compact space, we will discard the exponentially growning solutions to find \(\Theta(\theta) = c_1 \exp\left( -i R_{\theta}\sqrt{\alpha}\theta \right)\) and \(\Phi(\phi) = c_1 \exp\left( -i R_{\phi}\sqrt{\beta}\phi \right)\).
Thus the wavefunction has the form</p>
<script type="math/tex; mode=display">\Psi(\theta, \phi) = c \exp\left( -i(R_{\theta}\sqrt{\alpha}\theta + R_{\phi}\sqrt{\beta}\phi) \right).</script>
<p>There is still some work to be done: we need to find the values \(\alpha\) and \(\beta\) so that we can find an explicit form for \(E\). We also need to figure out the value of the normalization constant \(c\)–let’s do that first. Using our normalization condition, we see that</p>
<script type="math/tex; mode=display">c^2\int_{0}^{2\pi}\int_{0}^{2\pi}d\theta d\phi R_{\theta} R_{\phi} e^{ -i(R_{\theta}\sqrt{\alpha}\theta + R_{\phi}\sqrt{\beta}\phi)} e^{ i(R_{\theta}\sqrt{\alpha}\theta + R_{\phi}\sqrt{\beta}\phi)}= c^2 R_{\theta}R_{\phi}(2\pi)^2</script>
<p>so that \(c = \frac{1}{2\pi\sqrt{R_{\theta}R_{\phi}}}\) and the wavefunction is \( \Psi(\theta, \phi) = \frac{1}{2\pi\sqrt{R_{\theta}R_{\phi}}}\exp\left( -i(R_{\theta}\sqrt{\alpha}\theta + R_{\phi}\sqrt{\beta}\phi) \right) \).</p>
<p>Now we will deal with the boundary conditions as we figure out the energy levels. (You will note that this is a way to see that energy <strong>must</strong> be quantized in a quantum system.)
For \(\theta\), we have that \(e^{-iR_{\theta}\sqrt{\alpha}\theta} = e^{-iR_{\theta}\sqrt{\alpha}(\theta + 2\pi)} \), so that \(e^{-iR_{\theta}\sqrt{\alpha}2\pi} = 1 = e^{-i 2\pi n} \). Solving, we find that \(\alpha = \left( \frac{n}{R_{\theta}} \right)^2 \).
Performing an identical procedure for \(\phi\) gives \(\beta = \left( \frac{m}{R_{\phi}} \right)^2 \).
Substituting into the wavefunction, we have (finally!) that</p>
<script type="math/tex; mode=display">\Psi_{nm}(\theta, \phi) = \frac{1}{2\pi\sqrt{R_{\theta}R_{\phi}}}\ \exp\left( -i(n\theta + m\phi) \right)</script>
<p>with energy levels given by</p>
<script type="math/tex; mode=display">E_{nm} = \alpha_n + \beta_m = \left( \frac{n}{R_{\theta}} \right)^2 + \left( \frac{m}{R_{\phi}} \right)^2</script>
<p>Now that we’ve done all the hard work, we can have fun. Let’s introduce time and solve the time-dependent Schrödinger equation in the position basis:</p>
<script type="math/tex; mode=display">i\frac{\partial}{\partial t}\Psi(t) = E\psi(t).</script>
<p>This is a very easy equation; we just integrate to find \(\Psi(t) = \exp(-i E t)\). Great! Now we can put everything together to find that</p>
<script type="math/tex; mode=display">\Psi_{E_{nm}}(\theta, \phi, t) = \frac{1}{2\pi\sqrt{R_{\theta}R_{\phi}}} \exp(-i E_{nm} t) \exp\left( -i(n\theta + m\phi) \right),</script>
<p>with energy levels given above.</p>
<p>The great part about this problem is that, for noninteracting distinguishable particles, we could just repeat this process <em>ad infinitum</em> if we wanted to. The stationary wavefunction just becomes a product of the particles: \(\Psi(\theta, \phi) = \prod_{i=1}^n\Psi^{(i)}(\theta, \phi) \); in this case, we’d have</p>
<script type="math/tex; mode=display">\Psi_{n_1m_1,...,n_Nm_N}(\theta_1,\phi_1,...,\theta_N,\phi_N) = \left( \frac{1}{2\pi\sqrt{R_{\theta}R_{\phi}}} \right)^{N} \prod_{j=1}^N \exp\left( -i(n_j\theta_j + m_j\phi_j) \right)</script>
<p><a href="https://twitter.com/share" class="twitter-share-button" data-via="d_r_dewhurst" data-size="large">Tweet</a>
<script>!function(d,s,id){var js,fjs=d.getElementsByTagName(s)[0],p=/^http:/.test(d.location)?'http':'https';if(!d.getElementById(id)){js=d.createElement(s);js.id=id;js.src=p+'://platform.twitter.com/widgets.js';fjs.parentNode.insertBefore(js,fjs);}}(document, 'script', 'twitter-wjs');</script></p>Here is a simple yet beautiful problem in quantum mechanics: find the explict form of the wavefunction in the position basis for a single free particle (or multiple, noninteracting, distinguishable free particles) confined to move on the surface of the two-dimensional torus \(\mathbb{T}^2\).An efficient mechanism for carbon trading2017-07-19T12:00:00+00:002017-07-19T12:00:00+00:00http://daviddewhurst.github.io/carbon-emissions<p>Here’s a simple mechanism for formulating a carbon-trading market. This won’t be the fanciest thing ever, but it is efficient (defined in a very precise way) and guarantees a cap on total emissions into the infinite future, provided cheating the mechanism isn’t possible.</p>
<h3 id="primary-market-pricing">Primary market pricing</h3>
<p>Our goal is to limit total carbon emissions, to, say, \(A\) tons of carbon.
Each time period \(t \in \mathbb{N}\) people want to emit carbon, and so we have the natural condition that \(\sum_{t = 0}^{\infty}A_t = A\) as the restriction of the amount of carbon that can be emitted in each time period \(A_t\).
Now, if we impose the condition that in each time period there will be precisely \(n\) contracts traded, and that each contract expires at the end of its issued time period, then we can rewrite this condition as</p>
<script type="math/tex; mode=display">A = \sum_{t = 0}^{\infty} A_t = \sum_{t = 0}^{\infty} na_t = n\sum_{t = 0}^{\infty} a_t = na,</script>
<p>so that the optimization problem can actually be formulated on a per-contract basis with the constraint that \(\sum_{t = 0}^{\infty}a_t = a\).
We suppose that each contract amount of carbon \(a_t\) causes some cost \(c_t\) to the planet and society of the form \(c_t = f(a_t)\).
Our objective will be to minimize the economic cost associated with the primary sale of carbon emissions contracts on a per-unit-cost basis; that is, if \(p_t \) is the primary sale price of a contract to emit \(a_t\) units of carbon, we require that \(p_t\) has units so that \(p_t c_t\) has units of dollar (or whatever other currency you prefer).
It is thus clear that our optimization problem is</p>
<script type="math/tex; mode=display">\min \sum_{t=0}^{\infty}b_t p_t c_t \qquad \text{s.t. } \sum_{t = 0}^{\infty} a_t = a,</script>
<p>where \(b_t\) is a discounting function; we convert the dollar cost of purchasing future contracts into their net present value.
Substituting our functional relationship between carbon emitted and social cost, we seek to find \(p_t\) such that</p>
<script type="math/tex; mode=display">\frac{\partial}{\partial a_t}\left[ \sum_{t=0}^{\infty}b_t p_t f(a_t) + \lambda\left( a - \sum_{t = 0}^{\infty}a_t \right) \right] = 0.</script>
<p>We find that the optimal primary market price is given by</p>
<script type="math/tex; mode=display">p_t = \lambda\left(b_t \frac{df}{da_t} \right)^{-1}.</script>
<p>The parameter \(\lambda\) cannot be found in terms of \(a_t\) explicitly and we will treat it as a policy parameter below.
For a concrete example of this optimum, assume a rational discount function (assumptions!) given by \(b_t = e^{-t}\) and a quadratic social cost \(c_t = a_t^2\).
The optimal primary market price is then given by \( p_t = \lambda e^{t} a_t^{-1} \).
A heuristic understanding of why \(\lim_{a_t \rightarrow 0^+}p_t(a_t) = +\infty\) is possible with simple price theory; if demand is essentially perfectly inelastic (at least, for a long amount of time), a decreasing amount of \(a_t\) will naturally lead to large upward price movement.</p>
<h3 id="primary-market-sale">Primary market sale</h3>
<p>I’m not an expert in auction theory, so I’ll propose two simple (and possibly simplistic) ways of auctioning emissions permits in the primary market using this scheme.</p>
<ol>
<li>Fix the parameter \(\lambda = \lambda_t\) every time period and randomly select \(n\) out of \(m_t\) market participants to pay the amount \(p_t = \lambda_t \left(b_t \frac{df}{da_t} \right)^{-1}\). In this scenario, each market participant signs a contract acknowledging that they are willing to purchase a single contract at \(p_t\), but understand that they may not be selected to purchase the contract. While analytically pleasant, this mechanism is unlikely to work in practice as it relies on the assumption that each market participant is constrained to purchase only one contract per time period–an unlikely condition for, say, a large manufacturer.</li>
<li>Let \(\lambda\) float and have an open market auction. This is the more practical approach and allows for price discovery beyond the fixed part of the price. This will, however, introduce monopoly concerns.</li>
</ol>
<p>In a realistic case, one might implement a modified version of the second of these two strategies as follows.
There would be income tranches \(y_1,…,y_n\) and price multiplier caps on all tranches below \(n\) above which permits could not trade. The highest tranche would be for the “major-league” players and would have no price cap; it would be a true open auction.</p>
<p>These mechanisms don’t address trading issues inherent in the secondary market; I’ll have to come back to that.</p>
<p><a href="https://twitter.com/share" class="twitter-share-button" data-via="d_r_dewhurst" data-size="large">Tweet</a>
<script>!function(d,s,id){var js,fjs=d.getElementsByTagName(s)[0],p=/^http:/.test(d.location)?'http':'https';if(!d.getElementById(id)){js=d.createElement(s);js.id=id;js.src=p+'://platform.twitter.com/widgets.js';fjs.parentNode.insertBefore(js,fjs);}}(document, 'script', 'twitter-wjs');</script></p>Here’s a simple mechanism for formulating a carbon-trading market. This won’t be the fanciest thing ever, but it is efficient (defined in a very precise way) and guarantees a cap on total emissions into the infinite future, provided cheating the mechanism isn’t possible.Differentiability and least squares2017-06-30T12:30:00+00:002017-06-30T12:30:00+00:00http://daviddewhurst.github.io/differentiability-least-squares<p>Here is an interesting problem I encountered in <a href="http://www.cems.uvm.edu/~jmwilson/">Mike Wilson’s</a> analysis course: let \(u \in \mathbb{R}^d \) be a unit vector and define the function \(f: \mathbb{R}^d \rightarrow \mathbb{R} \) by</p>
<script type="math/tex; mode=display">f(x) = \inf_{t \in \mathbb{R}} || x - tu ||^2.</script>
<p>Show that \(f\) is differentiable on all of \(\mathbb{R}^d\), and find an expression for \(f’(p)\) in terms of \(p\) and \(u\).</p>
<p>Check it out <a href="https://daviddewhurst.github.io/documents/least_squares_formal.pdf">here</a>!</p>
<p>N.B.: Professor Wilson’s class is quite hard (this is one of the easier problems he assigned in the second half of the semester) but well worth the effort. I highly recommend it to any interested student.</p>
<p><a href="https://twitter.com/share" class="twitter-share-button" data-via="d_r_dewhurst" data-size="large">Tweet</a>
<script>!function(d,s,id){var js,fjs=d.getElementsByTagName(s)[0],p=/^http:/.test(d.location)?'http':'https';if(!d.getElementById(id)){js=d.createElement(s);js.id=id;js.src=p+'://platform.twitter.com/widgets.js';fjs.parentNode.insertBefore(js,fjs);}}(document, 'script', 'twitter-wjs');</script></p>Here is an interesting problem I encountered in Mike Wilson’s analysis course: let \(u \in \mathbb{R}^d \) be a unit vector and define the function \(f: \mathbb{R}^d \rightarrow \mathbb{R} \) byHealth insurance is hard2017-06-29T23:00:00+00:002017-06-29T23:00:00+00:00http://daviddewhurst.github.io/health-insurance<p>Health insurance has been front-and-center in the news recently, following Congressional Budget Office (CBO) reports that the current House and Senate healthcare bills may cause millions of Americans to lose their healthcare coverage while also removing several taxes imposed by the Affordable Care Act, colloquially known as Obamacare. Insurers haven’t been silent; some have stated that the proposed restructuring of Affordable Care Act exchanges would do wonders for their business, while others have vocally advocated for <a href="https://www.wsj.com/articles/health-insurers-uneasy-with-senates-approach-to-continuous-coverage-1498580450">retaining the individual mandate.</a>
Here, I’ll outline a simple model of a health insurance company to demonstrate why the problem of covering as many people as possible while also ensuring low premiums for those who can’t afford insurance is so difficult. I’ll also suggest guidelines for future policy.</p>
<h3 id="an-illustrative-model">An illustrative model</h3>
<p>The model I’ll detail is really quite simple; no Humana or Aetna here. The insurance company I’ll attempt to describe is the sort of company you might set up: premiums aren’t invested in a financial market, and there aren’t any employees.
Let’s begin with the health insurance consumer.
She has a probability distribution of incurring an illness or injury (or not-illness-or-injury; having nothing wrong with her at all) at time \(t\) given by \(p_i(x, t)\), for events \( x \in \mathbb{X}\).
Associated with each event \(x\) at time \(t\) is a cost \(C_i(x, t)\).
Thus, her expected discounted cost due to injury and illness over her life is given by</p>
<script type="math/tex; mode=display">\mathbb{E}[C_i] = \sum_{t=0}^{\infty}e^{-rt}\sum_{x \in \mathbb{X}}p_i(x,t)C_i(x,t).</script>
<p>Like anyone, she doesn’t want to have to pay this lifetime cost. More than that, though, she wants <em>certainty</em>; she’d rather pay a fair amount of money each month than pay what some event out on the tail of the distribution \(p(x,t)\) might cost her. (I’m making the (quite reasonable) assumption that rare events are positively correlated with higher cost.)
A health insurance company is willing to purchase that risk from her at a constant price per month (or whatever other time period you like), call it \(\pi_i\), but in return for taking on this risk they’ll increase the total cost to the consumer by \(\rho_i\), the risk premium.
To find out what they should charge the consumer per month, they’ll solve for \(\pi_i\) in the following equation:</p>
<script type="math/tex; mode=display">\mathbb{E}[C_i] + \rho_i = \sum_{t=0}^{\infty}\pi_i e^{-rt}</script>
<p>Summing the geometric series and doing some algebra, we see that she’ll be charged \(\pi_i = (\mathbb{E}[C_i] + \rho_i)( 1 - e^{-rt} ) \).
So far, so good. If the insurance company does the same thing for each customer \(i\), they’ll have revenue given by \(R = \sum_i \sum_{t=0}^{\infty} e^{-rt}\pi_i\) and expected cost given by \(C = \sum_i \mathbb{E}[C_i]\) for expected profit of \(\mathcal{P} =\sum_i \rho_i\).
(The reader will note that I’m talking about long-run revenue and cost here; we’ve done all of the discounting “up front”, so to speak, so that we can algebraically manipulate quantities when estimating expected profit.)
Already we see the profit incentive for insurance companies to seek out consumers \(i\) with cost functions of low magnitude or event probability distributions with thin tails.</p>
<p>Now suppose a policy is implemented such that all consumers \(i’\) must have monthly premiums set to \(\theta_{i’} = \pi_{i’} - \delta_{i’}\), where \(\delta_{i’} > 0\).
For now, we’ll assume that this policy appears for no reason at all, and thus that there’s no correlation between consumer \(i’\) and the consumer’s cost function or event probability distribution. The company can respond in one or both of two ways:</p>
<ol>
<li>Eat the cost; that is, let expected profits become</li>
</ol>
<script type="math/tex; mode=display">% <![CDATA[
\mathcal{P'} = \sum_{i \neq i'} \rho_i + \sum_{i'} (\rho_{i'} - \delta_{i'}) < \mathcal{P}, %]]></script>
<p>so that the company’s risk of ruin is higher and expected profitability is lower; or</p>
<ol>
<li>Charge other customers more; that is, given that <em>initially</em> the company will have the same number of consumers before and after the policy implementation, the revenue function becomes</li>
</ol>
<script type="math/tex; mode=display">R' = \sum_{t = 0}^{\infty}e^{-rt}\left( \sum_{i \neq i'}( \pi_i + \delta_i) + \sum_{i'} (\pi_{i'} - \delta_{i'}) \right)</script>
<p>where \( \sum_i \delta_i = \sum_{i’} \delta_{i’} \); that is, some consumers are in effect subsidizing others. The first option is pretty clearly flawed; the company exists to make a profit, and even more important that quarterly earnings is the company’s survival. Anything that increases its risk of ruin is going to be a no-go.</p>
<p>The second option is flawed, too, since it will discourage those consumers \(i \neq i’\) from entering the insurance market because of rising premiums, and likewise encourage consumers \(i’\) to enter the market when they might not have earlier.
This is problematic because the assumption that the assignment of a discount \(\delta_{i’}\) and \(i’\) cost function / probability distribution is not at all realistic. In fact, the usual rationale for health insurance subsidies is to provide previously-unobtainable coverage for the poor.
Yet there is <a href="http://www.cmaj.ca/content/174/7/923.short">overwhelming evidence</a> to support the claim that poverty and sickness are positively correlated.
Thus, increasing enrollment of \(i’\) and decreasing enrollment of \(i \neq i’\) has the effect of increasing the company’s risk profile and thus its risk of ruin, along with decreasing its profitability.
The result? Insurance companies will withdraw from markets that enforce these premiums—which is <a href="https://www.usatoday.com/story/news/nation-now/2017/05/03/iowa-health-insurers-obamacare/309955001/">exactly what we see happening</a> right now.</p>
<h3 id="individual-mandate">Individual mandate?</h3>
<p>There isn’t an easy fix to this problem; if there were, I wouldn’t be writing about it. The Affordable Care Act, flawed as it is, does contain one important mechanism that the current Congress would do well to retain in their bill, at least in spirit: the individual mandate. This enforces penalties for not purchasing insurance to encourage the spry, healthy consumers (\(i \neq i’\)) to purchase insurance, thus reducing the risk of ruin to the insurance companies and thereby encouraging them to stay in the market. In fact, my criticism of the individual mandate is simply that its penalties for not purchasing insurance aren’t high enough; many consumers decide they incur less disutility from paying the penalty than from purchasing insurance they feel they don’t need. The Swiss <a href="https://en.wikipedia.org/wiki/Healthcare_in_Switzerland">figured this out</a> a long time ago; they mandate that all residents purchase a basic private healthcare plan, while ensuring that insurance companies don’t raise prices of these basic plans to unaffordable levels.</p>
<p><a href="https://twitter.com/share" class="twitter-share-button" data-via="d_r_dewhurst" data-size="large">Tweet</a>
<script>!function(d,s,id){var js,fjs=d.getElementsByTagName(s)[0],p=/^http:/.test(d.location)?'http':'https';if(!d.getElementById(id)){js=d.createElement(s);js.id=id;js.src=p+'://platform.twitter.com/widgets.js';fjs.parentNode.insertBefore(js,fjs);}}(document, 'script', 'twitter-wjs');</script></p>Health insurance has been front-and-center in the news recently, following Congressional Budget Office (CBO) reports that the current House and Senate healthcare bills may cause millions of Americans to lose their healthcare coverage while also removing several taxes imposed by the Affordable Care Act, colloquially known as Obamacare. Insurers haven’t been silent; some have stated that the proposed restructuring of Affordable Care Act exchanges would do wonders for their business, while others have vocally advocated for retaining the individual mandate. Here, I’ll outline a simple model of a health insurance company to demonstrate why the problem of covering as many people as possible while also ensuring low premiums for those who can’t afford insurance is so difficult. I’ll also suggest guidelines for future policy.Simon’s model displays a first-mover advantage2017-05-11T18:00:00+00:002017-05-11T18:00:00+00:00http://daviddewhurst.github.io/simons-model-first-mover<p><a href="https://www.uvm.edu/pdodds/">Peter Dodds</a>, myself, and some other friendly folks affiliated with the <a href="https://www.uvm.edu/storylab/">Computational Story Lab</a> just published a paper in <a href="https://journals.aps.org/pre/abstract/10.1103/PhysRevE.95.052301">Physical Review E</a> detailing an inherent first-mover advantage in Herbert Simon’s <a href="https://en.wikipedia.org/wiki/Simon_model">preferential attachment model</a>. Check it out!</p>
<p>Re-analysis of the generative algorithm led us to the discovery that the first group actually has a size advantage proportional to \(1/\rho\), where \(\rho\) is the innovation probability.
As \(\rho\) is typically quite small–on the order of \(10^{-2}\) to \(10^{-5}\)–this results in the first group being from 100 to 100000 times (respectively) as large as the mean-field analysis of the algorithm dictates.
There’s even evidence for this mechanism in real-world citation counts (props to Dodds for wrangling this data…expert emacs skills were on display).</p>
<p><a href="https://twitter.com/share" class="twitter-share-button" data-via="d_r_dewhurst" data-size="large">Tweet</a>
<script>!function(d,s,id){var js,fjs=d.getElementsByTagName(s)[0],p=/^http:/.test(d.location)?'http':'https';if(!d.getElementById(id)){js=d.createElement(s);js.id=id;js.src=p+'://platform.twitter.com/widgets.js';fjs.parentNode.insertBefore(js,fjs);}}(document, 'script', 'twitter-wjs');</script></p>Peter Dodds, myself, and some other friendly folks affiliated with the Computational Story Lab just published a paper in Physical Review E detailing an inherent first-mover advantage in Herbert Simon’s preferential attachment model. Check it out!Random walks on networks2017-03-20T15:30:00+00:002017-03-20T15:30:00+00:00http://daviddewhurst.github.io/random-walks-networks<p>Given some network, weighted or unweighted, directed or undirected, what is the probability that a random walker starting at node \(i\) reaches node \(j\) after \(t\) timesteps?</p>
<p>It’s an old, solved problem, but it’s still fun.
Brief consideration of the problem allows us to write down the solution as</p>
<script type="math/tex; mode=display">\Pr(i \rightarrow j \text{ in } t \text{ timesteps }) = \sum_{\text{all paths}} \Pr(i \rightarrow j_1) \Pr(j_1 \rightarrow j_2) \cdots \Pr(j_{t - 1} \rightarrow j)</script>
<p>Denoting \(\Pr(j_{k} \rightarrow j_{k + 1})\) by \(P(j_{k}, j_{k + 1})\), we can rewrite this as</p>
<script type="math/tex; mode=display">\Pr(i \rightarrow j \text{ in } t \text{ timesteps }) = \sum_{j_1,...,j_{t - 1}} P(i, j_1)P(j_{t - 1}, j)\prod_{\ell = 2}^{t - 2}P(j_{\ell}, j_{\ell + 1})</script>
<p>Note that we can think of the term outside of the product as the “entry” and “exit” condition; if \(P(i, j_1) = 0\) or \(P(j_{t - 1}, j) = 0\) for some \(j_{1}\) or \(j_{t - 1}\), that means that, e.g., node \(i\) isn’t connected to node \(j_1\) and so the probability of transferring to that node is zero.
If these probabilities are zero for <em>all</em> nodes, e.g., \(j_1\), then that means node \(i\) is an isolated node–connected to no other nodes–and therefore we obviously can’t get to node \(j\) from \(i\).</p>
<p>From the above we can derive the various forms of \(\Pr(i \rightarrow j \text{ in } t \text{ timesteps }) \equiv P_{ij}(t)\) for different types of networks.</p>
<ul>
<li>
<p><strong>Undirected, unweighted, uniform probability</strong>: Here, \(P(j_{\ell}, j_{\ell + 1})\)
is inversely proportional to the degree of the outgoing node:</p>
<script type="math/tex; mode=display">P(j_{\ell}, j_{\ell + 1}) = \frac{A_{j_{\ell}, j_{\ell + 1}}}{K_{j_{\ell}}}</script>
<p>so that the general equation becomes</p>
<script type="math/tex; mode=display">P_{ij}(t) = \sum_{j_1,...,j_{t - 1}} \frac{A_{i, j_{1}}}{K_i} \frac{A_{j_{t - 1}, j}}{K_{j_{t - 1}}} \prod_{\ell = 2}^{t - 2}\frac{A_{j_{\ell}, j_{\ell + 1}}}{K_{j_{\ell}}}</script>
<p>This equation for \( P_{ji}(t) \) is</p>
<script type="math/tex; mode=display">P_{ji}(t) = \sum_{j_1,...,j_{t - 1}} \frac{A_{j, j_{1}}}{K_j} \frac{A_{j_{t - 1}, i}}{K_{j_{t - 1}}} \prod_{\ell = 2}^{t - 2}\frac{A_{j_{\ell}, j_{\ell + 1}}}{K_{j_{\ell}}}</script>
<p>Thus we see that \(K_i P_{ij}(t) = K_j P_{ji}(t) \), so that the system exhibits detailed balance and thus we could solve for its stationary distribution.
This result was shown by Noh and Rieger in 2008; see <a href="https://arxiv.org/pdf/cond-mat/0307719.pdf">their paper</a>.</p>
</li>
<li>
<p><strong>Undirected, weighted networks, weight-proportional probability</strong>: This problem is almost identical to that given above. Here the transition probabilities are given by</p>
<script type="math/tex; mode=display">P(j_{\ell}, j_{\ell + 1}) = \frac{ w_{j_{\ell}, j_{\ell +1}} }{ \sum_{k \in N(j_{\ell})} w_{j_{\ell} k} } \equiv \frac{ w_{j_{\ell}, j_{\ell +1}} }{w_{j_{\ell}}}</script>
<p>Then the above equations become</p>
<script type="math/tex; mode=display">P_{ij}(t) = \sum_{j_1,...,j_{t - 1}} \frac{ w_{i, j_{1}} }{w_{i}} \frac{ w_{j_{t - 1}, j} }{w_{j_{t - 1}}} \prod_{\ell = 2}^{t - 2}\frac{ w_{j_{\ell}, j_{\ell +1}} }{w_{j_{\ell}}}</script>
<p>and</p>
<script type="math/tex; mode=display">P_{ji}(t) = \sum_{j_1,...,j_{t - 1}} \frac{ w_{j, j_{1}} }{w_{j}} \frac{w_{j_{t - 1}, i}}{w_{j_{t - 1}}} \prod_{\ell = 2}^{t - 2}\frac{ w_{j_{\ell}, j_{\ell +1}} }{w_{j_{\ell}}}</script>
<p>so that this system also exhibits detailed balance; \(w_iP_{ij}(t) = w_jP_{ji}(t)\).</p>
</li>
<li>
<p>Here is an interesting example. Suppose a network has some “central” node \(n_0\), and define the distance from node \(j\) to this central node by</p>
<script type="math/tex; mode=display">r(j) = d(j, n_0) \equiv \text{length of the shortest path from } j \text{ to } n_0</script>
<p>Suppose a random walker of mass \(m = 1\) moves through the network influenced by some potential function \(V(r) = r^2 \), and that the walker passes from node \(j_{\ell}\) to node \(j_{\ell + 1}\) with probability proportional to the negative inverse of the force exerted on the walker by node \(j_{\ell + 1}\).
Considering the corresponding deterministic flow on the line, we see that \(\ddot{r} = -2r\), so the deterministic system is conservative.
Then \(F = -\frac{\partial V}{\partial r} = -2r\), so we have</p>
<script type="math/tex; mode=display">P(j_{\ell}, j_{\ell + 1}) = \frac{1}{2r(j_{\ell + 1})} \Big/ \sum_{k \in N(j_{\ell})}\frac{1}{2r(k)} = \Big( \sum_{k \in N(j_{\ell})} \frac{r(j_{\ell + 1})}{r(k)} \Big)^{-1}</script>
<p>whereupon the general equation for \(P_{ij}(t)\) becomes</p>
<script type="math/tex; mode=display">P_{ji}(t) = \sum_{j_1,...,j_{t - 1}} \Big(4 r(j_1)r(j) \prod_{\ell = 2}^{t - 2}\sum_{k \in N(j_{\ell})} \frac{r(j_{\ell + 1})}{r(k)} \Big)^{-1}</script>
<p>We find that \( r(j)P_{ij}(t) = r(i)P_{ji}(t) \) which we write as</p>
<script type="math/tex; mode=display">\frac{P_{ij}(t)}{r(i)} = \frac{P_{ji}(t)}{r(j)}</script>
<p>So we find that the stationary distribution is proportional to the inverse of the distance from the origin:</p>
<script type="math/tex; mode=display">P_i(\infty) \propto \frac{1}{r(i)}</script>
<p>What about if \(r(i) = 0\)–that is, if we’re already at \(n_0\)?
This is a systemic problem in classical physics that is due to the concept of
a point mass, and cannot be resolved here.
(Consider, for another example, the gravitational force…)</p>
</li>
</ul>
<h3 id="distribution-of-first-passage">Distribution of first passage</h3>
<p>Before considering another interesting case, let us discuss the distribution of first passage; that is</p>
<script type="math/tex; mode=display">\Pr(i \rightarrow j \text{ for the first time in } t \text{ timesteps }) \equiv F_{ij}(t)</script>
<p>If we think about \(P_{ij}(t)\) carefully, we realize that we can write it as</p>
<script type="math/tex; mode=display">P_{ij}(t) = \sum_{\tau = 0}^t \Pr(i \rightarrow j \text{ for the first time in } \tau \text{ timesteps }) \Pr(\text{is at } j \text{ after } t - \tau \text{ timesteps})</script>
<p>which we can write as</p>
<script type="math/tex; mode=display">P_{ij}(t) = \sum_{\tau = 0}^t F_{ij}(\tau)P_{jj}(t - \tau)</script>
<p>We’re almost right, but we need a correction term.
What happens if we’re already at \(j\) when we start?
Really, we should write the above as \( P_{ij}(t) = [t = 0] [i = j] + \sum_{\tau = 0}^t F_{ij}(\tau)P_{jj}(t - \tau) \) or, using Kronecker’s delta,</p>
<script type="math/tex; mode=display">P_{ij}(t) = \delta(t) \delta(i - j) + \sum_{\tau = 0}^t F_{ij}(\tau)P_{jj}(t - \tau)</script>
<p>Now we have to solve this functional equation for \(F_{ij}(t)\).
There are a lot of ways to do this, but probably the easiest is to use some sort of frequency-space transform.
Here, we’ll use the Laplace transform. (An instructive exercise would be to use the Z-transform to replicate these results.)
Applying the transform to both sides of the above, denoting \( L{ \cdot } \equiv \tilde{\cdot} \), and using common identities gives</p>
<script type="math/tex; mode=display">\tilde{P_{ij}}(s) = \delta(i - j) + \sum_{t = 0}^{\infty}\Big( \sum_{\tau = 0}^t F_{ij}(\tau)P_{jj}(t - \tau) \Big) e^{ -st }</script>
<p>The first term on the right was easy, since the Laplace transform of the Dirac spike is the identity.
The second sum might worry us momentarily, but we can rewrite it as a proper convolution:</p>
<script type="math/tex; mode=display">\sum_{t = 0}^{\infty}\Big( \sum_{\tau = 0}^t F_{ij}(\tau)P_{jj}(t - \tau) \Big) e^{ -st } = \sum_{t' = 0}^{\infty}\Big( \sum_{\tau = 0}^{\infty} F_{ij}(\tau)P_{jj}(t' - \tau)\sum_{\alpha = 0}^t\delta(\alpha - t') \Big) e^{ -st' }</script>
<p>so that the Laplace transform of the whole equation is actually just</p>
<script type="math/tex; mode=display">\tilde{P_{ij}}(s) = \delta(i - j) + \tilde{F_{ij}}(s) \tilde{P_{jj}}(s)</script>
<p>whereupon we see that the first passage distribution in frequency space is</p>
<script type="math/tex; mode=display">\tilde{F_{ij}}(s) = \frac{\tilde{P_{ij}}(s) - \delta(i - j)}{\tilde{P_{jj}}(s)}</script>
<p>Now, the mean first passage time from \(i\) to \(j\), denoted \(\langle T_{ij} \rangle \), is just \( \langle T_{ij} \rangle \equiv \mathbb{E}_{F_{ij}}[t] = \sum_{t = 0}^{\infty}t F_{ij}(t) \).
Let us introduce the operator \( \theta_n \equiv (-1)^n \frac{d^n}{ds^n} \).
Then we see that the first moment of \(F_{ij}(t)\) is just \(\langle T_{ij} \rangle = (\theta_1 \tilde{F_{ij}})(0) \).
Performing this calculation we find that</p>
<script type="math/tex; mode=display">\langle T_{ij} \rangle = \mathbb{E}_{P_{jj}}[t] - \mathbb{E}_{P_{ij}}[t]</script>
<p>valid for \(i \neq j\).
Now, we may want to examine more specialized cases of this distribution.
For example, we could ask what the average time of first return is–how long, on average, will our walker spend traversing the network before returning to the node from which it started?
To answer this question requires a bit of work.
First, suppose that \(P_{ij}\) satisfies detailed balance.
Letting \(\pi_i\) be the equilibrium probability of the system being in state \(i\) (i.e., of the walker being at node \(i\) ), we will denote the equilibrium distribution as \(P_i(\infty) = \pi_i / c_i\) where \(c_i\) is the normalization constant.
(Recall that we are, for now, considering finite networks, so that all distributions are normalizable.)
Since this distribution has a steady-state, let us consider the transient distribution \(Q_{ij}(t) \equiv P_{ij}(t) - P_j(\infty) \).
We will denote the moments of this distribution by \(\rho_{ij}(n) = (\theta_n \tilde{Q_{ij}})(0) \).
Separating into steady-state and transient parts, we then have</p>
<script type="math/tex; mode=display">\tilde{P_{ij}}(s) = \frac{\pi_j c_j^{-1}}{1 - e^{-s}} + \text{ transient terms }</script>
<p>where the transient terms are simply the Laplace transform of the transient distribution: \( \sum_{t = 0}^{\infty} (P_{ij}(t) - P_j(\infty))e^{-st} \ \).
Now, \( e^{- st} = \sum_{n=0}^{\infty}\frac{(-1)^n s^n t^n}{n!} \), so the above sum becomes</p>
<script type="math/tex; mode=display">\sum_{t = 0}^{\infty} ( P_{ij}(t) - P_j(\infty) ) \sum_{n=0}^{\infty}\frac{(-1)^n s^n t^n}{n!} = \sum_{n = 0}^{\infty} \Big( \sum_{t = 0}^{\infty} ( P_{ij}(t) - P_j(\infty) ) t^n \Big)\frac{(-1)^n s^n }{n!}</script>
<p>which we recognize (with joy!) as \( \sum_{n = 0}^{\infty} \rho_{ij}(n) \frac{(-1)^n s^n }{n!} \).
Then the above formula becomes</p>
<script type="math/tex; mode=display">\tilde{P_{ij}}(s) = \frac{\pi_j c_j^{-1}}{1 - e^{-s}} + \sum_{n = 0}^{\infty} \rho_{ij}(n) \frac{(-1)^n s^n }{n!}</script>
<p>Substituting this into the equation for the first passage distribution in frequency space and then calculating \( (\theta_1 \tilde{F_{ij}})(0) \) gives the general mean time of first passage as</p>
<script type="math/tex; mode=display">\langle T_{ij} \rangle =
\begin{cases}
\frac{c_{ij}}{\pi_{ij}} \text{ if } i = j \\
\frac{c_{ij}}{\pi_{ij}}(\rho_{jj}(1) - \rho_{ij}(1)) \text{ if } i \neq j
\end{cases}</script>
<h3 id="preferential-attachment">Preferential attachment</h3>
<p>With this result in mind, let’s consider one more type of random walk.
Suppose the walker transitions from node \(j_{\ell}\) to \(j_{\ell + 1}\) with probability proportional to some function, call it \(R\), of the edge degree of \(j_{\ell + 1}\):</p>
<script type="math/tex; mode=display">P(j_{\ell}, j_{e\ll + 1}) = \frac{R(K_{j_{\ell + 1}})}{\sum_{k \in N(j_{\ell})} R(K_k) } \equiv \frac{R(K_{j_{\ell + 1}})}{R_{j_{\ell}}}</script>
<p>This is similar to many other preferential attachment processes, beginning with Yule and Simon, then de Solla Price, then Watts / Strogatz, then Barabasi / Albert, etc. The process considered here differs from those above because it does not feature injection of probability into the system; probability is conserved.
Now, from the general equation for \(P_{ij}(t)\) we have that</p>
<script type="math/tex; mode=display">P_{ij}(t) = \sum_{j_1,...,j_{t - 1}} \frac{R(K_{j_1})}{R_i} \frac{R(K_j)}{R_{j_{t - 1}}} \prod_{\ell = 2}^{t - 2}\frac{R(K_{j_{\ell + 1}})}{R_{j_{\ell}}}</script>
<p>We are interested in the time of first return.
Does the function \(R\) affect the time of first return?
We see that the system satisfies detailed balance. In particular, we have</p>
<script type="math/tex; mode=display">R_i R(K_i) P_{ij}(t) = R_j R(K_j)P_{ji}(t)</script>
<p>so that</p>
<script type="math/tex; mode=display">\langle T_{ii} \rangle \propto \frac{1}{R_i R(K_i)}</script>
<p>We will define the first return statistic of a network on which the above process operates in the following manner: let \(p(k)\) be a probability mass function with mean \(\mu \equiv \langle k \rangle\) that describes a network’s degree distribution.
Then the first return statistic \(FR(\mu)\) is given by</p>
<script type="math/tex; mode=display">FR(\mu) = \frac{1}{\mathbb{E}_i [ R(K_i)\sum_{\beta \in N(i)} R(\beta) ] }</script>
<p>For example, in the relatively simple case where \( R(\beta) = \beta \), we have that</p>
<script type="math/tex; mode=display">FR(\mu) \propto \mu^{-2}</script>
<p><a href="https://twitter.com/share" class="twitter-share-button" data-via="d_r_dewhurst" data-size="large">Tweet</a>
<script>!function(d,s,id){var js,fjs=d.getElementsByTagName(s)[0],p=/^http:/.test(d.location)?'http':'https';if(!d.getElementById(id)){js=d.createElement(s);js.id=id;js.src=p+'://platform.twitter.com/widgets.js';fjs.parentNode.insertBefore(js,fjs);}}(document, 'script', 'twitter-wjs');</script></p>Given some network, weighted or unweighted, directed or undirected, what is the probability that a random walker starting at node \(i\) reaches node \(j\) after \(t\) timesteps?