Jekyll2018-06-02T05:25:40+00:00http://daviddewhurst.github.io/david dewhurstUncertainty, risk, and lossNaturally bounded normal processes2018-05-19T17:00:00+00:002018-05-19T17:00:00+00:00http://daviddewhurst.github.io/bounded-normal-processes<p>Here’s a quick overview of a naturally-bounded Weiner process (I’ll explain what that is momentarily) and
an application to election prediction. In particular, we will see why it isn’t really at all surprising that
Donald Trump won the 2016 presidential election.</p>
<p>Recall that a Weiner process is a type of continuous-time stochastic process that can be characterized in
many ways.
I am going to use the informal characterization (the physicist’s description) and merely say that a
Weiner process is a stochastic process \(W_t\) defined such that \(dW_t \sim \mathcal{N}(\mu=0,
\sigma=\sqrt{dt})\).
Brownian motion \(X_t\) with drift \(\mu(x,t)\) and volatility \(\sigma(x,t)\) is defined by</p>
<script type="math/tex; mode=display">dX_t = \mu(X_t, t)\ dt + \sigma(X_t, t)\ dW_t,</script>
<p>and is fundamental to many areas of science.</p>
<p>Now, we will consider a modified Weiner process with natural bound \([0,1]\).
By natural support it is meant that we are not imposing boundary conditions on the process \(X_t\) but
rather constructing a process with Gaussian (well, almost Gaussian) increments that is still unable to
exit the interval \([0,1]\).
Here is how we will do this.
Recall the definition of the truncated-Gaussian distribution \(\text{Trunc}-\mathcal{N}(\mu, \sigma; a, b)\) as a continuous
random variable with PDF given by</p>
<script type="math/tex; mode=display">p(x|\mu, \sigma, a, b) = \frac{1}{\sigma}\frac{\phi(\frac{x-\mu}{\sigma})}
{\Phi(\frac{b-\mu}{\sigma}) - \Phi(\frac{a-\mu}{\sigma})}.</script>
<p>We have denoted by \(\Phi(z)\) the CDF of the standard Gaussian and by \(\phi(z)\) its derivative.
Define the truncated (or naturally-bounded) Weiner process as</p>
<script type="math/tex; mode=display">dW^{[a,b]}_t \sim \text{Trunc}-\mathcal{N}(0, \sqrt{dt}; a, b),</script>
<p>and its corresponding Brownian motion with drift and volatility is</p>
<script type="math/tex; mode=display">dX_t = \mu(X_t, t)\ dt + \sigma(X_t, t)\ dW^{[-X_t,1-X_t]}_t</script>
<p>The reader will note that we have normalized the process to the interval \([0,1]\) for convenience of
analysis and for use in application presently.</p>
<p>It is not too hard to see that, with sufficient conditions on \(\mu\) (you can derive these yourself), we have
\(\Pr(X_t \in [0,1]) = 1\) for all \(t\).
We should note that, although this process is very similar in appearance (notation?) to a standard Weiner
process, this is only really true near \(\frac{1}{2}\), where, for
suitably small volatility \(\sigma\), there is very little probability density near
zero and one even in the standard (unbounded) Gaussian.
If \(X_t\) is close to the natural boundaries zero and one, however, the process becomes heavily skewed back
toward the center of the interval and behavior diverges from normality substantially.
We refer to this process as naturally-bounded as we have not imposed reflecting or absorbing boundary conditions,
but rather constructed the collection of underlying random variables to have zero probability of crossing
the boundaries.</p>
<p>Here’s an example of what this process can look like.
We will set \(\mu(x,t) \sim \text{Laplace}(\text{mean }=0.01)_t \), where we draw a new Laplace random variable
with mean 0.01 for each \(t\), and set \(\sigma = 0.1\).
We draw ten processes from the naturally-bounded Weiner process distribution, each with initial condition
set to \(X_0 = 0.48\).</p>
<p><img src="/documents/trunc-norm-laplace-drift.png" alt="png" /></p>
<p>At each \(t\), we compute the spatial mean \(\mathbb{E}_W^{[0,1]}[X_t] \), which is plotted as the
thick black line.
We may also be interested in the spatial probability distributions \(p(X_t)]\) at each point in time, which are
shown below.</p>
<p><img src="/documents/trunc-norm-laplace-drift-pdf.png" alt="png" /></p>
<p>We see that, for this specific process, the probability that \(X_T\) (the value of the
naturally-bounded Weiner process at the last time point) is greater than 0.5 is a little more than one half.</p>
<h3 id="application-to-elections">Application to elections</h3>
<p>This process is not only analytically interesting, but it can be useful in understanding the behavior of random
processes that are naturally constrained to lie in some compact domain.
Consider the case of some politicians running for higher office. If the voting system used is purely
a first-past-the-post plurality vote, we can model it pretty well with this process.
We start by denoting each candidate \(i\)’s polling popularity at time \(t\) by \(X^{(i)}_t\)
and noting that, if \(C\) candidates are in the race, we have</p>
<script type="math/tex; mode=display">\Pr(i \text{ wins}) = \Pr\left(\frac{X^{(i)}_T}{\sum_{j=1}^CX^{(j)}_T} \geq 1/C\right).</script>
<p>If there are only two candidates in the race, \(X\) and
\(Y\), this simplifies to \( \Pr(X_T > Y_T) = \Pr(X_T - Y_T > 0) \).</p>
<p>Let us take a (by now) classic case of an election that “no one saw coming”, the Hilary Clinton versus
Donald Trump 2016 presidential election.
I collected data from
<a href="https://daviddewhurst.github.io/documents/RCP_polling_trump_clinton.xls">Real Clear Politics</a>
that gives the result of every head-to-head polling contest between Clinton and Trump up until the day before
the election.
When multiple poll results occurred on the same day, I averaged the results. (I did not take into account
the repute of the pollster <em>a la</em> 538.)
Here’s what the data looks like—again, this is up until the day before the election.</p>
<p><img src="/documents/clinton-trump.png" alt="png" /></p>
<p>Let’s denote these processes \(X\) for Clinton and \(Y\) for Trump.
We can sample from the naturally-constrainted processes that generated these mean-field estimate as follows,
assuming that the randomness is, in fact, distributed as a truncated-normal process on \([0,1]\).
There is no reason to assume that the randomness is multiplicative, so we will assume that the noise is
additive and sample from the underlying truncated-normal process.
We set \(\sigma\) to be the volatility of each process—e.g., we have</p>
<script type="math/tex; mode=display">\sigma_X = \text{Var} \left(\frac{d}{dt}\log X_t \right)</script>
<p>and similarly for \(Y\).
Below are the resulting simulations.</p>
<p><img src="/documents/clinton-trump-sampled.png" alt="png" /></p>
<p>Compared to the mean estimates given above, we see a lot more confusion over who exactly is going to win the
popular vote—the result isn’t clear at all!
Looking at the PDF for \(Z_T = X_T - Y_T\), it’s even more of a toss-up.</p>
<p><img src="/documents/clinton-trump-pdf.png" alt="png" /></p>
<p>Wow! This estimate gives that Clinton has only a 58% chance of winning the popular vote on November 8th, 2016.
Since <a href="https://www.washingtonpost.com/opinions/the-power-that-gerrymandering-has-brought-to-republicans/2016/06/17/045264ae-2903-11e6-ae4a-3cdd5fe74204_story.html?utm_term=.e5194048d7fa">Republicans seem to be better at gerrymandering than Democrats</a>,
it shouldn’t really be surprising that Trump might be able to secure a majority of electoral votes.</p>
<p>By the way, if you haven’t been paying attention, that’s what happened.</p>
<p>Looking at this sequence of distributions in time kind of tells the story of the election.</p>
<p><img src="/documents/clinton-trump-pdf-time.png" alt="png" /></p>
<p>We see the distribution start off heavily favoring Clinton, swing over toward favoring Trump for a short
while, and then finally swing back over to favoring Clinton, albeit less than before, until it makes a sharp
run for the center during the final days of the contest.</p>
<p>By the way, if you’re wondering why the estimate \(\Pr(Z_T \geq 0 ) \) is changing in the fourth decimal place
every time I calculate it, it’s because I’m calculating a Monte Carlo estimate for this probability:</p>
<script type="math/tex; mode=display">\Pr(Z_T \geq 0 ) \simeq \frac{1}{N}\sum_{n=1}^N \delta(Z_T^{(n)} \geq 0),</script>
<p>where I have made \(N = \) 10,000 draws from the estimated distribution \(p(Z_T)\).</p>
<p>About that estimation.
One reasonable criticism could be that my estimate is sensitive to the method by which I estimate \(p(Z_T)\).
So far I have been using a Gaussian kernel, so that the PDF is estimated as</p>
<script type="math/tex; mode=display">p(z) = \frac{1}{M}\sum_{m=1}^M \phi\left( \frac{z - Z_T^{(m)}}{h} \right),</script>
<p>where \(h\) is the bandwidth of the kernel.
To alleviate these concerns, I’ll re-estimate \(p\) using the top hat kernel, defined as</p>
<script type="math/tex; mode=display">% <![CDATA[
K(z - Z_T| h) \propto
\begin{cases}
1 & \quad |z - Z_T| \leq h\\
0 & \quad |z - Z_T| > h
\end{cases}. %]]></script>
<p>We will now estimate</p>
<script type="math/tex; mode=display">p(z) = \frac{1}{M}\sum_{m=1}^M K\left( z - Z_T^{(m)}| h \right),</script>
<p>Here are the same quantities as calculated above.</p>
<p><img src="/documents/clinton-trump-pdf-tophat.png" alt="png" />
<img src="/documents/clinton-trump-pdf-time-tophat.png" alt="png" /></p>
<p>We give a 2% higher probability that Clinton wins the election with this method, but still a very far cry
from a certainty.</p>
<p><a href="https://twitter.com/share" class="twitter-share-button" data-via="d_r_dewhurst" data-size="large">Tweet</a>
<script>!function(d,s,id){var js,fjs=d.getElementsByTagName(s)[0],p=/^http:/.test(d.location)?'http':'https';if(!d.getElementById(id)){js=d.createElement(s);js.id=id;js.src=p+'://platform.twitter.com/widgets.js';fjs.parentNode.insertBefore(js,fjs);}}(document, 'script', 'twitter-wjs');</script></p>Here’s a quick overview of a naturally-bounded Weiner process (I’ll explain what that is momentarily) and an application to election prediction. In particular, we will see why it isn’t really at all surprising that Donald Trump won the 2016 presidential election.Continuum preferential attachment, volume 22018-02-19T18:00:00+00:002018-02-19T18:00:00+00:00http://daviddewhurst.github.io/continuum-preferential-attachment-redux<p>Awhile ago—in fact, too long ago—I <a href="http://daviddewhurst.github.io/choose-a-firm/">noted</a> that
Peter Dodds and I were working on a short paper regarding continuum preferential attachment processes.
Well, <a href="https://arxiv.org/abs/1710.07580">here it is</a>.
We submitted it to Physical Review E and got a “reject with resubmit” late last fall; I haven’t yet had a chance
to resubmit.</p>
<p>To recap: we construct a continuum (read: PDE) mean-field model for preferential attachment processes and solve
it in all generality.
Because we didn’t have anything better to do, we then extended this process to \(N \geq 1\) dimensions.
We then note that the power-law distribution of firm sizes in the US can be partially explained by such a process.
While it is well-known that a power law fits this data, a theoretical explanation was not known to us; we propose a
simple economic model here that reproduces this phenomenon.</p>
<p>Let me know if you have any suggestions for the resubmit!</p>
<p><a href="https://twitter.com/share" class="twitter-share-button" data-via="d_r_dewhurst" data-size="large">Tweet</a>
<script>!function(d,s,id){var js,fjs=d.getElementsByTagName(s)[0],p=/^http:/.test(d.location)?'http':'https';if(!d.getElementById(id)){js=d.createElement(s);js.id=id;js.src=p+'://platform.twitter.com/widgets.js';fjs.parentNode.insertBefore(js,fjs);}}(document, 'script', 'twitter-wjs');</script></p>Awhile ago—in fact, too long ago—I noted that Peter Dodds and I were working on a short paper regarding continuum preferential attachment processes. Well, here it is. We submitted it to Physical Review E and got a “reject with resubmit” late last fall; I haven’t yet had a chance to resubmit.Proof of a Bonferroni inequality2017-09-02T23:59:59+00:002017-09-02T23:59:59+00:00http://daviddewhurst.github.io/bonferroni<p>Here is a very enjoyable theorem due to Bonferroni.
Let \(n \geq 2 \) and consider the probability triple \( (\Omega, \mathcal{F}, P) \) and a collection of sets of \( \Omega \) in \( \mathcal{F} \) denoted \( ( A_i )_{i=1}^n \).
Then the following holds:</p>
<script type="math/tex; mode=display">P \left( \bigcup_{i=1}^n A_i \right) \geq \sum_{i=1}^n P(A_i) - \sum_{i=1}^{n-1}\sum_{j=i+1}^n P(A_i \cap A_j)</script>
<h3 id="proof">Proof</h3>
<p>By induction. The case where \(n = 2\) is obvious.
Assume the inequality holds up to \(n\) and denote \(B = \bigcup_{i = 1}^n A_i \).
Then the base case implies</p>
<script type="math/tex; mode=display">P(A_{n + 1} \cup B) = P(A_{n + 1}) + P(B) - P(A_{n + 1} \cap B).</script>
<p>Using the distributive law, this becomes</p>
<script type="math/tex; mode=display">P(A_{n + 1} \cap B) = P\Big(A_{n + 1} \cap \bigcup_{i=1}^n A_i \Big) = P\Big( \bigcup_{i=1}^n (A_{n + 1} \cap A_i) \Big)</script>
<p>which, by Boole’s inequality, is less than or equal to \( \sum_{i=1}^n P(A_{n + 1} \cap A_i) \).
So,</p>
<script type="math/tex; mode=display">P(A_{n + 1} \cup B) \geq P(A_{n + 1}) + P(B) - \sum_{i=1}^n P(A_{n + 1} \cap A_i)</script>
<script type="math/tex; mode=display">\qquad \geq P(A_{n + 1}) + \left( \sum_{i=1}^n P(A_i) - \sum_{i=1}^{n-1}\sum_{j=i+1}^n P(A_i \cap A_j) \right) - \sum_{i=1}^n P(A_{n + 1} \cap A_i)</script>
<p>by the inductive hypothesis.
Simplifying, we note that the term \( \sum_{i=1}^n P(A_{n + 1} \cap A_i) \) is the increment of the sum over \( i < j \) in the inequality, thus resulting in</p>
<script type="math/tex; mode=display">P(A_{n + 1} \cup B) \geq \sum_{i = 1}^{n + 1} P(A_i) - \sum_{i=1}^{n}\sum_{j=i+1}^{n + 1} P(A_i \cap A_j),</script>
<p>which was to be proved.</p>
<p><a href="https://twitter.com/share" class="twitter-share-button" data-via="d_r_dewhurst" data-size="large">Tweet</a>
<script>!function(d,s,id){var js,fjs=d.getElementsByTagName(s)[0],p=/^http:/.test(d.location)?'http':'https';if(!d.getElementById(id)){js=d.createElement(s);js.id=id;js.src=p+'://platform.twitter.com/widgets.js';fjs.parentNode.insertBefore(js,fjs);}}(document, 'script', 'twitter-wjs');</script></p>Here is a very enjoyable theorem due to Bonferroni. Let \(n \geq 2 \) and consider the probability triple \( (\Omega, \mathcal{F}, P) \) and a collection of sets of \( \Omega \) in \( \mathcal{F} \) denoted \( ( A_i )_{i=1}^n \). Then the following holds:Brief and (somewhat) intuitive description of the Kolmogorov equations2017-08-08T19:00:00+00:002017-08-08T19:00:00+00:00http://daviddewhurst.github.io/bke-fke<p>I have never taken a course in probability, let alone a course in stochastic differential equations, so I am sort of winging it here. I think the following is one of the shorter “derivations” of the BKE and FKE that can be performed–but the disclaimer above is just in case there’s a well-known shorter derivation of which I’m unaware!</p>
<p>Suppose \( (\mathcal{F})_{t\geq 0}\) is a filtration that carries all known information about the system under study.
This simply means that \(\mathcal{F}_t\) carries all information about the system from time \(t=0\) up to, and including, time \(t\).
Let \(X_t\) be a diffusion process such that the following conditions are satisfied:</p>
<script type="math/tex; mode=display">\mathbb{E}[dX_t | \mathcal{F}_t] = \mu(x,t)dt</script>
<script type="math/tex; mode=display">\mathbb{E}[dX_t^2 | \mathcal{F}_t] = \sigma^2(x,t)dt</script>
<p>Since \(\mathcal{F}_t\) carries all information about the system up to time \(t\), we can rewrite the above as expectations under the space and time variables:</p>
<script type="math/tex; mode=display">\mathbb{E}_{x,t}[dX_t] = \mu(x,t)dt</script>
<script type="math/tex; mode=display">\mathbb{E}_{x,t}[dX_t^2] = \sigma^2(x,t) dt</script>
<h3 id="bke">BKE</h3>
<p>We will first derive the BKE; it is fundamental as it defines the state evolution operator.
The FKE (or Fokker-Planck equation) can be derived from the BKE.
Suppose we have \(t’ > t\).
Let us define \(V(X_T)\) as the function that gives the value of a payoff at time \(T\).
The value at time \(T\) is given by \(f(X_T, T) = V(X_T)\) (the payoff is known!), and hence, moving backward in time, the function \(f\) just gives the expectation of the final payoff: \(f(x,t) = \mathbb{E}_{x,t}[V(X_T)]\).
Thus we can write</p>
<script type="math/tex; mode=display">f(x,t) = \mathbb{E}_{x,t}[ \mathbb{E}[V(X_T) | \mathcal{F}_{t'}] ] = \mathbb{E}_{x,t}[ f(X_{t'}, t') ]</script>
<p>Let us set \(t’ = t + dt \) and \(X_t = x\).
Expanding \(f(X_{t + dt}, t + dt)\) in Taylor series gives</p>
<script type="math/tex; mode=display">f(X_{t + dt}, t + dt) = f(x,t) + \partial_t f dt + \partial_x f dX_t + \frac{1}{2}\partial_x^2 f dX_t^2 + \cdots</script>
<p>(We will truncate terms of the expansion at \(\mathcal{O}(dt^2)\).)
Expanding the above expectation, we have</p>
<script type="math/tex; mode=display">\mathbb{E}_{x,t}[ f(X_{t + dt}, t + dt) ] \simeq \mathbb{E}_{x,t}[f(x,t) + \partial_t f dt + \partial_x f dX_t + \frac{1}{2}\partial_x^2 f dX_t^2]</script>
<script type="math/tex; mode=display">= f(x,t) + \partial_t f dt + \partial_x f \mathbb{E}_{x,t}[dX_t] + \frac{1}{2}\partial_x^2f \mathbb{E}_{x,t}[dX_t^2]</script>
<script type="math/tex; mode=display">= f(x,t) + \partial_t f dt + \mu(x,t) \partial_x f dt + \frac{\sigma^2(x,t)}{2}\partial_x^2 f dt.</script>
<p>Now, since \( f(x,t) = \mathbb{E}_{x,t}[ f(X_{t + dt}, t + dt) ] \), we subtract \(f(x,t)\) from both sides of the equation to find the BKE:</p>
<script type="math/tex; mode=display">\partial_t f = -\mu(x,t)\partial_x f - \frac{\sigma^2(x,t)}{2}\partial_x^2f</script>
<h3 id="fke">FKE</h3>
<p>Clearly, the BKE can be rewritten as an operator equation:</p>
<script type="math/tex; mode=display">\partial_t f = - \mathcal{L}f,</script>
<p>where we have defined the linear operator \(\mathcal{L} = \mu(x,t) \partial_x + \frac{\sigma^2(x,t)}{2}\partial_x^2 \).
Heuristically speaking, what the BKE tells us is the probability distribution of the process \(X_t\) starting at the initial point \( (x,t) \) until the present; this is why it’s the more fundamental of the two equations.
What we would like is an equation that describes the probability distribution of the process going forward from the current point \( (x’, \tau) \).
It is intuitive that this equation is given by</p>
<script type="math/tex; mode=display">\partial_{\tau}f = \mathcal{L}^{\dagger}f,</script>
<p>where the dagger denotes the adjoint; we want the operator to work “backwards” in some way.
Our task is thus to find the adjoint operator of \(\mathcal{L}\).
Since we have been implicitly assuming (by use of the double expectations above) that \(f \in L^2\), we must solve the operator equation</p>
<script type="math/tex; mode=display">\langle f, \mathcal{L}g \rangle_{L^2} = \langle g, \mathcal{L}^{\dagger}f \rangle_{L^2}</script>
<p>To do this we will integrate by parts:</p>
<script type="math/tex; mode=display">\langle f, \mathcal{L}g \rangle_{L^2} = \int_{-\infty}^{\infty} dx f(x) ( \mu(x) \partial_x g + \frac{1}{2}\sigma^2(x) \partial_x^2g)</script>
<script type="math/tex; mode=display">= g(x)\mu(x)f(x)|_{-\infty}^{\infty} - \int_{-\infty}^{\infty}dx g(x) \partial_x(\mu(x)f(x)) + \frac{1}{2}\sigma^2(x)f(x)\partial_xg|_{-\infty}^{\infty} - \frac{1}{2}\int_{-\infty}^{\infty}dx (\partial_x g)\partial_x(\sigma^2(x) f(x))</script>
<script type="math/tex; mode=display">= - \int_{-\infty}^{\infty}dx g(x) \partial_x(\mu(x)f(x)) - \frac{1}{2}g(x)\partial_x(\sigma^2(x)f(x))|_{-\infty}^{\infty} + \frac{1}{2}\int_{-\infty}^{\infty}dx g(x) \partial_x^2(\sigma^2(x)f(x))</script>
<script type="math/tex; mode=display">= \int_{-\infty}^{\infty} dx g(x) \left[ -\partial_x(\mu(x)f(x)) + \frac{1}{2}\partial_x^2(\sigma^2(x)f(x)) \right]</script>
<p>Thus the adjoint operator must be</p>
<script type="math/tex; mode=display">\mathcal{L}^{\dagger} = -\partial_x \mu(x, \tau) + \frac{1}{2}\partial_x^2 \sigma^2(x,\tau),</script>
<p>giving the familiar Fokker-Planck equation as</p>
<script type="math/tex; mode=display">\partial_{\tau} f = - \partial_x(\mu(x,\tau) f) + \frac{1}{2}\partial_x^2(\sigma^2(x,\tau) f)</script>
<p>The FKE can be derived formally–it takes a little longer–but this treatment is intuitive. We can understand taking the adjoint of the operator as performing time reversal on the system; note that the sign of the drift term changes in the time reversal.
Note also that the system does not exhibit time symmetry, as the drift and diffusion functions are acted upon by the derivatives in the adjoint but not in the BKE.</p>
<p><a href="https://twitter.com/share" class="twitter-share-button" data-via="d_r_dewhurst" data-size="large">Tweet</a>
<script>!function(d,s,id){var js,fjs=d.getElementsByTagName(s)[0],p=/^http:/.test(d.location)?'http':'https';if(!d.getElementById(id)){js=d.createElement(s);js.id=id;js.src=p+'://platform.twitter.com/widgets.js';fjs.parentNode.insertBefore(js,fjs);}}(document, 'script', 'twitter-wjs');</script></p>I have never taken a course in probability, let alone a course in stochastic differential equations, so I am sort of winging it here. I think the following is one of the shorter “derivations” of the BKE and FKE that can be performed–but the disclaimer above is just in case there’s a well-known shorter derivation of which I’m unaware!Heat equation, apparently just for fun2017-08-01T17:00:00+00:002017-08-01T17:00:00+00:00http://daviddewhurst.github.io/diffusion-equation-why-not<p>It often happens in research that one writes a considerable amount of code only to find that it won’t turn out to be useful in whatever project one is working on. Such was the case with me today: I wrote a solver for the diffusion equation only to find that it won’t suit the purpose for which I originally wrote it. Oh well! Check out these cool numerical solutions anyway.</p>
<p>Recall that the diffusion equation with constant rate of diffusivity is given by</p>
<script type="math/tex; mode=display">\frac{\partial \rho}{\partial t} = \nabla^2 \rho.</script>
<p>My application required me to solve it in two dimensions only, so that’s what I’ll do here, setting the domain of solution \(\Omega = [0,1]\times[0,1]\).
I considered two basic kinds of boundary conditions:</p>
<ol>
<li>Neumann, given by \( \nabla_{n}\rho = 0 \) (or, more generally, some function of time), where \(n\) is a vector normal to the boundary \(\partial \Omega\).</li>
<li>Dirichlet, given by \(\rho(\partial \Omega) = 0 \) (again, more generally, some function of time).</li>
</ol>
<p>First, let’s see what happens when we start out with a multivariate normal distribution centered at \( \mu = (\frac{1}{2}, \frac{1}{2})^T \) with covariance matrix given by \( \Sigma = \frac{1}{4} I \) and use Neumann boundary conditions:</p>
<p><img src="/documents/heat_eqn_neumann_mult_norm.png" alt="png" /></p>
<p>The left panel is the initial condition, while the right is the state of the system after \(N_t\) timesteps of size \(\Delta t\).
Since I won’t be able to use this code for my research, I just made some fun shapes…</p>
<p><img src="/documents/heat_eqn_neumann_squares.png" alt="png" /></p>
<p><img src="/documents/heat_eqn_neumann_uneven_outline.png" alt="png" /></p>
<p>The above are both using the Neumann BCs.
For a interesting example of how boundary conditions <strong>really</strong> matter in PDEs (even with ones as straightforward as these!), constrast the following two simulations.
The first uses the Neumann BCs, the second uses Dirichlet.</p>
<p><img src="/documents/heat_eqn_neumann_corner_square.png" alt="png" /></p>
<p><img src="/documents/heat_eqn_dirichlet_corner_square.png" alt="png" /></p>
<p>Physically, we can interpret the Dirichlet conditions as saying that there is some energy source (or sink, in this case) that is constantly injecting (or withdrawing) energy from \(\Omega\) in order to keep the boundary a fixed temperature.
The Neumann conditions, on the other hand, describe the case of a perfectly insulated boundary.</p>
<h3 id="update">Update</h3>
<p>We’d better just solve the Laplace equation too, while we’re at it.
This is the steady state heat equation,</p>
<script type="math/tex; mode=display">\nabla^2\rho = 0,</script>
<p>and here I’ll solve it with Dirichlet boundary conditions only. What follows isn’t even related to anything else I’m doing–100% of this is just for fun.</p>
<p>With the BCs \( \rho(x, 0) = \cos(2\pi x) + 1,\ \rho(x, 1) = \sin(2\pi x) + 1,\ \rho(0, y) = \rho(1, y) = 1\):</p>
<p><img src="/documents/laplace_eqn_dirichlet_sin2pix1_cos2pix1_1_1.png" alt="png" /></p>
<p>With the tent map on the upper and lower boundaries, and zero boundary conditions on the sides:</p>
<p><img src="/documents/laplace_eqn_dirichlet_tent_map_0_0.png" alt="png" /></p>
<p>With the tent map on all sides:</p>
<p><img src="/documents/laplace_eqn_dirichlet_tent_map_all.png" alt="png" /></p>
<p>And yes, just to reassure you that my solutions are accurate, here’s one that many will instinctually recognize: bottom and left walls held constant at one, top and right walls held constant at zero:</p>
<p><img src="/documents/laplace_eqn_dirichlet_11_00.png" alt="png" /></p>
<p>The code is <a href="https://daviddewhurst.github.io/documents/fdm.py">here</a>. Be careful when running it for many timesteps (e.g., over 100K); it stores the entire state of the system at every timestep <strong>in memory</strong>, so that can grow quickly. (I’ll end up changing this behavior eventually.)</p>
<p>N.B.: This was compiled with a new compiler!</p>
<p><a href="https://twitter.com/share" class="twitter-share-button" data-via="d_r_dewhurst" data-size="large">Tweet</a>
<script>!function(d,s,id){var js,fjs=d.getElementsByTagName(s)[0],p=/^http:/.test(d.location)?'http':'https';if(!d.getElementById(id)){js=d.createElement(s);js.id=id;js.src=p+'://platform.twitter.com/widgets.js';fjs.parentNode.insertBefore(js,fjs);}}(document, 'script', 'twitter-wjs');</script></p>It often happens in research that one writes a considerable amount of code only to find that it won’t turn out to be useful in whatever project one is working on. Such was the case with me today: I wrote a solver for the diffusion equation only to find that it won’t suit the purpose for which I originally wrote it. Oh well! Check out these cool numerical solutions anyway.Quantum particles on the torus2017-07-20T11:59:00+00:002017-07-20T11:59:00+00:00http://daviddewhurst.github.io/particles-torus<p>Here is a simple yet beautiful problem in quantum mechanics: find the explict form of the wavefunction in the position basis for a single free particle (or multiple, noninteracting, distinguishable free particles) confined to move on the surface of the two-dimensional torus \(\mathbb{T}^2\).</p>
<p>We will just be solving the time-independent Schrödinger equation at first, since we know how to go from those solutions to the time-dependent solutions. Our beginning equation is then</p>
<script type="math/tex; mode=display">H \Psi = E \Psi</script>
<p>I’ll note now, for any physicists reading, that I’ve set all the constants (mass, reduced Planck constant, etc.) to one in the correct units.
Anyway, the fun part of this problem is the torus; what coordinate system should we use? I’ll think of the torus as \(\mathbb{T} = S^1 \times S^1\), which means that we should use a product of polar coordinates \(x = (\theta, \psi) \) with the radius held fixed.
(We could also see this essential linearity of the coordinates by thinking of the torus as the quotient group \(\mathbb{T}^2 = \mathbb{R}^2/\mathbb{Z}^2 \).)
Expressing the Hamiltonian in this basis gives \( \langle x | H | \Psi \rangle = \nabla^2 \Psi \), so our equation to solve is</p>
<script type="math/tex; mode=display">\left( \frac{1}{R_{\theta}^2}\frac{\partial^2}{\partial^2 \theta} + \frac{1}{R_{\phi}^2}\frac{\partial^2}{\partial^2 \phi} \right)\Psi(\theta, \phi) = E \Psi(\theta, \phi).</script>
<p>We also need some boundary conditions. Thankfully, those aren’t hard to figure out: since we’re on the torus, we know that any admissible solution must be periodic in both variables, so \(\Psi(\theta, \phi) = \Psi(\theta + 2 \pi, \phi)\) and \(\Psi(\theta, \phi) = \Psi(\theta, \phi + 2\pi)\).
Since \( \Psi^*(\theta, \phi)\Psi(\theta, \phi)\) must be a probability distribution, we also have the normalization condition</p>
<script type="math/tex; mode=display">\int_{0}^{2\pi}\int_{0}^{2\pi}d\theta d\phi R_{\theta} R_{\phi}\Psi^*(\theta, \phi)\Psi(\theta, \phi) = 1.</script>
<p>We’re ready to solve! This PDE is linear, so we write the wavefunction as a product of univariate functions \(\Psi(\theta, \phi) = \Theta(\theta)\Phi(\phi)\). Now we can write the PDE as a sum of ODEs:</p>
<script type="math/tex; mode=display">\frac{1}{R_{\theta}^2}\frac{\Theta''(\theta)}{\Theta(\theta)} + \frac{1}{R_{\phi}^2} \frac{\Phi''(\phi)}{\Phi(\phi)} = E</script>
<p>We will write \(E = \alpha + \beta\) and break this sum apart into two separate ODEs as</p>
<script type="math/tex; mode=display">\Theta''(\theta) = \alpha R_{\theta}^2 \Theta(\theta); \qquad \Phi''(\phi) = \beta R_{\phi}^2 \Phi(\phi).</script>
<p>These are easily solved; the solutions take the form of complex exponentials. Since we are in a compact space, we will discard the exponentially growning solutions to find \(\Theta(\theta) = c_1 \exp\left( -i R_{\theta}\sqrt{\alpha}\theta \right)\) and \(\Phi(\phi) = c_1 \exp\left( -i R_{\phi}\sqrt{\beta}\phi \right)\).
Thus the wavefunction has the form</p>
<script type="math/tex; mode=display">\Psi(\theta, \phi) = c \exp\left( -i(R_{\theta}\sqrt{\alpha}\theta + R_{\phi}\sqrt{\beta}\phi) \right).</script>
<p>There is still some work to be done: we need to find the values \(\alpha\) and \(\beta\) so that we can find an explicit form for \(E\). We also need to figure out the value of the normalization constant \(c\)–let’s do that first. Using our normalization condition, we see that</p>
<script type="math/tex; mode=display">c^2\int_{0}^{2\pi}\int_{0}^{2\pi}d\theta d\phi R_{\theta} R_{\phi} e^{ -i(R_{\theta}\sqrt{\alpha}\theta + R_{\phi}\sqrt{\beta}\phi)} e^{ i(R_{\theta}\sqrt{\alpha}\theta + R_{\phi}\sqrt{\beta}\phi)}= c^2 R_{\theta}R_{\phi}(2\pi)^2</script>
<p>so that \(c = \frac{1}{2\pi\sqrt{R_{\theta}R_{\phi}}}\) and the wavefunction is \( \Psi(\theta, \phi) = \frac{1}{2\pi\sqrt{R_{\theta}R_{\phi}}}\exp\left( -i(R_{\theta}\sqrt{\alpha}\theta + R_{\phi}\sqrt{\beta}\phi) \right) \).</p>
<p>Now we will deal with the boundary conditions as we figure out the energy levels. (You will note that this is a way to see that energy <strong>must</strong> be quantized in a quantum system.)
For \(\theta\), we have that \(e^{-iR_{\theta}\sqrt{\alpha}\theta} = e^{-iR_{\theta}\sqrt{\alpha}(\theta + 2\pi)} \), so that \(e^{-iR_{\theta}\sqrt{\alpha}2\pi} = 1 = e^{-i 2\pi n} \). Solving, we find that \(\alpha = \left( \frac{n}{R_{\theta}} \right)^2 \).
Performing an identical procedure for \(\phi\) gives \(\beta = \left( \frac{m}{R_{\phi}} \right)^2 \).
Substituting into the wavefunction, we have (finally!) that</p>
<script type="math/tex; mode=display">\Psi_{nm}(\theta, \phi) = \frac{1}{2\pi\sqrt{R_{\theta}R_{\phi}}}\ \exp\left( -i(n\theta + m\phi) \right)</script>
<p>with energy levels given by</p>
<script type="math/tex; mode=display">E_{nm} = \alpha_n + \beta_m = \left( \frac{n}{R_{\theta}} \right)^2 + \left( \frac{m}{R_{\phi}} \right)^2</script>
<p>Now that we’ve done all the hard work, we can have fun. Let’s introduce time and solve the time-dependent Schrödinger equation in the position basis:</p>
<script type="math/tex; mode=display">i\frac{\partial}{\partial t}\Psi(t) = E\psi(t).</script>
<p>This is a very easy equation; we just integrate to find \(\Psi(t) = \exp(-i E t)\). Great! Now we can put everything together to find that</p>
<script type="math/tex; mode=display">\Psi_{E_{nm}}(\theta, \phi, t) = \frac{1}{2\pi\sqrt{R_{\theta}R_{\phi}}} \exp(-i E_{nm} t) \exp\left( -i(n\theta + m\phi) \right),</script>
<p>with energy levels given above.</p>
<p>The great part about this problem is that, for noninteracting distinguishable particles, we could just repeat this process <em>ad infinitum</em> if we wanted to. The stationary wavefunction just becomes a product of the particles: \(\Psi(\theta, \phi) = \prod_{i=1}^n\Psi^{(i)}(\theta, \phi) \); in this case, we’d have</p>
<script type="math/tex; mode=display">\Psi_{n_1m_1,...,n_Nm_N}(\theta_1,\phi_1,...,\theta_N,\phi_N) = \left( \frac{1}{2\pi\sqrt{R_{\theta}R_{\phi}}} \right)^{N} \prod_{j=1}^N \exp\left( -i(n_j\theta_j + m_j\phi_j) \right)</script>
<p><a href="https://twitter.com/share" class="twitter-share-button" data-via="d_r_dewhurst" data-size="large">Tweet</a>
<script>!function(d,s,id){var js,fjs=d.getElementsByTagName(s)[0],p=/^http:/.test(d.location)?'http':'https';if(!d.getElementById(id)){js=d.createElement(s);js.id=id;js.src=p+'://platform.twitter.com/widgets.js';fjs.parentNode.insertBefore(js,fjs);}}(document, 'script', 'twitter-wjs');</script></p>Here is a simple yet beautiful problem in quantum mechanics: find the explict form of the wavefunction in the position basis for a single free particle (or multiple, noninteracting, distinguishable free particles) confined to move on the surface of the two-dimensional torus \(\mathbb{T}^2\).An efficient mechanism for carbon trading2017-07-19T12:00:00+00:002017-07-19T12:00:00+00:00http://daviddewhurst.github.io/carbon-emissions<p>Here’s a simple mechanism for formulating a carbon-trading market. This won’t be the fanciest thing ever, but it is efficient (defined in a very precise way) and guarantees a cap on total emissions into the infinite future, provided cheating the mechanism isn’t possible.</p>
<h3 id="primary-market-pricing">Primary market pricing</h3>
<p>Our goal is to limit total carbon emissions, to, say, \(A\) tons of carbon.
Each time period \(t \in \mathbb{N}\) people want to emit carbon, and so we have the natural condition that \(\sum_{t = 0}^{\infty}A_t = A\) as the restriction of the amount of carbon that can be emitted in each time period \(A_t\).
Now, if we impose the condition that in each time period there will be precisely \(n\) contracts traded, and that each contract expires at the end of its issued time period, then we can rewrite this condition as</p>
<script type="math/tex; mode=display">A = \sum_{t = 0}^{\infty} A_t = \sum_{t = 0}^{\infty} na_t = n\sum_{t = 0}^{\infty} a_t = na,</script>
<p>so that the optimization problem can actually be formulated on a per-contract basis with the constraint that \(\sum_{t = 0}^{\infty}a_t = a\).
We suppose that each contract amount of carbon \(a_t\) causes some cost \(c_t\) to the planet and society of the form \(c_t = f(a_t)\).
Our objective will be to minimize the economic cost associated with the primary sale of carbon emissions contracts on a per-unit-cost basis; that is, if \(p_t \) is the primary sale price of a contract to emit \(a_t\) units of carbon, we require that \(p_t\) has units so that \(p_t c_t\) has units of dollar (or whatever other currency you prefer).
It is thus clear that our optimization problem is</p>
<script type="math/tex; mode=display">\min \sum_{t=0}^{\infty}b_t p_t c_t \qquad \text{s.t. } \sum_{t = 0}^{\infty} a_t = a,</script>
<p>where \(b_t\) is a discounting function; we convert the dollar cost of purchasing future contracts into their net present value.
Substituting our functional relationship between carbon emitted and social cost, we seek to find \(p_t\) such that</p>
<script type="math/tex; mode=display">\frac{\partial}{\partial a_t}\left[ \sum_{t=0}^{\infty}b_t p_t f(a_t) + \lambda\left( a - \sum_{t = 0}^{\infty}a_t \right) \right] = 0.</script>
<p>We find that the optimal primary market price is given by</p>
<script type="math/tex; mode=display">p_t = \lambda\left(b_t \frac{df}{da_t} \right)^{-1}.</script>
<p>The parameter \(\lambda\) cannot be found in terms of \(a_t\) explicitly and we will treat it as a policy parameter below.
For a concrete example of this optimum, assume a rational discount function (assumptions!) given by \(b_t = e^{-t}\) and a quadratic social cost \(c_t = a_t^2\).
The optimal primary market price is then given by \( p_t = \lambda e^{t} a_t^{-1} \).
A heuristic understanding of why \(\lim_{a_t \rightarrow 0^+}p_t(a_t) = +\infty\) is possible with simple price theory; if demand is essentially perfectly inelastic (at least, for a long amount of time), a decreasing amount of \(a_t\) will naturally lead to large upward price movement.</p>
<h3 id="primary-market-sale">Primary market sale</h3>
<p>I’m not an expert in auction theory, so I’ll propose two simple (and possibly simplistic) ways of auctioning emissions permits in the primary market using this scheme.</p>
<ol>
<li>Fix the parameter \(\lambda = \lambda_t\) every time period and randomly select \(n\) out of \(m_t\) market participants to pay the amount \(p_t = \lambda_t \left(b_t \frac{df}{da_t} \right)^{-1}\). In this scenario, each market participant signs a contract acknowledging that they are willing to purchase a single contract at \(p_t\), but understand that they may not be selected to purchase the contract. While analytically pleasant, this mechanism is unlikely to work in practice as it relies on the assumption that each market participant is constrained to purchase only one contract per time period–an unlikely condition for, say, a large manufacturer.</li>
<li>Let \(\lambda\) float and have an open market auction. This is the more practical approach and allows for price discovery beyond the fixed part of the price. This will, however, introduce monopoly concerns.</li>
</ol>
<p>In a realistic case, one might implement a modified version of the second of these two strategies as follows.
There would be income tranches \(y_1,…,y_n\) and price multiplier caps on all tranches below \(n\) above which permits could not trade. The highest tranche would be for the “major-league” players and would have no price cap; it would be a true open auction.</p>
<p>These mechanisms don’t address trading issues inherent in the secondary market; I’ll have to come back to that.</p>
<p><a href="https://twitter.com/share" class="twitter-share-button" data-via="d_r_dewhurst" data-size="large">Tweet</a>
<script>!function(d,s,id){var js,fjs=d.getElementsByTagName(s)[0],p=/^http:/.test(d.location)?'http':'https';if(!d.getElementById(id)){js=d.createElement(s);js.id=id;js.src=p+'://platform.twitter.com/widgets.js';fjs.parentNode.insertBefore(js,fjs);}}(document, 'script', 'twitter-wjs');</script></p>Here’s a simple mechanism for formulating a carbon-trading market. This won’t be the fanciest thing ever, but it is efficient (defined in a very precise way) and guarantees a cap on total emissions into the infinite future, provided cheating the mechanism isn’t possible.Differentiability and least squares2017-06-30T12:30:00+00:002017-06-30T12:30:00+00:00http://daviddewhurst.github.io/differentiability-least-squares<p>Here is an interesting problem I encountered in <a href="http://www.cems.uvm.edu/~jmwilson/">Mike Wilson’s</a> analysis course: let \(u \in \mathbb{R}^d \) be a unit vector and define the function \(f: \mathbb{R}^d \rightarrow \mathbb{R} \) by</p>
<script type="math/tex; mode=display">f(x) = \inf_{t \in \mathbb{R}} || x - tu ||^2.</script>
<p>Show that \(f\) is differentiable on all of \(\mathbb{R}^d\), and find an expression for \(f’(p)\) in terms of \(p\) and \(u\).</p>
<p>Check it out <a href="https://daviddewhurst.github.io/documents/least_squares_formal.pdf">here</a>!</p>
<p>N.B.: Professor Wilson’s class is quite hard (this is one of the easier problems he assigned in the second half of the semester) but well worth the effort. I highly recommend it to any interested student.</p>
<p><a href="https://twitter.com/share" class="twitter-share-button" data-via="d_r_dewhurst" data-size="large">Tweet</a>
<script>!function(d,s,id){var js,fjs=d.getElementsByTagName(s)[0],p=/^http:/.test(d.location)?'http':'https';if(!d.getElementById(id)){js=d.createElement(s);js.id=id;js.src=p+'://platform.twitter.com/widgets.js';fjs.parentNode.insertBefore(js,fjs);}}(document, 'script', 'twitter-wjs');</script></p>Here is an interesting problem I encountered in Mike Wilson’s analysis course: let \(u \in \mathbb{R}^d \) be a unit vector and define the function \(f: \mathbb{R}^d \rightarrow \mathbb{R} \) byHealth insurance is hard2017-06-29T23:00:00+00:002017-06-29T23:00:00+00:00http://daviddewhurst.github.io/health-insurance<p>Health insurance has been front-and-center in the news recently, following Congressional Budget Office (CBO) reports that the current House and Senate healthcare bills may cause millions of Americans to lose their healthcare coverage while also removing several taxes imposed by the Affordable Care Act, colloquially known as Obamacare. Insurers haven’t been silent; some have stated that the proposed restructuring of Affordable Care Act exchanges would do wonders for their business, while others have vocally advocated for <a href="https://www.wsj.com/articles/health-insurers-uneasy-with-senates-approach-to-continuous-coverage-1498580450">retaining the individual mandate.</a>
Here, I’ll outline a simple model of a health insurance company to demonstrate why the problem of covering as many people as possible while also ensuring low premiums for those who can’t afford insurance is so difficult. I’ll also suggest guidelines for future policy.</p>
<h3 id="an-illustrative-model">An illustrative model</h3>
<p>The model I’ll detail is really quite simple; no Humana or Aetna here. The insurance company I’ll attempt to describe is the sort of company you might set up: premiums aren’t invested in a financial market, and there aren’t any employees.
Let’s begin with the health insurance consumer.
She has a probability distribution of incurring an illness or injury (or not-illness-or-injury; having nothing wrong with her at all) at time \(t\) given by \(p_i(x, t)\), for events \( x \in \mathbb{X}\).
Associated with each event \(x\) at time \(t\) is a cost \(C_i(x, t)\).
Thus, her expected discounted cost due to injury and illness over her life is given by</p>
<script type="math/tex; mode=display">\mathbb{E}[C_i] = \sum_{t=0}^{\infty}e^{-rt}\sum_{x \in \mathbb{X}}p_i(x,t)C_i(x,t).</script>
<p>Like anyone, she doesn’t want to have to pay this lifetime cost. More than that, though, she wants <em>certainty</em>; she’d rather pay a fair amount of money each month than pay what some event out on the tail of the distribution \(p(x,t)\) might cost her. (I’m making the (quite reasonable) assumption that rare events are positively correlated with higher cost.)
A health insurance company is willing to purchase that risk from her at a constant price per month (or whatever other time period you like), call it \(\pi_i\), but in return for taking on this risk they’ll increase the total cost to the consumer by \(\rho_i\), the risk premium.
To find out what they should charge the consumer per month, they’ll solve for \(\pi_i\) in the following equation:</p>
<script type="math/tex; mode=display">\mathbb{E}[C_i] + \rho_i = \sum_{t=0}^{\infty}\pi_i e^{-rt}</script>
<p>Summing the geometric series and doing some algebra, we see that she’ll be charged \(\pi_i = (\mathbb{E}[C_i] + \rho_i)( 1 - e^{-rt} ) \).
So far, so good. If the insurance company does the same thing for each customer \(i\), they’ll have revenue given by \(R = \sum_i \sum_{t=0}^{\infty} e^{-rt}\pi_i\) and expected cost given by \(C = \sum_i \mathbb{E}[C_i]\) for expected profit of \(\mathcal{P} =\sum_i \rho_i\).
(The reader will note that I’m talking about long-run revenue and cost here; we’ve done all of the discounting “up front”, so to speak, so that we can algebraically manipulate quantities when estimating expected profit.)
Already we see the profit incentive for insurance companies to seek out consumers \(i\) with cost functions of low magnitude or event probability distributions with thin tails.</p>
<p>Now suppose a policy is implemented such that all consumers \(i’\) must have monthly premiums set to \(\theta_{i’} = \pi_{i’} - \delta_{i’}\), where \(\delta_{i’} > 0\).
For now, we’ll assume that this policy appears for no reason at all, and thus that there’s no correlation between consumer \(i’\) and the consumer’s cost function or event probability distribution. The company can respond in one or both of two ways:</p>
<ol>
<li>Eat the cost; that is, let expected profits become</li>
</ol>
<script type="math/tex; mode=display">% <![CDATA[
\mathcal{P'} = \sum_{i \neq i'} \rho_i + \sum_{i'} (\rho_{i'} - \delta_{i'}) < \mathcal{P}, %]]></script>
<p>so that the company’s risk of ruin is higher and expected profitability is lower; or</p>
<ol>
<li>Charge other customers more; that is, given that <em>initially</em> the company will have the same number of consumers before and after the policy implementation, the revenue function becomes</li>
</ol>
<script type="math/tex; mode=display">R' = \sum_{t = 0}^{\infty}e^{-rt}\left( \sum_{i \neq i'}( \pi_i + \delta_i) + \sum_{i'} (\pi_{i'} - \delta_{i'}) \right)</script>
<p>where \( \sum_i \delta_i = \sum_{i’} \delta_{i’} \); that is, some consumers are in effect subsidizing others. The first option is pretty clearly flawed; the company exists to make a profit, and even more important that quarterly earnings is the company’s survival. Anything that increases its risk of ruin is going to be a no-go.</p>
<p>The second option is flawed, too, since it will discourage those consumers \(i \neq i’\) from entering the insurance market because of rising premiums, and likewise encourage consumers \(i’\) to enter the market when they might not have earlier.
This is problematic because the assumption that the assignment of a discount \(\delta_{i’}\) and \(i’\) cost function / probability distribution is not at all realistic. In fact, the usual rationale for health insurance subsidies is to provide previously-unobtainable coverage for the poor.
Yet there is <a href="http://www.cmaj.ca/content/174/7/923.short">overwhelming evidence</a> to support the claim that poverty and sickness are positively correlated.
Thus, increasing enrollment of \(i’\) and decreasing enrollment of \(i \neq i’\) has the effect of increasing the company’s risk profile and thus its risk of ruin, along with decreasing its profitability.
The result? Insurance companies will withdraw from markets that enforce these premiums—which is <a href="https://www.usatoday.com/story/news/nation-now/2017/05/03/iowa-health-insurers-obamacare/309955001/">exactly what we see happening</a> right now.</p>
<h3 id="individual-mandate">Individual mandate?</h3>
<p>There isn’t an easy fix to this problem; if there were, I wouldn’t be writing about it. The Affordable Care Act, flawed as it is, does contain one important mechanism that the current Congress would do well to retain in their bill, at least in spirit: the individual mandate. This enforces penalties for not purchasing insurance to encourage the spry, healthy consumers (\(i \neq i’\)) to purchase insurance, thus reducing the risk of ruin to the insurance companies and thereby encouraging them to stay in the market. In fact, my criticism of the individual mandate is simply that its penalties for not purchasing insurance aren’t high enough; many consumers decide they incur less disutility from paying the penalty than from purchasing insurance they feel they don’t need. The Swiss <a href="https://en.wikipedia.org/wiki/Healthcare_in_Switzerland">figured this out</a> a long time ago; they mandate that all residents purchase a basic private healthcare plan, while ensuring that insurance companies don’t raise prices of these basic plans to unaffordable levels.</p>
<p><a href="https://twitter.com/share" class="twitter-share-button" data-via="d_r_dewhurst" data-size="large">Tweet</a>
<script>!function(d,s,id){var js,fjs=d.getElementsByTagName(s)[0],p=/^http:/.test(d.location)?'http':'https';if(!d.getElementById(id)){js=d.createElement(s);js.id=id;js.src=p+'://platform.twitter.com/widgets.js';fjs.parentNode.insertBefore(js,fjs);}}(document, 'script', 'twitter-wjs');</script></p>Health insurance has been front-and-center in the news recently, following Congressional Budget Office (CBO) reports that the current House and Senate healthcare bills may cause millions of Americans to lose their healthcare coverage while also removing several taxes imposed by the Affordable Care Act, colloquially known as Obamacare. Insurers haven’t been silent; some have stated that the proposed restructuring of Affordable Care Act exchanges would do wonders for their business, while others have vocally advocated for retaining the individual mandate. Here, I’ll outline a simple model of a health insurance company to demonstrate why the problem of covering as many people as possible while also ensuring low premiums for those who can’t afford insurance is so difficult. I’ll also suggest guidelines for future policy.Simon’s model displays a first-mover advantage2017-05-11T18:00:00+00:002017-05-11T18:00:00+00:00http://daviddewhurst.github.io/simons-model-first-mover<p><a href="https://www.uvm.edu/pdodds/">Peter Dodds</a>, myself, and some other friendly folks affiliated with the <a href="https://www.uvm.edu/storylab/">Computational Story Lab</a> just published a paper in <a href="https://journals.aps.org/pre/abstract/10.1103/PhysRevE.95.052301">Physical Review E</a> detailing an inherent first-mover advantage in Herbert Simon’s <a href="https://en.wikipedia.org/wiki/Simon_model">preferential attachment model</a>. Check it out!</p>
<p>Re-analysis of the generative algorithm led us to the discovery that the first group actually has a size advantage proportional to \(1/\rho\), where \(\rho\) is the innovation probability.
As \(\rho\) is typically quite small–on the order of \(10^{-2}\) to \(10^{-5}\)–this results in the first group being from 100 to 100000 times (respectively) as large as the mean-field analysis of the algorithm dictates.
There’s even evidence for this mechanism in real-world citation counts (props to Dodds for wrangling this data…expert emacs skills were on display).</p>
<p><a href="https://twitter.com/share" class="twitter-share-button" data-via="d_r_dewhurst" data-size="large">Tweet</a>
<script>!function(d,s,id){var js,fjs=d.getElementsByTagName(s)[0],p=/^http:/.test(d.location)?'http':'https';if(!d.getElementById(id)){js=d.createElement(s);js.id=id;js.src=p+'://platform.twitter.com/widgets.js';fjs.parentNode.insertBefore(js,fjs);}}(document, 'script', 'twitter-wjs');</script></p>Peter Dodds, myself, and some other friendly folks affiliated with the Computational Story Lab just published a paper in Physical Review E detailing an inherent first-mover advantage in Herbert Simon’s preferential attachment model. Check it out!