Mean square displacement

In a molecular dynamics simulation a convenient quantity that can be used to monitor the state of the system is the mean square displacement, defined as:

$\displaystyle m(t) = \frac{1}{N} \sum_{i=1}^N \vert{\bf r}_i(t+t_0) - {\bf r}_i(t_0)\vert^2,$ (7.106)

where $t_0$ is some initial reference time. In a system with no diffusing behaviour, such as a solid, $m(t)$ is expected to rise first, and then reach a constant, which is related to the maximum displacement of the particles from their equilibrium positions. By contrast, in a fluid $m(t)$ is expected to rise with time. If the motion of the particles is random, which is a good approximation for a system in thermal equilibrium, then $m(t)$ increases linearly with time, and its slope is related to the diffusion coefficient. To understand where the linear behaviour of $m(t)$ comes from consider a random walk. This could be in any dimensions but for simplicity let us consider a one dimensional system. Let us associate a variable $z_i$ to the $i^{th}$ step in the walk, which can be either $+1$ or $-1$, depending on if the step is taken by going to the right or to the left. Let us also define the variable $s_N = \sum_{i=1}^N z_i$, which is the length of the walk after $N$ steps. The average value of $s_N$ is clearly zero, as:

$\displaystyle \langle s_N \rangle = \left \langle \sum_{i=1}^N z_i \right \rangle = \sum_{i=1}^N \langle z_i \rangle = 0,$ (7.107)

but the average value of $s_N^2$ is not:

$\displaystyle \langle s_N^2 \rangle = \left \langle \sum_{i=1}^N z_i \sum_{j=1}...
...1}^N \langle z_i^2 \rangle + \sum_{i,j=1;i\ne j}^N \langle z_i z_j \rangle = N,$ (7.108)

because $\langle z_i^2 \rangle = 1$ and $\langle z_i z_j \rangle = \langle z_i \rangle \langle z_j \rangle = 0$ since the $i^{th}$ and the $j^{th}$ steps are uncorrelated. This shows that in a random walk of step size 1 the mean square displacement from the origin of the walk is equal to the number of steps, and so it is linearly proportional to time, if the number of steps per unit time is constant. In systems with continuous displacements this translates into a linear dependence on time of the mean square displacement $m(t)$ defined above.

Over a simulation of total length $T$ one clearly only has access to $m(t)$ with $0\le t \le T$, and to improve on statistics it is useful to compute [*] by averaging over time origins $t_0$:

$\displaystyle m(t) = \frac{1}{T-t} \sum_{t_0=0}^{T-t} \frac{1}{N} \sum_{i=1}^N \vert{\bf r}_i(t+t_0) - {\bf r}_i(t_0)\vert^2.$ (7.109)

We see that for $t=0$ it is possible to average of the whole length of the simulation, but as $t$ increases the available length over which one can average is reduced to $T - t$, and so the statistical error on $m(t)$ increases with $t$. For $t=T$ there is only one available configuration.