Theory behind PIC
tl;dr
Our goal in this section be to derive an equation which could describe an evolution of a collection of charged particles in self-induced eletric and magnetic fields. Ultimately, since we will want to solve this partial differential equation on the computer, we will construct an algorithm relying on the method of characteristics which approximates the exact solution, when the number of integrated characteristics (macroparticles) tends to infinity.
At the beginning, we make no simplifying assumptions, but, as we will see further, given the long-range nature of the interaction, we will have to sacrifice generality to make the equations tractable.
Clasically, there are two equivalent approaches to this problem. Below, we consider both of them, which hopefully will help highlight different aspects of the resulting equation as well as the assumptions made.
Klimontovich equation & ensemble average¶
We start with a distribution function for a collection of charged particles of species \(s\) with charge \(q_s\) and mass \(m_s\) in the phase space \((\bm{x},\bm{u})\), with \(\bm{u}\) being the relativistic four-velocity:
where \(\bm{x}_i^s(t)\) and \(\bm{u}_i^s(t)\) are the positions and four-velocities of all the particles of species \(s\). We can then take the derivative of this equation in time to get an equation for the evolution of \(N_s\):
We can now use the definition \((1)\) to finally write:
(the non-relativistic version of) which was first obtained by Yu. L. Klimontovich in 1958. Here \(\gamma \equiv \sqrt{1+|\bm{u}|^2}\). For a typical plasma with no additional interactions from outside, \(\bm{F}_s\) is the Lorentz force from self-induced electromagnetic fields:
where the hyperbolic Maxwell's equations are implied:
with \(V=\int d^3\bm{x}'d^3\bm{u}'\), and the other two equations serving as "boundary conditions"
The superscript "\(N\)" here implies that the fields are generated from the full distribution function \(N_s\).
Notice, that in deriving eq. \((2)\), we made no assumptions about the strengths of interactions of plasma particles with each other, or the scales of gradients of the fields. In other words, Klimontovich's equation (coupled with Maxwell's equations above) is the most general way of describing the plasmas without the loss of generality.
The function \(N_s\) still has \(6N\) unknowns (where \(N\) is the number of plasma particles), and is, in general, not very useful for studying the behavior of plasmas. To advance further, we consider an ensemble of microstates defined by \(N\) positions and velocities of particles, \(\bm{x}_i^s\) and \(\bm{u}_i^s\). We further define a special function, \(f_s(\bm{x},~\bm{u})\equiv \langle N_s(\bm{x},~\bm{u}) \rangle\), where from now on we will assume \(\langle\cdot\rangle\) to be an ensemble average. Notice, that by doing so, the function \(f_s\) loses any information about each individual particle, becoming essentially a continuous function of two phase-space variables.
In each microscopic realization, the function \(N_s\) can deviate from the ensemble average, which we conveniently denote as \(\delta N_s \equiv N_s - f_s\). Similarly, we may also define \(\bm{E}\equiv\langle\bm{E}^N\rangle\), \(\bm{B} \equiv \langle\bm{B}^N\rangle\), and \(\delta \bm{E}^N\equiv\bm{E}^N-\bm{E}\), \(\delta \bm{B}^N \equiv \bm{B}^N - \bm{B}\).
Note
Note that by definition, \(\langle\delta N_s\rangle = 0\), \(\langle\delta \bm{E}^N\rangle = 0\), \(\langle\delta \bm{B}^N\rangle = 0\).
Plugging \(N_s=f_s+\delta N_s\), \(\bm{F}^N_s=\bm{F}_s+\delta\bm{F}_s^N\) into equation \((2)\), and ensemble averaging the result, we find:
The left-hand-side of this equation is the advective derivative of \(f_s\) in phase-space. The non-conservation of the phase-space volume is thus governed solely by the right-hand-side. Ultimately, the right-hand-side contains terms proportional to \(\left\langle\delta N_s(\bm{x}',~\bm{u}')\delta N_s(\bm{x}'',~\bm{u}'') \right\rangle\), which is also referred to as the correlation function. This term describes non-linear correlations of the fluctuations of \(N_s\) in two different locations of the phase-space. When the fluctuations from the ensemble average are completely decorelated (independent), this term becomes zero, and we obtain the famous Vlasov equation:
In which case can one assume that fluctuations are decorelated? If the evolution of each particle is solely governed by the ensemble averaged quantities (i.e., smooth electromagnetic fields), and (on average) does not depend on the particular realizations of the thermodynamic microstate, then all the non-linear terms must vanish, when ensemble averaged. This may not hold if, for instance, particles actively experience dynamically important close encounters with each other (Coulomb collisions), in which case their evolutions are no longer solely governed by the smooth average forces. For this reason, the right-hand-side of the eq. \((3)\) is often referred to as the collisional (Landau) integral.
Liouville equation & BBGKY hierarchy¶
Instead of thinking of a collection of particles with predefined coordinates in phase-space in the microstate, one can also think of the whole system as a random realization one such microstate. Then we can write down a probability distribution function of the system of \(6N\) variables in the following form:
where \(\bm{X}_k(t)\) and \(\bm{U}_k(t)\) are the actual coordinates of the particles in phase-space implicitly depending on time (we drop the species index for brevity). To derive the time evolution of such a system, we differentiate \(f_N\) w.r.t. time:
Using the properties of a delta function, and assuming that \(\dot{\bm{X}}_k = \bm{U}_k/\Gamma_k\) (with \(\Gamma_k\equiv \sqrt{1+|\bm{U}_k|^2}\)), and \(\dot{\bm{U}}_k = (1/m_k)\sum\limits_l \bm{F}_{l\to k}\), with the latter expression defining the force from particle with index \(l\) to particle with index \(k\). Plugging the expression \((5)\) back into this equation, we get:
As with the Klimontovich equation before, here we have not yet made any simplifying assumptions, and eq. \((6)\) describes the evolution of the combined probability density function, \(f_N\), of \(N\) particles in the \(6N\)-dimensional phase-space in its most general form.
In practice, most of the time we are not interested in the exact behavior of each individual particle, as for all practicle purposes they are indistinguishable. It is thus helpful to introduce the reduced probability distribution of \(M<N\) particles as follows:
In other words, in transiting from \(N\)-particle distribution to an \(M\)-particle one we "integrate away" the explicit dependency (or correlation) of the evolution of the first \(M\) particles on the last \(N-M\) particles. This becomes obvious, when we interate the equation \((6)\) w.r.t. \(\bm{x}_N\) and \(\bm{u}_N\) to get the evolution equation for the \(N-1\)-particle distribution:
The dependency of the evolution of the first \(N-1\) particles on that of the last particle is now encoded in the last term of the equation; thus in general, the \(M<N\) dimensional probability distribution function still depends on the full \(N\)-dimensional one:
where it is obvious, that the evolution of \(f_{M}\) also depends on the evolution of \(f_{M+1}\). This is often called in the literature the BBGKY hierarchy, after Bogoliubov (1946), Born, & Green (1946), Kirkwood (1946, 1947), and Yvon (1935).
For \(M=1\) we find:
where to derive the equation for a single-particle distribution we need to know the two-particle one, \(f_2\). It is useful to decompose the latter into two single-particle distributions and a term describing their correlation:
where, just like before, the last term, \(\delta f_2\) describes two-particle interactions. Likewise, we could have written the evolution equation for \(f_2\), and decomposed it further into many-body interactions. For our purposes, we will limit ourselves to the evolution of one-particle distribution ignoring the two-particle correlations. Adopting \(f\equiv f_1\), and plugging the \(f_2\) decomposition ino the equation above (ignoring \(\delta f_2\)), we reproduce the previously found Vlasov equation:
where \(\bm{F} \equiv (N/\mathcal{V})\int d\bm{x}'d\bm{u}' \bm{F}_{2\to 1} f(\bm{x}',\bm{v}',t)\)
Deriving equations for PIC¶
We now derived the evolution equation for the continuous distribution describing a system of particles that do not undergo binary interactions, and only interact via collective large-scale self-induced fields:
which, coupled to the two Maxwell's equations,
forms a closed system of equations, otherwise known as the Vlasov-Maxwell system. The two boundary conditions
are satisfied automatically, if one follows the evolution using the system of equations in \((1)\) and \((2)\). In the most general case, this system is 6-dimensional and non-linear, so solving it numerically is not only challenging, but also costly.
Particle-in-cell (PIC) algorithm is a widely used technique which significantly simplifies the solution of this system. To introduce the technique, we first need to decompose the 6D distribution function \(f_s\) into special basis functions in phase-space by introducing a finite number of the so-called macroparticles:
where \(w_i^s\) are called macroparticle weights, and \(S\) is often referred to as a shape function. Since \(f_s\) in general is an arbitrary continuous function, and we only introduce a finite set of particles with phase-space coordinates \((\bm{x}_i^s,~\bm{u}_i^s)\), the equality above is only approximate, and it approaches the exact solution as the number of macroparticles increases. From the shape function we will require the following properties:
where \(V\) is a finite volume in the real space which includes the origin. Here and further we will use the terms particle and macroparticle interchangeable, but one should really think of these entities (ha-ha), as discrete sampling of the original continuous distribution function.
Plugging \((3)\) into \((1)\), we get the following equation (for brevity, we use \(\delta_{\bm{u}} = \delta(\bm{u}-\bm{u}_i^s)\), and \(S_{\bm{x}}=S(\bm{x}-\bm{x}_i^s)\)):
where again, we used the fact that \(\partial \bm{F}_s/\partial \bm{u} = 0\). We are now ready to derive the evolution equations for macroparticles which will exactly reproduce (given large-enough number of them) the original solution of the Vlasov-Maxwell system. We integrate equation \((4)\) separately over \(d^3\bm{x}\) and \(d^3\bm{u}\):
where we used the Gauss-Ostrogradsky theorem to transform the integral in phase-space volume to an integral over the boundaries. The equality is satisfied for an arbitrarily chosen volume if and only if the following two equations hold:
Naively, one could recognize the equations of motion for particles with coordinates \(\bm{x}_i^s\), and four-velocities \(\bm{u}_i^s\). But the striking and crucial difference from regular equations of motion is in the fact that the force in this case is effectively "interpolated" for each macroparticle using the shape function \(S\). The absence of this in the first equation is due to our choice of the \(\delta\)-function for the "shape" in velocity space.
One should really think of the system \((5)\) as \(6N\)-dimensional system of characteristic equations along which the convective derivative of \(f_s\) (given by eq. \(1\)) is exactly zero.
To see how these macroparticles should induce current densities in our algorithm to exactly satisfy the charge conservation, i.e., \(\nabla\cdot\bm{E} = 4\pi \rho\), we differentiate the latter equation in time, substituting \(f_s\) from \((3)\), to get the following:
meaning that the divergence of the deposited current density has to satisfy
Discretization¶
Having the system of equations on macroparticles defined in \((5)\), and with the equations for the electromagnetic fields \((2)\) and the current density constrained by \((6)\), we can proceed to formulate the particle-in-cell numerical scheme.
First, we discretize the electromagnetic fields and the current densities following the Yee staggering convention. According to that, the \((i,j,k)\) element of each of the field components is defined in slightly different spatial locations:
Here we use Cartesian coordinates, but this convention naturally translates to any other coordinate system.
Why Yee?
The reason behind adopting this counter-intuitive convention has to do with the finite-differencing in Maxwell's equations. In particular, if we look at the \(x\) component of the Ampere's law, the term \(\partial E_x/\partial t\) is coaligned with the position of \(E_x\) (in our case, \(i+1/2,j,k\)). Terms in the curl, on the other hand, have a form: \(\partial B_z/\partial y - \partial B_y/\partial z\). If we attempt to discretize the first term, assuming that \(B_z\) is defined at some location \((*,j',*)\), we will get:
We must thus adopt \(j'-1/2 = j\), or \(j' = j+1/2\). Likewise, all the other components can also be inferred. In other words, the choice is dictated by the requirement that the left-hand-side of Faraday's and Ampere's laws be co-aligned with the right-hand-side written in the finite difference form.
Time in PIC is also discretized (we typically use the index \(n\) to indicate the timestep). In our particular case, for the forward-integration we employ the so-called leapfrog algorithm, where the electric and magnetic fields are defined to be staggered with respect to each other by half a timestep (defined to be \(\Delta t\)). In other words, we have \(\bm{E}^{(n)}\), and \(\bm{B}^{(n-1/2)}\).
Coordinates and velocities of macroparticles are tracked separately and have continuous values at each discrete timestep. We use the same leapfrog algorithm for integrating particle equations of motion defined in \((5)\), and thus the velocities and coordinates are staggered with respect to each other in time: \(\bm{x}_i^{(n)}\), and \(\bm{u}_i^{(n-1/2)}\).
There are two ingredients which were left out in this picture: interpolation of the electromagnetic force from the grid to the position of the particle (required by the second equation in eq. \((5)\)), and the deposition of currents on the discretized grid, which have to satisfy the requirement in \((6)\) to conserve charge. For both of these issues, we first need to make a choice of the shape function, \(S(\bm{x})\). By far the most common choice is the first-order (linear) shape, defined as:
Using this definition, the integral in \((5)\) is simply a linear interpolation of each field component from each corresponding location to the position of the particle.
Current deposition is slightly trickier, and we will not go through the entire derivation of the algorithm (for the reference, see Esirkepov 2001 or Umeda+ 2003). However, it can be shown mathematically, that asserting specific properties for the shape function (e.g., symmetry in all directions, etc.), there exists a unique set of coefficients to translate the shape function of each particle at timesteps \((n)\) and \((n+1)\) to the deposited current components.
Finally, the full timestepping algorithm with everything discussed above is shown on the page of this wiki about the PIC algorithm.