Numerics

As is oftentimes the case when aiming to model realistic physical systems, the equations of motion presented here do not admit closed-form analytical solutions. As such, we must rely on numerical methods and computer simulations to solve them. Specifically, we solve \(N\) independent sets of Hamilton’s equations, each consisting of six first-order, coupled differential equations.

Numerical integration presents several challenges. First, there is the practical question: how do we actually solve these equations? How can we be confident that our numerical solutions faithfully approximate the true dynamics, especially when the true solution is unknown? How do we handle performance, data volume, or the trade-offs between speed and accuracy?

To do this, I chose to write my own code, tstrippy, despite other codes on the market already existing (Pelupessy et al. 2013; Bovy 2015; Wang et al. 2015; Price-Whelan 2017; Vasiliev 2018). My motivation was part practical: I wanted to avoid installation difficulties, steep learning curves, and uncertainty over whether existing tools could implement my specific setup. But above all else, I wanted to try my hand at it.

This chapter documents how we solve the equations of motion numerically, how we validate the accuracy of the solutions, and how the code is organized under the hood.

Astronomical units and scaling

When writing any code, the choice of units is important. Astronomical units are rarely the same as SI units. Their creation were often times observationally and historically motivated, resulting in a system that uses multiple units for the same physical quantity, which can be confusing at first.

For instance, sky positions are typically reported in spherical coordinates. Right ascension (analogous to longitude) is often expressed in either degrees or hours, while declination (similar to latitude) is given in degrees. Distances are reported in parsecs when derived from parallax measurements. Line-of-sight velocities, obtained via spectroscopic Doppler shifts, are reported in kilometers per second. Proper motions describe angular displacements over time on the sky and are usually reported in milliarcseconds per year. Already, we encounter several different units for angles (degrees, hours, arcseconds), time (years, seconds), and distance (km, kpc), none of which align with SI’s standard units of radians, seconds, or meters, as summarized in Table 1.

Units for various astronomical quantities in Galactic and SI systems.
	Distance	RA	DEC	\(\mathrm{v}_\mathrm{LOS}\)	\(\mu_\alpha\)	\(\mu_\delta\)
Galactic:	[kpc]	[deg] [HHMMSS]	[deg]	km/s	[mas/yr]	[mas/yr]
S.I.	[m]	[rad]	[rad]	m/s	[rad/s]	[rad/s]

This raises practical concerns—for example, what would be the unit of acceleration? km/s\(^2\)? parsec/year/second? To systematically manage units, we turn to dimensional analysis, notably the Buckingham Pi theorem (Buckingham 1914). In classical mechanics, physical quantities are typically expressed in terms of three fundamental dimensions: length, time, and mass. Any quantity can then be represented as a product of powers of these base units: \[\left[\mathrm{Quantity}\right] = l^a t^b m^c = \begin{bmatrix} a\\ b\\ c \end{bmatrix}\] For example, velocity has dimensions \([1, -1, 0]\), momentum is \([1, -1, 1]\), and acceleration is \([1, -2, 0]\).

It is not strictly necessary to adopt length-time-mass as the fundamental basis, as long as the three chosen base units are linearly independent. In stellar dynamics, it is often more natural to use distance, velocity, and mass as the base units. In this thesis, we adopt:

In this system, time has derived units of: \[\left[t\right] = \frac{\mathrm{distance}}{\mathrm{velocity}} = \frac{\mathrm{kpc}}{\mathrm{km/s}}.\] While not immediately intuitive, this unit of time is convenient because: \[1\mathrm{Gyr} \approx 1~\mathrm{s}\cdot\frac{\mathrm{kpc}}{\mathrm{km}}.\] The gravitational constant has dimensions: \[\left[G\right]=\frac{v^2 \cdot l}{m},\] which evaluates numerically to: \[G = 4.301 \times 10^{-6} \left(\mathrm{km}/\mathrm{s}\right)^2 \cdot \mathrm{kpc} \cdot \mathrm{M}_\odot^{-1}.\] Once the base units are defined, derived quantities such as acceleration follow directly. Whether considering acceleration as \(v^2~l^{-1}\) or \(l \cdot t^{-2}\), they are equivalent and yield: \(\left(\mathrm{kpc}/\mathrm{s}\right)^2 \cdot \mathrm{kpc}^{-1}\).

It is worth mentioning that \(N\)-body codes often select distance, velocity, and the gravitational constant as the base units, setting \(G = 1\). While this choice simplifies force computations, it introduces less intuitive units for mass. For instance, by choosing 1 kpc for distance and 1 km/s for velocity, and setting \(G = 1\), the derived mass unit becomes: \[\left[\mathrm{mass}\right] = \frac{l \cdot v^2}{G} = 232509~\mathrm{M}_\odot.\] This approach was used in our first paper (see Chapter 4). The famous galactic dynamical python code, Galpy, makes a different choice and introduced natural units (Bovy 2015). More specifically, Bovy (2015) uses a normalization in which \(R\), the cylindrical scale length of the galaxy, and \(v_\mathrm{circ}\), the circular velocity at this radius, are both set to 1. This choice is motivated by a galaxy’s rotation curve and is embodied in: \[\frac{v_\mathrm{circ}^2}{R_0} = \nabla \Phi \left(R_0, z=0\right). \label{eq:vcirc}\] Note that the gravitational constant is also set to 1. Whatever the form of the potential, the scale lengths must be normalized to \(R\), and the mass parameter is subsequently determined through Eq. [eq:vcirc]. The total potential is a linear combination of individual components, with the user selecting the contribution of each component to the force at the characteristic radius. For example, \(\Phi = \sum_i a_i\Phi_i\), where \(a_i\) are weights such that \(\nabla \Phi_i(R_0, z=0) = a_i\) in normalized units. In this system of units, emphasis is placed on the rotation curve and how much each component contributes to it at the reference radius of the galaxy. Note that \(v_\mathrm{circ}(R_0)\) is not necessarily the maximum rotational velocity.

In short, each code presents its own preferred units and normalization. Tstrippy, by contrast, expects the user to pass masses in solar masses, velocities in kilometers per second, and distances in kiloparsecs. However, physical constants are not hard-coded, so the user may pass any numerical values to the code as long as they are based on a self-consistent unit system. Nonetheless, the code comes equipped with parameters for the pouliasis2017pii potential (Pouliasis, Di Matteo, and Haywood 2017) and for the catalog of globular clusters (Baumgardt and Hilker 2018) in units of kpc, km/s, and \(\mathrm{M}_\odot\).

A valid general strategy when developing numerical codes is to implement a module that converts user-defined units to the internal units. This functionality also exists in Galpy and a similar system is implemented in Agama (Vasiliev 2018). I chose not to add such a layer to Tstrippy since Astropy provides an excellent unit-handling module that allows users to convert between units easily (Astropy Collaboration et al. 2013), and I recommend its use in the documentation.

Solving the equations of motion

Long before the advent of computers, Euler (1707-1783) proposed a simple method for numerically solving differential equations. In this method, a solution is approximated by \[y_{i+1} = y_i + \Delta t \frac{dy}{dt}\left(y_i,t_i\right),\] where \(i\) is a timestep index. This means that at each point \((t_i, y_i)\), the function is extrapolated forward using a linear approximation.

The accuracy of this method can be understood using a Taylor series expansion of the exact solution \(y(t_i + \Delta t)\) about \(t_i\): \[y(t_i + \Delta t) = y(t_i) + \Delta t y'(t_i) + \frac{1}{2!}\Delta t^2 y''(t_i) + \dots\] Euler’s method captures only the first two terms. The difference between the exact solution and the Euler estimate is dominated by the second-order term. Thus, the local truncation error (error per step) is \[\mathrm{Err}_{\mathrm{step}} \approx \frac{1}{2} \Delta t^2 y''(t_i) = \mathcal{O}(\Delta t^2).\]

The global error (accumulated over many steps) is approximately the number of steps times the average error per step: \[\mathrm{Err} \approx N_{\mathrm{step}} \cdot \langle \mathrm{Err}_{\mathrm{step}} \rangle \approx \frac{T}{\Delta t} \cdot \Delta t^2 \langle y'' \rangle = \mathcal{O}(\Delta t).\] This means that halving the timestep roughly halves the global error.

It is important to note that using a Taylor series to estimate the error is not mathematically rigorous and not always generalizable. The actual error behavior depends strongly on the properties of the function being integrated. For instance:

For a more systematic treatment of integration methods and their error properties, Chapter 16 of Numerical Recipes in C provides an excellent introduction (Press et al. 1992).

Regardless of the method used, sanity checks are essential to validate the result. These include:

For example, we can exploit the properties of Hamiltonian systems to design integrators. In this thesis, we implemented the Leapfrog integrator and the Forest-Ruth scheme (Bovy 2025a; Forest and Ruth 1990). These schemes are derived from the structure of Hamiltonian mechanics and are known as symplectic integrators. Before continuing, I would like to quote (Bovy 2025a):

However, my curiosity about linguistics pushed me to delve further: What does symplectic mean? Weyl (1946) coined the term because complex was already taken. The prefix Latin com- refers to together, and plexus comes from Greek meaning “woven” or “braided”. Symplectic translates exactly the same way: sym- is a Greek prefix for “together.” The idea remains the same: in Hamiltonian dynamics, the evolution of position and momentum are interdependent. This becomes clearer in matrix form: \[\begin{bmatrix} \dot{\bf{q}}\\ \dot{\bf{p}} \end{bmatrix} = \begin{bmatrix} 0 & I_n \\ -I_n & 0 \end{bmatrix} \begin{bmatrix} \frac{\partial \mathcal{H}}{\partial \bf{q}} \\ \frac{\partial \mathcal{H}}{\partial \bf{p}} \end{bmatrix} \label{eq:symplectic}\] Here, the skew-symmetric symplectic matrix “weaves” the positions and momenta together.

Although the equations of motion do not admit analytical solutions, they possess several known properties. First, trajectories governed solely by gravity are time-reversible. This property is important for our methodology, where we integrate the equations of motion backward in time and then forward again to the present-day position. Secondly, the total orbital energy is conserved. Moreover, according to Liouville’s theorem, Hamiltonian flows preserve the local phase space volume. A corollary of this is that the determinant of the Jacobian matrix of the transformation from \(\left(q,p\right)\rightarrow \left(q',p'\right)\) must be one, which means that the transformation only rotates or translates an infinitesimal volume but does not shrink or expand the volume. We can view the transform as: \[\begin{aligned} q' &= q + \frac{\partial \mathcal{H}}{\partial p}\Delta t, \\ p' &= -\frac{\partial \mathcal{H}}{\partial q}\Delta t + p, \end{aligned}\] The Jacobian matrix is given by \(\left(\frac{\partial x_i'}{\partial x_j }\right)\): \[\begin{bmatrix} 1 & \Delta t \frac{\partial^2 \mathcal{H}}{\partial p^2} \\ -\Delta t \frac{\partial^2 \mathcal{H}}{\partial q^2} & 1 \\ \end{bmatrix}\] and the subsequent determinant is: \[\mathrm{det}\left(J\right) = 1 - \Delta t^2 \frac{\partial^2 \mathcal{H}}{\partial q^2} \frac{\partial^2 \mathcal{H}}{\partial p^2}.\] In general, neither \(\frac{\partial^2 \mathcal{H}}{\partial q^2}\) or \(\frac{\partial^2 \mathcal{H}}{\partial p^2}\) are zero. There is a quick fix to this dilemma, namely, only stepping in \(q\) or \(p\) while holding the other constant. In turn, the transformation of a single step will have a Jacobian whose determinant is 1. The transformation order becomes: \((q,p) \rightarrow (q',p) \rightarrow (q',p')\). This is commonly referred to as a sequence of drifts and kicks. A drift updates the position while holding the momentum fixed, and a kick updates the momentum while holding the position fixed. Symplectic integrators alternate these operations in a specific sequence to preserve the Hamiltonian and phase space volume.

The scheme outlined above is essentially a first-order method and is closely related to Euler’s method. More sophisticated integrators use values from multiple timesteps to construct higher-order estimates of the system’s evolution. For example, some schemes temporarily evolve the position to an intermediate value \(q_\mathrm{temp}\), use this to compute a momentum \(p_\mathrm{temp}\), and then adjust both using weighted averages or predictor-corrector steps to reach the final state. These methods carefully balance forward and backward steps to optimize accuracy while preserving the symplectic structure.

One of the most commonly used symplectic integrators in galactic dynamics is the Leapfrog scheme. It works by interleaving updates of positions and momenta using time-centered averages. Specifically, the average momentum between \(q_i\) and \(q_{i+1}\) (denoted \(p_{i+1/2}\)) is used to advance the position, and then the average force (derived from the potential) is used to update the momentum. In Cartesian coordinates—used throughout this thesis—the Leapfrog algorithm can be written as: \[\begin{aligned} x_{i+1/2} &= x_i + \frac{1}{2} \dot{x}_i \Delta t , \\ \ddot{x}_{i+1/2} &= -\nabla \Phi(x_{i+1/2}), \\ \dot{x}_{i+1} &= \dot{x}_i + \ddot{x}_{i+1/2} \Delta t, \\ x_{i+1} &= x_{i+1/2} + \frac{1}{2} \dot{x}_{i+1} \Delta t. \end{aligned}\] As will be shown in the next section, the Leapfrog algorithm is sufficient. However, the question of computational efficiency and numerical accuracy is ever present. Leapfrog uses the two local points about the position and momenta to evolve them. Other schemes can use more points to have more accurate estimations for the local derivatives.

Forest and Ruth (1990) proposed one such method for symplectic integration. The method involves finding roots of high order polynomials which determine the distances about the local point for finding the best estimate of the derivative for evolving the system. The method involves solving a cubic polynomial to determine the optimal coefficients. While the derivation is mathematically involved, the final scheme is straightforward to implement. I implemented this method and tested its efficiency against the Leapfrog and present the results in the following section. There are eight coefficients in this method, which are presented in Table 2.

Velocity (\(c_n\)) and acceleration (\(d_n\)) coefficients for the Forest-Ruth symplectic integrator.
Velocity coefficients (\(c_n\))	Acceleration coefficients (\(d_n\))
\(c_1\)	\(c_2\)	\(c_3\)	\(c_4\)	\(d_1\)	\(d_2\)	\(d_3\)	\(d_4\)
\(w + \frac{1}{2}\)	\(-w\)	\(-w\)	\(w + \frac{1}{2}\)	\(2w + 1\)	\(-4w - 1\)	\(2w + 1\)	\(0\)

The coefficients are all based on the solution to the cubic polynomial: \(48 w^3 + 24 w^2 - 1 = 0\). For a single step, the positions and velocities are updated as follows: \[\begin{aligned} x' &= x + c_n v \Delta t \\ t' &= t + c_n \Delta t \\ \ddot{x} &= \nabla \Phi (x') \\ \dot{x}' &= \dot{x} + d_n \ddot{x} \Delta t, \end{aligned}\] where \(n\) is the mini-step. Notice that the sum of \(\sum_n^4 c_n\) and \(\sum_n^4 d_n\) both equal 1, which is a full timestep \(\Delta t\).

Lastly, it is important to note that the Leapfrog algorithm is symplectic and time-reversible only for Hamiltonians that are both time-independent and separable—that is, where the Hamiltonian can be written as a sum of a kinetic term depending only on momenta, \(T(p)\), and whose potential depends only on position \(\Phi(q)\). These conditions are satisfied for systems in an inertial frame with conservative forces. This is true when integrating the motion for the center of mass of the globular clusters. However, the Hamiltonian for the integration of the particles does depend on time. So the Leapfrog algorithm may introduce systematic integration errors due to the violation of its underlying assumptions, beyond ordinary rounding errors.

Similarly, when we integrate the orbits of either the particles or the globular clusters in the Galaxy containing a bar, we are faced with a choice: we can either work in a time-dependent inertial frame, where the potential rotates and the Hamiltonian explicitly depends on time, or we can transform to a rotating frame, in which case the kinetic energy becomes position-dependent due to Coriolis and centrifugal forces, which breaks the necessary criterion of separability: \(\mathcal{H}(q,p) = T(p)+\Phi(q)\). In both cases, the standard assumptions of the Leapfrog algorithm are violated.

Nonetheless, we will continue to use Leapfrog as it remains a robust and efficient integrator for a wide range of astrophysical systems. Its good long-term energy conservation makes it a reasonable approximation even when the ideal assumptions are not strictly met. However, this highlights the need for careful validation: we must verify that the integration errors remain within acceptable bounds, especially in systems with non-separable or time-dependent dynamics. This validation is the subject of the next section.

Numerical Error and Computation Time

To ensure the quality of the integration, we perform two main checks. The first is to ensure that the initial orbital energy of a given particle is conserved to high precision. At each timestep, the relative error in the energy conservation is: \[\mathrm{err}(E(t)) = \left|\frac{E(t) - E_0}{E_0}\right|,\] where \(E_0\) is the initial energy and \(E\) is the orbital energy at a given timestep \(t\). For the case of a globular cluster, the total orbital energy is its own kinetic energy plus its gravitational potential energy in the Galaxy: \(E = T(\mathbf{v}_{\mathrm{GC}}) + \Phi_{\mathrm{MW}}\left(\mathbf{x}_{\mathrm{GC}}\right)\), where \(\mathbf{x}_{\mathrm{GC}}\) and \(\mathbf{v}_{\mathrm{GC}}\) are the Cartesian galactocentric position and velocity of the globular cluster. For the case of the \(i\)-th star-particle within a globular cluster, the potential energy of the cluster is included: \(E_i = T(\textbf{v}_i) + \Phi_{\mathrm{MW}}\left(\textbf{x}_i\right) + \Phi_\mathrm{GC}\left(\textbf{x}_i - \textbf{x}_{\mathrm{GC}}\right)\), where \(\mathbf{x}_i\) is the position relative to the Galactic center. The same approach holds when a bar is included, the only difference being that the Galactic potential has a time-dependent element.

For potentials with the galactic bar, the total energy is not conserved but rather the Jacobi energy, and this is true only for the globular clusters since we treat them as point masses. For the star-particles, the energy is not conserved in the simulations, since they are subject to the force from the globular cluster. However, we track the energy particularly when we perform the second check, which is time-reversibility.

To check the time-reversibility, we integrate a cluster back in time, and then change the sign of its velocity to subsequently integrate forward in time. If the integration is correct, the cluster should remain on the same trajectory retracing its steps. We investigate this for the following scenarios and show the results below:

At this stage, I did not have time to perform a full quality check of orbits within the barred potential. However, I refer the reader to the tstrippy documentation, where I demonstrate the time-reversibility of cluster orbits in the barred potential. ¹

Globular Cluster Orbits in a Static Galaxy

The initial conditions for the globular cluster system (positions and velocities) were taken from Baumgardt and Hilker (2018)’s online globular cluster catalog whose data derived from Gaia Early Data Release 3 among other sources (Baumgardt and Vasiliev 2021; Gaia Collaboration et al. 2021, 2023; Vasiliev and Baumgardt 2021).

To test the integrator, we integrated the whole globular cluster system for 5 Gyr, and then integrated it back to the initial conditions. We used four timesteps: \(10^4,10^5,10^6,10^7\) years which corresponds to \(\left[500,5000,50000,500000\right]\) integration steps, respectively. In general, the timestep should scale with the dynamical time of the orbit. In other words, the timestep should be inversely proportional to the orbital energy. The further the system is from the galactic center, a larger timestep can be used to obtain a given numerical error.

Of course, the timestep does not just simply scale with a body’s orbital energy, it should scale with the maximum acceleration experienced in the system. A highly eccentric orbit requires a smaller timestep to properly integrate the motion near the pericenter, compared to a circular orbit at the same orbital energy. To not clutter the graph, Fig. 1 only present the whole globular cluster system twice, once integrated with the smallest timestep, \(10^4\) years, and once with the largest timestep: \(10^7\) years.

In Fig. 1 we can notice that orbits with higher orbital energies (in red) have low numerical error compared to those with lower energies (in blue), which penetrate deeper in the potential well. It is clear that, for the whole system, \(10^7\) years is a timestep that is far too large, however, interestingly enough, for some of the farthest globular clusters, a timestep of 10 million years resolves their orbits to an error of \(\langle \Delta E / E_0 \rangle \sim 10^{-5}\), which is still far less than the uncertainties due to observational uncertainties. Note that in both cases, the errors neither accumulate nor grow with time. This is a testament to the quality of the Leapfrog integration scheme, which is designed to conserve the Hamiltonian accurately.

Fig. 2 illustrates the time reversibility of the integrator, i.e., its ability to retrace its own steps. For each timestep, I compute the difference between the forward integration and the backward retrace, normalizing to the mean of the two positions. The same computation is performed for the velocities. Specifically, I take the current kinematics of each cluster and integrate them forward in time for five billion years, storing positions and velocities at each step. Starting from the final state, I then integrate backward for the same duration, again recording positions and velocities at every step. At each corresponding time stamp \(t_i\), I calculate the vector differences in position and velocity, normalizing the positional error to the mean position of the forward and backward steps.

The timestep of \(10^7\) yr saturates only after 2 Gyr. The distances do not continue to grow because the orbital energy only differs by one part in ten, so at later timesteps, the cluster is still within the same region of phase-space, but the retrace is at a completely different location than the initial integration. The errors in the timestep of \(10^6\) yr become significant, though by the end of the integration period of 5 Gyr they are still only one part in ten thousand. The timesteps of \(10^5\) and \(10^4\) yr have excellent retraceability and on average, only differ by one part in \(10^{-14}\), and are thus only limited by round off error from the use of double precision floating point numbers.

Computation time is quite important. Fig. 3 presents the total computation time of integrating the globular cluster system, and the time of a single integration step, which is computed by normalizing the total time by the number of objects and number of steps taken: \(T_{\mathrm{total}}/N_{\mathrm{GCs}}/N_{\mathrm{steps}}\). In general, the relationship is linear and the integration time per step per object is roughly constant. The downward trend presented in Fig. 3 is in part a coincidence, as some time realizations minimize this, and in part due to the overhead computation time with initializing and finalizing the calculation contributes less and less with increasing integration time. Nonetheless, for this processor (Apple M2 processor), for a single integration step the mean time is \(\sim\) 128 nanoseconds.

The Ruth-Forest algorithm discussed in the previous section was implemented and is reported in Fig. 4. Here, as expected, we see that the precision greatly increases when decreasing the timestep. However, since there are four force evaluations for a single timestep, this method is naturally slower per step than Leapfrog. How do the two methods compare over all?

Fig. 5 compares the numerical error for again the total number of steps (which is inversely proportional to the timestep). It is clear that, for a given step size, the Ruth-forest outperforms the Leapfrog, but is it actually better? To answer this question, I fit the two curves with their own trend lines to find the number of steps required to have a relative error of \(10^{-8}\). With this requirement, The Leapfrog scheme requires 262,641 steps while the Ruth-Forest scheme requires 102,773 steps. However, given the difference in computation time per step, on average, the Forest-Ruth scheme takes \(\sim 1.5\)x more time for the same degree of numerical precision than the Leapfrog. For this reason, we use the Leapfrog algorithm. In this problem, numerical uncertainties are must less of a limiting factor compared to modeling uncertainties and observational uncertainties, so a better integrator for numerical precision is not worth the pay off of longer computation times.

Star-particles in a static globular cluster

A classic challenge in astronomy and the physical sciences arises when a problem involves two or more physical processes that operate on very different time scales. This is certainly the case when studying globular clusters. Figure 6 illustrates the orders-of-magnitude differences in time scales both across the globular cluster population and within individual clusters.

A useful metric for characterizing a system’s time scale is the crossing time, which is the time it would take a star to reach the center of the system given its current speed: \[t_\mathrm{cross} = \frac{r}{v}.\] While this quantity is not an integral of motion and varies as a star moves, it still provides a convenient and informative estimate of the dynamical time scale. It breaks down in extreme cases, such as a purely radial orbit near pericenter or apocenter where the instantaneous velocity approaches zero, but such cases are rare and do not undermine its utility. We compute the crossing time for each globular cluster in the galaxy (red distribution in the top panel of Fig. 6) as well as for each individual star particle within the clusters.

For the cluster as a whole, a robust characteristic dynamical time can be defined as: \[\tau = \sqrt{\frac{a^3}{GM}},\] where \(a\) is a characteristic size of the system. This is often taken to be the half-mass radius. In this work, I adopt the half-mass radius rather than the Plummer scale radius, though the two are related by \(r_{\mathrm{1/2}} \approx 1.3a\). This time scale was computed for each cluster and is shown as the blue distribution in the top panel of Fig. 6.

Note that the distribution of cluster dynamical times has a long tail toward longer values, overlapping with the galactic crossing times. A natural question arises: could any cluster have a longer internal dynamical time than its orbital crossing time? The answer should be no. If it takes longer for stars to orbit within the cluster than for the entire cluster to orbit the Galaxy, the cluster is effectively unbound or fully disrupted. I examined this question for the Galactic globular cluster population, and the results are shown in Fig. 7. As expected, all clusters have internal dynamical times shorter than their orbital crossing times.

Interestingly, the ratio of these two time scales serves as a useful diagnostic of cluster stability. Clusters in which stars complete hundreds or thousands of internal orbits per galactic orbit are significantly more stable than those where stars complete only a few.

To properly compute the orbits of the star particles, the timestep must be small enough to resolve the orbit accurately while the star is inside the cluster. How should we choose this timestep? There are two criteria. The first is that the timestep should be some fraction of the cluster’s dynamical time: \[\Delta t' = \alpha ' \tau.\] I use a prime to indicate that this is a trial timestep, since the second criterion must also be satisfied: the total number of timesteps should be an integer, \[N = \left\lceil \frac{T}{\alpha' \tau} \right\rceil,\] where \(T\) is the total integration time. We round \(N\) up to ensure a slightly smaller timestep, which becomes \(\Delta t = T/N\). This also redefines the effective timestep fraction as \(\alpha = \Delta t / \tau\).

In the experiments presented in this section, I choose several fractions of each cluster’s dynamical time and examine the resulting numerical errors. To efficiently explore a wide range of \(\alpha\), I begin with a few trial values of \(\alpha'\) and then round the number of steps down so that \(N\) is not only an integer, but also a power of two. This leads to the following condition: \[N = 2^{k-1}\] where \(k \in \mathbb{N}\) is an integer index. The expression for \(k\) becomes: \[k = \left\lceil \log_2\left(\frac{T}{\alpha ' \tau_\mathrm{dyn}}\right) + 1 \right\rceil. \label{eq:binary_time_step_criterion}\]

The goal is now to determine the appropriate value of \(\alpha\) to ensure energy conservation and time-reversibility. In the analysis presented in the previous sections, I selected timesteps without explicitly relating them to the crossing times of globular cluster orbits. Fig. 1 shows that a timestep of \(\Delta t = 10^4~\mathrm{yr}\) achieves a mean relative energy error of \(10^{-8}\). Meanwhile, Fig. 2 shows that a timestep of \(\Delta t = 10^5~\mathrm{yr}\) already achieves convergence in time-reversibility. Comparing these values to the typical crossing time shown in Fig. 6, we find that they correspond to \(\alpha\) values of approximately \(2 \times 10^{-4}\) and \(2 \times 10^{-3}\), respectively.

In Fig. 8, I investigate energy conservation in a static Plummer sphere using trial values of \(\alpha \in [10^{-4}, 10^{-3}, 10^{-2}, 10^{-1}, 1, 2]\). This experiment shows that values of \(\alpha < 10^{-2}\) yield excellent energy conservation. In this section, we do not consider the time-reversibility as the previous section has shown the integration’s robustness with regard to this metric. However, this is our main metric for the next section.

Full stream generation

The preceding sections assessed the numerical stability of integrating globular cluster orbits within the Milky Way, focusing on time-reversibility and relative energy error across two integration schemes: Leapfrog and Forest-Ruth. I also considered the computational cost associated with different temporal resolutions. Separately, I evaluated the energy conservation for star particles evolving within a stationary Plummer sphere, quantifying the relative error as a function of timestep. In that context, the cluster’s potential was scaled to its scale radius, mass, and the gravitational constant. This non-dimensionalization allowed me to select timesteps as fractions of the internal dynamical time \(\tau\) of each cluster.

In this section, I combine them: I examine the quality of orbit integration for star particles evolving within a globular cluster, which in turn is orbiting in the Galactic potential.

I restrict the analysis here to the Leapfrog integrator for two main reasons. First, although the Forest-Ruth scheme achieves better energy conservation (see Fig. 5), this comes at a significantly higher computational cost, which outweighs its marginal gains for this application. Second, the equations of motion used to model stream generation require loading the position of the host cluster for each force computation. Because Forest-Ruth evaluates forces at non-uniform substeps, using it would require either: (1) storing and loading the cluster orbit at each substep, which would demand excessive disk space and complex code restructuring; or (2) interpolating the orbit at intermediate times, which introduces ambiguity regarding whether time-reversibility is preserved with linear or cubic interpolation. For these reasons, I opted not to explore Forest-Ruth further in the context of full stream generation.

To assess the performance of the stream-generation method, I designed a quality assurance experiment, illustrated in Fig. 9. I began by integrating the orbit of a globular cluster’s center of mass backward in time by 1 Gyr. At that point, I initialized a Plummer sphere with 512 star particles using the cluster’s half-mass radius and total mass. This system was translated to the center-of-mass phase-space coordinates and then integrated forward to the present day, forming a stream. I recorded the resulting positions and computation time. To test time-reversibility, I subsequently integrated the stream particles backward again for 1 Gyr.

A note on terminology: although the cluster orbit is initialized from present-day observations, in the context of stream generation, I refer to the past position (1 Gyr ago) as the “initial conditions”. The forward-integrated system represents the “stream”, and the backward-integrated version of the stream is referred to as the “retrace”.

Fig. 9 shows the results for four different timesteps. As expected, increasing the timestep degrades time-reversibility. For the largest timestep (\(\alpha \approx 1\), top left), the retrace fails to recover the original configuration. The retraced structure remains a stream and does not re-coalesce into a cluster. The results improve rapidly with decreasing timestep: at \(\alpha \approx 2 \times 10^{-1}\), most particles return close to their starting positions with only slight offsets. At \(\alpha \approx 2 \times 10^{-2}\), nearly all particles recover their initial conditions, and by \(\alpha \approx 6 \times 10^{-3}\), the retrace is nearly perfect.

Fig. 10 presents the integrator’s energy conservation performance during the retrace phase of Fig. 9, across the same timesteps.

Although the retrace fails for the largest timestep in Fig. 9, Fig. 10 shows that the relative energy error remains modest—between \(10^{-4}\) and \(10^{-2}\). However, this metric requires careful interpretation. What appears as an “error” primarily reflects the intrinsic spread in orbital energies among star particles, relative to the center-of-mass energy.

We can estimate this ratio by comparing the internal potential energy scale of the cluster to its orbital energy in the Galaxy. The cluster’s characteristic potential is roughly \(\Phi \sim GM/a\), with \(M \sim 10^5~M_\odot\) and \(a \sim 0.005\) kpc, giving \(\Phi \sim 10^2~\mathrm{km}^2\,\mathrm{s}^{-2}\). Meanwhile, the typical orbital energy of a globular cluster is around \(10^5~\mathrm{km}^2\,\mathrm{s}^{-2}\).

This three-order-of-magnitude difference implies that the intrinsic energy spread from the Plummer sampling is about \(10^{-3}\) of the total orbital energy, which is consistent with the the apparent energy “error” in the stream of Fig. 10. Thus, this differences does not arise from numerical drift, but from the physical energy distribution encoded in the initial conditions.

Fig. 9 & Fig. 10 demonstrate the case of a single cluster. How is the energy conservation for the entire catalog? Fig. 11 presents the relative error in the conservation of energy for retracing the orbit of each individual star particle per globular cluster for each cluster in the catalog. Each data point reports the mean error in the conservation of energy. Each cluster was integrated for 5 Gyr. The upper limits for the timesteps were sampled logarithmically between \(10^{0}\) and \(10^{-2}\), the exact timesteps are based on the criterion from Eq. [eq:binary_time_step_criterion].

The quality of the solutions is fundamental for proper results. However, computation time and cost is also very important. The main motivation in using the restricted three body problem was to save computation time. With N body, the computation time should scale with \(N_p^2\), while for the restricted three body problem it should scale with \(N_p\). For both, it should scale linearly with the number of integration steps.

How does tstrippy perform? Fig. 12 presents the results from an experiment with all the globular clusters, integrating them each for 1 Gyr. I choose the timestep to be at most 1/20 of each internal dynamical time, thus since each cluster has a different dynamical time, it will require a different number of steps. For each of these tests I use \(10^1,~10^2,~10^2,~10^3\) particles and launched them on cluster at the Paris observatory. Then, I fit a scaling law of \(C(N_p,N_s) = \langle C\rangle N_p^a N_s^b\). If the code scales linearly, then each exponent, \(a,~b\), should be 1 and \(\langle C\rangle\) would be the mean computation time.

Fig 12 shows that the code is less efficient than linear scaling with number of particles, since the best fit exponent is \(1.35\). However, it scales better than linear with the number of steps. Perhaps this can be explained by the fact that there is overhead time involved for initiating and ending each simulation. As the number of steps increases the fixed overhead proportionally takes up less of the total time. Since the power law has an exponent of \(0.95\), the overhead is marginal. However, it is fortunate that the exponent is not greater than 1, which would mean the code would slow down with execution time.

Putting Fig. 11 and Fig. 12 together, we may estimate for how long it completes to run these simulations. If we want all simulations to have error better relative error on the retraced conservation of energy than about 10\(^{-3}\), then we must pick timesteps that are smaller than about 1/20 of the dynamical time. With this criteria selected, we can compute the number of timesteps necessary for each globular cluster, which is a function of its internal dynamical time. By doing so, I find that the fastest cluster can be computed in \(\sim\) 30 minutes. The median computation time is \(\sim\)17 hours. The largest computation time was \(\sim\) 5 days. The total CPU time for the whole catalog is \(\sim\) 194 days.

A note on non-symplectic integration

During the writing of this thesis, I discovered a bug in my code that affected the integration scheme. Specifically, the error concerned the way I handled the host orbit during the integration of the star particles. In earlier implementations, I would integrate the orbit of the host using the same timestep that I used later to integrate the orbits of the star particles. However, this approach violated the structure of the Leapfrog integration scheme.

In Hamiltonian integration, it is essential that the drift and kick steps alternate in a consistent manner. Since I implemented a drift-kick-drift (DKD) scheme, the kicks occur at the midpoint of each timestep, meaning that forces should be evaluated at intermediate positions. Previously, I erroneously computed the relative position vector as \[\Delta \vec{r}_i = \vec{r}_{p,i+1/2} - \vec{r}_\mathrm{GC,i},\] instead of the correct expression: \[\Delta \vec{r}_i = \vec{r}_{p,i+1/2} - \vec{r}_\mathrm{GC,i+1/2}.\]

This mistake appeared in both Ferrone et al. (2023) and Ferrone et al. (2025). Does this invalidate our results? To assess the impact, we can compare Figs. 13 and 10. In Fig. 13, the same retracing experiment was performed, but using the flawed integration: the host orbit was only provided at the same grid points as the stream. In the left panel, we see that the globular cluster does not perfectly re-coalesce when integrating backward in time, but the result is not catastrophic. Physically, we do not expect this process to be perfectly time-reversible; however, such irreversibility should ideally arise from physical modeling, not from numerical artifacts.

We also observe that the distribution of the relative error in energy conservation is nearly identical between the retraced and the original stream. In any case, once the stars move beyond the Jacobi radius, their dynamics are dominated by the galactic potential. Since this potential was integrated correctly and symplectically, the orbits of stars outside the cluster remain reliable and physically meaningful.

As a result of this discovery, I updated the tstrippy code to warn users if the host orbit is not sampled at \(2N + 1\) points, where \(N\) is the number of integration steps. While the flawed implementation breaks strict symplecticity for the internal cluster dynamics, the Leapfrog integration for the globular cluster orbits was still used correctly. This ensures that the cluster center of mass returns to its present-day sky position and that stars outside the cluster follow accurate galactic orbits. Ultimately, the simplification of using a static Plummer sphere—whose mass and size remain fixed—is a greater limitation than this particular numerical error.

Lastly, I discovered this error while writing about the Forest-Ruth integration method. To implement this scheme correctly, I would need to know the position of the globular cluster’s center of mass at four intermediate points across two timesteps, with the spacing determined by the coefficients in Table 2. This would require either storing additional mini-time-step data or interpolating the globular cluster’s trajectory between saved snapshots. Both options would necessitate a substantial refactoring of the code or an impractical increase in memory usage per orbit. Given the results shown in Fig. 5, the gain in accuracy does not justify the computational cost.

Tstrippy

This project is built on f2py, which allows integration between Fortran and Python. The core motivation behind this choice is performance: Fortran, as a compiled language, provides significantly faster execution for numerically intensive tasks, while Python—especially within Jupyter notebooks—offers a convenient environment for development, experimentation, and visualization. F2py stands for Fortran to Python (Peterson 2009), and it is included as a module within NumPy (NumPy Developers 2025; Harris et al. 2020). The name of the project, tstrippy, stands for Tidal Stripping in Python.

F2py supports Fortran 77, 90, and 95 standards, so we chose to write the code in Fortran 90 to make use of modules. In Fortran, a module encapsulates data and subroutines in a manner somewhat analogous to classes in object-oriented programming. However, Fortran modules do not support inheritance, and only a single instance of a module can exist at a time—unlike classes, which can be instantiated multiple times.

Tstrippy package is structured around five core Fortran modules, each responsible for a distinct aspect of the simulation:

This modular structure makes it straightforward to extend the code by adding new physics in the form of additional modules.

To make the package installable and easy to distribute, I initially followed the guide by Bovy (2025b), which describes how to create a Python package using setuptools—the standard build system in the Python ecosystem. However, compatibility between setuptools and f2py was broken starting with NumPy \(>\) 1.22 (released June 22, 2022²) and Python \(>\) 3.9.18 (released June 24, 2024³). This meant that Fortran extensions could only be compiled using deprecated versions of both. These older versions of NumPy were also not compatible with Apple’s ARM-based M1 and M2 processors, rendering the code unusable on modern Mac systems.

This limitation stemmed from the deprecation and eventual removal of numpy.distutils, the tool that previously enabled seamless integration of Fortran code in NumPy-based packages. As of NumPy 1.23 and later, numpy.distutils was deprecated, and with NumPy 2.0, it was removed entirely. The NumPy developers recommended migrating to meson (Meson Developers 2025) or Cmake.

To address these issues, I migrated the build process to meson, a language-agnostic build system capable of compiling Fortran, C, and Python extensions across architectures. This eliminated compatibility problems and made the build process architecture-independent. The build system now automatically detects the system architecture and compiles accordingly.

The code is fully open source and available on GitHub.⁴ To support users, I created documentation hosted on readthedocs.io⁵, which includes working examples and basic usage guides. A minimal test suite, written using pytest, is also included. While not exhaustive, the tests ensure that core functionality remains intact as the code evolves. In the next section, I present a minimal example of how the user may use the code within a python script.

Minimum example

If the package is properly installed on the system, it can be imported at the top of any Python script:

The user must then select the system to integrate. For example, to integrate the orbits of observed globular clusters, one must convert the ICRS coordinates to a Galactocentric frame using astropy and the provided MW reference frame. Alternatively, to simulate a star cluster, one can generate a Plummer sphere:

Here, NP is the number of particles, halfmassradius is the system’s half-mass radius, massHost is the total mass of the Plummer sphere, and G is the gravitational constant. All values must be in the same unit system.

The integrator must then be initialized. All parameters are passed via lists that are unpacked at the function call. Here is an example of initializing the integrator for a stellar stream in a potential that includes a rotating bar:

The output files will be named like: ../dir/temp0.bin, ../dir/temp1.bin, ..., up to ../dir/tempN.bin, where \(N = N_\mathrm{step} / N_\mathrm{skip}\). Note that the files are written in Fortran binary format. Although scipy.io.FortranFile can read them, I use a custom parser based on numpy.frombuffer to avoid the SciPy dependency. Once all parameters are set, the user can proceed with integration using one of two methods:

Full orbit integration (in memory)

leapfrogintime stores the full orbit of each particle in memory. This is useful for a small number of particles or short integrations—e.g., rapid parameter studies in a notebook. However, for large simulations it can be prohibitively memory-intensive. For instance, integrating all globular clusters at high time resolution might require: \[7 \times N_p \times N_\mathrm{step} \times 8~\mathrm{Byte} \approx 450~\mathrm{GB}\] if \(N_\mathrm{step} \approx 10^7\). This will likely exceed system RAM.

Final state only

leapfrogtofinalpositions() performs the integration but only returns the final phase-space coordinates. These arrays must be copied before deallocating memory:

Deallocating is necessary to avoid memory leaks or crashes in Jupyter when rerunning code cells.

Reflection on developing tstrippy

The earliest version of this code began as a simple Fortran script that was interfaced to python via f2py and was built to integrate test particles in specific gravitational potential models. Since I was already familiar with Fortran and had working routines, I chose to build on that foundation, gradually transforming the script into a modular and more reusable package. Rather than switching to C++ or a Python-only solution, I continued using Fortran in combination with f2py.

One of the primary motivations for writing my own code was flexibility. When I attempted to implement a particle spray method in Galpy, I found that performance degraded significantly when using custom potentials not constructed from its internal C++ backend. For example, I wanted to use the AllenSantillian halo model, which is not natively supported. I followed the documentation and implemented a class for it. However, custom potentials bypass Galpy’s optimized C++ backend, resulting in slow computations and rendering actions uncomputable (or at least with the functions I tried). This pushed me to continue developing tstrippy. I had a similar experience with Amuse.

This choice, however, came with challenges. At one point, I tried implementing potentials derived from exponential density profiles, which do not admit closed-form solutions for the potential. I attempted to work in elliptical coordinates, motivated by the idea that the potential would depend only on the “distance” from the center along the equipotential surfaces. While I successfully implemented simple potential models in elliptical coordinates, I naively overlooked that this changes the equations of motion entirely due to the underlying geometry. I found that my orbits always diverged (except in special cases). It was only later that I realized I would need to account for the metric tensor and Christoffel symbols to properly integrate orbits in elliptical coordinates. At that point, I chose not to pursue this further, prioritizing scientific analysis and launching other simulations instead. This experience helped me appreciate why many codes and other published works prefer basis function expansions for such problems. In hindsight, this episode was a perfect example of how even unsuccessful attempts can lead to valuable insights. Below, I summarize some of the key advantages and limitations I encountered while developing a code from scratch to answer a scientific question.

Advantages and Limitations

Nonetheless, the code was designed to address concrete scientific questions about Milky Way stellar streams and globular clusters. In the next two chapters, I present how this tool contributed to advancing our understanding of the Galaxy.

Astropy Collaboration, Thomas P. Robitaille, Erik J. Tollerud, Perry Greenfield, Michael Droettboom, Erik Bray, Tom Aldcroft, et al. 2013. “Astropy: A community Python package for astronomy” 558 (October):A33. https://doi.org/10.1051/0004-6361/201322068.

Baumgardt, H., and M. Hilker. 2018. “A catalogue of masses, structural parameters, and velocity dispersion profiles of 112 Milky Way globular clusters” 478 (2): 1520–57. https://doi.org/10.1093/mnras/sty1057.

Baumgardt, H., and E. Vasiliev. 2021. “Accurate distances to Galactic globular clusters through a combination of Gaia EDR3, HST, and literature data” 505 (4): 5957–77. https://doi.org/10.1093/mnras/stab1474.

Bovy, Jo. 2015. “galpy: A python Library for Galactic Dynamics” 216 (2): 29. https://doi.org/10.1088/0067-0049/216/2/29.

———. 2025a. Dynamics and Astrophysics of Galaxies. Princeton, NJ: Princeton University Press.

———. 2025b. “Python Packaging User Guide.” https://pythonpackaging.info/.

Buckingham, E. 1914. “On Physically Similar Systems; Illustrations of the Use of Dimensional Equations.” Physical Review 4 (4): 345–76. https://doi.org/10.1103/PhysRev.4.345.

Ferrone, Salvatore, Paola Di Matteo, Alessandra Mastrobuono-Battisti, Misha Haywood, Owain N. Snaith, Marco Montuori, Sergey Khoperskov, and David Valls-Gabaud. 2023. “The e-TidalGCs project. Modeling the extra-tidal features generated by Galactic globular clusters” 673 (May):A44. https://doi.org/10.1051/0004-6361/202244141.

Ferrone, Salvatore, Marco Montuori, Paola Di Matteo, Alessandra Mastrobuono-Battisti, Rodrigo Ibata, Paolo Bianchini, Sergey Khoperskov, et al. 2025. “Gaps in stellar streams as a result of globular cluster flybys: The case of Palomar 5” 699 (July):A289. https://doi.org/10.1051/0004-6361/202553923.

Forest, Etienne, and Ronald D. Ruth. 1990. “Fourth-order symplectic integration.” Physica D Nonlinear Phenomena 43 (1): 105–17. https://doi.org/10.1016/0167-2789(90)90019-L.

Gaia Collaboration, A. G. A. Brown, A. Vallenari, T. Prusti, J. H. J. de Bruijne, C. Babusiaux, M. Biermann, et al. 2021. “Gaia Early Data Release 3. Summary of the contents and survey properties” 649 (May):A1. https://doi.org/10.1051/0004-6361/202039657.

Gaia Collaboration, A. Vallenari, A. G. A. Brown, T. Prusti, J. H. J. de Bruijne, F. Arenou, C. Babusiaux, et al. 2023. “Gaia Data Release 3. Summary of the content and survey properties” 674 (June):A1. https://doi.org/10.1051/0004-6361/202243940.

Harris, Charles R., K. Jarrod Millman, Stéfan J. van der Walt, Ralf Gommers, Pauli Virtanen, David Cournapeau, Eric Wieser, et al. 2020. “Array programming with NumPy” 585 (7825): 357–62. https://doi.org/10.1038/s41586-020-2649-2.

Long, Kevin, and Chigurupati Murali. 1992. “Analytical Potentials for Barred Galaxies” 397 (September):44. https://doi.org/10.1086/171764.

Meson Developers. 2025. “The Meson Build System: Manual and Reference Documentation.” https://mesonbuild.com/index.html.

NumPy Developers. 2025. “F2PY User Guide and Reference Manual.” https://numpy.org/doc/stable/f2py/.

Pelupessy, F. I., A. van Elteren, N. de Vries, S. L. W. McMillan, N. Drost, and S. F. Portegies Zwart. 2013. “The Astrophysical Multipurpose Software Environment” 557 (September):A84. https://doi.org/10.1051/0004-6361/201321252.

Peterson, Pearu. 2009. “F2PY: A Tool for Connecting Fortran and Python Programs.” International Journal of Computational Science and Engineering 4 (4): 296–305. https://doi.org/10.1504/IJCSE.2009.029165.

Pouliasis, E., P. Di Matteo, and M. Haywood. 2017. “A Milky Way with a massive, centrally concentrated thick disc: new Galactic mass models for orbit computations” 598 (February):A66. https://doi.org/10.1051/0004-6361/201527346.

Press, William H., Saul A. Teukolsky, William T. Vetterling, and Brian P. Flannery. 1992. Numerical recipes in C. The art of scientific computing.

Price-Whelan, Adrian M. 2017. “Gala: A Python package for galactic dynamics.” The Journal of Open Source Software 2 (October):388. https://doi.org/10.21105/joss.00388.

Vasiliev, Eugene. 2018. “Agama reference documentation.” arXiv e-Prints, February, arXiv:1802.08255. https://doi.org/10.48550/arXiv.1802.08255.

Vasiliev, Eugene, and Holger Baumgardt. 2021. “Gaia EDR3 view on galactic globular clusters” 505 (4): 5978–6002. https://doi.org/10.1093/mnras/stab1475.

Wang, Long, Rainer Spurzem, Sverre Aarseth, Keigo Nitadori, Peter Berczik, M. B. N. Kouwenhoven, and Thorsten Naab. 2015. “NBODY6++GPU: ready for the gravitational million-body problem” 450 (4): 4070–80. https://doi.org/10.1093/mnras/stv817.

Weyl, Hermann. 1946. The Classical Groups: Their Invariants and Representations. Vol. 1. Princeton university press.