Python algorithms in particle tracking microrheology
© Maier and Haraszti; licensee Chemistry Central Ltd. 2012
Received: 13 September 2012
Accepted: 14 November 2012
Published: 27 November 2012
Skip to main content
© Maier and Haraszti; licensee Chemistry Central Ltd. 2012
Received: 13 September 2012
Accepted: 14 November 2012
Published: 27 November 2012
Particle tracking passive microrheology relates recorded trajectories of microbeads, embedded in soft samples, to the local mechanical properties of the sample. The method requires intensive numerical data processing and tools allowing control of the calculation errors.
We report the development of a software package collecting functions and scripts written in Python for automated and manual data processing, to extract viscoelastic information about the sample using recorded particle trajectories. The resulting program package analyzes the fundamental diffusion characteristics of particle trajectories and calculates the frequency dependent complex shear modulus using methods published in the literature. In order to increase conversion accuracy, segmentwise, double step, range-adaptive fitting and dynamic sampling algorithms are introduced to interpolate the data in a splinelike manner.
The presented set of algorithms allows for flexible data processing for particle tracking microrheology. The package presents improved algorithms for mean square displacement estimation, controlling effects of frame loss during recording, and a novel numerical conversion method using segmentwise interpolation, decreasing the conversion error from about 100% to the order of 1%.
Particle tracking microrheology is a modern tool to investigate the viscoelastic properties of soft matter, for example, biopolymers and the interior, or the membrane of living cells [1, 2] on the microscopic scale. Though embedding tracer particles into such a sample alters the local structure, this method is still considered non-invasive and provides important information not available by other methods [1–4].
where τ denotes the time step in which the particle moves Δr distance, <Δr 2(τ)> the mean square displacement, N D the dimensionality of the motion (usually N D = 2 for particle tracking digital microscopy), k B the Boltzmann constant, T is the absolute temperature and a the particle radius, respectively.
where is the Fourier transform of J(t). Assuming that the particle tracks are previously obtained, the frequency dependent complex shear modulus G ∗(ω) can be derived using equations (1) and (2) after calculating the mean square displacement.
There are two major algorithm libraries available on the Interned addressing data handling for microrheology: the algorithm collection of J. Crocker et al. written in the interactive data language (IDL) , which was translated to Matlab and expanded by the Kilfoil lab . A separate stand alone algorithm is provided by M. Tassieri for calculating the complex shear modulus from the creep compliance, written in LabView . However, an extensible integrated framework relying on freely accessible software and source code integrating multiple conversion methods is not yet available.
In this paper we present a software package written in the interpreted programing language Python (http://www.python.org) collecting functions to support particle tracking microrheology related calculations, with emphasis on those parts providing enhanced functionality. This library is meant to be an open source, platform independent, freely extendable set of algorithms allowing to extract rheology information from particle tracks obtained previously.
The software depends on the following Python packages for calculations and displaying results:
After installation, its functions are available using following import command within the interpreter or a Python script:
from Rheology import ⋆
Instead of defining individual classes for each data type (MSD, creep compliance and complex shear modulus), an alternative technique in Python is to employ generalized lists, the so called dictionaries or ‘dicts’. A dict is a container class holding data which are indexed with arbitrary keys (marked by quotes in this paper). This is a very general and flexible data structure, which we used for all data in the Rheology package. For example, MSD dicts contain keys “tau” and “MSD” referring to arrays holding the time lag and the mean square displacement data, respectively.
Particle tracking microrheology starts with the imaging experiments and the image treatment in a strict sense, however, obtaining particle trajectories from video microscopy is well described in the literature [16–20] including the statistical difficulties of the process [21–23]. There are various implementations of particle tracking algorithms available in IDL (see the same website as for the rheology code) [11, 24], Matlab [12, 25], LabView [20, 26] or C++ languages . An implementation of the Grier method  is also translated to Python .
Thus, we start our discussion by extracting rheological information from the particle trajectories, leaving the implementation of data input/output to the user. As a good starting template, we recommend the ReadTable() and SaveTable() functions in the ProcessRheology.py example script.
Several experimental systems show a drift: a slow oriented motion of the sample versus the imaging frame, which is caused by various factors of the experimenting apparatus. To remove this drift, which is independent of the sample, there are multiple possibilities one may consider. If a reference bead bound to the sample is available, its trajectory provides the drift itself. If multiple particles are tracked in the same time interval, an average trajectory may be calculated and used as a reference. If these possibilities are not available, one has to consider whether the long time motion is due to drift or it is a characteristic of the investigated sample, because subtracting it changes the resulted long time lag (low frequency) part of the viscoelastic characteristics.
The simplest way of data treatment in such cases is to calculate a smoothened trajectory: e. g. using a running average, and either subtract this smoothened data or, in one further step, fit a low order polynomial to the smoothened data and subtract the fitted values from the trajectory.
A very simple implementation is available for two dimensional data sets as:
GetData(timestamps, poslist, indx=0,
order=3,resolution=0.1434, Nd= 1000)
This algorithm takes a position list (parameter poslist), which is a list of dicts, each describing the positions of a particle. The indx parameter is used to select one of them. A position dict contains “X”, and “Y”, which are arrays of the x and y positions. The dict also contains an index array denoted with key “indx”, identifying the image index (serial number) of the given positions. Using this index allows the tracking algorithm to miss individual frames (e. g. when the particle drifted out of focus). This index is also used to define the time point of a position, either by identifying the corresponding time stamps, provided in seconds in the timestamps array, or if this variable is set to None, multiplying the index by the optional tscale parameter. The Nd parameter gives the number of points used in the running average and the order parameter identifies the order of the polynomial to be fitted for drift correction. If Nd is set to −1, the running averaging is off, and if order is -1, the drift correction is turned off. The resolution is used to scale the coordinates to micrometers (the same value for both coordinates).
Characterizing a soft sample using particle tracking microrheology strongly relies on the determination of the mean square displacement. Calculating the MSD has two possible sets of assumptions: 1.) ensemble averages are based on recording many tracer particles and assuming homogeneity across the sample. This is the averaging method considered in theories, and has the advantage allowing estimation of the time-dependent aging of the system. Technically this can be achieved by using video recording-based particle tracking, where the number of tracers can be increased to the order of tens. 2.) Assuming ergodicity, one can switch from ensemble averaging to time averaging. This is very important for systems which are not homogeneous and for cases where only few particles (1−5) can be observed at a time. For samples of biological origin, time averaging is more suitable because these samples are seldom homogeneous.
Calculating the time average is done by splitting up the trajectory into non-overlapping parts and averaging their displacement. Because the number of intervals decreases with an increasing lag time, this method has very high error for large lag time values. Alternatively, it is also possible to split the trajectory into overlapping regions and then do the averaging. The resulting statistical errors follow a non-normal distribution, but it has been shown that using overlapping segments the resulted MSD may show improved accuracy [29, 30] when using lag time values up to about the quarter of the measurement length (N/4 for N data points). The results of a test calculation using positions randomly calculated from a normal distribution with standard deviation of σ X = σ Y = 0.15μm in both X and Y directions are presented in Figure 2. The MSD oscillates around the theoretical value of as expected, but the data calculated using overlapping intervals show visibly better accuracy.
msd = MSD(positionlist, tau= 0)
The second parameter (tau) is used to generate the lag time values (steps) for which the MSD is calculated. This parameter can take various values: If an array or list of integer values are given, then those are used as index step values. If a single integer is given between 0 and the number of positions, then so many index steps (lag time values) are generated in an equidistant manner. If nothing, or 0 is given, then values 1…N/4 are used.
Each tau value results in M = M(tau) pairs, where the step r[i + tau]−r[i] is calculated. M also depends on whether the set was generated using overlapping intervals (if overlap = True is set).
If an array of time values (in seconds) are provided for the position data using the optional tvector parameter, the algorithm will check the time step between each data pair used. Calculating the mean value of these time steps and using a relative error, every value outside the mean(1 ± tolerance) will be ignored (by default tolerance = 0.1). This process eliminates jumps in the data caused by computer latency during recording.
The function returns a dictionary containing “tau”, “dtau”, “MSD”, “DMSD” keys. If the time values were not provided, then “tau” holds the index steps between the positions and “dtau” is not used.
J = MSD_to_J(msd, t0= 0.1, tend = 150,
T = 25.0, a = 1.0)
The calculation requires a dict structure (msd) having the time values under the “tau” key, and the MSD values under the “MSD” key (error values are optional). Further parameters are the temperature T of the experiment in Celsius degrees, the radius a of the applied tracer particle in micrometers, and optionally the dimensionality of the motion D (denoted as N D in equation (1)), which is set to D = 2 by default.
Calculating the frequency dependent shear modulus from J(t) with the numerical method proposed by Evans et al. requires extrapolated values to the zero time point and to infinite time values. These values are estimated here, allowing the user to override them before being used to calculate the complex shear modulus G ∗(ω). The zero time value J 0 = J(t = 0) is extrapolated from a linear fit in the t < t0 region, and the end extrapolation is obtained from a linear fit to the tend < t part. The slope of the extrapolated end part is 1/η, where η is the steady state viscosity .
The function returns a new dict containing: “J” (in 1/Pa), “tau”, “eta”, “J0”, “const”, “dJ”, and the fit parameters as “a0”, “b0” for the first part and “a1”, “b1” for the end part, where the linear equation J = a i t + b i (i = 0,1) holds.
While the connection presented by equation (2) is simple, there is a major problem with determining numerically. It is well known in numerical analysis that applying a numerical Fourier transform increases the experimental noise enormously . In microrheology, there are four commonly applied methods to solve this problem. The first two address the noise of the Fourier transform directly by averaging or by fitting, the second two were suggested in the last decade to improve the transform itself [6, 9, 31–33].
In a homogeneous system, where multiple particles can be tracked, converting their MSD to creep compliance and then G ∗(ω) using a discrete Fourier transform, allows one to average the converted values and decrease the noise this way [32, 33]. For cases when the creep compliance can be modeled using an analytical form, the Fourier transform of the fitted analytical function may be calculated and used to estimate of G ∗(ω) .
As we have discussed above, samples of biological origin are often not homogeneous and their MSD does not follow a well-described analytical function. However, model calculations suggest, in agreement with experiments, that biopolymers and many polymer gels show power law behavior at various time ranges [34–36]. The third and fourth conversion approaches have been suggested for such systems in the microrheology literature. One uses a power law approximation of the MSD or the creep-compliance  (we shall call the Mason method), and the other calculates a linear interpolation between the data points and applying a discrete Fourier transform on it  (we shall cite as the Evans method).
Because these methods have their strength and weakness, we summarize them and their implementation below. Recently we have shown that the accuracy of the Evans method can be greatly improved by using local interpolation of the data in a splinelike manner without forcing a single function to be fitted to the whole data set. This improvement is also included in the Rheology framework and will be discussed below.
Usually this equation is presented containing the MSD , but the key feature, the symmetry between the time dependent creep-compliance and the frequency dependent complex shear modulus, is more apparent in this form. The method generalizes this symmetry between ω ↔ t : ω = 1/t, and assumes it holds for the whole measured time range , even when the exponent α of the power law varies with time.
However, this is a highly specific case, and the symmetry does not hold for most of the functions . To improve the fit quality, a slightly more complicated version of this formula has been proposed by Dasgupta et al. in , based on empirical corrections.
G = J_to_G_Mason(J,N=30, advanced=True, logstep=True)
The algorithm takes a creep-compliance dict (J), and fits a power function locally in the form of equation (3) to estimate α and calculates the complex shear modulus at ω = 1/t using equation (4). The algorithm uses a Gaussian function to weight the neighbors in the local fit as it is described by Mason et al. . N defines the desired number of resulted data points, which are created by equidistant sampling in the linear or logarithmic space between 1/t max …1/t min . The logarithmic sampling is activated by the logstep parameter, which is set by default. The resulted dict contains ’omega’ and ’f’ for the circular frequency and frequency respectively, and a ’G’ array storing the corresponding complex shear moduli.
There are several further parameters to control the process, from which setting the advanced parameter forces the use of the method proposed by Dasgupta et al. instead of the original Mason method, and the verbose switch provides graphical feedback on how the local fitting proceeds. A test example is presented in Figure 3 using a creep compliance in the form of equation (3) and converted both numerically and analytically. Because this method is accurate for power law creep compliances, the conversion matches within machine precision.
J 0 and η are already estimated using the linear fits in the MSD_to_J() function. Using equation (5) is straightforward, and allow for the calculation of G ∗(ω) at any ω values. The natural selection of a suitable frequency range would be from 0 to (the unit is 1/s) in N/2 steps as it is common for the discrete Fourier transform of equally sampled data . The corresponding function is:
G = J_to_G(J)
There are various details worth mentioning about this method, which may affect the accuracy of the result in general cases. It is clear in equation (5), that the method is sensitive to the A k + 1−A k terms, which, in extreme cases, may be either very small for a nearly linear part of J(t) or very high for sudden jumps in the experimental data. In order to reduce round-off errors, one may eliminate the close to zero values, (when |A k + 1−A k | < ε) by activating the filter parameter.
The conversion can be improved by increasing the bandwidth of the data, or decreasing the frequency range where G ∗(ω) is calculated (using the omax parameter). A third alternative is using model interpolations to increase the bandwidth numerically. Because most experimental data cannot be fitted with a single analytical function for all τ values, we have developed a method to fit the MSD at consecutive intervals in a splinelike manner .
Biopolymer samples commonly show monotonic MSD, frequently described by power laws at various time segments. This makes the choice of the power function a good candidate for the local fitting. To count for deviations from power laws at short time values, the fitting system also allows the use of a Kelvin-Voigt solid as an alternative model for short time scales, in the case of more elastic samples. Including a Maxwell fluid as an option was not necessary, since the Evans method provides perfect fits for linear MSDs (see Figure 4).
define the data range using indices i 0 = 0 and i 1 = i(t 0), containing at least 4 data points;
fit function from i 0…i 1, calculate the squared error of each point and estimate the average error;
find the last point around i 1 where and redefine i 1 as this point i 1 = i 2(stretching or shrinking the fitting range to a reasonable maximum)
recalculate the fit and errors, store as the current segment
if we reached the end (or close to the end), finish the cycle, return with the results
otherwise, define the new fitting range, using the scaler parameter (by default use i 0…scaler × i 0)
return to 2 (see Additional file 1)
The exact procedure calculating the new range in each step may vary depending on the mode an optional keyword parameter, allowing for some control for functions which show fast changes or slow changes with noise. The default method (mode=“scaler”) described above works well for most MSDs with some noise but monotonic trends. (More details are available in the help of the library and the config.example text file in the Examples folder of the package).
The corresponding function call is:
fits = MSD_power_fit(msd, t0=0.2, scaler=10.0).
Completing the above algorithm, in the next step the program checks where the fits would cross each-other in the neighboring ranges, and readjusts them to the crossing point, if it lies within the union of the two ranges. This step helps to maintain a smoother approximation of the experimental data. Turning verbosity on using verbose=True, one may see details about how the algorithm operates.
The return value (fits) is a list of dicts, each containing the fitting parameters of one segment. This list can be directly used to calculate the interpolation of the original data using:
msd2 = MSD_interpolate(msd, fits, smoothing
= True, Nsmoothing=10, insert=10, factor=3)
The interpolation and insertion of new points before the first data point will increase the bandwidth of the original data. The factor parameter controls the oversampling of the original data (here it is set to 3× oversampling). Inserting new data points between 0 and the first time τ 0 is specified by insert = 10.
Estimating how many new points are required during this resampling procedure is a difficult question in general. The above example uses a static approach, simply inserting factor points between every two original data points, which results in a very large equidistant data set. Alternatively we provide a dynamic method, which is controlled by an error parameter.
Using h as a minimal step size for each fitted segment, the program can dynamically interpolate the data and increase the number of points only where the creep compliance changes faster. This results in a non-equally sampled data set, which (after calculating the creep compliance) can be well handled by both the MSD_to_J() function and subsequently the numerical conversion method J_to_G(), resulting in an improved complex shear modulus. To eliminate further bandwidth problems, the maximal desired circular frequency ω max can be forced to using the omax parameter.
The advantage of employing non-equally sampled data as the result of the dynamic interpolation is clear if we compare the required smallest time step produced by this method. The results indicate h min ≈ 2 × 10−6 s, which would increase the number of data points about 5000× in the case of equidistant sampling. In comparison, the increment of the number of data points is only about 20%.
For data sets following various trends in the different time segments (thus the fitting procedure identifies multiple fitted regions), the transition between the regions can be smoothened using a linear interpolation between the two smoothing functions. The range of this smoothing is defined by the NSmooth parameter. The range is identified in the original data, but then applied to the refined data set. The result is an MSD, where sudden jumps are reduced, minimizing the presence of oscillatory artifacts in the resulted complex shear modulus G ∗(ω).
In this paper we have presented a free software solution for analyzing particle tracking data for microrheology. Our software library implements the time average calculation of mean square displacement with control over the time shift in the data, and conversion methods to calculate the creep compliance and the complex shear modulus. Beyond the two most common methods mentioned in the literature, we have developed a dynamic local fitting procedure, which allows spline-like fitting of the MSD and improved conversion accuracy to about 1% from about 100% for a Kelvin-Voigt model test.
Lists the following:
Project name: Rheology for Python
Project home page: http://launchpad.net/microrheologypy/
Operating systems: Platform independent (Linux, Windows and Max OSX tested)
Programing language: Python 2.7
Other requirements: Numpy 1.5, Scipy 0.1, Matplotlib 1.0
License: LGPL v3
Any restrictions to use by non-academics: see license
Mean square displacement.
This work was supported by the Ministry of Science, Research, and the Arts of Baden-Württemberg (AZ:720.830-5-10a). The authors would like to thank to Professor Dr. Joachim P. Spatz, Dr. Heike Boehm and the Max-Planck Society for the generous financial support and Dr. Claire Cobley for assistance.
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License(http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.