kinisi.diffusion#

The modules is focused on tools for the evaluation of the mean squared displacement and resulting diffusion coefficient from a material.

class kinisi.diffusion.Bootstrap(delta_t, disp_3d, n_o, sub_sample_dt=1, dimension='xyz')#

Bases: object

The top-level class for bootstrapping.

Parameters:
  • delta_t (ndarray) – An array of the timestep values.

  • disp_3d (List[ndarray]) – A list of arrays, where each array has the axes [atom, displacement observation, dimension]. There is one array in the list for each delta_t value. Note: it is necessary to use a list of arrays as the number of observations is not necessary the same at each data point.

  • sub_sample_dt (int) – The frequency in observations to be sampled. Optional, default is 1 (every observation).

  • progress – Show tqdm progress for sampling. Optional, default is True.

to_dict()#
Return type:

dict

Returns:

Dictionary description of the Bootstrap.

classmethod from_dict(my_dict)#

Generate a Bootstrap object from a dictionary.

Parameters:

my_dict (dict) – The input dictionary.

Return type:

Bootstrap

Returns:

New :py:class`Bootstrap` object.

property dt: ndarray#
Returns:

Timestep values that were resampled.

property n: ndarray#
Returns:

The mean MSD/MSTD/MSCD, as determined from the bootstrap resampling process, in units Å:sup:2.

property s: ndarray#
Returns:

The MSD/MSTD/MSCD standard deviation, as determined from the bootstrap resampling process, in units Å:sup:2.

property v: ndarray#
Returns:

The MSD/MSTD/MSCD variance as determined from the bootstrap resampling process, in units Å:sup:4.

property euclidian_displacements: List[Distribution]#
Returns:

Displacements between particles at each dt.

property ngp: ndarray#
Returns:

Non-Gaussian parameter as a function of dt.

property n_i: ndarray#
Returns:

The number of independent trajectories as a function of dt.

property intercept: Distribution | None#
Returns:

The estimated intercept. Note that if fit_intercept is False is the relavent method call, then this is None

property covariance_matrix: ndarray#
Returns:

The covariance matrix for the trajectories.

static iterator(progress, loop)#

Get the iteration object, using tqdm as appropriate.

Parameters:
  • progress (bool) – Should tqdm be used to give a progress bar.

  • loop (Union[list, range]) – The object that should be looped over.

Return type:

Union[tqdm, range]

Returns:

Iterator object.

static sample_until_normal(array, n_samples, n_resamples, max_resamples, alpha=0.001, random_state=None)#

Resample from the distribution until a normal distribution is obtained or a maximum is reached.

Args: :type array: ndarray :param array: The array to sample from. :type n_samples: float :param n_samples: Number of samples. :param r_resamples: Number of resamples to perform initially. :type max_resamples: int :param max_resamples: The maximum number of resamples to perform. :type alpha: float :param alpha: Level that p-value should be below in scipy.stats.normaltest() for the distribution

to be normal. Optional, default is 1e-3.

Parameters:

random_state (Optional[RandomState]) – A RandomState object to be used to ensure reproducibility. Optional, default is None.

Return type:

Distribution

Returns:

The resampled distribution.

static ngp_calculation(d_squared)#

Determine the non-Gaussian parameter, from S. Song et al, “Transport dynamics of complex fluids” (2019): 10.1073/pnas.1900239116

Parameters:

d_squared (ndarray) – Squared displacement values.

Return type:

float

Returns:

Value of non-Gaussian parameter.

bootstrap_GLS(start_dt, model=True, fit_intercept=True, n_samples=1000, n_walkers=32, n_burn=500, thin=10, progress=True, random_state=None)#

Use the covariance matrix estimated from the resampled values to estimate the gradient and intercept using a generalised least squares approach.

Parameters:
  • start_dt (float) – The starting time for the analysis to find the diffusion coefficient. This should be the start of the diffusive regime in the simulation.

  • model (bool) – Use the model for the covariance matrix, if False this may lead to numerical instability. Optional, default is True.

  • fit_intercept (bool) – Should the intercept of the diffusion relationship be fit. Optional, default is True.

  • n_samples (int) – Number of samples of the Gaussian process to perform. Optional, default is 1000.

  • n_walkers (int) – Number of MCMC walkers to use. Optional, default is 32.

  • n_burn (int) – Number of burn in samples (these allow the sampling to settle). Optional, default is 500.

  • thin (int) – Use only every thin samples for the MCMC sampler. Optional, default is 10.

  • progress (bool) – Show tqdm progress for sampling. Optional, default is True.

  • random_state (Optional[RandomState]) – A RandomState object to be used to ensure reproducibility. Optional, default is None.

generate_covariance_matrix(diff_regime)#

Generate the covariance matrix, including the modelling and finding the closest matrix that is positive definite.

Parameters:

diff_regime (int) – The index of the point where the analysis should begin.

Returns:

Modelled covariance matrix that is positive definite.

diffusion(start_dt, **kwargs)#

Use the bootstrap-GLS method to determine the diffusivity for the system. Keyword arguments will be passed of the bootstrap_GLS() method.

Parameters:

start_dt (float) – The starting time for the analysis to find the diffusion coefficient. This should be the start of the diffusive regime in the simulation.

property D: Distribution | None#

An alias for the diffusion coefficient Distribution.

Returns:

Diffusion coefficient, with units of cm:sup:2`s:sup:-1`.

jump_diffusion(start_dt, **kwargs)#

Use the bootstrap-GLS method to determine the jump diffusivity for the system. Keyword arguments will be passed of the bootstrap_GLS() method.

Parameters:

start_dt (float) – The starting time for the analysis to find the diffusion coefficient. This should be the start of the diffusive regime in the simulation.

property D_J: Distribution | None#

Alias for the jump diffusion coefficient Distribution.

Returns:

Jump diffusion coefficient, with units of cm:sup:2`s:sup:-1`.

conductivity(start_dt, temperature, volume, **kwargs)#

Use the bootstrap-GLS method to determine the ionic conductivity for the system, in units of mScm:sup:-1. Keyword arguments will be passed of the bootstrap_GLS() method.

Parameters:
  • start_dt (float) – The starting time for the analysis to find the diffusion coefficient. This should be the start of the diffusive regime in the simulation.

  • temperature (float) – System temperature, in Kelvin.

  • volume (float) – System volume, in Å^{3}.

property sigma: Distribution | None#
Returns:

The estimated conductivity, based on the generalised least squares approach, with units mScm:sup:-1.

posterior_predictive(n_posterior_samples=None, n_predictive_samples=256, progress=True)#

Sample the posterior predictive distribution. The shape of the resulting array will be (n_posterior_samples * n_predictive_samples, start_dt).

Parameters:
  • n_posterior_samples (Optional[int]) – Number of samples from the posterior distribution. Optional, default is the number of posterior samples.

  • n_predictive_samples (int) – Number of random samples per sample from the posterior distribution. Optional, default is 256.

  • progress (bool) – Show tqdm progress for sampling. Optional, default is True.

Return type:

ndarray

Returns:

Samples from the posterior predictive distribution.

class kinisi.diffusion.MSDBootstrap(delta_t, disp_3d, n_o, sub_sample_dt=1, bootstrap=False, block=False, n_resamples=1000, max_resamples=10000, dimension='xyz', alpha=0.001, random_state=None, progress=True)#

Bases: Bootstrap

Perform a bootstrap resampling to obtain accurate estimates for the mean and uncertainty for the mean squared displacements.

Parameters:
  • delta_t (ndarray) – An array of the timestep values, units of ps

  • disp_3d (List[ndarray]) – A list of arrays, where each array has the axes [atom, displacement observation, dimension]. There is one array in the list for each delta_t value. Note: it is necessary to use a list of arrays as the number of observations is not necessary the same at each data point.

  • n_o (ndarray) – Number of statistically independent observations of the MSD at each timestep.

  • sub_sample_dt (int) – The frequency in observations to be sampled. Default is 1 (every observation)

  • bootstrap (bool) – Should bootstrap resampling be used to estimate the observed MSD distribution. Optional, default is False.

  • block (bool) – Should the blocking method be used to estimate the variance, if False an approximation is used to estimate the variance. Optional, default is False.

  • n_resamples (int) – The initial number of resamples to be performed. Default is 1000

  • max_resamples (int) – The max number of resamples to be performed by the distribution is assumed to be normal. This is present to allow user control over the time taken for the resampling to occur. Default is 100000

  • dimension (str) – Dimension/s to find the displacement along, this should be some subset of ‘xyz’ indicating the axes of interest. Optional, defaults to ‘xyz’.

  • alpha (float) – Value that p-value for the normal test must be greater than to accept. Default is 1e-3

:param random_stateA RandomState object to be used to ensure reproducibility. Default

is None

Parameters:

progress (bool) – Show tqdm progress for sampling. Default is True

class kinisi.diffusion.MSTDBootstrap(delta_t, disp_3d, n_o, sub_sample_dt=1, bootstrap=False, block=False, n_resamples=1000, max_resamples=10000, dimension='xyz', alpha=0.001, random_state=None, progress=True)#

Bases: Bootstrap

Perform a bootstrap resampling to obtain accurate estimates for the mean and uncertainty for the total mean squared displacements.

Parameters:
  • delta_t (ndarray) – An array of the timestep values.

  • disp_3d (List[ndarray]) – A list of arrays, where each array has the axes [atom, displacement observation, dimension]. There is one array in the list for each delta_t value. Note: it is necessary to use a list of arrays as the number of observations is not necessary the same at each data point.

  • n_o (ndarray) – Number of statistically independent observations of the MSD at each timestep.

  • sub_sample_dt (int) – The frequency in observations to be sampled. Optional, default is 1 (every observation).

  • bootstrap (bool) – Should bootstrap resampling be used to estimate the observed MSD distribution. Optional, default is False.

  • block (bool) – Should the blocking method be used to estimate the variance, if False an approximation is used to estimate the variance. Optional, default is False.

  • n_resamples (int) – The initial number of resamples to be performed. Optional, default is 1000

  • max_resamples (int) – The max number of resamples to be performed by the distribution is assumed to be normal. This is present to allow user control over the time taken for the resampling to occur. Optional, default is 100000

  • dimension (str) – Dimension/s to find the displacement along, this should be some subset of ‘xyz’ indicating the axes of interest. Optional, defaults to ‘xyz’.

  • alpha (float) – Value that p-value for the normal test must be greater than to accept. Optional, default is 1e-3

:param random_stateA RandomState object to be used to ensure reproducibility. Optional,

default is None

Parameters:

progress (bool) – Show tqdm progress for sampling. Optional, default is True

class kinisi.diffusion.MSCDBootstrap(delta_t, disp_3d, ionic_charge, n_o, sub_sample_dt=1, bootstrap=False, block=False, n_resamples=1000, max_resamples=10000, dimension='xyz', alpha=0.001, random_state=None, progress=True)#

Bases: Bootstrap

Perform a bootstrap resampling to obtain accurate estimates for the mean and uncertainty for the mean squared charge displacements.

Parameters:
  • delta_t (ndarray) – An array of the timestep values.

  • disp_3d (List[ndarray]) – A list of arrays, where each array has the axes [atom, displacement observation, dimension]. There is one array in the list for each delta_t value. Note: it is necessary to use a list of arrays as the number of observations is not necessary the same at each data point.

  • ionic_charge (Union[ndarray, int]) – The charge on the mobile ions, either an array with a value for each ion or a scalar if all values are the same.

  • n_o (ndarray) – Number of statistically independent observations of the MSD at each timestep.

  • bootstrap (bool) – Should bootstrap resampling be used to estimate the observed MSD distribution. Optional, default is False.

  • block (bool) – Should the blocking method be used to estimate the variance, if False an approximation is used to estimate the variance. Optional, default is False.

  • sub_sample_dt (int) – The frequency in observations to be sampled. Optional, default is 1 (every observation).

  • n_resamples (int) – The initial number of resamples to be performed. Optional, default is 1000.

  • max_resamples (int) – The max number of resamples to be performed by the distribution is assumed to be normal. This is present to allow user control over the time taken for the resampling to occur. Optional, default is 100000.

  • dimension (str) – Dimension/s to find the displacement along, this should be some subset of ‘xyz’ indicating the axes of interest. Optional, defaults to ‘xyz’.

  • alpha (float) – Value that p-value for the normal test must be greater than to accept. Optional, default is 1e-3.

  • random_state (Optional[RandomState]) – A RandomState object to be used to ensure reproducibility. Optional, default is None.

  • progress (bool) – Show tqdm progress for sampling. Optional, default is True.