Comparison with pymatgen#
The pymatgen
project also has tools capable of calculating the mean-squared displacement and diffusion coefficient from a relevant input. So why should you use kinisi
over pymatgen
?
The simple answer is that the approach taken by kinisi
, which is outlined in the methodology, uses a higher precision approach to estimate the diffusion coefficent and offers an accurate estimate in the variance of the mean-squared displacements and diffusion coefficient from a single simulation.
In this notebook, we will compare the results from pymatgen
and kinisi
. First we will import the kinisi
and pymatgen
DiffusionAnalyzer
classes.
[1]:
import numpy as np
from kinisi.analyze import DiffusionAnalyzer as KinisiDiffusionAnalyzer
from pymatgen.analysis.diffusion.analyzer import DiffusionAnalyzer as PymatgenDiffusionAnalyzer
from pymatgen.io.vasp import Xdatcar
np.random.seed(42)
The kinisi.DiffusionAnalyzer
API was based on the pymatgen
equivalent, therefore, the two take the same inputs and can parse the Xdatcar.structures
.
[2]:
p_params = {'specie': 'Li',
'time_step': 2.0,
'step_skip': 50
}
[3]:
xd = Xdatcar('./example_XDATCAR.gz')
/home/docs/checkouts/readthedocs.org/user_builds/kinisi/envs/latest/lib/python3.12/site-packages/pymatgen/io/vasp/outputs.py:4496: EncodingWarning: We strongly encourage explicit `encoding`, and we would use UTF-8 by default as per PEP 686
file_len = sum(1 for _ in zopen(filename, mode="rt"))
/home/docs/checkouts/readthedocs.org/user_builds/kinisi/envs/latest/lib/python3.12/site-packages/pymatgen/io/vasp/outputs.py:4499: EncodingWarning: We strongly encourage explicit `encoding`, and we would use UTF-8 by default as per PEP 686
with zopen(filename, mode="rt") as file:
We can then run both the pymagten
analysis and the kinisi
analysis (the pymatgen
requires and additional temperature
keyword which is not used in this example).
[4]:
pymatgen_diff = PymatgenDiffusionAnalyzer.from_structures(
xd.structures, temperature=300, **p_params)
[5]:
p_params['progress'] = False
u_params = {'progress': False}
kinisi_diff = KinisiDiffusionAnalyzer.from_pymatgen(
xd.structures, parser_params=p_params, uncertainty_params=u_params)
/home/docs/checkouts/readthedocs.org/user_builds/kinisi/envs/latest/lib/python3.12/site-packages/kinisi/parser.py:116: UserWarning: Converting triclinic cell to orthorhombic this may have unexpected results. Triclinic a, b, c are not equilivalent to orthorhombic x, y, z.
warnings.warn(
/home/docs/checkouts/readthedocs.org/user_builds/kinisi/envs/latest/lib/python3.12/site-packages/kinisi/parser.py:437: UserWarning: UserWarning: Be aware that the expected input unit for 'time_step' is femtoseconds, not picoseconds.
warnings.warn("UserWarning: Be aware that the expected input unit for 'time_step' is femtoseconds, not picoseconds.")
Now we can plot the mean-squared displacement from each to check agreement, the pymatgen
time units are femtoseconds so these are adjusted.
[6]:
import matplotlib.pyplot as plt
[7]:
plt.plot(pymatgen_diff.dt / 1000, pymatgen_diff.msd, label='pymatgen')
plt.plot(kinisi_diff.dt, kinisi_diff.msd, label='kinisi')
plt.legend()
plt.ylabel('MSD/Å$^2$')
plt.xlabel(r'$\Delta t$/ps')
plt.show()
We can see that the results overlap almost entirely.
However, this doesn’t show the benefits for using kinisi
over pymatgen
. The first benefit is that kinisi
will accurately estimate the variance in the observed mean-squared displacements, giving error bars for the above plot.
[8]:
plt.errorbar(kinisi_diff.dt, kinisi_diff.msd, kinisi_diff.msd_std, c='#ff7f0e')
plt.ylabel('MSD/Å$^2$')
plt.xlabel(r'$\Delta t$/ps')
plt.show()
The second benefit is that kinisi
will estimate the diffusion coefficient with an accurate uncertainty. pymatgen
also estimates this uncertainty, however, pymatgen
assumes that the data is independent and applies weighted least squares. However, mean-squared displacement observations are inherently dependent (as discussed in the thought experiment in the
methodology), so kinisi
accounts for this and applied a generalised least squares style approach. This means that the estimated variance in the diffusion coefficient from kinisi
is accurate (while, pymatgen
will heavily underestimate the value) and given the
BLUE nature of the GLS approach, kinisi
has a higher probability of determining a value for the diffusion coefficient closer to the true diffusion coefficient.
[9]:
kinisi_diff.diffusion(kinisi_diff.ngp_max, {'progress': False})
[10]:
from uncertainties import ufloat
[11]:
print('D from pymatgen:',
ufloat(pymatgen_diff.diffusivity, pymatgen_diff.diffusivity_std_dev))
D from pymatgen: (1.33307+/-0.00008)e-05
[12]:
print('D from kinisi:',
ufloat(np.mean(kinisi_diff.D), np.std(kinisi_diff.D, ddof=1)))
D from kinisi: (1.53+/-0.07)e-05
The comparison between weighted and generalised least squared estimators will be discussed in full in a future publication covering the methodology of kinisi
.