{ "cells": [ { "cell_type": "markdown", "id": "bfc0c1b1-3769-4945-adaf-8336fc7117ba", "metadata": {}, "source": [ "# Comparison with pymatgen\n", "\n", "The `pymatgen` project also has [tools capable of calculating the mean-squared displacement and diffusion coefficient](https://pymatgen.org/addons#add-ons-for-analysis) from a relevant input. \n", "So why should you use `kinisi` over `pymatgen`?\n", "\n", "The simple answer is that the approach taken by `kinisi`, which is outlined in the [methodology](./methodology.html), uses a higher precision approach to estimate the diffusion coefficent and offers an accurate estimate in the variance of the mean-squared displacements and diffusion coefficient from a single simulation. \n", "\n", "In this notebook, we will compare the results from `pymatgen` and `kinisi`. \n", "First we will import the `kinisi` and `pymatgen` `DiffusionAnalyzer` classes. " ] }, { "cell_type": "code", "execution_count": null, "id": "9a4f9657-1ac8-4e7c-90d2-3970bce9b318", "metadata": {}, "outputs": [], "source": [ "import numpy as np\n", "from kinisi.analyze import DiffusionAnalyzer as KinisiDiffusionAnalyzer\n", "from pymatgen.analysis.diffusion.analyzer import DiffusionAnalyzer as PymatgenDiffusionAnalyzer\n", "from pymatgen.io.vasp import Xdatcar\n", "np.random.seed(42)" ] }, { "cell_type": "markdown", "id": "29e4e68d-8032-4e41-bd6c-86624f663ba9", "metadata": {}, "source": [ "The `kinisi.DiffusionAnalyzer` API was based on the `pymatgen` equivalent, therefore, the two take the same inputs and can parse the `Xdatcar.structures`. " ] }, { "cell_type": "code", "execution_count": null, "id": "ff13aa9f-fc85-484b-8b09-37932b45aec1", "metadata": {}, "outputs": [], "source": [ "p_params = {'specie': 'Li',\n", " 'time_step': 2.0,\n", " 'step_skip': 50\n", " }" ] }, { "cell_type": "code", "execution_count": null, "id": "759490f5-26ed-47d2-9704-9d32465efbc1", "metadata": {}, "outputs": [], "source": [ "xd = Xdatcar('./example_XDATCAR.gz')" ] }, { "cell_type": "markdown", "id": "4fa3cfb3-bd45-407a-a0d7-6ad4ffe3e5a8", "metadata": {}, "source": [ "We can then run both the `pymagten` analysis and the `kinisi` analysis (the `pymatgen` requires and additional `temperature` keyword which is not used in this example). " ] }, { "cell_type": "code", "execution_count": null, "id": "fdb0a4aa-6ba4-48ab-9f65-b0ab2b8e15a4", "metadata": {}, "outputs": [], "source": [ "pymatgen_diff = PymatgenDiffusionAnalyzer.from_structures(\n", " xd.structures, temperature=300, **p_params)" ] }, { "cell_type": "code", "execution_count": null, "id": "5d8cd6e2-11d4-4926-9119-a7d81da4f3f5", "metadata": {}, "outputs": [], "source": [ "p_params['progress'] = False\n", "u_params = {'progress': False}\n", "\n", "kinisi_diff = KinisiDiffusionAnalyzer.from_pymatgen(\n", " xd.structures, parser_params=p_params, uncertainty_params=u_params)" ] }, { "cell_type": "markdown", "id": "cc65f185-3e64-40e9-8200-696f9366715a", "metadata": {}, "source": [ "Now we can plot the mean-squared displacement from each to check agreement, the `pymatgen` time units are femtoseconds so these are adjusted. " ] }, { "cell_type": "code", "execution_count": null, "id": "f3aa84e8-0876-42e3-92c0-28164ac2dd00", "metadata": {}, "outputs": [], "source": [ "import matplotlib.pyplot as plt" ] }, { "cell_type": "code", "execution_count": null, "id": "6c34728b-6aa1-495f-a2c5-31cba27b0b30", "metadata": {}, "outputs": [], "source": [ "plt.plot(pymatgen_diff.dt / 1000, pymatgen_diff.msd, label='pymatgen')\n", "plt.plot(kinisi_diff.dt, kinisi_diff.msd, label='kinisi')\n", "plt.legend()\n", "plt.ylabel('MSD/Å$^2$')\n", "plt.xlabel(r'$\\Delta t$/ps')\n", "plt.show()" ] }, { "cell_type": "markdown", "id": "3af71036-f616-416f-9a11-082135f00b9e", "metadata": {}, "source": [ "We can see that the results overlap almost entirely.\n", "\n", "However, this doesn't show the benefits for using `kinisi` over `pymatgen`. \n", "The first benefit is that `kinisi` will accurately estimate the variance in the observed mean-squared displacements, giving error bars for the above plot. " ] }, { "cell_type": "code", "execution_count": null, "id": "82b2379b-5dbd-4d57-937c-1b84e318bf6d", "metadata": {}, "outputs": [], "source": [ "plt.errorbar(kinisi_diff.dt, kinisi_diff.msd, kinisi_diff.msd_std, c='#ff7f0e')\n", "plt.ylabel('MSD/Å$^2$')\n", "plt.xlabel(r'$\\Delta t$/ps')\n", "plt.show()" ] }, { "cell_type": "markdown", "id": "50fa7a3d-1171-4896-a2ce-36370a7a9620", "metadata": {}, "source": [ "The second benefit is that `kinisi` will estimate the diffusion coefficient with an accurate uncertainty. \n", "`pymatgen` also estimates this uncertainty, however, `pymatgen` assumes that the data is independent and applies [weighted least squares](https://en.wikipedia.org/wiki/Weighted_least_squares). \n", "However, mean-squared displacement observations are inherently dependent (as discussed in the [thought experiment in the methodology](https://kinisi.readthedocs.io/en/latest/methodology.html#Understanding-the-correlation-between-measurements)), so `kinisi` accounts for this and applied a [generalised least squares](https://en.wikipedia.org/wiki/Generalized_least_squares) style approach. \n", "This means that the estimated variance in the diffusion coefficient from `kinisi` is accurate (while, `pymatgen` will heavily underestimate the value) and given the [BLUE](https://en.wikipedia.org/wiki/Gauss–Markov_theorem#Generalized_least_squares_estimator) nature of the GLS approach, `kinisi` has a higher probability of determining a value for the diffusion coefficient closer to the true diffusion coefficient. " ] }, { "cell_type": "code", "execution_count": null, "id": "b53004e8-4550-4a7d-9b7f-42c478529496", "metadata": {}, "outputs": [], "source": [ "kinisi_diff.diffusion(kinisi_diff.ngp_max, {'progress': False})" ] }, { "cell_type": "code", "execution_count": null, "id": "f40d9b74-15d8-4971-90b2-4a8771fb7698", "metadata": {}, "outputs": [], "source": [ "from uncertainties import ufloat" ] }, { "cell_type": "code", "execution_count": null, "id": "65bdd80a-9396-40e3-b45b-a09862a79534", "metadata": {}, "outputs": [], "source": [ "print('D from pymatgen:', \n", " ufloat(pymatgen_diff.diffusivity, pymatgen_diff.diffusivity_std_dev))" ] }, { "cell_type": "code", "execution_count": null, "id": "4933f40b-1b26-4de5-be07-f34e7c108b27", "metadata": {}, "outputs": [], "source": [ "print('D from kinisi:', \n", " ufloat(np.mean(kinisi_diff.D), np.std(kinisi_diff.D, ddof=1)))" ] }, { "cell_type": "markdown", "id": "555bc479-d641-47d7-8421-5dea4ab14555", "metadata": {}, "source": [ "The comparison between weighted and generalised least squared estimators will be discussed in full in a future publication covering the methodology of `kinisi`." ] } ], "metadata": { "kernelspec": { "display_name": "kinisi", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.10.13" } }, "nbformat": 4, "nbformat_minor": 5 }