{ "cells": [ { "cell_type": "markdown", "id": "bfc0c1b1-3769-4945-adaf-8336fc7117ba", "metadata": {}, "source": [ "# Comparison with pymatgen\n", "\n", "The `pymatgen` project also has [tools capable of calculating the mean-squared displacement and diffusion coefficient](https://pymatgen.org/addons#add-ons-for-analysis) from a relevant input. \n", "So why should you use `kinisi` over `pymatgen`?\n", "\n", "The simple answer is that the approach taken by `kinisi`, which is outlined in the [methodology](./methodology.html), uses a higher precision approach to estimate the diffusion coefficent and offers an accurate estimate in the variance of the mean-squared displacements and diffusion coefficient from a single simulation. \n", "\n", "In this notebook, we will compare the results from `pymatgen` and `kinisi`. \n", "First we will import the `kinisi` and `pymatgen` `DiffusionAnalyzer` classes. " ] }, { "cell_type": "code", "execution_count": null, "id": "5a662c28-a60b-4686-8c0a-6f5411408439", "metadata": {}, "outputs": [], "source": [ "import numpy as np\n", "import matplotlib.pyplot as plt\n", "import scipp as sc\n", "from kinisi.analyze import DiffusionAnalyzer as KinisiDiffusionAnalyzer\n", "from pymatgen.analysis.diffusion.analyzer \\\n", " import DiffusionAnalyzer as PymatgenDiffusionAnalyzer\n", "from pymatgen.io.vasp import Xdatcar\n", "np.random.seed(42)\n", "rng = np.random.RandomState(42)" ] }, { "cell_type": "markdown", "id": "29e4e68d-8032-4e41-bd6c-86624f663ba9", "metadata": {}, "source": [ "The `kinisi.DiffusionAnalyzer` and `pymatgen.DiffusionAnalyzer` are very similar, but pymatgen does not use `scipp` to handle dimensions. Pymatgen therefore takes the same imputs, but in in a different format, assuming fs as unit for time. We will define the pymatgen input parameters as `p_params`." ] }, { "cell_type": "code", "execution_count": null, "id": "ff13aa9f-fc85-484b-8b09-37932b45aec1", "metadata": {}, "outputs": [], "source": [ "p_params = {'specie': 'Li',\n", " 'time_step': 2.0,\n", " 'step_skip': 50\n", " }" ] }, { "cell_type": "code", "execution_count": null, "id": "759490f5-26ed-47d2-9704-9d32465efbc1", "metadata": {}, "outputs": [], "source": [ "xd = Xdatcar('./example_XDATCAR.gz')" ] }, { "cell_type": "markdown", "id": "d3028fc4-2491-491f-a978-a2a57b9ddc4a", "metadata": {}, "source": [ "We parse `Xdatcar.structures` to `pymatgen` to run the analysis. `pymatgen` also requires an additional `temperature` keyword. " ] }, { "cell_type": "code", "execution_count": null, "id": "fdb0a4aa-6ba4-48ab-9f65-b0ab2b8e15a4", "metadata": {}, "outputs": [], "source": [ "pymatgen_diff = PymatgenDiffusionAnalyzer.from_structures(\n", " xd.structures, temperature=300, **p_params)" ] }, { "cell_type": "markdown", "id": "ffe9eb41-4487-4e85-a6ab-186e6c39085c", "metadata": {}, "source": [ "We can set up the `kinisi` analysis as in this [previous example](./vasp_d.html)." ] }, { "cell_type": "code", "execution_count": null, "id": "9746e927-bda1-4256-8931-a6b310b57fa6", "metadata": {}, "outputs": [], "source": [ "params = {'specie': 'Li',\n", " 'time_step': 2.0 * sc.Unit('fs'),\n", " 'step_skip': 50 * sc.Unit('dimensionless'),\n", " 'progress': False\n", " }" ] }, { "cell_type": "code", "execution_count": null, "id": "5d8cd6e2-11d4-4926-9119-a7d81da4f3f5", "metadata": {}, "outputs": [], "source": [ "kinisi_diff = KinisiDiffusionAnalyzer.from_xdatcar(xd, **params)" ] }, { "cell_type": "markdown", "id": "cc65f185-3e64-40e9-8200-696f9366715a", "metadata": {}, "source": [ "Now we can plot the mean-squared displacement from each to check agreement." ] }, { "cell_type": "code", "execution_count": null, "id": "6c34728b-6aa1-495f-a2c5-31cba27b0b30", "metadata": {}, "outputs": [], "source": [ "fig, ax = plt.subplots()\n", "\n", "ax.plot(pymatgen_diff.dt, pymatgen_diff.msd, label='pymatgen')\n", "ax.plot(kinisi_diff.dt.values, kinisi_diff.msd.values, label='kinisi')\n", "ax.legend()\n", "ax.set_xlabel(f'Time / {kinisi_diff.dt.unit}')\n", "ax.set_ylabel(f'MSD / {kinisi_diff.msd.unit}')\n", "ax.set_xlim(0, None)\n", "ax.set_ylim(0, None)\n", "plt.show()" ] }, { "cell_type": "markdown", "id": "3af71036-f616-416f-9a11-082135f00b9e", "metadata": {}, "source": [ "We can see that the results overlap almost entirely.\n", "\n", "However, this doesn't show the benefits for using `kinisi` over `pymatgen`. \n", "The first benefit is that `kinisi` will accurately estimate the variance in the observed mean-squared displacements, giving error bars for the above plot. " ] }, { "cell_type": "code", "execution_count": null, "id": "82b2379b-5dbd-4d57-937c-1b84e318bf6d", "metadata": {}, "outputs": [], "source": [ "fig, ax = plt.subplots()\n", "\n", "ax.errorbar(kinisi_diff.dt.values, \n", " kinisi_diff.msd.values, \n", " np.sqrt(kinisi_diff.msd.variances), \n", " c='#ff7f0e')\n", "ax.set_xlabel(f'Time / {kinisi_diff.dt.unit}')\n", "ax.set_ylabel(f'MSD / {kinisi_diff.msd.unit}')\n", "ax.set_xlim(0, None)\n", "ax.set_ylim(0, None)\n", "plt.show()" ] }, { "cell_type": "markdown", "id": "50fa7a3d-1171-4896-a2ce-36370a7a9620", "metadata": {}, "source": [ "The second benefit is that `kinisi` will estimate the diffusion coefficient with an accurate uncertainty. \n", "`pymatgen` also estimates this uncertainty, however, `pymatgen` assumes that the data is independent and applies [weighted least squares](https://en.wikipedia.org/wiki/Weighted_least_squares). \n", "However, mean-squared displacement observations are inherently dependent (as discussed in the [thought experiment in the methodology](https://kinisi.readthedocs.io/en/latest/methodology.html#Understanding-the-correlation-between-measurements)), so `kinisi` accounts for this and applied a [generalised least squares](https://en.wikipedia.org/wiki/Generalized_least_squares) style approach. \n", "This means that the estimated variance in the diffusion coefficient from `kinisi` is accurate (while, `pymatgen` will heavily underestimate the value) and given the [BLUE](https://en.wikipedia.org/wiki/Gauss–Markov_theorem#Generalized_least_squares_estimator) nature of the GLS approach, `kinisi` has a higher probability of determining a value for the diffusion coefficient closer to the true diffusion coefficient. " ] }, { "cell_type": "code", "execution_count": null, "id": "b53004e8-4550-4a7d-9b7f-42c478529496", "metadata": {}, "outputs": [], "source": [ "start_of_diffusion = 3000 * sc.Unit('fs')\n", "kinisi_diff.diffusion(start_of_diffusion, progress=False, random_state=rng)" ] }, { "cell_type": "code", "execution_count": null, "id": "f40d9b74-15d8-4971-90b2-4a8771fb7698", "metadata": {}, "outputs": [], "source": [ "from uncertainties import ufloat" ] }, { "cell_type": "code", "execution_count": null, "id": "65bdd80a-9396-40e3-b45b-a09862a79534", "metadata": {}, "outputs": [], "source": [ "print('D from pymatgen:', \n", " ufloat(pymatgen_diff.diffusivity, pymatgen_diff.diffusivity_std_dev))" ] }, { "cell_type": "code", "execution_count": null, "id": "4933f40b-1b26-4de5-be07-f34e7c108b27", "metadata": {}, "outputs": [], "source": [ "kinisi_diff.D" ] }, { "cell_type": "markdown", "id": "555bc479-d641-47d7-8421-5dea4ab14555", "metadata": {}, "source": [ "The comparison between weighted and generalised least squared estimators will be discussed in full in a future publication covering the methodology of `kinisi`." ] } ], "metadata": { "kernelspec": { "display_name": "kinisi-dev", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.11.13" } }, "nbformat": 4, "nbformat_minor": 5 }