{ "cells": [ { "cell_type": "markdown", "id": "yeh-hummer-intro", "metadata": {}, "source": [ "# Finite-size corrections with Yeh-Hummer\n", "\n", "When performing molecular dynamics simulations with periodic boundary conditions, the calculated diffusion coefficients are affected by finite-size effects. The `kinisi` package includes tools to apply the Yeh-Hummer finite-size correction, which extrapolates the diffusion coefficient to the infinite system size limit.\n", "\n", "In this tutorial, we will demonstrate how to use the `YehHummer` class to correct diffusion coefficients obtained from simulations with different box sizes.\n", "\n", "
\n", "\n", "**Note**\n", "\n", "The Yeh-Hummer correction is based on the work by Yeh & Hummer (J. Phys. Chem. B 2004, 108, 15873-15879). The method assumes that the finite-size effects follow a linear relationship with the inverse box length.\n", "
" ] }, { "cell_type": "code", "execution_count": null, "id": "imports", "metadata": {}, "outputs": [], "source": [ "import matplotlib.pyplot as plt\n", "import numpy as np\n", "import scipp as sc\n", "\n", "from kinisi.yeh_hummer import YehHummer\n", "\n", "np.random.seed(42)" ] }, { "cell_type": "markdown", "id": "data-setup", "metadata": {}, "source": [ "## Setting up the data\n", "\n", "For this tutorial, we will use the TIP3P water diffusion data from the original Yeh & Hummer paper. This data represents diffusion coefficients calculated from molecular dynamics simulations of water at different box sizes.\n", "\n", "For obtaining diffusion coefficients from your simulations, you can use the `kinisi.analyze.DiffusionAnalyzer` (from `DiffusionAnalyzer.D`) by following the steps outlined in the [Diffusion coefficient from a VASP file](./vasp_d.html) tutorial." ] }, { "cell_type": "code", "execution_count": null, "id": "tip4p-data", "metadata": {}, "outputs": [], "source": [ "# TIP3P water data from Yeh & Hummer paper\n", "box_lengths = np.array([18.58, 23.42, 29.51, 37.19, 46.86]) # Angstroms\n", "D_values = np.array([4.884e-5, 5.123e-5, 5.315e-5, 5.466e-5, 5.590e-5]) # cm^2/s\n", "D_errors = np.array([0.032e-5, 0.027e-5, 0.014e-5, 0.011e-5, 0.013e-5]) # cm^2/s" ] }, { "cell_type": "markdown", "id": "dataarray-creation", "metadata": {}, "source": [ "We need to organize this data into a `scipp.DataArray` with the appropriate coordinates and variances." ] }, { "cell_type": "code", "execution_count": null, "id": "create-dataarray", "metadata": {}, "outputs": [], "source": [ "td = sc.DataArray(\n", " data=sc.array(dims=['system'], values=D_values, variances=D_errors**2, unit='cm^2/s'),\n", " coords={'box_length': sc.Variable(dims=['system'], values=box_lengths, unit='angstrom')},\n", ")\n", "td" ] }, { "cell_type": "markdown", "id": "yeh-hummer-object", "metadata": {}, "source": [ "## Creating the YehHummer object\n", "\n", "With our data prepared, we can now create a `YehHummer` object. The temperature is required for the viscosity calculation. You can optionally provide bounds for the fitting parameters." ] }, { "cell_type": "code", "execution_count": null, "id": "create-yeh-hummer", "metadata": {}, "outputs": [], "source": [ "# Create YehHummer object with temperature 298 K\n", "yh = YehHummer(td, temperature=sc.scalar(value=298, unit='K'))" ] }, { "cell_type": "markdown", "id": "mcmc-sampling", "metadata": {}, "source": [ "## MCMC sampling for uncertainty estimation\n", "\n", "To obtain full probability distributions for the parameters, we can perform Markov chain Monte Carlo (MCMC) sampling:" ] }, { "cell_type": "code", "execution_count": null, "id": "run-mcmc", "metadata": {}, "outputs": [], "source": [ "yh.mcmc(n_samples=500, n_walkers=16)" ] }, { "cell_type": "markdown", "id": "parameter-distributions", "metadata": {}, "source": [ "After MCMC sampling, we can visualize the probability distributions for the fitted parameters:" ] }, { "cell_type": "code", "execution_count": null, "id": "corner-plot", "metadata": {}, "outputs": [], "source": [ "from corner import corner\n", "\n", "corner(\n", " np.array([i.values for i in yh.flatchain.values()]).T,\n", " labels=[' / '.join([k, str(v.unit)]) for k, v in yh.flatchain.items()],\n", ")\n", "plt.show()" ] }, { "cell_type": "markdown", "id": "results-access", "metadata": {}, "source": [ "## Accessing the results\n", "\n", "The infinite-system diffusion coefficient and shear viscosity are now available as probability distributions:" ] }, { "cell_type": "code", "execution_count": null, "id": "show-d-infinite", "metadata": {}, "outputs": [], "source": [ "yh.D_infinite" ] }, { "cell_type": "code", "execution_count": null, "id": "show-viscosity", "metadata": {}, "outputs": [], "source": [ "yh.shear_viscosity" ] }, { "cell_type": "markdown", "id": "plotting-results", "metadata": {}, "source": [ "## Plotting the results\n", "\n", "We can now plot the linear relationship between the diffusion coefficient and the inverse box length with credible intervals.\n", "For the estimation of the diffusion coefficient $D_\\infty$ we extrapolate to infinite box size $\\frac{1}{L} \\to 0$:" ] }, { "cell_type": "code", "execution_count": null, "id": "plot-results", "metadata": {}, "outputs": [], "source": [ "fig, ax = plt.subplots()\n", "\n", "credible_intervals = [[16, 84], [2.5, 97.5], [0.15, 99.85]]\n", "alpha = [0.6, 0.4, 0.2]\n", "\n", "inv_L_data = 1 / yh.box_lengths.values\n", "max_inv_L = np.max(inv_L_data)\n", "inv_L_extended = np.linspace(0, max_inv_L * 1.1, 50)\n", "\n", "ax.errorbar(\n", " inv_L_data,\n", " yh.diffusion.values,\n", " np.sqrt(yh.diffusion.variances),\n", " marker='o',\n", " ls='',\n", " color='k',\n", " zorder=10,\n", ")\n", "\n", "D_0_samples = yh.data_group['D_0'].values\n", "slope_samples = yh.data_group['slope'].values\n", "n_samples = len(D_0_samples)\n", "predictions_extended = np.zeros((len(inv_L_extended), n_samples))\n", "\n", "for i in range(n_samples):\n", " predictions_extended[:, i] = D_0_samples[i] - slope_samples[i] * inv_L_extended\n", "\n", "for i, ci in enumerate(credible_intervals):\n", " ax.fill_between(\n", " inv_L_extended,\n", " *np.percentile(predictions_extended, ci, axis=1),\n", " alpha=alpha[i],\n", " color='#0173B2',\n", " lw=0,\n", " )\n", "\n", "ax.set_xlabel(f'1 / L / {yh.box_lengths.unit}')\n", "ax.set_ylabel(f'D / {yh.diffusion.unit}')\n", "plt.show()" ] }, { "cell_type": "markdown", "id": "understanding-results", "metadata": {}, "source": [ "## Understanding the results\n", "\n", "The Yeh-Hummer correction provides two important quantities:\n", "\n", "1. **Infinite-system diffusion coefficient ($D_{\\infty}$)**: This is the extrapolated diffusion coefficient for an infinite system.\n", "\n", "2. **Shear viscosity ($\\eta$)**: This is estimated from the slope of the linear relationship:\n", "\n", "$$D_{\\text{PBC}} = D_\\infty - \\frac{k_B T \\xi}{6\\pi\\eta L}$$\n", "\n", "where:\n", "- $D_{\\text{PBC}}$ is the diffusion coefficient from the periodic simulation\n", "- $D_\\infty$ is the infinite-system diffusion coefficient\n", "- $k_B$ is the Boltzmann constant\n", "- $T$ is the temperature\n", "- $\\xi$ constant from Ewald summation (2.837297 for cubic boxes)\n", "- $\\eta$ is the shear viscosity\n", "- $L$ is the box length" ] } ], "metadata": { "kernelspec": { "display_name": "kinisi", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.11.5" } }, "nbformat": 4, "nbformat_minor": 5 }