Getting started with MD simulations
Learn Python that is needed to code up MD analysis scripts.
Learn Unix that is needed to install MD simulation programs, run MD simulations, and access supercomputers for running MD simulations by following this tutorial here.
Amber, one of the commonly used MD simulation engines, has very well-documented tutorials for everyone to follow here. Run your very first alanine dipeptide simulation by following the tutorial here. Then I would recommend Amber tutorial 5.5 (advanced version of alanine dipeptide tutorial) and 6.3 (aMD tutorial but the most up-to-date method is GaMD or Gaussian accelerated molecular dynamics). Please download and install Amber and VMD from their respective websites before attempting these tutorials.
GROMACS is another commonly used MD simulation engines and can be downloaded for free here, but there are many more like LAMMPS, etc.!
PyMOL is another commonly used MD simulation visualization programs (other than VMD) and can be downloaded for free here (college students). Chimera X is also a great visualization program!
How to set up MD simulations (taken from https://ctlee.github.io/BioChemCoRe-2018/system-prep/)
Protein structures from various structural determination methods often are not complete. For example, structures from X-ray crystallography typically do not have resolved hydrogens. Given the importance of hydrogen bonding, which requires hydrogen participation, for protein stability and receptor-ligand interactions, X-ray crystal structures cannot be used used in molecular dynamics (MD) right “off the shelf.”
Protein preparation: Use pdb2pqr (or PROPKA, H++, Schrodinger Maestro) to prepare the PDB, i.e., add hydrogens at the appropriate pH (usually 7). If using pdb2pqr, set the force field and residue naming scheme to be both AMBER. Then you can just download the resulting PQR file, which has a very similar format to PDB (the last two columns indicate the partial charge of the atom and van der Waals radius of the atom in PQR files), and use that like a PDB file. If disulfide bonds need to be created, this can be addressed at the Amber tleap stage (please read about Amber tleap here). For filling in missing side chains and loops, Schrodinger Prime or other programs like AlphaFold will need to be used - if this is the case for your system, please contact me/PI so that we can work on this together unless you're an expert.
Ligand parameterization: If your system has ligands, then pdb2pqr will delete these non-standard residues. Make sure to copy and paste the ligands from your original PDB file to the resulting PQR file. We will have to create parameters for these ligands using Amber's Antechamber and Generalized Amber Force Field (GAFF). Please complete this tutorial before proceeding with this step. If you need to deal with DNA or RNA, please complete this tutorial before proceeding with this step.
Protein parameterization: For proteins, we can use the Amber ff19SB force field along with the OPC water model that has been proven to work best with the ff19SB force field (do not use the TIP3P water model with this force field!). If your protein is an intrinsically disordered protein (IDP), use the a99SB-disp force field and its associated water model (see here). You can use loadpdb command in Amber tleap with your resulting PQR file from Step 1. Use Amber tleap to create your prmtop and rst7 files as done in the Amber tutorials. Please complete this tutorial before proceeding with this step. Remember that we need to add ions (Na+, K+, and Cl- are common ones) to neutralize the charge of the system and also set the appropriate ionic concentration for the system (usually 0.1-0.15 M NaCl or KCl but please refer to relevant experimental papers to see if this is really the case). We also need to add the appropriate parameter files for the ligands if we have any from Step 2 at the Amber tleap stage.
Relaxing the system: Now that you have completed building the system, we need to relax the system before moving onto production MD. We first need to minimize, heat to the appropriate temperature (usually room temperature or 298 or 300 K but again please refer to relevant experimental papers to see if this is really the case), and equilibrate the system as done in the introductory Amber tutorials. Complete or read over this and this to make sure that you understand each step thoroughly. You can take and adapt the minimization, heating, and equilibration scripts from the Amber tutorials for your system and run those steps on your computer/workstation or high performance computing (HPC) supercomputer cluster.
Production MD: For production MD, depending on what you want from your MD simulations (free energies? rate constants? continuous pathways?), we will run an MD simulation with the appropriate enhanced sampling method (usually GaMD or WE or the combination of the two). Please read about the enhanced sampling method that you will use from relevant papers and websites.
Great introductory slides on MD and WE by Prof. Matthew Zwier!
Seminal papers for Ahn lab members:
WE Review Paper: Zuckerman, Daniel M., and Lillian T. Chong. "Weighted ensemble simulation: review of methodology, applications, and software." Annual review of biophysics 46 (2017): 43.
WESTPA 2.0: Russo, John D., et al. "WESTPA 2.0: High-performance upgrades for weighted ensemble simulations and analysis of longer-timescale applications." Journal of Chemical Theory and Computation 18.2 (2022): 638-649.
WESTPA 1.0 (older version but still has a valuable overview of the WESTPA program): Bogetti, Anthony T., et al. "A suite of tutorials for the WESTPA rare-events sampling software [Article v1. 0]." Living journal of computational molecular science 1.2 (2019).
GaMD: Miao, Yinglong, Victoria A. Feher, and J. Andrew McCammon. "Gaussian accelerated molecular dynamics: unconstrained enhanced sampling and free energy calculation." Journal of chemical theory and computation 11.8 (2015): 3584-3595.
Additional reading and websites to get started on the weighted ensemble (WE) method:
An example of how WE is able to sample biologically relevant timescales
Download the latest WESTPA here for your workstation and supercomputer cluster
Installation for PowerPC architecture, etc. (alternative installations) can be found here
Checklist for WESTPA simulation
Using the HDF5 trajectory storage scheme
Minimal adaptive binning scheme video
Getting thermodynamic and kinetic properties from WESTPA simulations