Previous Next

Computational Chemistry Tutorial: 1. Obtaining Molecular Structures

Sources of Molecular Structures

Structures of molecules can be determined experimentally or predicted computationally. For small molecules, such as ethanol and cyclohexane, both spectroscopic methods and rigorous quantum chemical computations can provide highly accurate molecular geometries in the vapor phase. Such highly accurate methods are not readily applicable for most drug molecules due to their larger size. However, experiments can give quite accurate structures of drugs in the crystalline state or while bound to target macromolecules. Approximate computations can give sufficiently accurate structures for many drugs in the gas or liquid phase.

Acetylacetone tautomerismSolvation can change the structure of molecules significantly under two common scenarious. First, molecule in the solution may exist in a tautomeric form that is not the most stable in the gas phase. For example, acetylacetone, a prototypic β-diketone exists in the gas phase in the enol form where an intramolecular hydrogen bond stabilizes the structure. In the polar solvent, the carbonyl groups are well solvated by water, and the molecule is found predominanty in the keto form. Second, larger flexible molecules with both polar and non-polar moieties (e.g. some HIV protease inhibitors) undego significant conformational changes in aqueous solution. Such molecules try to maximise intramolecular hydrogen bonds in the gas phase in the gas phase, and thus the polar moiteties are often found "inside". In aqueous solution where these polar moieties are solvated, hydrophobic effect forces the non-polar parts "inside". This process is sometimes called an hydrophobic collapse. Finally, structures of biological macromoleculers are sensitive to the environment. For example the structure of dehydrated DNA would differ significantly from the structure of well-hyd rated DNA. In such cases advanced computational methods that account for effects of the environment could be applied.

Molecular Structure Files

Molecular structures can be stored in computer files. Minimally, one needs to define the atom type and its Cartesian coordinates to uniquely describe molecule's structure. However, largely due to historic reasons, we currently have several different file formats that represent the same structural data slightly differently. Some file formats that you may encounter are the PDB format, the MOL2 format, Tinker's XYZ format, Gaussian's Z-Matrix format, and GAMESS's XYZ format. In past, practitioners of computational chemistry had to learn how to hand-craft input files in a suitable format. In more modern times various model building programs help in this task, and programs such as Babel can convert between different formats.

Downloading Molecular Structures

Despite the emergence of large chemical information resources such as PubChem, finding molecular structures that are suitable starting points for computational modeling remains a challenge. For example, you could search PubChem for cubane, and then save its coordinates in the SDF format, but closer inspection (e.g. open the molecule in PyMOL and rotate) reveals serious problems. Databases such as Klotho provide model structures for some common small molecules. Chemicals with Pharmaceutical Activity from University of Oxford offers access to many drug models via JMol plugin from which the mol file can be saved. The ZINC database at UCSF provides MOL2 structure files for millions of compounds. The Protein Data Bank offers access to experimentally determined structures of macromolecules and macromolecular complexes.

Molecular Structures via SMILES Servers

Several web sites generate 3D molecular structures from the SMILES string using the program CORINA. CORINA uses built-in tables of standard bond lengths and angles to create a reasonable model for small or rigid molecules. However, the model geometry for larger and flexible molecules is likely to be quite different than the most prevalent geometry in aqueous solution. One such site that generates 3D model structures is Online SMILES Translator by National Institutes of Health. Practice creating a molecule of ethanol using this service. Start the Structure Editor and sketch an ethanol molecule consisting of two bonds and oxygen; do not worry about hydrogens at carbons. Hit Submit Molecule and notice that the SMILES string for ethanol was generated. On the right-side panel, select PDB and 3D and hit Translate. Right-click on the link to download the molecule and save it as ethanol.pdb into your directory. Examine the file with text editor.

Sketching 2D Molecular Structures

Most chemists are well familiar with drawing 2D molecular structures and several programs allow effortlessly draw 2D representations of three-dimensional molecules. Two of the most popular 2D chemical diagram editors for Windows and Mac OS systems are ChemDraw from CambridgeSoft and MDL Draw from Elsevier MDL. Students can download a fully functional free chemical drawing program MDL Isis/Draw from Elsevier MDL website after registration. Some chemical drawing tools allow generation and export of 3D coordinates of the drawn molecule. The JME Molecular Editor allows to sketch simple molecules on-line and export these structures into the SMILES string.


Previous Next

Tutorial by Dr. Kalju Kahn, Department of Chemistry and Biochemistry, UC Santa Barbara. ©2005-2009