Chemical Table File Formats
Published:
Molfile
An ‘'’MDL Molfile’’’ is a file format for holding information about the atoms, bonds, connectivity and coordinates of a molecule.
The molfile consists of some header information, the Connection Table (CT) containing atom info, then bond connections and types, followed by sections for more complex information.
The molfile is sufficiently common that most, if not all, [[cheminformatics]] software systems/applications are able to read the format, though not always to the same degree. It is also supported by some computational software such as [[Mathematica]].
The current ‘‘de facto’’ standard version is molfile V2000, although, more recently, the V3000 format has been circulating widely enough to present a potential compatibility issue for those applications that are not yet V3000-capable.
{| class=”wikitable” style=”margin-left: auto; margin-right: auto; border: none;” |+ [[File:L-Alanine.svg|thumb|center|The contents of a Molfile of L-Alanine]] | L-Alanine |’'’Title line’’’ (can be blank but line must exist) ! rowspan=”3” |’'’Header Block’’’ (3 lines) |- |<pre> ABCDEFGH09071717443D</pre> |’'’Program / file timestamp line’’’ (Name of source program and a file timestamp) |- |<pre>Exported</pre> |’'’Comment line’’’ (can be blank but line must exist) |- |<pre>6 5 0 0 1 0 3 V2000</pre> |’'’Counts line’’’ ! rowspan=”4” |Connection table |- |<pre>-0.6622 0.5342 0.0000 C 0 0 2 0 0 0 0.6622 -0.3000 0.0000 C 0 0 0 0 0 0
-0.7207 2.0817 0.0000 C 1 0 0 0 0 0
-1.8622 -0.3695 0.0000 N 0 3 0 0 0 0
0.6220 -1.8037 0.0000 O 0 0 0 0 0 0
1.9464 0.4244 0.0000 O 0 5 0 0 0 0</pre> |’'’Atom block’’’ (1 line for each atom): x, y, z (in [[angstrom]]s), element, etc. |- |<pre>1 2 1 0 0 0 1 3 1 1 0 0
1 4 1 0 0 0
2 5 2 0 0 0
2 6 1 0 0 0</pre> |’'’Bond block’’’ (1 line for each bond): 1st atom, 2nd atom, type, etc. |- |<pre>M CHG 2 4 1 6 -1 M ISO 1 3 13</pre> |’'’Properties block’’’ |- | M END |’'’END line’’’ (NOTE: some programs don’t like a blank line before M END) !’'’END’’’ |}
==== Counts line block specification ==== {| class=”wikitable” style=”margin-left: auto; margin-right: auto; border: none;” |+ !Value !6 !5 !0 !0 !0 !1 !V2000 |- |Description |number of atoms |number of bonds |number of atom list |Chiral flag, 1 = chiral; 0 = not chiral |number of stext entries |number of lines of additional properties |mol version |- |Type |[Generic] |[Generic] |[Query] |[Generic] |[ISIS/Desktop] |[Generic] | |}
==== Counts line block specification ==== {| class=”wikitable” style=”margin-left: auto; margin-right: auto; border: none;” |+ !Value !6 !5 !0 !0 !0 !1 !V2000 |- |Description |number of atoms |number of bonds |number of atom list |Chiral flag, 1 = chiral; 0 = not chiral |number of stext entries |number of lines of additional properties |mol version |- |Type |[Generic] |[Generic] |[Query] |[Generic] |[ISIS/Desktop] |[Generic] | |}
==== Bond block specification ==== The [[Chemical bond|Bond]] Block is made up of bond lines, one line per bond, with the following format:
111 222 ttt sss xxx rrr ccc
where the values are described in the following table: {| class=”wikitable” |+ !Field !Meaning !Values |- |111 |first atom number | |- |222 |second atom number | |- |ttt |bond type |1= Single, 2 = Double, 3 = Triple, 4 = Aromatic,5 = Single or Double, 6 = Single or Aromatic, 7 = Double or Aromatic, 8 = Any |- |sss |bond stereo |For single bonds: 0 = not stereo; 1= up; 4=either, 6= down
For double bonds:
0= Use x-, y-, z-coords from atom block to determine cis or trans; 3=Cis or trans (either) double bond |- |xxx |not used | |- |rrr |bond topology |0 = Either, 1 = Ring, 2 = Chain |- |ccc |reacting center status |0 = unmarked, 1 = a center, -1 = not a center, Additional: 2 = no change, 4 = bond made/broken, 8 = bond order changes 12 = 4+8 (both made/broken and changes);
5 = (4 + 1), 9 = (8 + 1), and 13 = (12 + 1) are also possible |}
Structure-Data (SDF) File
SDF is one of a family of chemical-data file formats developed by MDL; it is intended especially for structural information. “SDF” stands for structure-data file, and SDF files actually wrap the molfile (MDL Molfile) format. Multiple records are delimited by lines consisting of four dollar signs (\(\)). A feature of the SDF format is its ability to include associated data. === SDF === , | mime = chemical/x-mdl-sdfile | owner = | creatorcode = | genre = [[chemical file format]] | container for = | contained by = | extended from = | extended to = }}
| SDF is one of a family of chemical-data file formats developed by MDL; it is intended especially for structural information. “SDF” stands for structure-data file, and SDF files actually wrap the molfile ([[#Molfile | MDL Molfile]]) format. Multiple records are [[delimiter | delimited]] by lines consisting of four dollar signs (\(\)). A feature of the SDF format is its ability to include associated data. |
Associated data items are denoted as follows:
