31 No. 3
IUPAC International Chemical Identifier—InChI Update
The final InChI version 1.02 software was issued in January 2009 as an implementation for generating Standard InChI (see below) and the corresponding Standard InChIKey. The complete package, which can be downloaded at <www.iupac.org/inchi/download/>, contains:
- source code and Application Program Interface (API)
- stand-alone executables for Windows and Linux (stdinchi-1.exe, stdwinchi-1.exe, and stdinchi-1.gz)
- description of new features, with examples using new functionality
- copy of GNU LGPL license
In response to user requests, a Standard InChI (i.e., without options for such properties as tautomerism and stereoconfiguration) has been defined as follows:
- Standard InChI is for the purposes of interoperability/compatibility between large databases/web searching and information exchange.
- Standard InChI and nonstandard InChI are always distinguishable.
- Standard InChI is a stable identifier. However, periodic updates may be necessary; they are reflected in the identifier version designation, which is included in the InChI string.
- Any shortcomings in standard InChI may be addressed using nonstandard InChI (currently obtainable using InChI v. 1.02beta).
In response to user feedback, the format of InChIKey has been changed; it is different from that in InChI software v. 1.02 beta, having 27 characters rather than 25. Standard InChIKey has five distinct components:
- 14-character hash of the basic (mobile-H) InChI layer
- 8-character hash of the remaining layers (except for the “/p” segment, which accounts for added or removed protons. It is not hashed at all; the number of protons is encoded at the end of the Standard InChIKey.)
- 1 flag character
- 1 version character
- a [de]protonation indicator as the last character
The overall length of InChIKey is fixed at 27 characters, including separators (dashes):
This is significantly shorter than a typical InChI string:
- AAAAAAAAAAAAAA is a 14-character hash.
- BBBBBBBB is an 8-character hash.
- F is a flag indicating standard InChIKey (produced out of standard InChI), which always has the value “S.”
- V is a flag for InChI version character: “A” for version 1, “B” for version 2, and so forth.
- P is an indicator for the number of protons; this number is not encoded in the hash but is indicated as a separate two-character block at the end where one character is a hyphen, as –N for neutral, –M for –1 hydrogen, –O for +1 hydrogen, and so forth.
Full details and examples are provided in the documentation accompanying the software download.
Software implementing the final InChI version 1.02 for nonstandard InChI (i.e., with all previous options retained and with the 27-character InChIKey) will be issued in due course.
Users are encouraged to report their experiences and any problems via the SourceForge website <sourceforge.net/projects/inchi>.
last modified 24 April 2009.
Copyright © 2003-2009 International Union of Pure and
Questions regarding the website, please contact [email protected]