Resources¶
Algorithms¶
Fingerprinting is a common technique for molecular screening. Daylight, Inc. introduced this technique, and it is described in the following article. Bingo fingerprints, as compared to Daylight fingerprints, are built not from bond paths, but from trees and rings. A Russian article describes the enumeration of subtrees, which in turn is based on reverse search by David Avis and Komei Fukuda.
For tautomer substructure search, Bingo incorporates a unique type of fingerprinting.
For subgraph matching, we developed an original algorithm. It is somewhat similar to an algorithm by Luigi Cordella et al.
For the molecule layout in Indigo and Bingo, we developed a unique algorithm, based on biconnected components extraction. It incorporates some ideas of Craig Shelley.
For the aromaticity matching and resonance search, we developed unique algorithms as well.
For the canonical SMILES, Brendan McKay’s nauty canonical labeling algorithm was re-implemented. Many of the original nauty’s features and optimizations were not included. There is also a description of the algorithm in Russian.
For the affine transformation matching, Wolfgang Kabsch’s algorithm was implemented.
In internal molecule and reaction formats, the LZW algorithm with some modifications was used for compression.
For the exact maximum common substructure search, we developed a unique
algorithm partially based on Thierry Hanser’s
algorithm
(which in turn has its roots in an algorithm by Coen Bron and Joep
Kerbosh). For the
approximate maximum common substructure search, the
2DOM algorithm with some
modifications was implemented. A Russian paper
describes both algorithms.
Oracle Interface¶
The C++ wrapper was implemented for the Oracle OCI library. One Russian article describes the pitfalls of handling Oracle LOBs in OCI, while another Russian article covers the performance of queries.
Tautomer Methods¶
InChI Code¶
The IUPAC International Chemical Identifier (InChI) is a textual identifier for chemical substances, designed to provide a standard and human-readable way to encode molecular information. Read more about The IUPAC Chemical Identifier – Technical Manual.
InChI code provides information about mobile H atoms with allows us enumerate all possible tautomers based on (1,3)-shifts for open-chain molecules
and (1,n)-shifts (with n being an odd number >3) for ring systems. Please see Tautomer Identification and Tautomer Structure Generation Based on the InChI Code
for details.
RSMARTS Rules¶
RSMARTS rules is another approach to enumerate tautomers. Currently the set of rules is taken from the article Tautomerism in large databases
.
Cairo¶
Cairo 1.8.6 was used for rendering in
Indigo. In Linux builds, the system-wide installation of libcairo
must be present (-I/usr/include/cairo
flag is passed to the compiler
and -lcairo
is passed to the linker). In Mac OS X builds, the
system-wide installation of libcairo
must be present as well
(/opt/local/lib/libcairo.dylib
path is passed to the linker). You
can install it using MacPorts. The 32-bit
and 64-bit Windows builds of Cairo are included in the source tree. They
can be downloaded from
GTK and
GNOME
project sites, respectively. Binaries of
libpng,
zlib, Freetype,
Fontconfig, and
Expat, needed for Cairo, are there
as well.
File Formats¶
The molecule and reaction format descriptions are available on the following sites: Daylight SMILES, ChemAxon SMILES extensions, and MDL (Symyx) Molfiles.
Web standards¶
More information about web standards used by Ketcher can be found here: SVG, VML.
Web frameworks¶
Ketcher includes two JavaScript libraries: Raphael for vector graphics and Prototype for overall code improvement.
Code Examples¶
Chemistry Toolkit Rosetta wiki contains some examples of small utilities written using Indigo C++ API.
Development Tools¶
For building projects CMake is used. We use NetBeans on Linux workstations, Microsoft Visual Studio on Windows workstations, and XCode on Mac OS X workstations.
Commercial availability¶
We have dual-licensed our code. If the GPL-licensed code does not fit your needs, please contact us to discuss the purchase of a commercial license. You may need the commercial license if you want to:
Receive ongoing support and maintenance of our code
Include our code in your proprietary software product
Contributors¶
Indigo Toolkit and Bingo¶
Core team: Aleksandr Savelev, Iurii Puzanov, Valerii Samoilov, Vladislav Karnaukhov
Java Lucene: Artem Malykh
KNIME Nodes: Anton Pikhtin
QA, Devops: Irina Tuzkova, Mikhail Kviatkovskii
Ketcher¶
Core team: Sergei Gelmetdinov, Nikolay Kuznetsov, Nikita Ryzhov
QA: Irina Tuzkova
Indigo ELN¶
Nikita Karuze, Vladislav Alekseev, Evgeniia Demianchuk
Parso¶
Petr Tsurinov, Igor Printsev
Lifesciences Portal¶
Design: Iuliia Nikolskaia, Daniil Nosenko, Alexander Telenkov
Page-proofs, UX and implementation: Ekaterina Leonteva, Egor Tarakanov, Nikita Ryzhov