Resources

Algorithms

Fingerprinting is a common technique for molecular screening. Daylight, Inc. introduced this technique, and it is described in the following article. Bingo fingerprints, as compared to Daylight fingerprints, are built not from bond paths, but from trees and rings. A Russian article describes the enumeration of subtrees, which in turn is based on reverse search by David Avis and Komei Fukuda.

For tautomer substructure search, Bingo incorporates a unique type of fingerprinting.

For subgraph matching, we developed an original algorithm. It is somewhat similar to an algorithm by Luigi Cordella et al.

For the molecule layout in Indigo and Bingo, we developed a unique algorithm, based on biconnected components extraction. It incorporates some ideas of Craig Shelley.

For the aromaticity matching and resonance search, we developed unique algorithms as well.

For the canonical SMILES, Brendan McKay’s nauty canonical labeling algorithm was re-implemented. Many of the original nauty’s features and optimizations were not included. There is also a description of the algorithm in Russian.

For the affine transformation matching, Wolfgang Kabsch’s algorithm was implemented.

In internal molecule and reaction formats, the LZW algorithm with some modifications was used for compression.

For the exact maximum common substructure search, we developed a unique algorithm partially based on Thierry Hanser’s algorithm (which in turn has its roots in an algorithm by Coen Bron and Joep Kerbosh). For the approximate maximum common substructure search, the 2DOM algorithm with some modifications was implemented. A Russian paper describes both algorithms.

Oracle Interface

The C++ wrapper was implemented for the Oracle OCI library. One Russian article describes the pitfalls of handling Oracle LOBs in OCI, while another Russian article covers the performance of queries.

Tautomer Methods

InChI Code

The IUPAC International Chemical Identifier (InChI) is a textual identifier for chemical substances, designed to provide a standard and human-readable way to encode molecular information. Read more about The IUPAC Chemical Identifier – Technical Manual.

InChI code provides information about mobile H atoms with allows us enumerate all possible tautomers based on (1,3)-shifts for open-chain molecules and (1,n)-shifts (with n being an odd number >3) for ring systems. Please see Tautomer Identification and Tautomer Structure Generation Based on the InChI Code for details.

RSMARTS Rules

RSMARTS rules is another approach to enumerate tautomers. Currently the set of rules is taken from the article Tautomerism in large databases.

Cairo

Cairo 1.8.6 was used for rendering in Indigo. In Linux builds, the system-wide installation of libcairo must be present (-I/usr/include/cairo flag is passed to the compiler and -lcairo is passed to the linker). In Mac OS X builds, the system-wide installation of libcairo must be present as well (/opt/local/lib/libcairo.dylib path is passed to the linker). You can install it using MacPorts. The 32-bit and 64-bit Windows builds of Cairo are included in the source tree. They can be downloaded from GTK and GNOME project sites, respectively. Binaries of libpng, zlib, Freetype, Fontconfig, and Expat, needed for Cairo, are there as well.

File Formats

The molecule and reaction format descriptions are available on the following sites: Daylight SMILES, ChemAxon SMILES extensions, and MDL (Symyx) Molfiles.

Web standards

More information about web standards used by Ketcher can be found here: SVG, VML.

Web frameworks

Ketcher includes two JavaScript libraries: Raphael for vector graphics and Prototype for overall code improvement.

Code Examples

Chemistry Toolkit Rosetta wiki contains some examples of small utilities written using Indigo C++ API.

Development Tools

For building projects CMake is used. We use NetBeans on Linux workstations, Microsoft Visual Studio on Windows workstations, and XCode on Mac OS X workstations.

Commercial availability

We have dual-licensed our code. If the GPL-licensed code does not fit your needs, please contact us to discuss the purchase of a commercial license. You may need the commercial license if you want to:

  • Receive ongoing support and maintenance of our code

  • Include our code in your proprietary software product

Contributors

Indigo Toolkit and Bingo

  • Core team: Aleksandr Savelev, Iurii Puzanov, Valerii Samoilov, Vladislav Karnaukhov

  • Java Lucene: Artem Malykh

  • KNIME Nodes: Anton Pikhtin

  • QA, Devops: Irina Tuzkova, Mikhail Kviatkovskii

Ketcher

  • Core team: Sergei Gelmetdinov, Nikolay Kuznetsov, Nikita Ryzhov

  • QA: Irina Tuzkova

Indigo ELN

Nikita Karuze, Vladislav Alekseev, Evgeniia Demianchuk

Parso

Petr Tsurinov, Igor Printsev

Lifesciences Portal

  • Design: Iuliia Nikolskaia, Daniil Nosenko, Alexander Telenkov

  • Page-proofs, UX and implementation: Ekaterina Leonteva, Egor Tarakanov, Nikita Ryzhov