Indigo 1.2.2beta

18 November 2015

Summary

New features and improvements:

  • New layout algorithm. SMART layout mode was added.( details)
  • SGroup rich support was implemented.( details)
  • TGroups and SCSR Transformations ( details)
  • Tautomer enumeration was implemented ( details)
  • Ionize and pKa calculation methods (details)
  • New Maven repository deployment (details)
  • CIP descriptors simplest implementation (details)

Bugfixes:

  • SGroups related bugs were fixed
  • Layout bugfixes
  • Bug with dearomatization was fixed
  • Bug with CDX empty objects
  • SDF ordering issue was fixed
  • Other small bugfixes

Details

Layout enhancement

New original algorithm was implemented to improve layout procedure for the difficult cycle structures. New option SMART was added. By using this option, one can compare the layout procedures.

Please see examples for more details

SGroups

There are many kinds of S-groups, Indigo now supports all described in the format:

  • generic SGroup (GEN)
  • abbreviation (superatom) (SUP)
  • structure repeating unit (SRU)
  • multiple SGroup (MUL)
  • data SGroup (DAT)
  • monomer SGroup (MON)
  • mer SGroup (MER)
  • copolymer SGroup (COP)
  • crosslink SGroup (CRO)
  • modification SGroup (MOD)
  • graft SGroup (GRA)
  • component SGroup (COM)
  • mixture SGroup (MIX)
  • formulation SGroup (FOR)
  • any polymer SGroup (ANY)

Please see new Indigo API methods.

TGroups

Indigo supports the hybrid representation (SCSR) for a molecule loaded from a V3000 Molfile. SCSR uses TEMPLATE blocks to represent residues and this representation is widely used for biological sequences.

There are methods for transformation SCSR into full CTAB form and vise versa:

  • transformSCSRtoCTAB - transforms SCSR into full CTAB representation (templates are transformed into S-groups)
  • transformCTABtoSCSR - transforms CTAB into SCSR (accepts templates collection and replaces matched fragments by pseudoatoms and corresponding templates)

Examples of usage these methods are in corresponding Examples section.

Enumeration of Tautomers

Indigo provides a method to enumerate tautomers of a selected molecule. Currently there are two algorithms to enumerate tautomers: based on InChI code and based on a set of reaction SMARTS rules.

The iterateTautomers method returns an iterator for tautomers. It accepts a molecule and options as parameters. There are two possible options: INCHI to use method based on InChI code, and RSMARTS to use reaction SMARTS templates

Please see the API description or the Enumeration of Tautomers for detailed examples.

Ionize and pKa calculations

The new IndigoObject.ionize method can be used for building protonated/deprotonated form of the molecule in accordance with pH and pH tolerance. pKa model for pKa estimation can be defined using corresponding Options section).

The IndigoObject.getAcidPkaValue and IndigoObject.getBasicPkaValue method can be used for estimation pKa values for individual atoms in a molecule. pKa model for pKa estimation can be defined using corresponding Options section).

The IndigoObject.buildPkaModel method is used for building pKa model based on custom structures set.

See API methods for some examples

Maven Central Repository

All the Indigo Java packages are uploaded to The Central Repository.

GroupId ArtifactId
com.epam.indigo indigo
com.epam.indigo indigo-inchi
com.epam.indigo indigo-renderer
com.epam.indigo bingo-nosql

Just add a dependency to your Maven project to download Indigo Java API automatically:

<dependencies>
    ...
    <dependency>
        <groupId>com.epam.indigo</groupId>
        <artifactId>indigo</artifactId>
        <version>1.2.2beta-r37</version>
    </dependency>
    ...
</dependencies>

Please note: all Java packages were changed to use com.epam package

CIP Stereo Descriptors

Indigo provides the CIP stereo descriptors calculations. These calculations correspond to latest chemical nomenclature requirements (Nomenclature of Organic Chemistry - IUPAC Recommendations and Preferred Names (2013)). Current implementation includes some simplifications and supports calculations only R/S and E/Z descriptors.

Please see the CIP Descriptors for detailed examples.