This report compares Imago OCR application with other applications for molecule image optical character recognition.
For each image the testing framework measures execution time and similarity score with a reference
molecule file in Molfile format. Indigo toolkit is used to measure molecule similarity.
Because different application produces output differently testing framework applies the following rules to standardize molecules:
Hydrogens are folded.
If the output contains multiple molecules in SDF format then all of them are merged into a single molecule with several disconnected fragments.
Both aromatized and dearomatized structures are compared and best score is selected.
Compared versions:
$VERSIONS$
Test sets
$TOC$
Summary
$SUMMARY$
Note: each number has a link to a histogram.