Volume 3 Supplement 1

4th German Conference on Chemoinformatics: 22. CIC-Workshop

Open Access

Chemical complexity mapping in QSAR models

  • T Thalheim1,
  • R-U Ebert1,
  • R Kühne1 and
  • G Schüürmann1
Chemistry Central Journal20093(Suppl 1):P40

https://doi.org/10.1186/1752-153X-3-S1-P40

Published: 05 June 2009

QSAR models relate features of chemical structures to target properties or effects. Quality models are supposed to apply validated data sets. Typically, the target data are validated in terms of accuracy and reliability. To each data item, a chemical structure is assigned, and in case of 3D geometry models some more or less sophisticated geometry optimisation is performed. However, usually less attention is drawn to the proper representation of chemical identities themselves before entering the model training set. Reported chemical names or even registry numbers often relate to ambiguous chemical structures. There are chemical aspects such as isomerism, mesomerism, and tautomerism, and measured data may relate to generic compound specifications, or to mixtures of defined or even undefined compositions.

Within the framework of the EU projects OSIRIS and 2-FUN, a database concept is introduced to reflect these aspects of chemical complexity. One of the goals of this development is to provide a tool for obtaining representative data sets for QSAR developments, taking into account the chemical complexity in an appropriate manner.

The importance of this approach is demonstrated by example calculations to show the effect of uncertainties due to ambiguous chemical structures on the output of QSAR models. This study is supported by the EU projects OSIRIS (contract No. 037017) and 2-FUN (contract No. 036976).

Authors’ Affiliations

(1)
Department of Ecological Chemistry, Helmholtz-Centre for Environmental Research

Copyright

© Thalheim et al; licensee BioMed Central Ltd. 2009

This article is published under license to BioMed Central Ltd.