Creating chemo- & bioinformatics workflows, further developments within the CDK-Taverna Project
© Kuhn et al. 2008
Published: 26 March 2008
The CDK-Taverna project aims at building an open-source pipelining solution through combination of different open-source projects such as Taverna , the Chemistry Development Kit (CDK)  and Bioclipse .
Pipelining or workflow tools allow for the Lego™-like, graphical assembly of I/O modules and algorithms into a complex workflow which can be easily deployed, modified and tested without the hassle of implementing it into a monolithic application.
Current developments in CDK-Taverna focus on a soft computing framework which allows a flexible use of different methods from, for example, the WEKA  library. Here, properties of chemical substances may be calculated using descriptors from the QSAR / QSPR package of the Chemistry Development Kit (CDK).
Further, a reaction enumeration algorithm for combinatorial chemistry based on existing methods of the Chemistry Development Kit is being developed. This algorithm allows for the enumeration of a reaction given that reactants and products are provided as “Markush” structures.
- Oinn T, Addis M, Ferris M, Marvin D, Senger M, Greenwood M, Carver T, Glover K, Pocock M, Wipat A, Li P: Taverna: A tool for the composition and enactment of bioinformatics workflows. Bioinformatics. 2004, 20 (17): 3045-3054. 10.1093/bioinformatics/bth361.View ArticleGoogle Scholar
- Steinbeck C, Han YQ, Kuhn S, Horlacher O, Luttmann E, Willighagen E: The Chemistry Development Kit (CDK): An open-source Java library for chemo- and bioinformatics. J Chem Inf Comput Sci. 2003, 43: 493-500. 10.1021/ci025584y.View ArticleGoogle Scholar
- Spjuth O, Helmus T, Willighagen EL, Kuhn S, Eklund V, et al: An open rich client workbench for chemo- and bioinformatics. submitted.,Google Scholar
- Witten IH, Frank E: Data-Mining Practical machine learning tools and techniques. 2005, Morgen Kaufmann, San Francisco, 2nd EditionGoogle Scholar
- Hassan M, Brown RB, Varma-O'Brien , Rogers D: Cheminformatics analysis and learning in a data pipelining environment. Molecular Diversity. 2006, 10: 283-299. 10.1007/s11030-006-9041-5.View ArticleGoogle Scholar