MUVing SAFEly: Scale for Assessing Figures of Effectiveness (SAFE) in virtual screening using Maximum Unbiased Validation (MUV) datasets
© Rohrer and Baumann; licensee BioMed Central Ltd. 2009
Published: 05 June 2009
Refined nearest neighbour analysis was recently introduced for the analysis of bias introduced by benchmark datasets into the validation of virtual screening techniques . It is based on spatial statistics  and provides a mathematical framework for the nonparametric analysis of mapped point patterns.
Here, the application of refined nearest neighbour analysis to the design of benchmark datasets for virtual screening based on PubChem bioactivity data is presented. Datasets of compounds active against pharmaceutically relevant targets were purged from unselective hits by a data cantered workflow. Topological optimization using experimental design strategies monitored by refined nearest neighbour analysis functions was applied to generate corresponding datasets of actives and decoys that are as unbiased as possible with regard to analogue bias and artificial enrichment.
These datasets provide a tool for Maximum Unbiased Validation (MUV) of virtual screening methods. The datasets and a software package implementing the MUV design workflow are freely available at http://www.pharmchem.tu-bs.de/lehre/baumann/MUV.html.
Using a large number of designed sub-samples from established VS benchmark datasets, a general regression model was developed that allows the prediction of the estimated performance of VS methods. Based on this model and the MUV datasets, SAFE (Scale for Assessing Figures of Effectiveness) an open, standardized, platform for the benchmarking of virtual screening methods was developed. SAFE will allow developers to benchmark novel methods against established ones and facilitate the selection of the best VS tool for researchers on prospective virtual screening projects.
- Rohrer SG, Baumann K: Impact of Benchmark Data Set Topology on the Validation of Virtual Screening Methods: Exploration and Quantification by Spatial Statistics. J Chem Inf Model. 2008, 48: 704-718. 10.1021/ci700099u.View ArticleGoogle Scholar
- Fortin MJ, Dale MRT: Spatial Analysis: A Guide for Ecologists. 2005, Cambridge University Press: Cambridge, UKGoogle Scholar
This article is published under license to BioMed Central Ltd.