Datasets used for the Evaluation of the Applicability Domain
This page provides access to the datasets used in the article
Nikolas Fechner, Andreas Jahn, Georg Hinselmann and Andreas Zell Estimation of the Applicability Domain of Kernel-based Machine Learning Models for Virtual Screening Journal of Cheminformatics, 2010, 2:2 DOI:10.1186/1758-2946-2-2 |
The training of the SVMs was conducted using the following SD files. Some of the files had to be preprocessed respective to the version provided by cheminformatics.org. In order to allow a fair comparison of future research to this paper here we provide exactly the same SD files as used in the article.
We are very grateful to the original authors of the datasets to grant us the permission to provide these files for public access. Feel free to use these data for your own research, but please cite the original publications as well.
Training Sets
Factor Xa
Training set as SD file consisting of 290 benzamidine-type molecules annotated with pKi values in the ki SD tag. The compounds were originally published by
Fabien Fontaine, Manuel Pastor, Ismael Zamora and Ferran Sanz Anchor-GRIND: Filling the Gap between Standard 3D QSAR and the GRid-INdependent Descriptors Journal of Medicinal Chemistry, 2005, 48(7),pp 2687-2694 DOI:10.1021/jm049113+ |
PDGRF &beta
Training set as SD file consisting of 79 piperazyninylquinazoline analogues annotated with pIC50 values in the Activity SD tag. The compounds were originally published by
Rajarshi Guha and Peter C. Jurs Development of Linear, Ensemble and Nonlinear Models for the Prediction and Interpretation of the Biological Activity of a Set of PDGFR Inhibitors Journal of Chemical Information and Computer Science, 2004, 44(6),pp 2179-2189 DOI:10.1021/ci049849f |
Thrombin
Training set as SD file consisting of 88 benzamidine-type molecules annotated with pKi values in the Activity SD tag. The compounds were originally published by
Markus Böhm, Jörg Stürzebecher and Gerhard Klebe Three-Dimensional Quantitative Structure-Activity Relationship Analyses Using Comparative Molecular Field Analysis and Comparative Molecular Similarity Indices Analysis To Elucidate Selectivity Differences of Inhibitors Binding to Trypsin, Thrombin, and Factor Xa Journal of Medicinal Chemistry, 1999, 42(3),pp 458-477 DOI:10.1021/jm981062r |
Screening Sets
The QSAR models were used to rank the respective data sets from the DUD LIB VS 1.0. This compilation was originally published by
Andreas Jahn, Georg Hinselmann, Nikolas Fechner and Andreas Zell Optimal Assignment Methods for Ligand-based Virtual Screening Journal of Cheminformatics, 2009, 1(14) DOI:10.1186/1758-2946-1-14 |
Contact:
Nikolas FechnerAndreas Jahn