Kernel-Based Multi-Target Drug Design

We use kernel based machine learning methods, like the well known Support Vector Machine (SVM) [1], to develop models for virtual screening and QSAR (quantitative structure-activity relationship) in order to test the suitability of chemical compounds as potential new drugs. Lately, the development of multi-target drugs is considered more and more beneficial for complex diseases like cancer. Therefore, we extend the development of single-target models to a multi-target version. For that to happen, specific multi-target algorithms have to be applied. Several multi-task algorithms can exploit the similarity between several targets to transfer knowledge of a well known domain to a similar, less known domain [2].

During the optimization of a lead candidate, support vector regression (SVR) can be used to address and reveal the specific affinity of each molecule. The development of a multi-target agent requires to take the affinity profile against several targets into account. If the targets of a multi-target drug are sufficiently similar, then knowledge can be transferred between target specific QSAR models to improve the model accuracy [3]. The resulting QSAR model should generalize better on unseen data. An algorithm specific regularizer facilitates the similarity between the target (task) specific models. The relatedness between the targets can be derived from a given taxonomy, e.g. the human kinome [4].

Figure 1: The top-down domain adaption multi-task (TDMT) SVR successively trains more specific models along a supplied taxonomy (left). The graph-regularized multi-task (GRMT) SVR assumes the tasks to be pairwise related with a given similarity and trains all tasks in one step (right) [3].



Multi-task learning can increases the performance compared to training separate models given a sufficient similarity between the tasks. The application of multi-task learning is most beneficial if knowledge can be transferred from a similar task with a lot of in-domain knowledge to a task with little in-domain knowledge. Furthermore, the benefit increases with a decreasing overlap between the chemical space spanned by training compounds of the tasks. Thus, multi-target algorithms can even out a lesser amount of compounds for a specific problem by transferring knowledge of a similar problem.

Figure 2: The regularizer J(w1, ... ,wT) forces the model of task 1 (w1) to be more similar to the model of task 2 (w2) [3].


References

  • [1] Cortes C, Vapnik V. Support Vector Networks, Machine Learning. 20, 273 - 297, 1995.
  • [2] Widmer C, and Rätsch G. Multitask Learning in Computational Biology. JMLR W&CP, 27:207-216, 2012.
  • [3] Rosenbaum L, Dörr A, Bauer M, Boeckler FM, and Zell A. Inferring multi-target QSAR models with taxonomy-based multi-task learning. J. Cheminf., 5:33, 2013.
  • [4] Manning G, Whyte DB, Martinez R, Hunter T, and Sudarsanam S. The Protein Kinase Complement of the Human Genome. Sci. Signaling, 298(5600):1912, 2002.

Contact

Alexander Dörr, Tel.: (07071) 29-77174, alexander.doerr (at) uni-tuebingen.de