Virtual screening is the computational or in-silico screening of biological compounds and complements the HTS process. It is used to aid the selection of compounds for screening in HTS bioassays or for inclusion in a compound-screening library.Virtual screening can utilise several computational techniques depending on the amount and type of information available about the compounds and the target. Protein-based methods are employed when the 3D structure of the bioassay target is known and computational techniques involve the docking (virtual binding), and subsequent scoring, of candidate ligands (the part of the compound that is capable of binding) to the protein target.Ligand-based approaches are usually used when there are compounds known to be active or inactive for a specific target. If a few active compounds are known then structure-similarity techniques may be used; if the activity of several compounds is known then discriminant analysis techniques, such as machine learning approaches, may be applied. This is achieved by choosing several compounds that have known activity for a specific biological target and building predictive models that can discriminate between the active and inactive compounds. The goal is to then apply these models to several other unscreened compounds so that the compounds most likely to be active may be selected for screening. This is the approach taken in this research.The rationale behind the use of machine learning is to discover patterns and signatures in data sets from high throughput in-vitro assays.

In this module, Abhik Seal describes the technical process of bringing together a variety of computational tools in the R statistics package, to enable predictive modeling of compound-target interaction using supervised machine learning methods.

If you understood the video how to perform predictive modeling then give a shot at this KDD dataset i.e (Prediction of Molecular Bioactivity for Drug Design -- Binding to Thrombin) and try to post the results. This is not an assignment nor the results will be graded.

Max Kuhn ( Director at Pfizer) who is the developer of Caret package shows the uses of R Caret of package. This is quite popular package for predictive modeling.

There are various kind of classification models .Below I listed some of the classification models and its it different properties from Tom Mitchell's Book here is the link

Resources for Learning Machine Learning

The following knowledge is prerequisite to make any sense out of Machine learning

Kernel methods and Support Vector Machines, Smola:http://videolectures.net/mlss08au_smola_ksvm/
Introduction of the main ideas of statistical learning theory, Support Vector Machines, Kernel Feature Spaces, An overview of the applications of Kernel Methods.

Introduction to Learning Theory, Olivier Bousquet. http://videolectures.net/mlss06au_bousquet_ilt/
This tutorial focuses on the “larger picture” than on mathematical proofs, it is not restricted to statistical learning theory however. 5 lectures.

Statistical Learning Theory, Olivier Bousquet, http://videolectures.net/mlss07_bousquet_slt/
This course gives a detailed introduction to Learning Theory with a focus on the Classification problem.

Virtual screening is the computational or in-silico screening of biological compounds and complements the HTS process. It is used to aid the selection of compounds for screening in HTS bioassays or for inclusion in a compound-screening library.Virtual screening can utilise several computational techniques depending on the amount and type of information available about the compounds and the target. Protein-based methods are employed when the 3D structure of the bioassay target is known and computational techniques involve the docking (virtual binding), and subsequent scoring, of candidate ligands (the part of the compound that is capable of binding) to the protein target.Ligand-based approaches are usually used when there are compounds known to be active or inactive for a specific target. If a few active compounds are known then structure-similarity techniques may be used; if the activity of several compounds is known then discriminant analysis techniques, such as machine learning approaches, may be applied. This is achieved by choosing several compounds that have known activity for a specific biological target and building predictive models that can discriminate between the active and inactive compounds. The goal is to then apply these models to several other unscreened compounds so that the compounds most likely to be active may be selected for screening. This is the approach taken in this research.The rationale behind the use of machine learning is to discover patterns and signatures in data sets from high throughput in-vitro assays.Predictive Modeling in CheminformaticsIn this module, Abhik Seal describes the technical process of bringing together a variety of computational tools in the R statistics package, to enable predictive modeling of compound-target interaction using supervised machine learning methods.If you understood the video how to perform predictive modeling then give a shot at this KDD dataset i.e (

Prediction of Molecular Bioactivity for Drug Design -- Binding to Thrombin) and try to post the results. This is not an assignment nor the results will be graded.Max Kuhn ( Director at Pfizer) who is the developer of Caret package shows the uses of R Caret of package. This is quite popular package for predictive modeling.

Links to the papers for Predictive Modeling.There are various kind of classification models .Below I listed some of the classification models and its it different properties from Tom Mitchell's Book here is the link

Resources for Learning Machine LearningThe following knowledge is prerequisite to make any sense out of Machine learning

- Linear Algebra by Gilbert Strang: http://ocw.mit.edu/courses/mathematics/18-06-linear-algebra-spring-2010/video-lectures/
- Convex Optimization by Boyd http://see.stanford.edu/see/courseinfo.aspx?coll=2db7ced4-39d1-4fdb-90e8-364129597c87
- Probability and statistics for ML: http://videolectures.net/bootcamp07_keller_bss/
- Some mathematical tools for ML: http://videolectures.net/mlss03_burges_smtml/ Video+Audio Very bad quality
- Probability primer (measure theory and probability theory) : http://www.youtube.com/playlist?list=PL17567A1A3F5DB5E4&feature=plcp
- Machine Learning Cheat sheet

Once the prerequisites are complete, the following are good series of lectures on Machine Learning.## Basic ML:

## Advanced ML:

- SVMs and kernel methods , Scholkopf: http://videolectures.net/mlss03_scholkopf_lk/
- Kernel methods and Support Vector Machines, Smola:http://videolectures.net/mlss08au_smola_ksvm/
- Easily one of the best talks on SVM. Almost like a run-down tutorial.http://videolectures.net/mlss06tw_lin_svm/
- Introduction to Learning Theory, Olivier Bousquet. http://videolectures.net/mlss06au_bousquet_ilt/
- Statistical Learning Theory, Olivier Bousquet, http://videolectures.net/mlss07_bousquet_slt/
- Statistical Learning Theory, John-Shawe Taylor, University of London. 7 lectures. http://videolectures.net/mlss04_taylor_slt/
- Advanced Statistical Learning Theory, Oliver Bousquet. 3 Lectures. http://videolectures.net/mlss04_bousquet_aslt/

Most of the above links have been filtered fromhttp://onionesquereality.wordpress.com/2008/08/31/demystifying-support-vector-machines-for-beginners/basics for Support Vector Machines and related Kernel methods. Video+Audio Very bad quality

Introduction of the main ideas of statistical learning theory, Support Vector Machines, Kernel Feature Spaces, An overview of the applications of Kernel Methods.

This tutorial focuses on the “larger picture” than on mathematical proofs, it is not restricted to statistical learning theory however. 5 lectures.

This course gives a detailed introduction to Learning Theory with a focus on the Classification problem.

## Other Important Links:

- Channel for probability primer and Machine learning . :http://www.youtube.com/user/mathematicalmonk#grid/user/D0F06AA0D2E8FFBA
- A comprehensive blog comprising of best resources for ML :http://onionesquereality.wordpress.com/2008/08/31/demystifying-support-vector-machines-for-beginners/
- Another great blog for ML http://www.quora.com/Machine-Learning/What-are-some-good-resources-for-learning-about-machine-learning-Why

Lectures 21-28 by Gilbert Strang, linear algebra way of optimization. http://academicearth.org/courses/mathematical-methods-for-engineers-ii