Practical+drug-protein+predictive+modeling+with+R

**Predictive Modeling in Cheminformatics**
Virtual screening is the computational or in-silico screening of biological compounds and complements the HTS process. It is used to aid the selection of compounds for screening in HTS bioassays or for inclusion in a compound-screening library.Virtual screening can utilise several computational techniques depending on the amount and type of information available about the compounds and the target. Protein-based methods are employed when the 3D structure of the bioassay target is known and computational techniques involve the docking (virtual binding), and subsequent scoring, of candidate ligands (the part of the compound that is capable of binding) to the protein target.Ligand-based approaches are usually used when there are compounds known to be active or inactive for a specific target. If a few active compounds are known then structure-similarity techniques may be used; if the activity of several compounds is known then discriminant analysis techniques, such as machine learning approaches, may be applied. This is achieved by choosing several compounds that have known activity for a specific biological target and building predictive models that can discriminate between the active and inactive compounds. The goal is to then apply these models to several other unscreened compounds so that the compounds most likely to be active may be selected for screening. This is the approach taken in this research.The rationale behind the use of machine learning is to discover patterns and signatures in data sets from high throughput in-vitro assays.


 * In this module, Abhik Seal describes the technical process of bringing together a variety of computational tools in the R statistics package, to enable predictive modeling of compound-target interaction using supervised machine learning methods.**

media type="youtube" key="hnzqReZoqRk" height="344" width="425"

media type="custom" key="24394532"

If you understood the video how to perform predictive modeling then give a shot at this [|KDD dataset] i.e (**Prediction of Molecular Bioactivity for Drug Design -- Binding to Thrombin) and try to post the results. This is not an assignment nor the results will be graded.**

Max Kuhn ( Director at Pfizer) who is the developer of Caret package shows the uses of R Caret of package. This is quite popular package for predictive modeling.

media type="custom" key="25251538"


 * Links to the papers for Predictive Modeling. **
 * [|Introduction to ROC analysis]
 * [|Virtual Screening of Bioassay Data]
 * [|In-silico predictive mutagenicity model generation using supervised learning approaches]
 * [|Pubchem as a source of polypharmacology]
 * [|Open Source platform to benchmark fingerprints for ligand based virtual screening]
 * [|Modeling of non-additive mixture properties using the Online CHEmical database and Modeling environment (OCHEM)]

There are various kind of classification models .Below I listed some of the classification models and its it different properties from [|Tom Mitchell's Book] here is the 


 * Resources for Learning Machine Learning**

The following knowledge is prerequisite to make any sense out of Machine learning Once the prerequisites are complete, the following are good series of lectures on Machine Learning.
 * Linear Algebra by Gilbert Strang: @http://ocw.mit.edu/courses/mathematics/18-06-linear-algebra-spring-2010/video-lectures/
 * Convex Optimization by Boyd @http://see.stanford.edu/see/courseinfo.aspx?coll=2db7ced4-39d1-4fdb-90e8-364129597c87
 * Probability and statistics for ML: @http://videolectures.net/bootcamp07_keller_bss/
 * Some mathematical tools for ML: @http://videolectures.net/mlss03_burges_smtml/ Video+Audio Very bad quality
 * Probability primer (measure theory and probability theory) : @http://www.youtube.com/playlist?list=PL17567A1A3F5DB5E4&feature=plcp
 * [|Machine Learning Cheat sheet]

Basic ML:

 * Andrew Ng’s Video Lectures(CS229) : @http://see.stanford.edu/see/courseinfo.aspx?coll=348ca38a-3a6d-4052-937d-cb017338d7b1
 * Andrew Ng’s online course offering: http://www.ml-class.org
 * Learning from Data by Yaser Abu-Mostafa []
 * Tom Mitchell’s video lectures(10-701) : @http://www.cs.cmu.edu/~tom/10701_sp11/lectures.shtml
 * Mathematicalmonk’s videos: @http://www.youtube.com/playlist?list=PLD0F06AA0D2E8FFBA&feature=plpp
 * Videos on Machine Learning [] Clustering, EM, SVM, Naive Bayes,PCA

Advanced ML:
> basics for Support Vector Machines and related Kernel methods. Video+Audio Very bad quality > Introduction of the main ideas of statistical learning theory, Support Vector Machines, Kernel Feature Spaces, An overview of the applications of Kernel Methods. > This tutorial focuses on the “larger picture” than on mathematical proofs, it is not restricted to statistical learning theory however. 5 lectures. > This course gives a detailed introduction to Learning Theory with a focus on the Classification problem. Most of the above links have been filtered from@http://onionesquereality.wordpress.com/2008/08/31/demystifying-support-vector-machines-for-beginners/ =Other Important Links:= Lectures 21-28 by Gilbert Strang, linear algebra way of optimization. @http://academicearth.org/courses/mathematical-methods-for-engineers-ii
 * SVMs and kernel methods, Scholkopf: @http://videolectures.net/mlss03_scholkopf_lk/
 * Kernel methods and Support Vector Machines, Smola:@http://videolectures.net/mlss08au_smola_ksvm/
 * Easily one of the best talks on SVM. Almost like a run-down tutorial.@http://videolectures.net/mlss06tw_lin_svm/
 * Introduction to Learning Theory, Olivier Bousquet. @http://videolectures.net/mlss06au_bousquet_ilt/
 * Statistical Learning Theory, Olivier Bousquet, @http://videolectures.net/mlss07_bousquet_slt/
 * Statistical Learning Theory, John-Shawe Taylor, University of London. 7 lectures. @http://videolectures.net/mlss04_taylor_slt/
 * Advanced Statistical Learning Theory, Oliver Bousquet. 3 Lectures. @http://videolectures.net/mlss04_bousquet_aslt/
 * Channel for probability primer and Machine learning . :http://www.youtube.com/user/mathematicalmonk#grid/user/D0F06AA0D2E8FFBA
 * A comprehensive blog comprising of best resources for ML :@http://onionesquereality.wordpress.com/2008/08/31/demystifying-support-vector-machines-for-beginners/
 * Another great blog for ML @http://www.quora.com/Machine-Learning/What-are-some-good-resources-for-learning-about-machine-learning-Why