home

= IMPORTANT NOTICE: This site contains material from the I590 Data Science in Drug Discovery, Health and Translational Medicine course held in Spring 2016. The course running in Spring 2017 is being redesigned and will use updated materials and a different content delivery mechanism. This site is just being preserved for archival purposes. = = = = Data Science for Drug Discovery, Health and Translational Medicine =

media type="custom" key="24880664"

Welcome to //INFO I590// //Data Science for Drug Discovery, Health and Translational Medicine.// This course is hosted in the [|School of Informatics and Computing] at Indiana University. The lead instructor is [|David Wild]. The course is open to both Data Science and Informatics graduate students.

**This course is being held in the spring semester of 2016.** Online data science students should register for section number **32673**. Residential data science students should register for **32698** with the associated 13840 discussion section. Residential informatics students should register for **32699** with the associated 10883 discussion section. The course will be taught entirely online, using videos, group discussions, forums and other online resources. Residential students will have the opportunity to meet for a discussion section. Students will be graded through online quizzes (50%), and practical assignments (50%) given on the Canvas Site for this class. // To tweet about this course, please use the hashtag **#dsdht** and also **#datascience** (if there is room!) //

Course description
With exploding healthcare costs, greater longevity and widespread health challenges of diabetes, obesity, cancer and cardiovascular disease, medicine and healthcare will be a primary scientific and economic focus for the remainder of this century. Informatics and data science offer the promise of a level of understanding of health, disease and treatment on a scale never before imagined. This course will address the big data techniques that are being used in the drug discovery, healthcare and translational medicine domains and will be organized around three questions: how can data science help researchers find new drugs and reuse old ones? How can data science help doctors treat patients better? And how can data science help us all lead healthier lives?

The course is broken down into sections, based around these questions, and modules. Each week of the course will focus on 1-3 modules. Each of these modules will have four parts: a // Video //, which gives an overview of the topic; // Learning Goals // that list what you should aim to know after completing the module; // Learning Tasks // that all students should complete in addition to watching the video, and // Going Deeper // that gives resources for advanced students and those that want to go deeper into the material.

Course goals
Students will: Understand the current scientific and human challenges of drug discovery, health and translational medicine; be able to describe the demonstrated or potential value of data science techniques in each of these areas; understand the specific opportunities afforded by crossing domain boundaries; be able to practically work with drug discovery and EMR data using the R statistics package and network visualization tools.

Prerequisites
Students should have a good foundational knowledge of data science tools, including familiarity with the R statistical package, and at least some exposure to [|machine learning]. Ability to program and some background in a healthcare field are desirable but not essential.

Resources
Videos and class materials will be posted on this site, at the links below. The other required resource is the Canvas site, which will be used for discussion, announcements and grading. IU Data science github account for codes **(https://github.com/IUCCRG/dsdht)**

Textbooks
There are no required texts for this course, however, there are several books that are recommended for background reading for parts of this course
 * [|R for Medicine and Biology] practical guide to using the R statistics package with biology and healthcare data
 * [|Introducing Cheminformatics] - eBook overview of the field of cheminformatics, in [|PDF] and [|Kindle format]
 * [|Groovy Cheminformatics with the Chemistry Development Kit] - eBook programming for cheminformatics

**Course Developers**
Abhik Seal David J Wild

Course instructors

 * [|Dr David Wild,] Associate Professor of Informatics and Computing, Primary Instructor
 * Varsha Kulkarni, Associate Instructor
 * Adithya Nagaraj-Tirumale, Associate Instructor


 * COURSE INTRODUCTION**
 * Course introduction
 * Using the R statistics package


 * HOW CAN DATA SCIENCE HELP RESEARCHERS FIND NEW DRUGS AND REUSE OLD ONES?**
 * What it takes to find a new drug
 * Traditional drug discovery paradigms
 * Rational drug discovery case study: HIV protease inhibitors
 * Rational drug discovery case study: Resulin and Avandia for Diabetes
 * Are there any new drugs left?
 * Information-based drug discovery
 * A new epoch: information-based drug discovery
 * Chemistry-based data
 * Using chemistry data in R
 * Biology-based data - protein and DNA sequences, pathways, metagenomics
 * Retrieving Biological data
 * Using R to retrieve biological data and Perform Alignments
 * Using Workflow tools (Knime) Lets make our life simple !!
 * Predictive modeling in cheminformatics
 * QSAR and Scaffold analysis of drug-protein interactions
 * Practical drug-protein predictive modeling with R
 * Predicting drug-target interactions with 3D visualization and molecular docking
 * Virtual Screening in Drug Discovery
 * More Knime workflows.
 * Predictive modeling in bioinformatics
 * Mapping structure to function
 * Gene expression and Microarray
 * Gene expression Data Analysis with R and Bioconductor
 * Network approaches to molecular data
 * Systems chemical biology and network pharmacology
 * Searching on integrative drug discovery data repositories
 * Networks in R


 * HOW CAN DATA SCIENCE HELP DOCTORS TREAT PATIENTS BETTER?**
 * The big picture of data science, pharmaceutical research and healthcare
 * Overview of Data Science opportunities in the U.S. healthcare system
 * Patient-based data - clinical trials, side effects and EMRs
 * Mining adverse drug events from EMRs
 * Why most medical research is wrong
 * [|Doctors tell all - and it's bad (external link)]
 * __ [|Introduction to Health IT systems (external link)] __
 * [|Vendor-specific EHR systems (external link)]


 * HOW CAN DATA SCIENCE HELP US ALL LEAD HEALTHIER LIVES?**
 * Rethinking the doctor-patient model
 * Boiling the ocean - linking everything together
 * Health apps - do they really make a difference?
 * Managing disease with data science - diabetes example
 * [|Public Health IT] - improving registries, registries, biosurveillance, and epidemiology with data science (external link)


 * OTHER MATERIALS**
 * Building R applications with R-Shiny

Assignments
Below are 5 assignments. You should **choose one** of these assignments and submit your completed PDF's to the appropriate assignment on Canvas (i.e. with the same assignment number) by **May 1, 2016.** https://github.com/IUCCRG/dsdht
 * Assignment 1 - Predictive modeling for chemical compound toxicity
 * Assignment 2 - Identifying relationship of genes to Down syndrome
 * Assignment 3 - Network visualization of semantic disease-related data using Cytoscape / Sci2
 * Assignment 4 - Designing a data science health app
 * Assignment 5 - Mining electronic medical record data