IMPORTANT NOTICE: This site contains material from the I590 Data Science in Drug Discovery, Health and Translational Medicine course held in Spring 2016. The course running in Spring 2017 is being redesigned and will use updated materials and a different content delivery mechanism. This site is just being preserved for archival purposes.
Welcome to INFO I590 Data Science for Drug Discovery, Health and Translational Medicine. This course is hosted in the School of Informatics and Computing at Indiana University. The lead instructor is David Wild. The course is open to both Data Science and Informatics graduate students.
This course is being held in the spring semester of 2016. Online data science students should register for section number 32673. Residential data science students should register for 32698 with the associated 13840 discussion section. Residential informatics students should register for 32699 with the associated 10883 discussion section. The course will be taught entirely online, using videos, group discussions, forums and other online resources. Residential students will have the opportunity to meet for a discussion section. Students will be graded through online quizzes (50%), and practical assignments (50%) given on the Canvas Site for this class. To tweet about this course, please use the hashtag #dsdht and also #datascience (if there is room!)
With exploding healthcare costs, greater longevity and widespread health challenges of diabetes, obesity, cancer and cardiovascular disease, medicine and healthcare will be a primary scientific and economic focus for the remainder of this century. Informatics and data science offer the promise of a level of understanding of health, disease and treatment on a scale never before imagined. This course will address the big data techniques that are being used in the drug discovery, healthcare and translational medicine domains and will be organized around three questions: how can data science help researchers find new drugs and reuse old ones? How can data science help doctors treat patients better? And how can data science help us all lead healthier lives?
The course is broken down into sections, based around these questions, and modules. Each week of the course will focus on 1-3 modules. Each of these modules will have four parts: a Video, which gives an overview of the topic; Learning Goals that list what you should aim to know after completing the module; Learning Tasks that all students should complete in addition to watching the video, and Going Deeper that gives resources for advanced students and those that want to go deeper into the material.
Students will: Understand the current scientific and human challenges of drug discovery, health and translational medicine; be able to describe the demonstrated or potential value of data science techniques in each of these areas; understand the specific opportunities afforded by crossing domain boundaries; be able to practically work with drug discovery and EMR data using the R statistics package and network visualization tools.
R statistical package, and at least some exposure to machine learning. Ability to program and some background in a healthcare field are desirable but not essential.
- R for Medicine and Biology practical guide to using the R statistics package with biology and healthcare data
- Introducing Cheminformatics - eBook overview of the field of cheminformatics, in PDF and Kindle format
- Groovy Cheminformatics with the Chemistry Development Kit - eBook programming for cheminformatics
- Dr David Wild, Associate Professor of Informatics and Computing, Primary Instructor
- Varsha Kulkarni, Associate Instructor
- Adithya Nagaraj-Tirumale, Associate Instructor
HOW CAN DATA SCIENCE HELP RESEARCHERS FIND NEW DRUGS AND REUSE OLD ONES?
- What it takes to find a new drug
- Information-based drug discovery
- A new epoch: information-based drug discovery
- Chemistry-based data
- Using chemistry data in R
- Biology-based data - protein and DNA sequences, pathways, metagenomics
- Retrieving Biological data
- Using R to retrieve biological data and Perform Alignments
- Using Workflow tools (Knime) Lets make our life simple !!
- Predictive modeling in cheminformatics
- Predictive modeling in bioinformatics
- Network approaches to molecular data
HOW CAN DATA SCIENCE HELP DOCTORS TREAT PATIENTS BETTER?
- The big picture of data science, pharmaceutical research and healthcare
- Overview of Data Science opportunities in the U.S. healthcare system
- Patient-based data - clinical trials, side effects and EMRs
- Mining adverse drug events from EMRs
- Why most medical research is wrong
- Doctors tell all - and it's bad (external link)
- Introduction to Health IT systems (external link)
- Vendor-specific EHR systems (external link)
HOW CAN DATA SCIENCE HELP US ALL LEAD HEALTHIER LIVES?
- Rethinking the doctor-patient model
- Boiling the ocean - linking everything together
- Health apps - do they really make a difference?
- Managing disease with data science - diabetes example
- Public Health IT - improving registries, registries, biosurveillance, and epidemiology with data science (external link)
- Assignment 1 - Predictive modeling for chemical compound toxicity
- Assignment 2 - Identifying relationship of genes to Down syndrome
- Assignment 3 - Network visualization of semantic disease-related data using Cytoscape / Sci2
- Assignment 4 - Designing a data science health app
- Assignment 5 - Mining electronic medical record data