Data Science for Drug Discovery, Health and Translational Medicine




Welcome to INFO I590 Data Science for Drug Discovery, Health and Translational Medicine. This course is hosted in the School of Informatics and Computing at Indiana University. The lead instructor is David Wild. The course is open to both Data Science and Informatics graduate students.

This course is being held in the spring semester of 2016. Online data science students should register for section number 32673. Residential data science students should register for 32698 with the associated 13840 discussion section. Residential informatics students should register for 32699 with the associated 10883 discussion section. The course will be taught entirely online, using videos, group discussions, forums and other online resources. Residential students will have the opportunity to meet for a discussion section. Students will be graded through online quizzes (50%), and practical assignments (50%) given on the Canvas Site for this class. To tweet about this course, please use the hashtag #dsdht and also #datascience (if there is room!)

Course description

With exploding healthcare costs, greater longevity and widespread health challenges of diabetes, obesity, cancer and cardiovascular disease, medicine and healthcare will be a primary scientific and economic focus for the remainder of this century. Informatics and data science offer the promise of a level of understanding of health, disease and treatment on a scale never before imagined. This course will address the big data techniques that are being used in the drug discovery, healthcare and translational medicine domains and will be organized around three questions: how can data science help researchers find new drugs and reuse old ones? How can data science help doctors treat patients better? And how can data science help us all lead healthier lives?

The course is broken down into sections, based around these questions, and modules. Each week of the course will focus on 1-3 modules. Each of these modules will have four parts: a Video, which gives an overview of the topic; Learning Goals that list what you should aim to know after completing the module; Learning Tasks that all students should complete in addition to watching the video, and Going Deeper that gives resources for advanced students and those that want to go deeper into the material.

Course goals

Students will: Understand the current scientific and human challenges of drug discovery, health and translational medicine; be able to describe the demonstrated or potential value of data science techniques in each of these areas; understand the specific opportunities afforded by crossing domain boundaries; be able to practically work with drug discovery and EMR data using the R statistics package and network visualization tools.

Prerequisites

Students should have a good foundational knowledge of data science tools, including familiarity with the R statistical package, and at least some exposure to machine learning. Ability to program and some background in a healthcare field are desirable but not essential.

Resources

Videos and class materials will be posted on this site, at the links below. The other required resource is the Canvas site, which will be used for discussion, announcements and grading. IU Data science github account for codes (https://github.com/IUCCRG/dsdht)

Textbooks

There are no required texts for this course, however, there are several books that are recommended for background reading for parts of this course

Course instructors

  • Dr David Wild, Associate Professor of Informatics and Computing, Primary Instructor
  • Varsha Kulkarni, Associate Instructor
  • Adithya Nagaraj-Tirumale, Associate Instructor

COURSE INTRODUCTION

HOW CAN DATA SCIENCE HELP RESEARCHERS FIND NEW DRUGS AND REUSE OLD ONES?

HOW CAN DATA SCIENCE HELP DOCTORS TREAT PATIENTS BETTER?

HOW CAN DATA SCIENCE HELP US ALL LEAD HEALTHIER LIVES?

OTHER MATERIALS

Assignments

Below are 5 assignments. You should choose one of these assignments and submit your completed PDF's to the appropriate assignment on Canvas (i.e. with the same assignment number) by May 1, 2016.
  • Assignment 1 - Predictive modeling for chemical compound toxicity
  • Assignment 2 - Identifying relationship of genes to Down syndrome
  • Assignment 3 - Network visualization of semantic disease-related data using Cytoscape / Sci2
  • Assignment 4 - Designing a data science health app
  • Assignment 5 - Mining electronic medical record data
https://github.com/IUCCRG/dsdht