Assignment 3: Network visualization of semantic disease-related data using Cytoscape / Sci2

In this assignment you are required to perform some network analysis on the drug target network given in the Canvas Files/Assignment 3 folder. For this assignment you will need to install the igraph library in R. igraph is also available as a CRAN package and Cytoscape tool.

1. Data Loading and Conversion:
The data is given is as a csv file. Load the csv file and then convert it to a matrix object:

If you open the csv file you will find it is a bimodal network(consisting of drugs and proteins). We call a network a bimodal mode network when we only havetwo types of nodes. To convert a bipartite network into a bipartite graph object you need to call the graph.incidence() function and pass the matrix.Then you can project the network into two partitions using bipartite.projection() function into two separate graphs which you can access as by list index.
Another way to convert the network to into separate unimodal network structure is given below. Consider net as csv file, and convert to matrix using as.matrix command this way you are creating two separate matrices using the a bipartite network. For the drug network we are considering two drugs are connected if they share a single protein target and for the protein network two proteins are connected if they share a single drug. We will be using the drug and the protein network to calculate some network properties and try to visualize the network.

net<-as.matrix(net) protein<-net%*%t(net)#t is the matrix transpose. # For the proteins protein[protein>0]<-1 # convert all the values greater than 0 count to 1 to make a unweighted network. diag(protein)<-0 # remove self loops

# same for the drug data drug<-t(net)%*%net drug[drug>0]<-1 diag(drug)<-0 # avoiding self loops in networks

Using graph.adjacency() you can convert the final drug and the protein matrix into an igraph object.

The example below shows for drug matrix which is converted to graph object and it is undirected

Using the drug and the protein networks we will be analyzing some basic properties of each network using the R igraph library. There is a list of functions in the igraph package; we will be using some of them to calculate some properties. Report your answers to each of the questions below:

1. Decompose the graph of proteins and drugs into components. Consider the minimum number of vertices to be 2. What is the size of the largest component ?

2. Calculate the normalized betweenness of all the vertices of the drug and protein network and list the top 5 high betweenness compounds.

3. Calculate the density of the drug and protein network.

4. Using the walktrap community algorithm of step=3, find the communities of both the drug and protein networks. Report the total number of communities in both the drug and protein networks. Extract the largest community and plot it with layout kamada.kawai. Include an image of this in your report.

5. How many first neighbors do the drugs Acetaminophen and Risperidone have in the network? What are the drug names of the first neighbourhood of Sunitinib? What diseases are the 1st neighbourhood drugs used for? Now use the 2nd neighborhood of Sunitinib to see what new drugs show up and write a brief report how does the new drug support the effects of the diseases treated for 2nd neighbourhood drugs.

6. Plot the drug and protein network showing the communities. Don't make the vertex labels big so that it gets unreadable. Make your own settings so that the nodes and labels get adjusted.

Report your results in a PDF and submit on Canvas.

Assignment 3:Network visualization of semantic disease-related data using Cytoscape / Sci2In this assignment you are required to perform some

network analysison the drug target network given in the Canvas Files/Assignment 3 folder. For this assignment you will need to install theigraph libraryin R. igraph is also available as a CRAN package and Cytoscape tool.1. Data Loading and Conversion:The data is given is as a csv file. Load the csv file and then convert it to a matrix object:

net<-read.csv(filename,header=TRUE,row.names=1,check.names=FALSE)

net<- as.matrix(net)

If you open the csv file you will find it is a

bimodal network(consisting of drugs and proteins).We call a network a bimodal mode network when we only havetwo types of nodes.To convert a bipartite network into a bipartite graph object you need to call the graph.incidence() function and pass the matrix.Then you can project the network into two partitions using bipartite.projection() function into two separate graphs which you can access as by list index.Another way to convert the network to into separate unimodal network structure is given below. Consider net as csv file, and convert to matrix using as.matrix command this way you are creating two separate matrices using the a bipartite network. For the drug network we are considering two drugs are connected if they share a single protein target and for the protein network two proteins are connected if they share a single drug. We will be using the drug and the protein network to calculate some network properties and try to visualize the network.

net<-as.matrix(net)

protein<-net%*%t(net)#t is the matrix transpose.

# For the proteins

protein[protein>0]<-1 # convert all the values greater than 0 count to 1 to make a unweighted network.

diag(protein)<-0 # remove self loops

# same for the drug data

drug<-t(net)%*%net

drug[drug>0]<-1

diag(drug)<-0 # avoiding self loops in networks

Using

graph.adjacency()you can convert the final drug and the protein matrix into an igraph object.The example below shows for drug matrix which is converted to graph object and it is undirected

drug<-graph.adjacency(drug,mode="undirected",diag=FALSE)

2. Analyzing networksUsing the drug and the protein networks we will be analyzing some basic properties of each network using the R igraph library. There is a list of functions in the igraph package; we will be using some of them to calculate some properties. Report your answers to each of the questions below:

1. Decompose the graph of proteins and drugs into components. Consider the minimum number of vertices to be 2. What is the size of the largest component ?

2. Calculate the normalized betweenness of all the vertices of the drug and protein network and list the top 5 high betweenness compounds.

3. Calculate the density of the drug and protein network.

4. Using the walktrap community algorithm of step=3, find the communities of both the drug and protein networks. Report the total number of communities in both the drug and protein networks. Extract the largest community and plot it with layout kamada.kawai. Include an image of this in your report.

5. How many first neighbors do the drugs Acetaminophen and Risperidone have in the network? What are the drug names of the first neighbourhood of Sunitinib? What diseases are the 1st neighbourhood drugs used for? Now use the 2nd neighborhood of Sunitinib to see what new drugs show up and write a brief report how does the new drug support the effects of the diseases treated for 2nd neighbourhood drugs.

6. Plot the drug and protein network showing the communities. Don't make the vertex labels big so that it gets unreadable. Make your own settings so that the nodes and labels get adjusted.

Report your results in a PDF and submit on Canvas.