Sign in Register Submit Manuscript

Qingres Home

Location:Home >> Detail

Creative Commons License

This work is licensed under a

Creative Commons Attribution 4.0 International License

Journal of Psychiatry and Brain Science 2017; 2 (6) : 3; DOI:10.20900/jpbs.20170019

Article

Omnigenic Model: The Evidence from Neurodegenerative Diseases

Libing Shen1* , Qili Shi1

1 Interdisciplinary Research Center on Biology and Chemistry, Shanghai Institute of Organic Chemistry, Chinese Academy of Sciences, Shanghai, 200032, P. R. China.

*Correspondence: Libing Shen, Ph.D.; Email: shenlibing@sioc.ac.cn.

Published: 12/25/2017 16:50:18 PM

ABSTRACT

Recently, a seminal model, called the omnigenic model, is proposed for understanding complex traits such as schizophrenia. In this study, we examined this model in Alzheimer’s disease and Parkinson’s disease from the perspectives of expression spectrum, shared disease-associated genes, common biological pathways, organ specificities, and network properties. Our results support the arguments brought forward by the omnigenic model. Although we only provided the limited evidence for the omnigenic model in neurodegenerative diseases, we hope that our effort can improve the understanding of these diseases and thus spur new ideas on how to prevent and treat them. The most important information we try to convey in our study is that there are more genes than we expected playing a role in the pathogenesis of neurodegenerative diseases and it is insufficient to study these diseases only focusing on some core genes or genetic pathways.

Complex traits such as height and intelligence are always a fascinating research topic for geneticists. Two central questions on complex traits are how many genes are exactly behind a complex trait and whether they equally contribute to such a trait. In the era of genome project and next-generation sequencing, we seem to be approaching to the answers for these two questions. Nevertheless, the big data of this era only bring us confounding results instead of definitive answers. The study of complex traits with common SNPs shows that all autosomal SNPs contribute to the 45 % variance in height and the variance explained by each chromosome is proportional to its length[1]. In another word, each gene in an individual’s genome could probably make its tiny contribution to his or her height. For complex diseases, such answer is obviously far beyond satisfaction, because we are trying to seek possible treatment, not probable statement.

In a recently published seminal paper, Boyle et al. proposed a new omnigenic model for understanding complex diseases[2]. The omnigenic model proposes that all genes expressed in disease-relevant cells have the influence on the functions of core disease-related genes and thus the genetic propensity for complex diseases can lay outside core gene pathways. The hypothesis behind the omnigenic model is that all genes expressed in a cell are tightly interconnected by an omnibus gene regulatory network and thus each of them can affect the development of disease. We found this model both troubling and enlightening and tested it with Alzheimer’s disease and Parkinson’s disease microarray data. Both of them are neurodegenerative diseases, whose causes are still poorly understood[3, 4].

First, we downloaded the expression data of Alzheimer’s disease and Parkinson’s disease from GEO database and then identified the differentially expressed genes (DEGs) between cases and controls in each dataset. The DEGs are classified into two categories --- down-regulated genes and up-regulated genes (Table 1). Our result shows the number of down-regulated genes and up-regulated genes varies from dataset to dataset, which is ranging from a few of thousands to a couple of hundreds. Accordingly, the gene expression spectra are widely different among different disease datasets.

TABLE 1
Table 1. The number of differentially expressed genes (DEGs) in each microarray dataset.

Second, we tried to find the common DEGs in down-regulated gene category and up-regulated gene category for Alzheimer’s disease and Parkinson’s disease, respectively. Among four Alzheimer’s disease datasets, just dozens of common DEGs were found in both down-regulated genes and up-regulated genes (Fig. 1a and 1b). Among five Parkinson’s disease datasets, only three common DEGs were found in down-regulated genes whereas no common DEG was found in up-regulated genes. Certainly, there exists a possibility that these disease cases share some prominent common core gene pathways, although they share very few common genes. To for or against such possibility, we performed the gene-GO term enrichment analyses for 1300 down-regulated genes and 1651 up-regulated genes in Alzheimer’s disease and 2042 down-regulated genes and 2469 up-regulated genes in Parkinson’s disease. The gene-GO term enrichment results are shown in Fig. 2. Although a couple of thousands of genes were used for each gene-GO term enrichment analysis, the most significant biological processes only contain a dozens of genes in each GO analysis. For the down-regulated genes in Alzheimer’s disease and Parkinson’s disease, their significant biological processes mainly involve in metabolic processes but not conspicuous cell degenerative pathways such as apoptotic process as we expected. Furthermore, the down-regulation of cell metabolism is a result of cell death but not a cause for it[5]. Apparently, there are very few shared genes among different Alzheimer’s disease datasets or Parkinson’s disease datasets and our GO analyses failed to detect any common initiation for the neuron degeneration process in Alzheimer’s disease and Parkinson’s disease.

FIGURE 1
Fig. 1 Venn diagram of DEGs in Alzheimer’s disease datasets and Parkinson’s disease datasets.

a. Down-regulated genes in four Alzheimer’s disease datasets. b. Up-regulated genes in four Alzheimer’s disease datasets. c. Down-regulated genes in five Parkinson’s disease datasets. d. Up-regulated genes in five Parkinson’s disease datasets. AD stands for Alzheimer’s disease and PD stands for Parkinson’s disease.

FIGURE 2
Fig. 2 GO term enrichment results for DEGs in Alzheimer’s disease and Parkinson’s disease.

a. GO term enrichment for down-regulated genes in Alzheimer’s disease. b. GO term enrichment for up-regulated genes in Alzheimer’s disease. c. GO term enrichment for down-regulated genes in Parkinson’s disease. d. GO term enrichment for up-regulated genes in Parkinson’s disease. AD stands for Alzheimer’s disease and PD stands for Parkinson’s disease

Third, we checked the expression levels of DEGs in normal organs, since the omnigenic model states that all genes expressed in disease-relevant cells can have the influence on the disease development process. And vice versa, the disease must have the impact on their expression level in disease-relevant organs. Fig. 3 shows the expression levels of down-regulated genes and up-regulated genes in four Alzheimer’s disease dataset among six normal human organs. The down-regulated genes in four Alzheimer’s disease datasets exhibit a statistically higher expression level in brain and cerebellum than heart, liver, kidney, liver and testis while the up-regulated genes don’t (Fig. 3a and 3b). In five Parkinson’s disease datasets, the down-regulated genes show a similarly higher expression pattern in brain and cerebellum, although such pattern is less obvious in dataset GSE20295 (Fig. 4a). Moreover, the up-regulated genes in five Parkinson’s disease datasets show a more diversified expression pattern (Fig. 4b), which suggests that the pathogenic process might be more complex in Parkinson’s disease than in Alzheimer’s disease. Fig. 3 and 4 actually show that these down-regulated genes are specifically expressed in normal central nervous system (CNS), because they have a higher expression level in brain and/or cerebellum compared to the other four organs. That the CNS-specifically-expressed genes are down-regulated in neurodegenerative diseases is consistent with the assumption of the omnigenic model.

FIGURE 3
Fig. 3 The expression level of DEGs from four Alzheimer’s disease datasets in six normal human organs.

a. The expression level of down-regulated genes in six organs. b. The expression level of up-regulated genes in in six organs. The star indicates the statistical difference between two gene sets (P-value < 0.01, Wilcoxon test and Kolmogorov–Smirnov test). The color of star indicates the corresponding organ.

FIGURE 4
Fig. 4 The expression level of DEGs from five Parkinson’s disease datasets in six normal human organs.

a. The expression level of down-regulated genes in six organs. b. The expression level of up-regulated genes in in six organs. The star indicates the statistical difference between two gene sets (P-value < 0.01, Wilcoxon test and Kolmogorov–Smirnov test). The color of star indicates the corresponding organ.

Finally, we examined the network properties of the down-regulated genes in Alzheimer’s disease and Parkinson’s disease using both protein-protein-interaction information and gene co-expression information. The key point in omnigenic model is that all genes expressed in disease-relevant cells are sufficiently interconnected by gene regulatory networks. Because our result shows that the down-regulated genes in neurodegenerative diseases are also CNS-specifically-expressed genes, we infer that the down-regulated genes in neurodegenerative diseases must constitute a biological network which can be characterized by the power-law degree distribution and hierarchical structure[6]. Fig. 5a shows that the degree distribution of the down-regulated genes in Alzheimer’s disease to the probability P(k) fits the power law in protein-protein-interaction networks (the power-law fit is shown as a red line, R2 = 0.94, P-value < 2×10−16). Fig. 5b shows that the scaling of the clustering coefficient of the down-regulated genes in Alzheimer’s disease follows C(K)~K−1 in protein-protein-interaction networks (which is shown in a red straight line with negative slope, R2 = 0.37, P-values < 2×10−16). Our network property analyses also show that the down-regulated genes in Parkinson’s disease have the same network properties as the ones in Alzheimer’s disease in protein-protein-interaction network (Fig. 5c and 5d). The network analysis based on gene co-expression information shows a similar result for the down-regulated genes in Alzheimer’s disease and Parkinson’s disease (Fig. 6a, 6b, 6c and 6d). That a gene set’s degree distribution fits the power law indicates that they constitute a scale-free network (there exist a small number of highly connected nodes, known as hubs, within the network). And the scaling of the clustering coefficients of two sets of down-regulated genes show that they have a hierarchical network structure as a protein-protein interaction network (sparsely connected nodes are part of highly clustered areas and different clustered areas are communicated by a few hubs). The hierarchical network structure exhibits the small-world property of disease-related genes, which is predicted by Boyle et al’s paper[2]. Thus, the results of our network analyses further support the omnigenic model.

FIGURE 5
Fig. 5 Protein-protein-interaction (PPI) network properties of down-regulated genes in Alzheimer’s disease and Parkinson’s disease.

a. PPI degree distribution of down-regulated genes in Alzheimer’s disease. b. Scaling of the PPI clustering coefficient in down-regulated genes in Alzheimer’s disease. c. PPI degree distribution of down-regulated genes in Parkinson’s disease. d. Scaling of the PPI clustering coefficient in down-regulated genes in Parkinson’s disease.

FIGURE 6
Fig. 6 Co-expression network properties of down-regulated genes in Alzheimer’s disease and Parkinson’s disease.

a. Co-expression degree distribution of down-regulated genes in Alzheimer’s disease. b. Scaling of the co-expression clustering coefficient in down-regulated genes in Alzheimer’s disease. c. Co-expression degree distribution of down-regulated genes in Parkinson’s disease. d. Scaling of the co-expression clustering coefficient in down-regulated genes in Parkinson’s disease.

There is a major difference between two network analysis results. The coefficient of determinations (R2) are smaller in gene co-expression network analyses than in protein-protein-interaction ones. We find that gene’s average degree and clustering coefficient is much higher in gene co-expression networks than in protein-protein-interaction networks and the genes with high degree and clustering coefficient in gene co-expression networks also outnumber their counterparts in protein-protein-interaction networks. For example, for the down-regulated genes in Alzheimer’s disease, the protein-protein-interaction network analysis shows that they have the highest degree of 50 while the gene co-expression one shows that they have the highest degree of 250 (Fig. 5a and Fig. 6a), which suggests that the protein-protein-interaction information might be limited. This limitation could lead to bias in our network analyses, although the bias is favorable towards our conclusion. In order to rule out such bias in our study, we performed the network property analysis for all genes with protein-protein interaction information and used them as the background (total 17644 genes). The result shows that all genes with protein-protein interaction information also constitute a genuine biological network (scale-free and highly modular, Supplementary Fig. 1). Consequently, our network analysis in Figure 5 is less likely to be biased since it is based on a small protein-protein-interaction network from a much bigger one. In addition, the large average degree and clustering coefficient from gene co-expression network analysis propose that the down-regulated genes in disease samples are often expressed in a modular fashion in normal brain regions.

SUPPLEMENTARY FIGURE 1
Supplementary Fig. 1 Protein-protein-interaction (PPI) network properties of 17644 genes.

a. Their PPI degree distribution. b. Scaling of the PPI clustering coefficient in them.

In this comment, we used the expression data from Alzheimer’s disease and Parkinson’s disease to verify the omnigenic model in neurodegenerative diseases. We originally felt some qualm about this model, because it provides more challenge for those like us who work in the field of complex diseases. To our surprise, our results fully support it, although we have to say that our evidence is circumstantial. However, acquiring direct evidence for this model would not be an easy task. First, we should know how many genes are exactly expressed in a disease-relevant cell. Then, we have to thoroughly understand the cellular network in order to separate peripheral genes from core genes, since they play different roles in disease pathogenesis. In our view, the most important information that the omnigenic model conveys is that new methods, both experimental and bioinformatic ones, are needed for studying the complex diseases such as Alzheimer’s disease and Parkinson’s disease, because there are more genes than we expected participating in the process of neurodegenerative disease development.

1 METERIALS AND METHODS

1.1 Microarray data and detection of differentially expressed genes

All microarray datasets used in this study were downloaded from GEO database. Four datasets are Alzheimer’s disease, which are GSE28146 (15 cases and 8 controls), GSE48350(19 cases and 43 controls), GSE1297 (15 cases and 9 controls), and GSE26927 (11 cases and 7 controls). Five datasets are Parkinson’s disease, which are GSE7621(16 cases and 9 controls), GSE8397(29 cases and 16 controls), GSE49036 (16 cases and 8 controls), GSE20295 (12 cases and 18 controls), and GSE26927 (10cases and 10 controls). The detailed disease information for each dataset has been shown in previous studies[7-13].

In each dataset, the expression counts for each gene were normalized with the quantile algorithm in the limma package under R environment[14]. Differentially expressed genes (DEGs) were detected by limma package using linear models with a cutoff P-value < 0.01.

1.2 GO term enrichment analysis

To test whether there exist certain common biological pathways in both diseases, we used the clusterProfiler package to perform gene-GO term enrichment analyses for the combined DEGs in Alzheimer’s disease or Parkinson’s disease[15]. To visualize the GO results, we used the ggplot2 package for GO result display.

1.3 RNA-Seq data for six human organs

The RNA-Seq data of six human organs (brain, cerebellum, heart, kidney, liver, and testis) were downloaded from the supplementary information of Brawand et al.[16]. We calculated the RPKM (Reads Per Kilobase per Million mapped reads) value for each gene based on the downloaded data (unique read coverage per exon). Due to the uneven number of samples in some organs from some species, we used the mean RPKM value if multiple RPKM values were available for each human organ. We transformed the RPKM values into the log2(RPKM) values and then calculated the Z-score for every log2(RPKM) value within each organ, in order to render the gene expression values comparable among different organs.

1.4 Statistical analysis and data display

The R package (version 3.2.4) was used to perform statistical analyses in this study. Both Wilcoxon test and Kolmogorov–Smirnov test were employed to compare the expression level of the DEGs in Alzheimer’s disease or Parkinson’s disease among six human organs and P-value smaller than 0.01 was viewed as statistically significant. The R package VennDiagram was used for Venn diagram plotting[17].

1.5 Network property analysis

We downloaded the protein-protein interaction (PPI) information from mentha database and the gene co-expression information from www.brainExp.org, a website hosting the gene co-expression data from different brain regions, age stages, and genders[18]. These information were used to perform the network property analysis in this study. We filtered non-human proteins for PPI network analysis and only used a node gene’s top 5 % of positively correlated co-expression gene partners (weighted correlation > 0.3) for co-expression network analysis.

We used the down-regulated genes in Alzheimer’s disease or Parkinson’s disease for network property analysis, because they are specifically expressed in central nervous system (brain and cerebellum) while the down-regulated ones don’t exhibit a uniform tissue-specific expression pattern. According to the omnigenic model, a cellular regulatory network is made of the genes specifically expressed in a cell type.

For each down-regulated gene, we calculated its degree k and clustering coefficient C(k) using both protein-protein interaction information and gene co-expression information. Degree is a measure of a node’s connectivity in a network. In protein-protein interaction network, it measures how many interaction neighbors (neighbor genes) a specific gene has. In gene co-expression network, it measures how many co-expression gene partners a specific gene has. Clustering coefficient measures a node’s modularity in a network, i.e. the degree to which nodes in a network tend to cluster together. In cellular network, modularity implies certain biological function.

The clustering coefficient is mathematically defined as follows[6]:

C(k)=2n/k(k-1)

where n is the number of direct links among a specific gene’s neighbors or its co-expression partners and k(k −1)/2 is the total possible number of direct links among its neighbors or partners.

AUTHOR CONTRIBUTIONS

The first author and second author contribute equally to this article.

REFERENCES

1.

2.

3.

4.

5.

6.

7.

8.

9.

10.

11.

12.

13.

14.

15.

16.

17.

18.

All Rights Reserved © Copyright 2016 Qingres Co., Ltd .

Powered by Qingres Limitd.