Pinal skeletal muscles and its mutation is widely found to occur
Pinal skeletal muscles and its mutation is widely found to occur in various types of muscular dystropy [38-41].Conclusions Microarray experiments are crucial because they measure the behaviour of individual genes with respect to diseases or treatments. Results from these experiments are heavily scrutinised to obtain biological insights into the occurrence of diseases or the effectiveness of certain types of treatments. In order to provide more indepth analysis toexperiments, contemporary algorithms have incorporated biological information into their analysis so that the analysis can be more descriptive and hopefully useful to the researchers. Our techniques have taken this approach one step further. Firstly, we no longer consider prior biological knowledge as a separate aspect of microarray analysis. Rather, we take into PubMed ID:https://www.ncbi.nlm.nih.gov/pubmed/27532042 account the integrity of the biological information that is being provided into the algorithm for analysis. Secondly, our algorithm uses both the gene-gene interaction information and pathway information in our analysis. Because of these two enhancements, we are able to generate subnetworks in real-time according to the responses of the microarray experiments. These contributions help us avoid some of the potential caveats present within microarray experiments. We are certainly not the first to integrate gene-expression data with gene-gene relationships. GNEA [42] is one such example. GNEA uses a global protein-protein interaction network, finds subnetworks that correspond to regions of significantly differentially expressed genes; these subnetworks are called HSNs in the paper. GNEA then determines which gene sets in a library of gene sets are significantly enriched in HSNs. There are two possible shortcomings in this approach. Firstly, in using a single global protein interaction network, GNEA makes the biological assumption that the local behaviour of proteins can be translated in a similar fashion globally and that gene expression levels are in a tight correspondence to protein levels (which is not generally true). A similar issue is raised in [7] where the authors argued that proteins which are very well connected have an extremely high chance of obtaining a low p-value and being ranked as significant. Because of the high connectivity of such proteins, they are liable to be involved in various disjoint biological processes, leading to the error of combiningFigure 2 Sample subnetwork from leukaemia dataset. A sample subnetwork from leukaemia dataset [26,27].Soh et al. BMC OxaliplatinMedChemExpress Oxaliplatin Bioinformatics 2011, 12(Suppl 13):S15 http://www.biomedcentral.com/1471-2105/12/S13/SPage 7 ofFigure 3 Sample subnetwork from leukaemia database. A sample subnetwork from DMD dataset [16,17].independent subnetworks through these proteins. To prevent such scenarios, we instead implemented our algorithm via identifying localised gene-gene subnetworks within pathways. Secondly, while a gene set that is significantly enriched in HSNs is likely to be relevant, a large gene set may not be found significantly enriched in HSNs even though it may have contained a subset that is significantly enriched. This is also an issue that we find in GSEA. We obtain a low result overlap from GSEA possibly because the pathways from PathwayApi are very large and GSEA relies on a large portion of a pathway to exhibit a correlated change. Hence when only a subset of a pathway demonstrates differential expression, GSEA may be unable to pick this up. We verified this hypothe.