Functional and Network Exploration of RNA Seq data of Breast Cancer

Analysis of RNASeq data of breast cancer


  • Tehreem Anwar Lahore College for Women University, Lahore, Pakistan
  • Mirza Jawad ul Hasnain Virtual University of Pakistan
  • Vina Kanwal Virtual University of Pakistan



Breast cancer, Bioinformatics, R software, Network analysis, Gene expression analysis


This study comprised of RNASeq data analysis of breast cancer. It includes statistical, functional and network analysis by various bioinformatics tools. Breast cancer is the most frequent cancer in women and affects everyone, including the young and elderly, rich and poor, women and children. Objective: To explore dataset of breast cancer, network and functional wise. Although there is extensive research on breast cancer, in silico studies on this topic are very rare. Methods: The study makes use of GEO (Gene Expression Omnibus) database from where data was collected. The data obtained of Breast cancer samples was normalized for which R language was used (using Limma, RPKM values) which eventually gave differentially expressed genes which were mainly involved in causing this Breast cancer and up- and down-regulatory genes were found using logFC values. Then functional analysis of these up- and down-regulated genes was performed using David Software. Then network analysis was performed, which showed the co-relation between the genes in making this Breast cancer prevalent in patients. Finally, importance of our genes was studied by using cBioPortal database. Results: Six important and novel genes were identified as differentially expressing through R software. Functional and network analysis and their significance studied by cBioportal dictated several potential genes taking part in important cancer and other pathways paving way for further research. Conclusions:  The pathways and candidate genes were selected based on high enrichment score and these genes and pathways play a significant role in breast cancer.


Akram M and Iqbal M, Daniyal M, Khan AU. Awareness and current knowledge of breast cancer. Biological Research. 2017 Oct; 50(1):33. doi: 10.1186/s40659-017-0140-9

Jemal A, Siegel R, Ward E, Hao Y, Xu J, Murray T, Thun MJ. Cancer statistics, 2008. CA a Cancer Journal for Clinicians. 2008 Mar-Apr; 58(2):71-96. doi: 10.3322/CA.2007.0010

Chen, W, Zheng R, Baade PD, Zhang S, Zeng H, Bray F, et al. Cancer statistics in China, 2015. CA: A Cancer Journal for Clinicians. 2016 Mar; 66:115-132. doi: 10.3322/caac.

Bhurgri Y, Bhurgri A, Hassan SH, Zaidi S, Rahim A, Sankaranarayanan R, et al. Cancer incidence in Karachi, Pakistan: First results from Karachi Cancer Registry. International Journal of Cancer. 2000 Feb; 85:325-329. doi: 10.1002/(sici)1097-0215(20000201)85:3<325::aid-ijc5>;2-j.

Chuang HY, Lee E, Liu YT, Lee D, Ideker T. Network-based classification of breast cancer metastasis. Molecular Systems Biology. 2007 oct; 3:140. doi: 10.1038/msb4100180.

Weigelt B, Peterse JL, Veer LJ. Breast cancer metastasis: markers and models. Nature Reviews Cancer. 2005 Aug; 5(8):591–602. doi: 10.1038/nrc1670.

Sun YS, Zhao Z, Yang ZN, Xu F, Lu HJ, Zhu ZY, et al. Risk Factors and Preventions of Breast Cancer. International Journal of Biological Sciences. 2017 Nov; 13(11):1387-1397. doi: 10.7150/ijbs.21635.

Timothy JK, Pia KV, Banks E. Epidemiology of breast cancer. The lancet oncology, 2001 Mar; 2(3):63-140. doi: 10.1016/S1470-2045(00)00254-0.

Tippmann S. Programming tools: Adventures with R. Nature. 2015 Jan; 517(7532):109-10. doi: 10.1038/517109a.

Huang DW, Sherman BT, Tan Q, Kir J, Collins JR, Alvord WG, et al. DAVID Bioinformatics Resources: expanded annotation database and novel algorithms to better extract biology from large gene lists. Nucleic Acids Research. 2007 Jul; 35:W169-75. doi: 10.1186/gb-2007-8-9-r183

Shannon P, Markiel A, Ozier O, Baliga NS, Wang JT, Ramage D, et al. (2003). Cytoscape: A software Environment for integrated models of biomolecular interaction networks. Genome Research. 2003 Nov; 13(11):2498–2504. doi: 10.1101/gr.1239303.

Cline MS, Smoot M, Cerami E, Kuchinsky A, Landys N, Workman C, et al. Integration of biological networks and gene expression data using Cytoscape. Nature Protocols. 2007 Jun; 2(10):2366–2382. doi: 10.1038/nprot.2007.324.

Magoc T, Wood D, Salzberg' SL. EDGE-pro: estimated degree of gene expression in prokaryotic genomes. Evolutionary Bioinformatics. 2013 Mar; 9:127-36. doi: 10.4137/EBO.S11250.

Booms A, Coetzee GA, Pierce SE. MCF-7 as a Model for Functional Analysis of Breast Cancer Risk Variants. Cancer Epidemiology Biomarkers and Prevention. 2019 Oct; 28(10):1735-1745. doi: 10.1158/1055-9965.EPI-19-0066.

Basha SM, Rajput D, Iyengar NC, Caytiles RD. A Novel Approach to Perform Analysis and Prediction on Breast Cancer Dataset using R. International Journal of Grid and Distributed Computing. 2018 Feb; 11(2):41–54. doi: 10.14257/ijgdc.2018.11.2.05

Martin D, Brun C, Remy E, Mouren P, Thieffry D, Jacq B. GOToolBox: functional analysis of gene datasets based on Gene Ontology. Genome Biology. 2004 Nov; 5(12):1-8. doi: 10.1186/gb-2004-5-12-r101

Sherman BT, Huang DW, Tan Q, Guo Y, Bour S, Liu D, et al. DAVID Knowledgebase: a gene-centered database integrating heterogeneous gene annotation resources to facilitate high-throughput gene functional analysis. BMC Bioinformatics. 2007 Nov; 8:426. doi: 10.1186/1471-2105-8-426

Szklarczyk D, Gable AL, Nastou KC, Lyon D, Kirsch R, Pyysalo S, et al. The STRING database in 2021: customizable protein–protein networks, and functional characterization of user-uploaded gene/measurement sets. Nucleic Acids Research. 2021 Jan; 49(D1): D605-12. doi: 10.1093/nar/gkaa1074.

Du J, Li M, Yuan Z, Guo M, Song J, Xie X, et al. A decision analysis model for KEGG pathway analysis. BMC Bioinformatics. 2016 Oct; 17:407. doi: 10.1186/s12859-016-1285-1

Liu X, Liu Y. Comprehensive Analysis of the Expression and Prognostic Significance of the CENP Family in Breast Cancer. International Journal of General Medicine. 2022 Mar; 15:3471-3482. doi: 10.2147/IJGM.S354200




How to Cite

Anwar, T., Jawad ul Hasnain, M., & Kanwal, V. . (2022). Functional and Network Exploration of RNA Seq data of Breast Cancer: Analysis of RNASeq data of breast cancer. Pakistan BioMedical Journal, 5(10), 28–33.



Original Article