Review Article

Improving the Prediction of Survival in Cancer Patients by Using Machine Learning Techniques: Experience of Gene Expression Data: A Narrative Review


Background: Today, despite the many advances in early detection of diseases, cancer patients have a poor prognosis and the survival rates in them are low. Recently, microarray technologies have been used for gathering thousands data about the gene expression level of cancer cells. These types of data are the main indicators in survival prediction of cancer. This study highlights the improvement of survival prediction based on gene expression data by using machine learning techniques in cancer patients.

Methods: This review article was conducted by searching articles between 2000 to 2016 in scientific databases and e-Journals. We used keywords such as machine learning, gene expression data, survival and cancer.

Results: Studies have shown the high accuracy and effectiveness of gene expression data in comparison with clinical data in survival prediction. Because of bewildering and high volume of such data, studies have highlighted the importance of machine learning algorithms such as Artificial Neural Networks (ANN) to find out the distinctive signatures of gene expression in cancer patients. These algorithms improve the efficiency of probing and analyzing gene expression in cancer profiles for survival prediction of cancer.   

Conclusion: By attention to the capabilities of machine learning techniques in proteomics and genomics applications, developing clinical decision support systems based on these methods for analyzing gene expression data can prevent potential errors in survival estimation, provide appropriate and individualized treatments to patients and improve the prognosis of cancers.


Bal MS, Bodal VK, Kaur J, Kaur M, Sharma S (2015). Patterns of Cancer: A Study of 500 Punjabi Patients. Asian Pac J Cancer Prev, 16(12):5107-10.

Ghaderzadeh M, Sadoughi F, and Ketabat A (2012). Designing a Clinical Decision Support System Based on Artificial Neural Network for Early Detection of Prostate Cancer and Differentiation from Benign Prostatic Hyperplasia. Health Inf Manage, 9(4): 457-464.

Mehrabani D, Tabei S, Heydari S, Shamsina S, Shokrpour N, Amini M, et al (2008). Cancer occurrence in Fars Province, Southern Iran. Iran Red Crescent Med J, 10(4):314-22.

Field RW, Withers BL (2012). Occupational and environmental causes of lung cancer. Clin Chest Med, 33(4):681-703.

Twombly R (2005). Cancer surpasses heart disease as leading cause of death for all but the very elderly. J Natl Cancer Inst, 97(5):330-1.

Eyre H, Kahn R, Robertson RM, Clark NG, Doyle C, Gansler T, et al (2004). Preventing cancer, cardiovascular disease, and diabetes: a common agenda for the American Cancer Society, the American Diabetes Association, and the American Heart Association. Diabetes Care, 27(7):1812-24..

Xu X-c (2009). Risk factors and gene expression in esophageal cancer. Methods Mol Biol, 471:335-60..

Fakoor R, Ladhak F, Nazi A, Huber M (2013). Using deep learning to enhance cancer diagnosis and classification. Proceedings of the 30 th International Conference on Machine Learning, Atlanta, Georgia, USA, 2013. JMLR: W&CP volume 28.

Chen Y-C, Yang W-W, Chiu H-W (2009). Artificial neural network prediction for cancer survival time by gene expression data. 3rd International Conference on Bioinformatics and Biomedical Engineering, IEEE.

Van Vliet MH, Horlings HM, Van De Vijver MJ, Reinders MJ, Wessels LF (2012). Integration of clinical and gene expression data has a synergetic effect on predicting breast cancer outcome. PloS One,7(7):e40358.

Shimada Y, Sato F, Shimizu K, Tsujimoto G, Tsukada K (2009). cDNA microarray analysis of esophageal cancer: discoveries and prospects. Gen Thorac Cardiovasc Surg,57(7):347-56.

Kolasa M, Wojtyna R, Długosz R, Jóźwicki W (2009). Application of Artificial Neural Network to Predict Survival Time for Patients with Bladder Cancer. Computers in Medical Activity. Springer Berlin Heidelberg, pp. 113-22.

Spechler SJ (2013). Barrett esophagus and risk of esophageal cancer: a clinical review. JAMA, 310(6):627-36.

Aguirre-Gamboa R, Gomez-Rueda H, Martínez-Ledesma E, Martínez-Torteya A, Chacolla-Huaringa R, Rodriguez-Barrientos A (2013). SurvExpress: an online biomarker validation tool and database for cancer gene expression data using survival analysis. PloS One, 8(9):e74250.

Guo W, Jiang Y-G (2009). Current gene expression studies in esophageal carcinoma. Curr Genomics,10(8):534-9.

Neumann J, Feuerhake F, Kayser G, Wiech T, Aumann K, Passlick B, et al (2010). Gene expression profiles of lung adenocarcinoma linked to histopathological grading and survival but not to EGF-R status: a microarray study. BMC Cancer, 10:77.

Lozano R, Naghavi M, Foreman K, Lim S, Shibuya K, Aboyans V, et al (2012). Global and regional mortality from 235 causes of death for 20 age groups in 1990 and 2010: a systematic analysis for the Global Burden of Disease Study 2010. Lancet, 380(9859):2095-2128.

Guyon I, Weston J, Barnhill S, Vapnik V (2002). Gene selection for cancer classification using support vector machines. Mach learn, 46(1):389-422.

Greenawalt DM, Duong C, Smyth GK, Ciavarella ML, Thompson NJ, Tiang T, et al (2007). Gene expression profiling of esophageal cancer: comparative analysis of Barrett's esophagus, adenocarcinoma, and squamous cell carcinoma. Int J Cancer, 120(9):1914-21.

Ringnér M, Peterson C (2003). Microarray-based cancer diagnosis with artificial neural networks. Biotechniques, Suppl:30-5.

Pennathur A, Xi L, Litle VR, Gooding WE, Krasinskas A, Landreneau RJ, et al (2013). Gene expression profiles in esophageal adenocarcinoma predict survival after resection. J Thorac Cardiovasc Surg, 145(2):505-12.

Cruz JA, Wishart DS (2007). Applications of machine learning in cancer prediction and prognosis. Cancer Inform, 2: 59–77.

Somi M, Ahmadzadeh R, Farhangh S, Mirinejhad S, Jazayeri E, Sadeghi M, et al (2012). Evaluation of Treatment and Survival Rates in Patients with Esophageal Cancer Referred to Imam Khomeini Hospital, Tabriz, Iran. Govaresh, 17(1):33-8.

Moghtadaei M, Golpayegani MRH, Almasganj F, Etemadi A, Akbari MR, Malekzadeh R (2014). Predicting the risk of squamous dysplasia and esophageal squamous cell carcinoma using minimum classification error method. Comput Biol Med, 45:51-7.

Krishnan M, Temel JS, Wright AA, Bernacki R, Selvaggi K, Balboni T (2013). Predicting life expectancy in patients with advanced incurable cancer: a review. J Support Oncol, 11(2):68-74.

Wang C-Y, Lee T-F, Fang C-H, Chou J-H (2012). Fuzzy logic-based prognostic score for outcome prediction in esophageal cancer. IEEE Trans Inf Technol Biomed,16(6):1224-30

Ishibashi Y, Hanyu N, Nakada K, Suzuki Y, Yamamoto T, Yanaga K, et al (2003). Profiling gene expression ratios of paired cancerous and normal tissue predicts relapse of esophageal squamous cell carcinoma. Cancer Res, 63(16):5159-64.

Sato F, Shimada Y, Selaru FM, Shibata D, Maeda M, Watanabe G, et al (2005). Prediction of survival in patients with esophageal carcinoma using artificial neural networks. Cancer,103(8):1596-605.

Hou J, Aerts J, Den Hamer B, Van Ijcken W, Den Bakker M, Riegman P, et al (2010). Gene expression-based classification of non-small cell lung carcinomas and survival prediction. PloS One, 5(4):e10312.

Guo L, Ma Y, Ward R, Castranova V, Shi X, Qian Y (2006). Constructing molecular classifiers for the accurate prognosis of lung adenocarcinoma. Clin Cancer Res, 12(11 Pt 1):3344-54.

Chang HY, Nuyten DS, Sneddon JB, Hastie T, Tibshirani R, Sørlie T, et al (2005). Robustness, scalability, and integration of a wound-response gene expression signature in predicting breast cancer survival. Proc Natl Acad Sci U S A, 102(10):3738-43.

Pass HI, Liu Z, Wali A, Bueno R, Land S, Lott D, et al (2004). Gene expression profiles predict survival and progression of pleural mesothelioma. Clin Cancer Res,10(3):849-59.

Lee JS, Chu IS, Heo J, Calvisi DF, Sun Z, Roskams T, et al (2004). Classification and prediction of survival in hepatocellular carcinoma by gene expression profiling. Hepatology, 40(3):667-76.

Sarwat M, Ahmad A, Abdin M (2013). Stress Signaling in Plants: Genomics and Proteomics Perspective (Vol. 1). Springer Science & Business Media.

Mohri M, Rostamizadeh A, Talwalkar A (2013). Foundations of machine learning. MIT press.

Tarca AL, Carey VJ, Chen X-w, Romero R, Drăghici S (2007). Machine learning and its applications to biology. PLoS Comput Biol, 3(6):e116.

Wong R, Malthaner R (2000). Esophageal cancer: a systematic review. Curr Probl Cancer, 24(6):297-373.

Ando T, Suguro M, Kobayashi T, Seto M, Honda H (2003). Multiple fuzzy neural network system for outcome prediction and classification of 220 lymphoma patients on the basis of molecular profiling. Cancer Sci, 94(10):906-13.

Ando T, Suguro M, Hanai T, Kobayashi T, Honda H, Seto M (2002). Fuzzy Neural Network Applied to Gene Expression Profiling for Predicting the Prognosis of Diffuse Large B‐cell Lymphoma. Jpn J Cancer Res. 93(11):1207-12.

O'Neill MC, Song L (2003). Neural network analysis of lymphoma microarray data: prognosis and diagnosis near-perfect. BMC Bioinformatics, 4:13.

Ahmad FK, Deris S, Othman NH (2009). Toward integrated clinical and gene expression profiles for breast cancer prognosis: A review paper. Int J Biom Bioinformatics, 3(4):31-47.

Moslemi A, Mahjub H, Saidijam M, Poorolajal J, Soltanian AR (2016). Bayesian Survival Analysis of High-Dimensional Microarray Data for Mantle Cell Lymphoma Patients. Asian Pac J Cancer Prev, 17(1):95-100.

Petalidis LP, Oulas A, Backlund M, Wayland MT, Liu L, Plant K, et al (2008). Improved grading and survival prediction of human astrocytic brain tumors by artificial neural network analysis of gene expression microarray data. Mol Cancer Ther, 7(5):1013-24.

Bair E, Tibshirani R (2004). Semi-supervised methods to predict patient survival from gene expression data. PLoS Biol, 2(4):e108.

Wei JS, Greer BT, Westermann F, Steinberg SM, Son C-G, Chen Q-R, et al (2004). Prediction of clinical outcome using gene expression profiling and artificial neural networks for patients with neuroblastoma. Cancer Res, 64(19):6883-91.

Lancashire LJ, Powe D, Reis-Filho J, Rakha E, Lemetre C, Weigelt B, et al (2010). A validated gene expression profile for detecting clinical outcome in breast cancer using artificial neural networks. Breast Cancer Res Treat, 120(1):83-93.

Beer DG, Kardia SL, Huang C-C, Giordano TJ, Levin AM, Misek DE, et al (2002). Gene-expression profiles predict survival of patients with lung adenocarcinoma. Nat Med, 8(8):816-24.

Ng T, Chew L, Yap CW (2012). A Clinical Decision Support Tool To Predict Survival in Cancer Patients beyond 120 Days after Palliative Chemotherapy. J Palliat Med, 15(8):863-9.

Stojadinovic A, Bilchik A, Smith D, Eberhardt JS, Ward EB, Nissan A, et al (2013). Clinical Decision Support and Individualized Prediction of Survival in Colon Cancer: Bayesian Belief Network Model. Ann Surg Oncol,20(1):161-74.

IssueVol 46 No 2 (2017) QRcode
SectionReview Article(s)
Survival Cancer Gene expression Machine-learning techniques Clinical decision support system

Rights and permissions
Creative Commons License This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.
How to Cite
BASHIRI A, GHAZISAEEDI M, SAFDARI R, SHAHMORADI L, EHTESHAM H. Improving the Prediction of Survival in Cancer Patients by Using Machine Learning Techniques: Experience of Gene Expression Data: A Narrative Review. Iran J Public Health. 2017;46(2):165-172.