Iranian Journal of Public Health 2017. 46(11):1563-1571.

Prognosis and Early Diagnosis of Ductal and Lobular Type in Breast Cancer Patient



Background: Breast cancer is one of the most common cancers with a high mortality rate among women. Prognosis and early diagnosis of breast cancer among women society reduce considerable rate of their mortality. Nowadays, due to this illness, try to be setting up intelligent systems, which can predict and early diagnose this cancer, and reduce mortality of women society.

Methods: Overall, 208 samples were collected from 2014 to 2015 from two oncologist offices and Javadalaemeh Clinic in Kerman, southeastern Iran. Data source was medical records of patients, then 64 data mining models in MATLAB and WEKA software were used, eventually these measured precision and accuracy of data mining models.

Results: Among 64 data mining models, Bayes-Net model had 95.67% of accuracy and 95.70% of precision; therefore, was introduced as the best model for prognosis and diagnosis of breast cancer.

Conclusion: Intelligent and reliable data mining models are proposed. Hence, these models are recommended as a useful tool for breast cancer prediction as well as medical decision-making.




Diagnosis, Breast cancer, Ductal and lobular, Data mining models

Full Text:



Sephton SE, Sapolsky RM, Kraemer HC, Spiegel D (2000). Diurnal cortisol rhythm as a predictor of breast cancer survival. J Natl Cancer Inst, 92:994-1000.

Cook MB, Guénel P, Gapstur SM et al (2015). Tobacco and alcohol in relation to male breast cancer: an analysis of the male breast cancer pooling project consortium. Cancer Epidemiol Biomarkers Prev, 24:520-531.

Anderson AS, Macleod M, Mutrie N et al (2014). Breast cancer risk reduction-is it feasible to initiate a randomised controlled trial of a lifestyle intervention programme (ActWell) within a national breast screening programme? Int J Behav Nutr Phys Act, 11:156.

Borst MJ, Ingold JA (1993). Metastatic patterns of invasive lobular versus invasive ductal carcinoma of the breast. Surgery, 114:637-41.

Kurtz JM, Jacquemier J, Torhorst J et al (1989). Conservation therapy for breast cancers other than infiltrating ductal carcinoma. Cancer, 63:1630-5.

Kumar Y, Sahoo G (2013). Prediction of different types of liver diseases using rule based classification model. Technol Health Care, 21:417-432.

Delen D, Walker G, Kadam A (2005). Predicting breast cancer survivability: a comparison of three data mining methods. Artif Intell Med, 34:113-127.

Fiuzy M, Haddadnia J, Mollania N et al (2012). Introduction of a new diagnostic method for breast cancer based on Fine Needle Aspiration (FNA) test data and combining intelligent system. Iran J Cancer Prev, 5:169-177.

Luo Z, Wu X, Guo S, Ye B (2008). Diagnosis of breast cancer tumor based on manifold learning and support vector machine. Information and Automation, 2008. ICIA 2008. International Conference on, IEEE, pp. 703-707.

Mangasarian OL, Street WN, Wolberg WH (2009). Breast Cancer Diagnosis and Prognosis via Linear Programming. July,

Khajouei R, Salehi S, Ahmadian L (2013). Methods Used for Evaluation of Health Information Systems in Iran. J Health Adm, 16:7-21.

Maroco J, Silva D, Rodrigues A et al (2011). Data mining methods in the prediction of Dementia: A real-data comparison of the accuracy, sensitivity and specificity of linear discriminant analysis, logistic regression, neural networks, support vector machines, classification trees and random forests. BMC Res Notes, 4:299.

Broomhead DS, Lowe D (1988). Radial basis functions, multi-variable functional interpolation and adaptive networks. Royal Signals and Radar Establishment Malvern (United Kingdom).

Lavrač N (1999). Selected techniques for data mining in medicine. Artif Intell Med, 16:3-23.

Bellazzi R, Zupan B (2008). Predictive data mining in clinical medicine: current issues and guidelines. Int J Med Inform, 77:81-97.

Fayyad U, Piatetsky-Shapiro G, Smyth P (1996). From data mining to knowledge discovery in databases. AI Magazine, 17:37-54.

Listgarten J, Damaraju S, Poulin B et al (2004). Predictive models for breast cancer susceptibility from multiple single nucleotide polymorphisms. Clin Cancer Res, 10:2725-37.

Hosseini A SF (2013). Determination of minimum data set to design a decision making system in order to prognosis breast cancer [BSc Thesis]. School of Management and Medical Information Science, Kerman University of Medical Science, Iran.

Montazeri M, Baghshah MS, Enhesari A (2015). Hyper-Heuristic algorithm for finding efficient features in diagnose of lung cancer disease. arXiv preprint arXiv:1512.04652.

Murphy K (2001). The bayes net toolbox for matlab. Computing Science and Statistics,

Platt J (1998). Fast training of support vector machines using sequential minimal optimization. Advances in kernel methods—support vector learning, 3.

Rojek I (2009). Classifier Models in Intelligent CAPP Systems. In: Man-Machine Interactions. Ed(s): Springer, pp. 311-319.

Kohavi R (1995). Wrappers for performance enhancement and oblivious decision graphs. Citeseer.

Melville P, Mooney RJ (2003). Constructing diverse classifier ensembles using artificial training examples. IJCAI'03 Proceedings of the 18th international joint conference on Artificial intelligence. pp. 505-510.

Dong L, Frank E, Kramer S (2005). Ensembles of balanced nested dichotomies for multi-class problems. Lecture Notes in Computer Science, vol 3721. Springer, pp. 84-95.

Caruana R, Niculescu-Mizil A, Crew G, Ksikes A (2004). Ensemble selection from libraries of models. Proceedings of the twenty-first international conference on Machine learning, ACM, pp. 18.

Webb GI (2000). Multiboosting: A technique for combining boosting and wagging. Machine Learning, 40:159-196.

Frank E, Hall M (2001). A simple approach to ordinal classification. ed. Springer.

Peng C, Liu L, Niu B et al (2011). Prediction of RNA-binding proteins by voting systems. J Biomed Biotechnol, 2011:506205.

Ho TK (1998). The random subspace method for constructing decision forests.

IEEE Trans Pattern Anal Mach Intell, 20:832-844.

Khoshgoftaar TM, Seliya N (2004). The necessity of assuring quality in software measurement data. Software Metrics, 2004. Proceedings. 10th International Symposium on, IEEE, pp. 119-130.

Hall M, Frank E (2008). Combining Naive Bayes and Decision Tables. FLAIRS Conference, pp. 318-319.

Cohen WW (1995). Fast effective rule induction. Proceedings of the twelfth international conference on machine learning, pp. 115-123.

Holte RC (1993). Very simple classification rules perform well on most commonly used datasets. Machine Learning, 11:63-90.

Devasena CL, Sumathi T, Gomathi V, Hemalatha M (2011). Effectiveness evaluation of rule based classifiers for the classification of iris data set. Bonfring International Journal of Man Machine Interface, 1:05-09.

Freund Y, Mason L (1999). The alternating decision tree learning algorithm. ICML, pp. 124-133.

Quinlan JR (1993). C4. 5: Programming for machine learning. Morgan Kauffmann.

Webb GI (1999). Decision tree grafting from the all-tests-but-one partition. IJCAI, Citeseer, pp. 702-707.

Kohavi R (1996). Scaling Up the Accuracy of Naive-Bayes Classifiers: A Decision-Tree Hybrid. KDD, Citeseer, pp. 202-207.

Park J, Tyan H-R, Kuo CJ (2006). Internet traffic classification for scalable qos provision. . Multimedia and Expo, 2006 IEEE International Conference on, IEEE, pp. 1221-1224.

Dong L, Frank E, Kramer S (2005). Ensembles of balanced nested dichotomies for multi-class problems. In: Knowledge Discovery in Databases: PKDD 2005. Ed(s): Springer, pp. 84-95.

Bockhorst J, Craven M, Page D, Shavlik J, Glasner J (2003). A Bayesian network approach to operon prediction. Bioinformatics, 19:1227-1235.

West P, Rutstein D, Mislevy RJ et al (2009). A Bayes net approach to modeling learning progressions and task performances. Learning Progressions in Science (LeaPS) Conference, Iowa City, IA, pp 257-292.

Breiman L (1996). Bagging predictors. Machine Learning, 24:123-140.


  • There are currently no refbacks.

Creative Commons Attribution-NonCommercial 3.0

This work is licensed under a Creative Commons Attribution-NonCommercial 3.0 Unported License which allows users to read, copy, distribute and make derivative works for non-commercial purposes from the material, as long as the author of the original work is cited properly.