Application of decision trees for prediction of heart disease
Abstract
The study proposes the utilization of a decision tree for predicting the onset of cardiovascular diseases. A dataset consisting of 12 key factors was used as input to build a decision tree. These factors include age, gender, chest pain nature, resting blood pressure, cholesterol level, blood sugar level, type of resting electrocardiogram result, maximum heart rate observed during examination, presence of angina during physical exertion, ST segment depression induced by exercise compared to rest, slope type of the ST segment at peak exercise, and the confirmation of heart disease through diagnostics. A correlation matrix of the investigated factors was constructed based on the input dataset. The research presents a table outlining the studied factors, their English counterparts, and the acceptable ranges of their values. The fundamental steps of the decision tree construction algorithm are outlined. The test interpretation of the decision tree with transition conditions for nodes is provided. To assess the classification quality of the input data using the decision tree, the classification results are presented in the form of a confusion matrix. Precision and accuracy of the model are evaluated. A Receiver Operating Characteristic (ROC) analysis is conducted to examine the relationship between sensitivity and specificity, resulting in an Area Under the Curve (AUC) value of 0.9. The obtained AUC values correspond to excellent classification quality, serving as a prerequisite for employing the proposed decision tree in the design of an information-diagnostic system for predicting the risk of cardiovascular diseases.
References
2. Мазуренко В. П. Статистика: навч.-метод. посіб. / В. П. Мазуренко. – К.: ВПЦ «Київський університет», 2006.
3. Сверстюк А. С. Обгрунтування та верифікація математичної моделі синхронно зареєстрованих кардіосигналів з використанням вектора циклічних ритмічно пов’язаних випадкових процесів. Вимірювальна та обчислювальна техніка в технологічних процесах. 2009. № 1. С. 143–147.
4. Sofogianni A, Stalikas N, Antza C, Tziomalos K. Cardiovascular Risk Prediction Models and Scores in the Era of Personalized Medicine. J Pers Med. 2022. doi: 10.3390/jpm12071180.
5. Kaiser, H.F. An index of factorial simplicity. Psychometrika 39, 31–36 (1974). doi: 10.1007/bf02291575
Abstract views: 100 PDF Downloads: 89





