Analysis of the effectiveness of machine learning algorithms in big data processing

Keywords: artificial intelligence, predictive analytics, data processing, classification algorithms, model optimization

Abstract

The aim of this article is to analyze the effectiveness of machine learning algorithms and identify optimal approaches for their use under heavy loads and large volumes of information. Special attention is paid to tasks requiring high accuracy and speed, such as financial forecasting, medical diagnostics, and behavioral data analysis. The research methodology included a comparative analysis of various machine learning algorithms, such as linear regression, decision trees, support vector machines, and deep neural networks. Key factors influencing the speed and accuracy of data processing were evaluated, including data size, model complexity, computational resources, and data quality. A series of experiments on real-world datasets was conducted to assess each algorithm’s performance in terms of accuracy, training time, and computational resource demands. The results showed that deep neural networks provide high accuracy on unstructured data, but require significant computational resources and training time. Algorithms such as linear regression and decision trees demonstrated high processing speed on simpler datasets, though their accuracy decreases as task complexity increases. The support vector machine method proved effective for classification and prediction tasks, particularly in financial and medical applications. Random forests were found to be efficient for text classification, providing a balance between speed and accuracy. The conclusions indicate that the choice of algorithm depends on the task’s specific characteristics, data size, and accuracy requirements. Optimizing models for distributed computing environments is a key direction for improving productivity, as it allows for parallelization and reduces training time

References

1. Данчак О., Войтюк М. Ефективні сховища даних для рішень машинного навчання. Herald of Khmelnytskyi National University. Technical sciences. 2024. Вип. 337. № 3(2). С. 57–63.
2. Нестеров В. Дослідження впливу аналітики великих даних на ефективність бізнесу в цифрову епоху. Інформаційні технології та суспільство. 2024. Вип. 1 (12). С. 70–76.
3. Лозовська К. Аналіз використання методів машинного навчання в аналітиці показників інтернет-ресурсів. Сталий розвиток економіки. 2023. Вип. 2 (47). С. 65–69.
4. Ngiam K. Y., Khor W. Big data and machine learning algorithms for health-care delivery. The Lancet Oncology. 2019. Vol. 20. № 5. P. e262–e273.
5. Koliesetty V. V., Rajput D. S. A review on the significance of machine learning for data analysis in big data. Jordanian Journal of Computers and Information Technology. 2020. Vol. 6. № 1. P. 155–171.

Abstract views: 121
PDF Downloads: 80
Published
2024-09-28
How to Cite
Bondarchuk , O., Kozub , V., & Kozub , Y. (2024). Analysis of the effectiveness of machine learning algorithms in big data processing. COMPUTER-INTEGRATED TECHNOLOGIES: EDUCATION, SCIENCE, PRODUCTION, (56), 107-116. https://doi.org/10.36910/6775-2524-0560-2024-56-13
Section
Computer science and computer engineering