Peculiarities of the application of distributed computing for processing streaming data.

Keywords: cloud service, hardware and software complex, distributed information system, routing of data flow, task deployment, channel capacity, task schedule.

Abstract

The analysis of modern algorithms for streaming processing of digital data arrays and methods for formalizing the procedures in order to build an appropriate mathematical apparatus is provided. The generalized scheme of streaming data arrays and the scheme of the hardware-software complex of the cloud service are built. The features of the organization of the hardware and software complex of the network node according to the architecture of the distributed information system, as well as the tasks that must be solved in order to optimize the specified structure are indicated. In particular, the problem of optimizing the schedule for processing requests in accordance with the peculiarities of the operation of the general complex and the problem of optimizing algorithms for parallel processing are considered. A specialized mathematical model of a distributed information system of a hardware-software complex of a cloud service has been developed.The model consists of a central computing node and peripheral computing nodes, and includes the parameters of the corresponding components, functions for displaying the task deployment procedure and routing functions for the input data flow. On the basis of the constructed mathematical model, the method for calculating the indicators of throughput and delay time when processing requests from users of the cloud servicehas been developed, which are considered as indicators of objective functions.

References

Hummer, W., Satzger, B., & Dustdar, S. (2013). Elastic stream processing in the cloud. Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery, 3(5), 333-345. doi:10.1002/widm.1100.

Zeadally, S., Das, A. K., & Sklavos, N. (2019). Cryptographic technologies and protocol standards for Internet of Things. Internet of Things, 100075. doi: 10.1016/j.iot.2019.100075.

Li, J., Pu, C., Chen, Y., Gmach, D., & Milojicic, D. (2016). Enabling elastic stream processing in Shared Clusters. 2016 IEEE 9th International Conference on Cloud Computing (CLOUD). doi:10.1109/cloud.2016.0024.

Akidau, T., et al.: The dataflow model: a practical approach to balancing correctness, latency, and cost in massive-scale, unbounded, out-of-order data processing. In: Very Large Data Bases 2015, vol. 8, pp. 1792–1803 (2015).

Kersten, H., & Klett, G. (2014). Data leakage prevention. Heidelberg: Mitp.

Löhel, J. (2014). Data Leakage Prevention: Der Einsatz von Datenträgerverschlüsselung zur erweiterten Absicherung der mobilen IT-Nutzung. Erscheinungsort nicht ermittelbar: Verlag nicht ermittelbar.

Soyata T., et al.: Combat: mobile cloud-based compute/communications infrastructure for battlefield applications. In: Proceedings of SPIE, vol. 8403, pp. 1–13. https://doi.org/10.1117/12.919146.

Mayer-Schonberger, V. (2013). Big Data. London: John Murray General Publishing Division.

Dixit, A., Choudhary, J., & Singh, D. P. (2018). Survey of Apache Storm Scheduler. SSRN Electronic Journal. doi: 10.2139/ssrn.3168564.

Ganesan, D. Apache Spark: Einführung zu Technologie und Anwendung. (2017). Troisdorf: SIGS DATACOM GmbH.

Chintapalli, S., et al.: Benchmarking streaming computation engines: storm, flink and spark streaming. In: International Parallel and Distributed Processing Symposium 2016, pp. 1789–1792 (2016).

Rahman, A., Liu, X., & Kong, F. (2014). A survey on geographic load balancing based data center power management in the smart grid environment. IEEE Communications Surveys & Tutorials, 16(1), 214-233. doi:10.1109/surv.2013.070813.00183.

Jonathan, A., Chandra, A., Weissman, J.B.: Multi-query optimization in wide area streaming analytics. In: Symposium on Cloud Computing 2018, pp. 412–425 (2018).

Femminella, M., Pergolesi, M., & Reali, G. (2016). Performance evaluation of edge cloud computing system for big data applications. 2016 5th IEEE International Conference on Cloud Networking (Cloudnet). doi:10.1109/cloudnet.2016.56.

Heintz, B., Chandra, A., Sitaraman, R.K.: Optimizing grouped aggregation in geo-distributed streaming analytics. In: High Performance Distributed Computing 2015, pp. 133–144 (2015).

Barretto, W., B. Kochem Vendramin, A. C., & Fonseca, M. (2019). RW-Through: A data replication protocol suitable FOR GeoDistributed And Read-intensive workloads. Workshop Em Clouds E Aplicações. doi:10.5753/wcga.2019.7592.

Yin, F., Li, X., Li, X., & Li, Y. (2019). Task Scheduling for Streaming Applications in a Cloud-Edge System. Security, Privacy, and Anonymity in Computation, Communication, and Storage Lecture Notes in Computer Science, 105–114. doi: 10.1007/978-3-030-24900-7_9.

Hwang, J., Cetintemel, U., Zdonik, S.B.: Fast and highly-available stream processing over wide area networks. In: International Conference on Data Engineering 2008, pp. 804–813 (2008).

Abstract views: 80
PDF Downloads: 104
Published
2021-06-26
How to Cite
Arpentii , S. (2021). Peculiarities of the application of distributed computing for processing streaming data . COMPUTER-INTEGRATED TECHNOLOGIES: EDUCATION, SCIENCE, PRODUCTION, (43), 171-176. https://doi.org/10.36910/6775-2524-0560-2021-43-28
Section
Computer science and computer engineering