Method of compiling declarations of classes of WEB-programming languages into byte-code languages based on machine learning

Keywords: compiler, machine learning, generative artificial intelligence, byte-code CIL, .NET, TypeScript, ChatGPT, LLM, ML.NET

Abstract

This article proposes a method that allows you to compile class declarations of WEB programming languages ​​into byte-code languages ​​based on machine learning. The proposed method is based on the idea of ​​using two types of artificial intelligence. In order to check the entered program by a user, it is proposed to use a neural network trained to perform the task of binary classification of the entered program into correct and incorrect. The SDCA algorithm was chosen for network training, which has proven itself well for solving binary classification problems. In order to generate CIL instructions based on the entered program, this article used generative artificial intelligence based on LLM from the OpenAI company. To solve the problem, the ChatGPT model was retrained on relevant examples using the fine-tuning method. The study was tested on a developed test compiler that checks the entered program written in TypeScript for correctness and generates the corresponding code in the Common Intermediate Language (CIL) with sufficiently high accuracy. The obtained result proves that the use of machine learning methods to create compilers is possible. This approach will reduce the development of the compiler only to the preparation of the correct data set for retraining the corresponding models, which is much simpler and less time-consuming compared to the classical approach, when for each method or method of compiling a subset of the language being compiled, it is necessary to make changes and refine the lexical, syntactic, semantic analyzers and a code generator for each new or changed construct of the input programming language

References

1. Іваненко А.Р., Марченко О.І. «Метод компіляції типів об’єднання мови TypeScript у проміжну мову CIL платформи .NET», Комп’ютерно-інтегровані технології: освіта, наука, виробництво, 2023, № 52, c.77-84.
2. Yujia Li, David Choi, , Junyoung Chung and others. «Competition-level code generation with AlphaCode», Science, 2022, Vol 378, pp. 1092-1097.
3. AlphaCode Attention Visualization
4. Qiushi Sun, Nuo Chen, Jianing Wang, Xiang Li, Ming Gao. «TransCoder: Towards Unified Transferable Code Representation Learning Inspired by Human Skills», Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation, 2024, pp. 16713–16726
5. Документація TypeScript.

Abstract views: 32
PDF Downloads: 23
Published
2024-09-28
How to Cite
Ivanenko , A., & Marchenko , O. (2024). Method of compiling declarations of classes of WEB-programming languages into byte-code languages based on machine learning. COMPUTER-INTEGRATED TECHNOLOGIES: EDUCATION, SCIENCE, PRODUCTION, (56), 165-173. https://doi.org/10.36910/6775-2524-0560-2024-56-21
Section
Computer science and computer engineering