A parsing tool based on the ANTLR parser generator and the Clojure language

Keywords: ANTLR, Clojure, parsing, parse tree, lexical analysis, token, reflection

Abstract

This article is devoted to the process of parsing program code. As a result of the analysis of a number of publications comparing existing parser generators, it has been determined that the ANTLR parser generator is popular among developers due to its simplicity, wide range of functions, as well as the quality and efficiency of the generated code. The article presents a tool for parsing program code based on the use of the ANTLR parser generator and the modern functional programming language Clojure, whose characteristics make it much easier to create tree-like data structures, as well as analyze and modify them, when compared to non-functional programming languages. In addition, the wide range of constructs available in Clojure makes it possible to make code more concise and understandable than code written in non-functional programming languages. This is a great advantage when processing complex data structures such as syntax trees. Unlike the existing Clojure libraries that provide some parsing tools that use the ANTLR generator in an inefficient interpreter mode, the developed tool contains functions that provide the developer with full-fledged capabilities to perform parsing of program code using ANTLR. The set of these functions includes: the function of generating classes of lexical and syntax analyzers, which should be used as a separate stage of project build; the registration function, the purpose of which is to analyze the classes of generated analyzers and create Clojure data structures to implement an interface between programs written in Clojure and generated classes of analyzers written in Java; the function of parsing program code using registered data structures. Due to the fact that the program code written in Clojure runs on the Java Virtual Machine (JVM), these functions were able to use the classes of the ANTLR generator libraries written in Java.

References

1. Генератор синтаксичних аналізаторів ANTLR.
2. Lex & Yacc.
3. M. Mernik, M. Lenic, E. Avdicausevic, V. Zumer, «Compiler/interpreter generator system LISA», the 33rd Annual Hawaii International Conference on System Sciences, Maui, HI, USA, 2000.
4. D. da Cruz, M.J.V. Pereira, M. Béron, R. Fonseca, P.R. Henriques, «Comparing Generators for Language-based Tools», the 1st Conf. on Compiler Related Technologies and Applications, CoRTA'07, Portugal, 2007.
5. F. Ortin, J. Quiroga, O. Rodriguez-Prieto, M. Garcia, «An empirical evaluation of Lex/Yacc and
ANTLR parser generation tools», Plos one, 17(3), e0264326, 2022.

Abstract views: 45
PDF Downloads: 23
Published
2024-09-28
How to Cite
Kasianchuk , D., & Marchenko , O. (2024). A parsing tool based on the ANTLR parser generator and the Clojure language. COMPUTER-INTEGRATED TECHNOLOGIES: EDUCATION, SCIENCE, PRODUCTION, (56), 174-184. https://doi.org/10.36910/6775-2524-0560-2024-56-22
Section
Computer science and computer engineering