The Architecture of the Intellectual System for Determining the Degree of Uniqueness of Armenian Text in a Multilingual Environment
The Architecture of the Intellectual System for Determining the Degree of Uniqueness
of Armenian Text in a Multilingual Environment
Petrosyan Gevorg

Summary
Key words: natural language processing, interlingual borrowing, plagiarism, text originality, language model, sentence embedding, transformer
The problem of determining the uniqueness of texts in a multilingual environment, in the context of the availability of translation and rewording tools, has acquired greater importance. Traditional methods of searching for monolingual borrowings based on word coincidences cannot analyze the semantic correspondences between typologically different languages, for example, Armenian, English, and Russian. Without identifying multilingual borrowings, it is impossible to ensure an objective and accurate determination of the degree of text uniqueness. This work presents the architecture of the intellectual system for determining the degree of uniqueness of Armenian texts, with a focus on detecting multilingual borrowings in the Armenian–English and Armenian–Russian language pairs. According to the proposed architecture, the system is presented as a two-level approach to searching for borrowings. At the first level, a search for possible sources is performed based on the most informative parts of speech, which ensures speed and sufficient accuracy in selecting candidate texts. At the second level, the semantic analysis is performed using a multilingual model based on a transformer architecture, which maps sentences from different languages into a common vector space. At this level, structural analysis is also performed using a method based on Markov chains.
DOI: https://doi.org/10.58726/27382923-2025.2-66
