LONG-RANGE CORRELATIONS AS A CRITERION FOR RELEVANCE OF WORDS IN TEXTUAL DOCUMENTS

Authors

  • Oleh Kushnir Ivan Franko National University of Lviv
  • Bohdan Horon Ivan Franko National University of Lviv
  • Ivan Dovhan Ivan Franko National University of Lviv
  • Marta Dufanets Ivan Franko National University of Lviv

Keywords:

information retrieval, keyword detection, long-range correlations, word-token clustering

Abstract

Highly efficient detection of keywords is a basis for successful information retrieval. Here we present a new criterion of relevance of words in textual documents, which is associated with the long-range autocorrelations of word-token time series. The above approach is compared with a canonical keyword detection method based on word-token clustering in a text.

References

J. P. Herrera, P. A. Pury, Eur. Phys. J. B, 63, 135 (2008).

О. Кушнір, А. Волоско, Л. Іваніцький, С. Рихлюк, Електрон. та інф. технол., 6, 155 (2016).

K.-I. Goh, A.-L. Barabási, Europhys. Lett., 81, 48002 (2008).

E. G. Altmann, G. Cristadoro, M. D. Esposti, Proc. Natl. Acad. Sci., 109, 11582 (2012).

Published

2025-06-03