2018 | 7 | nr 4 | 227--236
Towards the Data Structure for Effective Word Search

In the paper problem of searching basic forms for words in the Polish language is discussed. Polish language has a very extensive inflection and effective method for finding base form is important in many NLP tasks for example text indexing. The method for searching, based on open-source dictionary of Polish language, is presented. In this method it is important to design a structure for storing all words from dictionary, in such a way that it allows to quickly find basic words forms. Two dictionary structures: ternary search tree and associative table are presented and discussed. Tests are performed on the six actual and three crafted artificial texts and results are compared with other possible dictionary structures. At the end conclusions about structures effectiveness are formulated.(original abstract)
  • Warsaw University of Life Sciences - SGGW, Poland
  • Warsaw University of Life Sciences - SGGW, Poland
