PL EN


Preferencje help
Widoczny [Schowaj] Abstrakt
Liczba wyników
2018 | 7 | nr 4 | 227--236
Tytuł artykułu

Towards the Data Structure for Effective Word Search

Treść / Zawartość
Warianty tytułu
Języki publikacji
EN
Abstrakty
EN
In the paper problem of searching basic forms for words in the Polish language is discussed. Polish language has a very extensive inflection and effective method for finding base form is important in many NLP tasks for example text indexing. The method for searching, based on open-source dictionary of Polish language, is presented. In this method it is important to design a structure for storing all words from dictionary, in such a way that it allows to quickly find basic words forms. Two dictionary structures: ternary search tree and associative table are presented and discussed. Tests are performed on the six actual and three crafted artificial texts and results are compared with other possible dictionary structures. At the end conclusions about structures effectiveness are formulated.(original abstract)
Rocznik
Tom
7
Numer
Strony
227--236
Opis fizyczny
Twórcy
  • Warsaw University of Life Sciences - SGGW, Poland
  • Warsaw University of Life Sciences - SGGW, Poland
Bibliografia
  • Bentley J., Sedgewick R., (1998) Ternary Search Trees. Dr. Dobbs Journal April, 1998
  • Cormen, T. H., Leiserson, C. E.; Rivest, R. L.; Stein, C., (2001), Chapter 11 Hash Tables, Introduction to Algorithms (2nd ed.), MIT Press and McGraw-Hill
  • Karwowski W., Wrzeciono P., (2014) Automatic indexer for Polish agricultural texts. Information Systems in Management 2014, Vol. 3, nr 4, pp. 229-238
  • Karwowski W., Wrzeciono P., (2017) Methods of automatic topic mining in publications in agriculture domain. Information Systems in Management 2016, Vol. 6 (3) pp 192-202
  • Karwowski W., Wrzeciono P., (2017) The dictionary structure for effective word search. Information Systems in Management 2017, Vol. 6, (4), s. 284-293
  • Mehlhorn, K., Sanders, P. (2008), Chapter 4 Hash Tables and Associative Arrays, Algorithms and Data Structures: The Basic Toolbox, Springer
  • Morphosyntactic dictionary for the Polish language https://github.com/morfologik/
  • Polish language dictionary, http://www.sjp.pl
  • Stempel - Algorithmic Stemmer for Polish Language http://getopt.org/stempel/
  • Weiss D. (2005) A Survey of Freely Available Polish Stemmers and Evaluation of Their Applicability in Information Retrieval. 2nd Language and Technology Conference, Poznań, Poland, pp. 216-221
  • Wrzeciono P., Karwowski W. (2013) Automatic Indexing and Creating Semantic Networks for Agricultural Science Papers in the Polish Language, Computer Software and Applications Conference Workshops (COMPSACW), 2013 IEEE 37th Annual, Kyoto
Typ dokumentu
Bibliografia
Identyfikatory
Identyfikator YADDA
bwmeta1.element.ekon-element-000171546045

Zgłoszenie zostało wysłane

Zgłoszenie zostało wysłane

Musisz być zalogowany aby pisać komentarze.
JavaScript jest wyłączony w Twojej przeglądarce internetowej. Włącz go, a następnie odśwież stronę, aby móc w pełni z niej korzystać.