PL EN


Preferencje help
Widoczny [Schowaj] Abstrakt
Liczba wyników
2015 | 5 | 355--364
Tytuł artykułu

Concepts Extraction from Unstructured Polish Texts: a Rule Based Approach

Autorzy
Warianty tytułu
Języki publikacji
EN
Abstrakty
EN
We present recently developed solution allowing extraction of concepts from unstructured Polish texts with special focus on correct morphological forms of obtained concept names. As Polish is a highly inflected language, detected names need to be transformed following Polish grammar rules. We propose a user-friendly method for specification of transformation patterns, which is based on a simple annotations language. Annotations prepared by a user are compiled into transformation rules. During the concept extraction process the input document is split into sentences and the rules are applied to sequences of words comprised in sentences. Recognized strings forming concept names are aggregated at various levels and assigned with scores. We report also results of initial experiments performed on a medical text.(original abstract)
Rocznik
Tom
5
Strony
355--364
Opis fizyczny
Twórcy
autor
  • AGH University of Science and Technology Kraków, Poland
Bibliografia
  • S. Acedański, "A morphosyntactic brill tagger for inflectional languages," in Advances in Natural Language Processing. Springer, 2010, pp. 3-14.
  • C. Blake and W. Pratt, "Better rules, fewer features: a semantic approach to selecting features from text," in Data Mining, 2001. ICDM 2001, Proceedings IEEE International Conference on. IEEE, 2001, pp. 59-66.
  • S. Bloehdorn, P. Cimiano, and A. Hotho, "Learning ontologies to improve text clustering and classification," in From Data and Information Analysis to Knowledge Engineering, ser. Studies in Classification, Data Analysis, and Knowledge Organization, M. Spiliopoulou, R. Kruse, C. Borgelt, A. Nürnberger, and W. Gaul, Eds. Springer Berlin Heidelberg, 2006, pp. 334-341. [Online]. Available: http://dx.doi.org/10.1007/3-540-31314-1_40
  • C. Carpineto and G. Romano, Concept data analysis: Theory and applications. John Wiley & Sons, 2004.
  • J. Challis, "Lateral thinking in information retrieval white paper," Concept Searching, Tech. Rep., 2003.
  • S.-M. Chen, J.-s. Ke, and J.-F. Chang, "Knowledge representation using fuzzy petri nets," Knowledge and Data Engineering, IEEE Transactions on, vol. 2, no. 3, pp. 311-319, Sep 1990.
  • P. Cimiano, A. Hotho, and S. Staab, "Learning concept hierarchies from text corpora using formal concept analysis." J. Artif. Intell. Res.(JAIR), vol. 24, pp. 305-339, 2005.
  • J. Daciuk, "Incremental construction of finite-state automata and transducers, and their use in the natural language processing," Ph.D. dissertation, Gdansk University of Technology, ETI faculty, Gabriela Narutowicza 11/12, 80-233 Gdansk Poland, 1998.
  • N. Dalvi, R. Kumar, B. Pang, R. Ramakrishnan, A. Tomkins, P. Bohannon, S. Keerthi, and S. Merugu, "A web of concepts," in Proceedings of the twenty-eighth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems. ACM, 2009, pp. 1-12.
  • F. Graliński, K. Jassem, and M. Junczys-Dowmunt, "Psi-toolkit: A natural language processing pipeline," in Computational Linguistics, ser. Studies in Computational Intelligence, A. Przepiórkowski, M. Piasecki, K. Jassem, and P. Fuglewicz, Eds. Springer Berlin Heidelberg, 2013, vol. 458, pp. 27-39. [Online]. Available: http://dx.doi.org/10.1007/ 978-3-642-34399-5_2
  • D. Janus, "Smyrna prosty konkordancer obsługuja˛cy je˛zyk polski," 2015, accessed: May 2015. [Online]. Available: http://smyrna.danieljanus.pl/
  • K. Jensen, Coloured Petri Nets: Basic Concepts, Analysis Methods and Practical Use. Springer, 1996, vol. 1, no. Basic Concepts.
  • A. Ligeza, Principles of Verification of Rule-Based Systems. Springer, 2006.
  • A. Maedche and S. Staab, "Ontology learning for the semantic web," Intelligent Systems, IEEE, vol. 16, no. 2, pp. 72-79, Mar 2001.
  • E. H. Mamdani and S. Assilian, "An experiment in linguistic synthesis with a fuzzy logic controller," International Journal of ManMachine Studies, vol. 7, no. 1, pp. 1-13, 1975. [Online]. Available: http://linkinghub.elsevier.com/retrieve/pii/S0020737375800022
  • M. Miłkowski, "Developing an open-source, rule-based proofreading tool," Software: Practice and Experience, vol. 40, no. 7, pp. 543-566, 2010.
  • "Morfologik," 2015, accessed: May 2015. [Online]. Available: http://morfologik.blogspot.com/
  • D. Naber, "Language tool style and grammar check," 2015, accessed: May 2015. [Online]. Available: https://www.languagetool.org/
  • S. Osinski and D. Weiss, "A concept-driven algorithm for clustering search results," Intelligent Systems, IEEE, vol. 20, no. 3, pp. 48-54, 2005.
  • A. Parameswaran, H. Garcia-Molina, and A. Rajaraman, "Towards the web of concepts: Extracting concepts from large datasets," Proceedings of the VLDB Endowment, vol. 3, no. 1-2, pp. 566-577, 2010.
  • P. Pęzik, "Wyszukiwarka PELCRA dla danych NKJP," 2012.
  • T. Ross, Fuzzy Logic with Engineering Applications. Wiley, 2009.
  • A. Stavrianou, P. Andritsos, and N. Nicoloyannis, "Overview and semantic issues of text mining," ACM Sigmod Record, vol. 36, no. 3, pp. 23-34, 2007.
  • P. Szwed, "Application of fuzzy ontological reasoning in an implementation of medical guidelines," in Human System Interaction (HSI), 2013 The 6th International Conference on, June 2013, pp. 342-349.
  • "Video event recognition with Fuzzy Semantic Petri Nets," in Man-Machine Interactions 3, ser. Advances in Intelligent Systems and Computing, A. Gruca, T. Czachórski, and S. Kozielski, Eds. Springer International Publishing, 2014, vol. 242, pp. 431-439. [Online]. Available: http://dx.doi.org/10.1007/978-3-319-02309-0_47
  • P. Szwed and M. Komorkiewicz, "Object tracking and video event recognition with fuzzy semantic petri nets," in Proceedings of the 2013 Federated Conference on Computer Science and Information Systems, Kraków, Poland, September 8-11, 2013., M. Ganzha, L. A. Maciaszek, and M. Paprzycki, Eds., 2013, pp. 167-174. [Online]. Available: http://fedcsis.org/2013/
  • M. Wolinski, M. Milkowski, M. Ogrodniczuk, and A. Przepiórkowski, "Polimorf: a (not so) new open morphological dictionary for polish." in LREC, 2012, pp. 860-864.
Typ dokumentu
Bibliografia
Identyfikatory
Identyfikator YADDA
bwmeta1.element.ekon-element-000171422176

Zgłoszenie zostało wysłane

Zgłoszenie zostało wysłane

Musisz być zalogowany aby pisać komentarze.
JavaScript jest wyłączony w Twojej przeglądarce internetowej. Włącz go, a następnie odśwież stronę, aby móc w pełni z niej korzystać.