PL EN


Preferencje help
Widoczny [Schowaj] Abstrakt
Liczba wyników
2014 | 2 | 331--335
Tytuł artykułu

Data Cleansing of the Fire & Rescue Text Corpus. The Case Study of Correction of the Misspellings and Segmentation into Sentences

Warianty tytułu
Języki publikacji
EN
Abstrakty
EN
The article presents a case study of applying data cleansing methods and segmentation procedures in order to correct and enhance the structure of the domain corpus of fire service. During the study we present our approach and the results in the task of correcting the misspellings, as well as the method of segmenting the corpus into sentences.(original abstract)
Rocznik
Tom
2
Strony
331--335
Opis fizyczny
Twórcy
  • The Main School of Fire Service
  • The Main School of Fire Service
Bibliografia
  • Elzinga P., Poelmans J., Viaene S., Dedene G., and Morsing S., "Terrorist threat assessment with formal concept analysis," in Intelligence and Security Informatics (ISI), 2010 IEEE International Conference on. IEEE, 2010, pp. 77-82.
  • Hernández M. A. and Stolfo S. J., "Real-world data is dirty: Data cleansing and the merge/purge problem," Data mining and knowledge discovery, vol. 2, no. 1, pp. 9-37, 1998.
  • Krasuski A., Kreński K., Wasilewski P., and Łazowy S., "Granular approach in knowledge discovery," in Rough Sets and Knowledge Technology. Springer, 2012, pp. 416-421.
  • Lee M. L., Lu H., Ling T. W., and Ko Y. T., "Cleansing data for mining and warehousing," in Database and Expert Systems Applications. Springer, 1999, pp. 751-760.
  • Levenshtein V. I., "Binary codes capable of correcting deletions, insertions, and reversals," Soviet physics doklady, vol. 10, pp. 707-710, 1966.
  • Müller H. and Freytag J. -C., Problems, methods, and challenges in comprehensive data cleansing. Professoren des Inst. Für Informatik, 2005.
  • Poelmans J., Elzinga P., Dedene G., Viaene S., and Kuznetsov S., "A concept discovery approach for fighting human trafficking and forced prostitution," Conceptual Structures for Discovering Knowledge, pp. 201-214, 2011.
  • Rudolf M. and Świdziński M., "Automatic utterance boundaries recognition in large polish text corpora," in Intelligent Information Processing and Web Mining. Springer, 2004, pp. 247-256.
  • Wikipedia, "Zipf's law," http://en.wikipedia.org/wiki/Zipf's_law, [Access: 23.04.2014].
  • Work C., "Ewidencja zdarze´n - EWID99," Abacus, http://www.ewid.pl/,Tech. Rep., [Access: 23.04.2014].
  • Zipf G. K., "Selected studies of the principle of relative frequency in language." 1932.
Typ dokumentu
Bibliografia
Identyfikatory
Identyfikator YADDA
bwmeta1.element.ekon-element-000171325183

Zgłoszenie zostało wysłane

Zgłoszenie zostało wysłane

Musisz być zalogowany aby pisać komentarze.
JavaScript jest wyłączony w Twojej przeglądarce internetowej. Włącz go, a następnie odśwież stronę, aby móc w pełni z niej korzystać.