PL EN


Preferencje help
Widoczny [Schowaj] Abstrakt
Liczba wyników
2016 | 5 | nr 1 | 24--35
Tytuł artykułu

Effective Multi-Label Classification Method with Applications to Text Document Categorization

Treść / Zawartość
Warianty tytułu
Języki publikacji
EN
Abstrakty
EN
Increasing number of repositories of online documents resulted in growing demand for automatic categorization algorithms. However, in many cases the texts should be assigned to more than one class. In the paper, new multi-label classification algorithm for short documents is considered. The presented problem transformation Labels Chain (LC) algorithm is based on relationship between labels, and consecutively uses result labels as new attributes in the following classification process. The method is validated by experiments conducted on several real text datasets of restaurant reviews, with different number of instances, taking into account such classifiers as kNN, Naive Bayes, SVM and C4.5. The obtained results showed the good performance of the LC method, comparing to the problem transformation methods like Binary Relevance and Label Powerset. (original abstract)
Rocznik
Tom
5
Numer
Strony
24--35
Opis fizyczny
Twórcy
autor
  • Lodz University of Technology
  • Lodz University of Technology
Bibliografia
  • [1] Glinka K., Zakrzewska D. (2015) Effective Multi-label Classification Method for Multidimensional Datasets, Proceeding of the 11th International Conference FQAS 2015, Cracow, Poland, 127-138.
  • [2] Schapire R.E., Singer Y. (2000) BoosTexter: A boosting-based system for text categorization, Machine learning 39(2/3), 135-168.
  • [3] Li T., Ogihara M. (2004) Content-based music similarity search and emotion detection, Proceeding of IEEE International Conference on Acoustic, Speech and Signal Processing (volume 5), Canada, 705-708.
  • [4] Tsoumakas G., Katakis I., Vlahavas I. (2010) Mining Multi-label Data, Maimon O., Rokach L. [ed.]: Data Mining and Knowledge Discovery Handbook, Springer US, Boston, MA, 667-685.
  • [5] Madjarov G., Kocev D., Gjorgjevikj D., Dẑeroski S. (2012) An extensive experimental comparison of methods for multi-label learning, Pattern Recognition 45(9), 3084-3104.
  • [6] Sajnani H., Javanmardi S., McDonald D.W., Lopes C.V. (2011) Multi-label classification of short text: A study on wikipedia barnstars, Analyzing Microtext: Papers from the 2011 AAAI Workshop.
  • [7] Boutell M.R., Luo J., Shen X., Brown C.M. (2004) Learning multi-label scene classification, Pattern Recognition 37(9), 1757-1771.
  • [8] Esuli A., Fagni T., Sebastiani F. (2008) Boosting multi-label hierarchical text categorization, Information Retrieval 11(4), 287-313.
  • [9] Comité F.D., Gilleron R., Tommasi M. (2003) Learning multi-label alternating decision decision tree from text and data, Lecture Notes in Computer Science, vol. 2734, Springer, Heidelberg, 35-49.
  • [10] Lee S.-J., Jiang J.-Y. (2014) Multilabel text categorization based on fuzzy relevance clustering, IEEE Transactions on Fuzzy Systems 22(6), 1457-1471.
  • [11] Read J., Pfahringer B., Holmes G., Frank E. (2009) Classifier Chains for Multi-label Classification, Buntine W., Grobelnik M., Mladenic, D., Shawe-Taylor J. [ed.]: Machine Learning and Knowledge Discovery in Databases, Lecture Notes in Computer Science, vol. 5782, Springer, Heidelberg, 254-269.
  • [12] Kajdanowicz T., Kazienko P. (2012) Multi-label classification using error correcting out-put codes, Applied Mathematics and Computer Science 22(4), 829-840.
  • [13] http://www.yelp.com/
  • [14] http://www.ics.uci.edu/~vpsaini/
  • [15] Koehn P. (2010) Statistical Machine Translator, Cambridge University Press, UK.
  • [16] Witten I.H., Frank E., Hall M.A. (2011) Data Mining: Practical Machine Learning Tools and Techniques, Morgan Kaufmann, San Francisco, USA.
  • [17] http://www.cs.waikato.ac.nz/ml/weka/index.html
Typ dokumentu
Bibliografia
Identyfikatory
Identyfikator YADDA
bwmeta1.element.ekon-element-000171428617

Zgłoszenie zostało wysłane

Zgłoszenie zostało wysłane

Musisz być zalogowany aby pisać komentarze.
JavaScript jest wyłączony w Twojej przeglądarce internetowej. Włącz go, a następnie odśwież stronę, aby móc w pełni z niej korzystać.