Specialized, MSE-Optimal m-Estimators of the Rule Probability Especially Suitable for Machine Learning

Piegat, Andrzej; Landowski, Marek

Artykuł - szczegóły

Czasopismo

Control and Cybernetics

2014 | 43 | nr 1 | 133--160

Tytuł artykułu

Specialized, MSE-Optimal m-Estimators of the Rule Probability Especially Suitable for Machine Learning

Autorzy

Andrzej Piegat , Marek Landowski

Treść / Zawartość

Pełne teksty:

http://control.ibspan.waw.pl:3000/contents/export?filename=2014-1-08_piegat_landowski.pdf [zdalny]

Warianty tytułu

Języki publikacji

Abstrakty

The paper presents an improved sample based rule- probability estimation that is an important indicator of the rule quality and credibility in systems of machine learning. It concerns rules obtained, e.g., with the use of decision trees and rough set theory. Particular rules are frequently supported only by a small or very small number of data pieces. The rule probability is mostly investigated with the use of global estimators such as the frequency-, the Laplace-, or the m-estimator constructed for the full probability interval [0,1]. The paper shows that precision of the rule probability estimation can be considerably increased by the use of m-estimators which are specialized for the interval [phmin, phmax] given by the problem expert. The paper also presents a new interpretation of the m-estimator parameters that can be optimized in the estimators. (original abstract)

Słowa kluczowe

Machine learning Data Mining Decision tree

Uczenie maszynowe Data Mining Drzewo decyzyjne

Czasopismo

Control and Cybernetics

Rocznik

2014

Tom

Numer

nr 1

Strony

133--160

Opis fizyczny

Twórcy

autor

Andrzej Piegat

West Pomeranian University of Technology in Szczecin

autor

Marek Landowski

Maritime University of Szczecin

Bibliografia

CESTNTK, В. (1990), Estimating probabilities: A crucial task in machinc learning. In: L. C. Aiello (Ed.), ECAI'90. Pitman, London, 147-149.
CESTNIK, B. (1991), Estimating probabilities in machine learning. Ph.D. thesis, University of Ljubljana, Faculty of Computer and Information Science.
CIIAWLA, N. V., CIEŚLAK, D. A. (2006), Evaluating calibration of probability estimation from decision trees. AAAI Workshop on the Evaluation Methods in Machine Learning, The AAAI Press, Boston, July 2006,18-23.
CICHOSZ, P. (2000), Systemy uczące się (Learning systems). Wydawnictwo Naukowo Techniczne, Warsaw, Poland.
CUSSENS, J. (1993), Bayes and pseudo bayes estimates of conditional probabilities and their reliabilities. In: Proceedings of European Conference on Machine Learning, ECML-93. LNCS 667, 136 152.
FURNKRANZ, J., FLACH, P. A. (2005) ROC 'n' rule learning - towards a better understanding of covering algorithms. Machine Learning, 58(1), 39 77.
HAJEK, A. (2010), Website: Interpretations of probability. The Stanford Encyclopedia of Philosophy (E.N. Zalta ed.). Available from: http://plato. stanford.edu/entries/probability-interpret/.
LAROSE, D. T. (2010), Discovering Statistics. W. H. Freeman and Company, New York.
LUTZ, H., WENDT, W. (1998), Taschenbuch der Regelungstechnik. Verlag Harri Deutsch, Frankfurt am Main.
MOZINA, M., DEMSAR, J., ZABKAR, J., BRATKO, I. (2006), Why is rule learning optimistic and how to correct it. In: European Conference on Machine Learning, ECML 2006. LNCS 4212, 330-340.
PIEGAT, A., LANDOWSKI, M. (2012), Optimal estimator of hypothesis probability for data mining problems with small samples. Int. J. Appl. Math. Comput. Sci., 22, 3, 629-645.
POLKOWSKI, L. (2002), Rough Sets. Physica-Verlag, Heidelberg, New York.
ROKACH, L., MAIMON, O. (2008), Data mining with decision trees, theory and applications. Machine Perception and Artificial Intelligence, 69. World Scientific Publishing Co. Pte. Ltd, New Jersey, Singapore.
SIEGLER, R. S. (1976), Three Aspects of Cognitive Development. Cognitive Psychology, 8, 481-520.
SIEGLER, R. S. (1994), Balance Scale Weight & Distance Database. UCI Machine Learning Repository. Available from: http://archive.ics.uci.edu/ ml/datasets/Balance+Scale.
STARZYK, A., WANG, F. (2004), Dynamic probability estimator for ma- chine learning. IEEE Transactions on Neural Networks, March 15(2), 298-308.
SULZMANN, J. N., FURNKRANZ, J. (2009), An empirical comparison of probability estimation techniques for probabilistic rules. In: J. Gama, V. S. Costa, A. Jorge, P. Brazdil, Proceedings of the 12th International Conference on Discovery Science (DS-09), Porto, Portugal. Springer-Verlag, 317-331.
SULZMANN, J. N., FURNKRANZ, J. (2010), Probability estimation and aggregation for rule learning. Technical Report TUD-KE-201-03, TU Darmstadt, Knowledge Engineering Group.
WITTEN, I. H., FRANK, E. (2005), Data Mining. Second edition, Elsevier, Amsterdam.
ZADROZNY, B., ELKAN, C. (2001), Learning and decision making when costs and probabilities are both unknown. In: Proceedings of the Seventh International Conference on Knowledge Discovery and Data Mining. San Francisco, August 2001. ADM, 204-213.
ZHANG, Z. (1995), Parameter Estimation Techniques: A Tutorial with Application to Conic Fitting. M estimators. INRIA. Available from: http:// research.microsoft.com/enus/um/people/zhang/INRIA/Publis/Tutorial Estim/Main.html.
ZIARKO, W. (1999), Decision making with probabilistic decision tables. In: N. Zhong, ed., RSFDGrC'99 Proceedings of the 7th International Workshop on New Directions in Rough Sets, Data Mining, and Granular-Soft Computing, Yamaguchi, Japan. Springer-Verlag, Berlin, Heidelberg, New York, 463-471.
VON MISES, R. (1957), Probability, Statistics and the Truth. Macmillan, Dover, New York.

Typ dokumentu

Bibliografia

Identyfikatory

Identyfikator YADDA

bwmeta1.element.ekon-element-000171481896

Komentarze

Musisz być zalogowany aby pisać komentarze.

Control and Cybernetics

Specialized, MSE-Optimal m-Estimators of the Rule Probability Especially Suitable for Machine Learning

Zgłoszenie zostało wysłane

Zgłoszenie zostało wysłane