Recurrent Drifts: Applying Fuzzy Logic to Concept Similarity Function
Recurrent drift, as a specific type of concept drift, is characterised by the appearance of previously seen concepts. Therefore, in those cases the learning process could be saved or at least minimized by applying an already trained classification model. In this paper we propose Fuzzy-Rec, a framework that is able to deal with recurrent concept drifts by means of a repository of classification models and a similarity function. Fuzzy logic is used in the framework to implement the similarity function needed to compare different classification models. This is a crucial aspect when dealing with drift recurrence, as long as some measure must be implemented to determine which model better fits a previously seen context. As it can be seen in the experimentation results of this paper, this fuzzy similarity function provides excellent results both in synthetic and real datasets. As a conclusion, we can state that the introduction of fuzzy logic comparisons between models could lead to a better efficient reuse of previously seen concepts, saving computational resources by applying not just equal models, but also similar ones.(original abstract)
- M. Gaber, A. Zaslavsky, and S. Krishnaswamy, "A survey of classification methods in data streams," Data Streams, pp. 39-59, 2007.
- A. Tsymbal, "The problem of concept drift: definitions and related work," Computer ScienceDepartment, Trinity College Dublin, 2004. [Online]. Available: http://www.cs.tcd.ie/publications/tech-reports/ reports.04/TCD-CS-2004-15.pdf
- J. Gama, P. Medas, G. Castillo, and P. Rodrigues, "Learning with drift detection," Lecture Notes in Computer Science, pp. 286-295, 2004.
- I. Žliobaite, "Learning under concept drift: an ˙ overview," Technical Report. Faculty of Mathematics and Informatics, Vilnius University: Vilnius, Lithuania., 2010. [Online]. Available: http://arxiv.org/abs/1010.4784
- G. Widmer and M. Kubat, "Learning in the presence of concept drift and hidden contexts," Machine learning, vol. 23, no. 1, pp. 69-101, 1996.
- J. Gama, Knowledge Discovery from Data Streams, 1st ed. Chapman & Hall/CRC, 2010.
- J. Gama and P. Kosina, "Tracking Recurring Concepts with Meta-learners," in Progress in Artificial Intelligence: 14th Portuguese Conference on Artificial Intelligence, Epia 2009, Aveiro, Portugal, October 12-15, 2009, Proceedings. Springer, 2009, p. 423.
- I. Katakis, G. Tsoumakas, and I. Vlahavas, "Tracking recurring contexts using ensemble classifiers: an application to email filtering," Knowl. Inf. Syst., vol. 22, no. 3, pp. 371-391, Mar. 2010.
- Y. Yang, X. Wu, and X. Zhu, "Mining in anticipation for concept change: Proactive-reactive prediction in data streams," Data mining and knowledge discovery, vol. 13, no. 3, pp. 261-289, 2006.
- "Combining proactive and reactive predictions for data streams," in Proceedings of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining. ACM, 2005, p. 715.
- W. Kosinski, P. Prokopowicz, and D. Slezak, "Calculus with fuzzy numbers," in Proceedings of the Second international conference on Intelligent Media Technology for Communicative Intelligence, ser. IMTCI'04. Berlin, Heidelberg: Springer-Verlag, 2005, pp. 21-28.
- I. Žliobaite, A. Bifet, M. M. Gaber, B. Gabrys, J. Gama, ˙ L. L. Minku, and K. Musial, "Next challenges for adaptive learning systems," SIGKDD Explorations, vol. 14, no. 1, pp. 48-55, 2012.
- G. Hulten, L. Spencer, and P. Domingos, "Mining timechanging data streams," in Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining. ACM New York, NY, USA, 2001, pp. 97-106.
- W. Street and Y. Kim, "A streaming ensemble algorithm (SEA) for large-scale classification," in Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining. ACM New York, NY, USA, 2001, pp. 377-382.
- S. Ramamurthy and R. Bhatnagar, "Tracking recurrent concept drift in streaming data using ensemble classi- fiers," in Proc. of the Sixth International Conference on Machine Learning and Applications, 2007, pp. 404-409.
- J. Bartolo Gomes, E. Menasalvas, and P. Sousa, "Tracking recurrent concepts using context," in Rough Sets and Current Trends in Computing, Proceedings of the Seventh International Conference RSCTC2010. Springer, 2010, pp. 168-177.
- D. Brzezinski and J. Stefanowski, "Accuracy updated ´ ensemble for data streams with concept drift," in Proceedings of the 6th international conference on Hybrid artificial intelligent systems - Volume Part II, ser. HAIS'11. Berlin, Heidelberg: Springer-Verlag, 2011, pp. 155-163.
- D. Brzezinski and J. Stefanowski, "Reacting to different types of concept drift: The accuracy updated ensemble algorithm," IEEE Transactions on Neural Networks and Learning Systems, vol. 25, no. 1, pp. 81-94, 2013.
- R. Elwell and R. Polikar, "Incremental learning of concept drift in nonstationary environments," Neural Networks, IEEE Transactions on, vol. 22, no. 10, pp. 1517- 1531, 2011.
- M. Muhlbaier, A. Topalis, and R. Polikar, "Learn++. nc: Combining ensemble of classifiers with dynamically weighted consult-and-vote for efficient incremental learning of new classes," IEEE Transactions on Neural Networks, vol. 20, no. 1, 2009.
- M. J. Hosseini, Z. Ahmadi, and H. Beigy, "New management operations on classifiers pool to track recurring concepts," in Data Warehousing and Knowledge Discovery. Springer, 2012, pp. 327-339.
- G. J. Ross, N. M. Adams, D. K. Tasoulis, and D. J. Hand, "Exponentially weighted moving average charts for detecting concept drift," Pattern Recogn. Lett., vol. 33, no. 2, pp. 191-198, Jan. 2012.
- P. M. Gonçalves Jr and R. S. M. D. Barros, "RCD: A Recurring Concept Drift Framework," Pattern Recogn. Lett., vol. 34, no. 9, pp. 1018-1025, Jul. 2013.
- P. Li, X. Wu, and X. Hu, "Mining recurring concept drifts with limited labeled streaming data," ACM Trans. Intell. Syst. Technol., vol. 3, no. 2, pp. 29:1-29:32, Feb. 2012.
- J. a. B. Gomes, E. Menasalvas, and P. A. C. Sousa, "Learning recurring concepts from data streams with a context-aware ensemble," in Proceedings of the 2011 ACM Symposium on Applied Computing, ser. SAC '11. New York, NY, USA: ACM, 2011, pp. 994-999.
- R. Klinkenberg and I. Renz, "Adaptive information filtering: Learning in the presence of concept drifts," in Learning for Text Categorization. Menlo Park, California: AAAI Press, 1998, pp. 33-40.
- D. Kifer, S. Ben-David, and J. Gehrke, "Detecting Change in Data Streams," in Proceedings of the Thirtieth International Conference on Very Large Data Bases - Volume 30, ser. VLDB '04. VLDB Endowment, 2004, pp. 180-191.
- A. Dries and U. Rückert, "Adaptive Concept Drift Detection," Stat. Anal. Data Min., vol. 2, no. 5-6, pp. 311-327, Dec. 2009.
- I. Adä and M. Berthold, "EVE: a framework for event detection," Evolving Systems, vol. 4, no. 1, pp. 61-70, 2013.
- K. Nishida and K. Yamauchi, "Detecting Concept Drift Using Statistical Testing," in Proceedings of the 10th International Conference on Discovery Science, ser. DS'07. Berlin, Heidelberg: Springer-Verlag, 2007, pp. 264-269.
- M. Baena-Garcıa, J. del Campo-Ávila, R. Fidalgo, A. Bifet, R. Gavalda, and R. Morales-Bueno, "Early drift detection method," in Fourth International Workshop on Knowledge Discovery from Data Streams. Citeseer, 2006, pp. 77-86.
- J. Mendel, "Fuzzy logic systems for engineering: a tutorial," Proceedings of the IEEE, vol. 83, no. 3, pp. 345 -377, mar 1995.
- P. Domingos and G. Hulten, "Mining high-speed data streams," in Proceedings of the sixth ACM SIGKDD international conference on Knowledge discovery and data mining. ACM New York, NY, USA, 2000, pp. 71-80.
- L. Zadeh, "Fuzzy sets," Information and Control, vol. 8, no. 3, pp. 338-353, Jun. 1965.
- E. Cox, "Fuzzy fundamentals," Spectrum, IEEE, vol. 29, no. 10, pp. 58 -61, oct. 1992.
- M. Harries, "Splice-2 comparative evaluation: Electricity pricing. Technical report, The University of South Wales," 1999.
- X. Zhu, "Stream Data Mining Repository - http://www. cse.fau.edu/~xqzhu/stream.html," 2010.
- G. Holmes, R. Kirkby, and B. Pfahringer, "MOA: Massive Online Analysis, 2007 - http://sourceforge.net/projects/moa-datastream/," 2007.
- P. Cingolani and J. Alcala-Fdez, "jfuzzylogic: a robust and flexible fuzzy-logic inference system language implementation," in Fuzzy Systems (FUZZ-IEEE), 2012 IEEE International Conference on, june 2012, pp. 1 - 8.
- G. H. John and P. Langley, "Estimating continuous distributions in bayesian classifiers," in Eleventh Conference on Uncertainty in Artificial Intelligence. San Mateo: Morgan Kaufmann, 1995, pp. 338-345.
- R Development Core Team, R: A Language and Environment for Statistical Computing, R Foundation for Statistical Computing, Vienna, Austria, 2010, ISBN 3-900051-07-0. [Online]. Available: http://www. R-project.org
- M. Hollander and D. A. Wolfe, Nonparametric Statistical Methods. Wiley-Interscience, 1999.
- T. Galili, "Post-hoc analysis for Friedman test. Code available in http://www.r-statistics.com/2010/02/ post-hoc-analysis-for-friedmans-test-r-code," 2010.