IT Infrastructure Downtime Preemption using Hybrid Machine Learning and NLP

Roy, Chiranjiv; Moitra, Sourov; Malhotra, Rashika; Srinivasan, Subramaniyan; Das, Mainak

doi:10.15439/2015F400

Artykuł - szczegóły

Czasopismo

Annals of Computer Science and Information Systems

2015 | 6 | 39--44

Tytuł artykułu

IT Infrastructure Downtime Preemption using Hybrid Machine Learning and NLP

Autorzy

Chiranjiv Roy , Sourov Moitra , Rashika Malhotra , Subramaniyan Srinivasan , Mainak Das

Warianty tytułu

Języki publikacji

Abstrakty

IT Infrastructure Management and server downtime have been an area of exploration by researchers and industry experts, for over a decade. Despite the research on web server downtime, system failure and fault prediction, etc., there is a void in the field of IT Infrastructure Downtime Management. Downtime in an IT Infrastructure can cause enormous financial, reputational and relationship losses for customer and vendor. Our attempt is to address this gap by developing an innovative architecture which predicts IT Infrastructure failure. We have used a hybrid approach of human-machine interaction through Big Data, Machine Learning, NLP and IR. We sourced real-time machine, operating system, application logs and unstructured case notes into an algorithm for multi-dimensional symptoms mining, using iterative deepening depth-first search, traversal to create transactions for Sequential Pattern Mining of symptoms to events. It went through multiple statistical tests and review from technology experts, to create and update a dynamic Pattern Dictionary. This dictionary is used for training unsupervised and supervised classification models of machine learning, namely SVM and Random Forrest to score and predict new logs in a real time mode. The approach is also dynamic to use unsupervised clustering methods to give directions to the technicians on future or unknown pattern of errors or fault, to constantly update the Pattern Dictionary and improve classification for new IT products.(original abstract)

Słowa kluczowe

Neuro-Linguistic Programming (NLP) Algorithms Machine learning

Programowanie neurolingwistyczne (NLP) Algorytmy Uczenie maszynowe

Czasopismo

Annals of Computer Science and Information Systems

Rocznik

2015

Tom

Strony

39--44

Opis fizyczny

Twórcy

autor

Chiranjiv Roy

Technology Services, GSD CSC Bangalore, India

autor

Sourov Moitra

Technology Services, GSD CSC Bangalore, India

autor

Rashika Malhotra

Technology Services, GSD CSC Bangalore, India

autor

Subramaniyan Srinivasan

Technology Services, GSD CSC Bangalore, India

autor

Mainak Das

Technology Services, GSD CSC Bangalore, India

Bibliografia

Aggarwal, Charu C., Yu, Philip C. 2001. Outlier Detection for High Dimensional Data, ACM SIGMOD
Ghose, Udayan., Rai, C.S., Singh, Yogesh. 2010. On Multiplicative Entropy and Information gain in Large Data Sets, International Journal of Engineering Science and Technology, 187-193.
Han, Jiawei., Kamber, Micheline., Pei, Jian. 2011. Data mining: Concepts and Techniques, 561-562, Morgan Kaufmann.
Hodge, Victoria J., Austin, Jim. 2004. A Survey of Outlier Detection Methodologies, In: Artificial Intelligence Review, 85-126, Kluwer Academic Publishers, Netherlands.
Knorr, Edwin M., Ng Raymond T. 1998. Algorithms for Mining Distance-Based Outliers in Large Datasets, VLDB Conference.
Minka, Thomas P. 2003. A comparison of numerical optimizers for logistic regression.
Pawling, Alec., Chawla, Nitesh V., Chaudhary, Amitabh. 2005. Computing Information Gain in Data Streams, Temporal Data Mining Workshop.
Pliner, Vadim. 2004. A SAS® Macro for Naïve Bayes Classification.
Pokrajac, Dragoljub., Lazarevic, Aleksandar., Latecki, Longin Jan. 2007. Incremental Local Outlier Detection for Data Streams, IEE Symposium on Computational Intelligence and Data Mining (CIDM).
Rokach, Lior, Maimon, Oded. 2010. Decision Trees. In: Data Mining and Knowledge Discovery Handbook, 165-192, Springer.
Sahami, Mehran.1996. Learning Limited Dependence Bayesian Classifiers.
Tan, Pang-Ning., Stienbach, Michael., Kumar, Vipin. 2007. Introduction to Data Mining, 139-20, Pearson.
Agrawal, R., Amielinski, T., and Swami, A. (1993). Mining association rule between sets of items in large databases. In Proceeding of the 1993 ACM SIGMOD International Conference on Management of Data, pp. 207-216, Washington, DC, May 26-28.
Agrawal, R. and Srikant, R. (1994). Fast algorithms for mining association rule. Proceedings of the 20th International Conference on Very Large Data Bases. pp. 487 - 499.
Antonie, M., Zaïane, O. R., Coman, A. (2003). Associative Classifiers for Medical Images. Lecture Notes in Artificial Intelligence 2797, Mining Multimedia and Complex Data, pp 68-83, Springer-Verlag.
Blackmore, K. and Bossomaier, T. J. (2003). Comparison of See5 and J48.PART Algorithms for Missing Persons Profiling. Technical report. Charles Sturt University, Australia.
Brin, S., Motwani, R., Ullman, J., Tsur, S. (1997). Dynamic Itemset Counting and Implication Rules for Market Basket Data. Proceedings of the 1997 ACM SIGMOD International Conference on Management of Data.
Cendrowska, J. (1987). MODEL: An algorithm for inducing modular rules. International Journal of Man-Machine Studies. Vol.27, No.4, pp.349-370.
Cohen, W. W. (1995). Fast effective rule induction. In the Proceeding of the 12 th International Conference on Machine Learning, Morgan Kaufmann, San Francisco, pp. 115-123.
Cohen, W. W. (1993). Efficient pruning methods for separate-andconquer rule learning systems. In the proceeding of the 13th International Joint Conference on AI, Chambry, France.
Cowling, P. and Chakhlevitch, K. (2003). Hyperheuristics for Managing a Large Collection of Low Level Heuristics to Schedule Personnel. Proceeding of 2003 IEEE conference on Evolutionary Computation, Canberra, Australia, 8-12 Dec 2003.
Dong, G., Li, J. (1999). Efficient mining of frequent patterns: Discovering trends and differences. In Proceeding of SIGKDD 1999, San Diego, California.
Chris Buckley and Alan F. Lewit, Optimizations of inverted vector searches, SIGIR '85, Pages 97-110, 1985.
Fayyad, U. M.; Piatetsky-Shapiro, G.; Smyth, P. (1996). Advances in knowledge discovery and data mining, MIT Press.
Zaki, M. J., Parthasarathy, S., Ogihara, M., and Li, W. (1997). New algorithms for fast discovery of association rules. 3rd KDD Conference, pp. 283-286, August 1997.
Charu C. Aggarwal, Stephen C. Gates and Philip S. Yu, On the merits of building categorization systems by supervised clustering, Proceedings of the fifth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Pages 352 - 356, 1999.
Paul Bradley and Usama Fayyad, Refining Initial Points for K-Means Clustering, Proceedings of the Fifteenth International Conference on Machine Learning ICML98, Pages 91-99. Morgan Kaufmann, San Francisco, 1998
Alvarez, Sergio A. Technical Report BC-CS-2003-01, July 2003. Chisquared computation for association rules: preliminary results

Typ dokumentu

Bibliografia

Identyfikatory

DOI

10.15439/2015F400

Identyfikator YADDA

bwmeta1.element.ekon-element-000171422726

Komentarze

Musisz być zalogowany aby pisać komentarze.

Annals of Computer Science and Information Systems

IT Infrastructure Downtime Preemption using Hybrid Machine Learning and NLP

Zgłoszenie zostało wysłane

Zgłoszenie zostało wysłane