Feature Engineering for Anti-Fraud Models Based on Anomaly Detection
The paper presents two algorithms as a solution to the problem of identifying fraud intentions of a customer. Their purpose is to generate variables that contribute to fraud models' predictive power improvement. In this article, a novel approach to the feature engineering, based on anomaly detection, is presented. As the choice of statistical model used in the research improves predictive capabilities of a solution to some extent, most of the attention should be paid to the choice of proper predictors. The main finding of the research is that model enrichment with additional predictors leads to the further improvement of predictive power and better interpretability of anti-fraud model. The paper is a contribution to the fraud prediction problem but the method presented may generate variable input to every tool equipped with variable selection algorithm. The cost is the increased complexity of the models obtained. The approach is illustrated on a dataset from one of the European banks. (original abstract)
-  Baesens B., Van Vlasselaer V., Verbeke W., (2015), Fraud Analytics using descriptive, predictive and social network techniques, Wiley and SAS Business Series, 1st Edition.
-  Bai B., Yen J., Yang X., (2008), False financial statements: characteristics of China's listed companies and CART detecting approach, International Journal of Information Technology & Decision Making 7, 339-359.
-  Basel Committee on Banking Supervision, (2006), International Convergence of Capital Measurement and Capital Standards: A Revised Framework, Bank for International Settlements, Basel.
-  Breunig M. M., Kriegel H. P., Ng R. T., Sander J., (2000), LOF: identifying density-based local outliers, ACM Sigmod Record.
-  Correa Bahnsen A., Aouada D., Stojanovic A., Ottersten B., (2016), Feature Engineering Strategies for Credit Card Fraud Detection, Expert Systems with Applications 51.
-  Daneshpazhouh A., Sami A., (2014), Entropy-based outlier detection using semisupervised approach with few positive examples, Pattern Recognition Letters 49, 77-84.
-  Dorronsoro J., Ginel F., Sanchez C., Cruz C., (1997), Neural fraud detection in credit card operations, Neural Networks 8, 827-834.
-  Dorfleitner G., Jahnes H., (2014), What factors drive personal loan fraud? Evidence from Germany, Review of Managerial Science 8(1), 89-119.
-  Farvaresh H., Sepehri M., (2011), A data mining framework for detecting subscription fraud in telecommunication, Engineering Applications of Artificial Intelligence 24(1), 182-194.
-  Hartmann-Wendels T., Mählmann T., Versen T., (2009), Determinants of banks' risk exposure to new account fraud - Evidence from Germany, Journal of Banking & Finance 33, 347-357.
-  Hawkins D., (1980), Identification of Outliers, Chapman and Hall Hawkins, London.
-  Jin Y., Rejesus R. M., Little B. B., (2005), Binary choice models for rare events data: a crop insurance fraud application, Applied Economics 37, 841-848.
-  Keyan L., Tingting Y., (2011), An Improved Support-Vector Network Model for Anti-Money Laundering, Fifth International Conference on Management of e-Commerce and e-Government (ICMeCG).
-  Kim Y., Sohn S., (2012), Stock fraud detection using peer group analysis, Expert Systems with Applications 39, 8986-8992.
-  Koufakou A., Georgiopoulos M., (2010), A fast outlier detection strategy for distributed high-dimensional datasets with mixed attributes, Principles of Data Mining and Knowledge Discovery 20, 259-289.
-  Mählmann T., (2010), On the correlation between fraud and default risk, Zeitschrift für Betriebswirtschaft, December, 80(12), 1325-1352.
-  Tibshirani T., (1996), Regression Shrinkage and Selection via the Lasso, Journal of the Royal Statistical Society. Series B (Methodological) 58(1), 267-288.
-  Weston D., Hand D., Adams N., Whitrow C., Juszczak P., (2008), Plastic card fraud detection using peer group analysis, Advances in Data Analysis and Classification 2(1), 45-62.