Ocena wpływu wartości stałej Minkowskiego na możliwość identyfikacji struktury grupowej danych o wysokim wymiarze przy występowaniu wartości nietypowych

Migdał-Najman, Kamila; Najman, Krzysztof

Artykuł - szczegóły

Czasopismo

Zarządzanie i Finanse

2015 | R. 13, nr 4, cz. 2 | 229--239

Tytuł artykułu

Ocena wpływu wartości stałej Minkowskiego na możliwość identyfikacji struktury grupowej danych o wysokim wymiarze przy występowaniu wartości nietypowych

Autorzy

Kamila Migdał-Najman , Krzysztof Najman

Treść / Zawartość

Pełne teksty:

http://zif.wzr.pl/pim/2015_4_2_14.pdf [zdalny]

Warianty tytułu

An Evaluation of the Impact of a Minkovsky Constant on the Possibility of Identification of the Group Structure in High Dimensional in the Presence of Outliers

Języki publikacji

Abstrakty

Ważną decyzją w analizie zróżnicowania jednostek w przestrzeni wielowymiarowej jest wybór odpowiedniej dla danego problemu miary odległości. W badaniach empirycznych najczęściej stosuje się miary odległości oparte na metryce potęgowej. W metryce tej, gdy jednostki opisane są bardzo dużą liczbą cech, istotny staje się wybór odpowiedniego poziomu stałej Minkowskiego. Wybór ten jest bardzo ważny, ponieważ istotnie wpływa na własności metryki we właściwym różnicowaniu jednostek. Własności te zmieniają wraz ze wzrostem wymiaru przestrzeni. W artykule poddano analizie wpływ wartości stałej Minkowskiego na poprawność identyfikacji skupień dla danych o wysokim wymiarze, przy występowaniu jednostek nietypowych. (abstrakt oryginalny)

An important decision, in the analysis of the variability of units in a multidimensional space, is the choice the measurement of distance which is accurate for a given problem. In the empirical studies, the most used measure of distance is the exponential metric. When the units are described by very large number of features, the relevant in appropriate the choose the Minkovsky constant in the exponential metric. The choice is very important, because has to effect on the properties of the exponential metric. With the increase of dimensionality, the properties of the metric may change. The aim of this paper is to estimation of the influence of the Minkovsky constant and high dimensional space on obtained group structure, in the presence of outliers. (original abstract)

Słowa kluczowe

Analiza skupień Symulacja Badania naukowe

Cluster analysis Simulation Scientific research

Czasopismo

Zarządzanie i Finanse

Rocznik

2015

Numer

R. 13, nr 4, cz. 2

Strony

229--239

Opis fizyczny

Twórcy

autor

Kamila Migdał-Najman

Uniwersytet Gdański

autor

Krzysztof Najman

Uniwersytet Gdański

Bibliografia

1. Bellman R. E. (1961), Adaptive control processes, A Guided Tour, Princeton University Press, Princeton, New Jersey.
2. Beyer K., Goldstein J., Ramakrishnan R., Shaft U. (1999), When is "nearest neighbor" meaningful, International Conference on Database Theory, Jerusalem, Israel.
3. Bishop C. M. (1995), Neural networks for pattern recognition, Clarendon Press, Oxford.
4. Hair J. F., Anderson R. E., Tatham R. L., Black W. C. (1995), Multivariate data analysis with readings, Prentice Hall International, Ltd., London (4th ed.).
5. Hinneburg A., Aggarwal C. C., Keim D. A. (2000), What is the nearest in high dimensional spaces, The VLDB Journal, Bibliothek der Universitat Konstanz.
6. Hinneburg A., Aggarwal C. C., Keim D. A. (2001), On the surprising behavior of distance metrics in high dimensional space, [w]: Van den Bussche, Vianu V. (eds.), International Conference on Database Theory, LNCS, Springer, Heidelberg.
7. Houle M. E., Kriegel H. P., Kroger P., Schubert E., Zimek A. (2010), Can shared-neighbor distances defeat the curse of dimensionality, w: Proceedings of the 22nd International Conference on Scientific and Statistical Database Management, Heidelberg.
8. Migdał-Najman K. (2015), Ocena wpływu wartości stałej Minkowskiego na możliwość identyfikacji struktury grupowej danych o wysokim wymiarze, w: Jajuga K., Walesiak M. (red.), "Taksonomia" nr 24, Klasyfikacja i analiza danych - teoria i zastosowania, Prace Naukowe UE we Wrocławiu.
9. Schnitzer D., Flexer A., Tomasev N. (2014), Choosing the metric in highdimensional spaces based on hub analysis, European Symposium on Artificial Neural Networks ESANN.
10. Scott D., Thompson J. (1983), Probability density estimation in higher dimensions, w: Gentle J. (ed.), Computer Science and Statistics: Proceedings of the Fifteenth Symposium on the Interface.
11. Silverman B. (1986), Density estimation for statistics and data analysis, Chapman and Hall, London.
12. Taylor C. C. (1977), Principal component and factor analysis, [w]: O'Muircheartaigh C. A., Payne C (eds.), The analysis of survey data, Vol. I: Exploring data structures, Wiley&Sons, New York.
13. Walesiak M. (2005), Problemy selekcji i ważenia zmiennych w zagadnieniach klasyfikacji, "Taksonomia" nr 12, Prace Naukowe Uniwersytetu Ekonomicznego we Wrocławiu, nr 1076.
14. White H. (1989), Learning in artificial neural networks: a statistical perspective, "Neural Computation", Vol. 1.
15. Verleysen M., Francois D. (2005), The curse of dimensionality in data mining and time series prediction, 8th International Workshop on Artificial Neural Networks, IWANN.

Typ dokumentu

Bibliografia

Identyfikatory

Identyfikator YADDA

bwmeta1.element.ekon-element-000171415261

Komentarze

Musisz być zalogowany aby pisać komentarze.

Zarządzanie i Finanse

Ocena wpływu wartości stałej Minkowskiego na możliwość identyfikacji struktury grupowej danych o wysokim wymiarze przy występowaniu wartości nietypowych

Zgłoszenie zostało wysłane

Zgłoszenie zostało wysłane