Ocena wpływu wartości stałej Minkowskiego na możliwość identyfikacji struktury grupowej danych o wysokim wymiarze

Migdał-Najman, Kamila

doi:10.15611/pn.2015.384.20

Artykuł - szczegóły

Czasopismo

Prace Naukowe Uniwersytetu Ekonomicznego we Wrocławiu. Taksonomia

2015 | 24 | nr 384 Klasyfikacja i analiza danych - teoria i zastosowania | 192--199

Tytuł artykułu

Ocena wpływu wartości stałej Minkowskiego na możliwość identyfikacji struktury grupowej danych o wysokim wymiarze

Autorzy

Kamila Migdał-Najman

Treść / Zawartość

Pełne teksty:

http://www.dbc.wroc.pl/publication/32805 [zdalny]

Warianty tytułu

The Assessment of Impact value of Minkowski's Constant for the Possibility of Group Structure Identification in High dimensional data

Języki publikacji

Abstrakty

W analizie zróżnicowania jednostek w przestrzeni wielowymiarowej ważny jest wybór odpowiedniej miary odległości. Wybór ten nabiera znaczenia, gdy analizie poddany jest zbiór danych o dużej liczbie jednostek opisanych setkami cech. Najczęściej stosuje się miary odległości oparte na metryce potęgowej. W metryce tej istotny staje się wybór odpowiedniego poziomu stałej Minkowskiego. Celem prezentowanych badań jest ocena wpływu wartości stałej Minkowskiego i wymiaru przestrzeni na możliwą do uzyskania strukturę grupową. W artykule na podstawie przeprowadzonych badań symulacyjnych wykazano, że w przypadku wysokiego wymiaru przestrzeni zastosowanie ułamkowego poziomu wykładnika w normie potęgowej wpływa na możliwość identyfikacji istniejącej struktury grupowej badanych jednostek(abstrakt oryginalny)

An important decision in the analysis of the variability of units in themultidimensional space, is the choice of the measurement of distance which is accurate for a given problem. This choice is of particular importance, when we have data sets which are described by hundreds of features. In the empirical studies, the most used measure of distance is the exponential metric measure. When the units are described by a very large number of features, an appropriate choice of the Minkowski's constant level is important because has to affect the properties of the exponential metrics. With the increase of dimensionality, the properties of the metrics may change. The aim of this paper is to estimate the influence of the Minkowski's constant and high dimensional space on the group structure which may be obtained. Based on simulation studies the author of this paper shows that the high dimension of the space application of fractional exponential metrics affects the ability to identify the group structure

Słowa kluczowe

Analiza skupień

Cluster analysis

Czasopismo

Prace Naukowe Uniwersytetu Ekonomicznego we Wrocławiu. Taksonomia

Rocznik

2015

Tom

Numer

nr 384 Klasyfikacja i analiza danych - teoria i zastosowania

Strony

192--199

Opis fizyczny

Twórcy

autor

Kamila Migdał-Najman

Uniwersytet Gdański

Bibliografia

Bellman R.E., 1961, Adaptive Control Processes, A Guided Tour, Princeton University Press, Princeton, New Jersey.
Beyer K., Goldstein J., Ramakrishnan R., Shaft U., 1999, When Is "Nearest Neighbor" Meaningful, International Conference on Database Theory, Jerusalem, Israel, s. 217-235.
Bishop C.M., 1995, Neural Networks for Pattern Recognition, Clarendon Press, Oxford.
Hair J.F., Anderson R.E., Tatham R.L., Black W.C., 1995, Multivariate Data Analysis with Readings, Prentice Hall International, Ltd., London (4th ed.).
Hinneburg A., Aggarwal C.C., Keim D.A., 2001, On the Surprising Behavior of Distance Metrics in High Dimensional Space, [w:] Van den Bussche, Vianu V. (eds.), International Conference on Database Theory, LNCS, vol. 1973, Springer, Heidelberg, s. 420-434.
Hinneburg A., Aggarwal C.C., Keim D.A., 2000, What is the Nearest in High Dimensional Spaces, The VLDB Journal, Bibliothek der Universität Konstanz, s. 506-515.
Houle M.E., Kriegel H.P., Kröger P., Schubert E., Zimek A., 2010, Can Shared-Neighbor Distances Defeat the Curse of Dimensionality, [w:] Proceedings of the 22nd International Conference on Scientific and Statistical Database Management, Heidelberg, s. 482-500.
Schnitzer D., Flexer A., Tomasev N., 2014, Choosing the metric in high-dimensional spaces based on hub analysis, European Symposium on Artificial Neural Networks ESANN.
Scott D., Thompson J., (1983), Probability Density Estimation in Higher Dimensions, [w:] Gentle J. (ed.), Computer Science and Statistics: Proceedings of the Fifteenth Symposium on the Interface, s. 173-179.
Silverman B., 1986, Density Estimation for Statistics and Data Analysis, Chapman and Hall, London.
Taylor C.C., 1977, Principal Component and Factor Analysis, [w:] O'Muircheartaigh C.A., Payne C.(eds.), The Analysis of Survey Data, vol. I: Exploring data structures, Wiley&Sons, New York, s. 89-123.
Walesiak M., 2005, Problemy selekcji i ważenia zmiennych w zagadnieniach klasyfikacji, Taksonomia 12, Prace Naukowe Uniwersytetu Ekonomicznego we Wrocławiu, nr 1076, s. 106-118.
White H., 1989, Learning in artificial neural networks: a statistical perspective, Neural Computation, vol. 1, s. 425-464.
Verleysen M., François D., 2005, The curse of dimensionality in data mining and time series prediction, 8th International Workshop on Artificial Neural Networks, IWANN, s. 758-770.

Typ dokumentu

Bibliografia

Identyfikatory

DOI

10.15611/pn.2015.384.20

Identyfikator YADDA

bwmeta1.element.ekon-element-000171379599

Komentarze

Musisz być zalogowany aby pisać komentarze.

Prace Naukowe Uniwersytetu Ekonomicznego we Wrocławiu. Taksonomia

Ocena wpływu wartości stałej Minkowskiego na możliwość identyfikacji struktury grupowej danych o wysokim wymiarze

Zgłoszenie zostało wysłane

Zgłoszenie zostało wysłane