Ocena wyniku grupowania w oparciu o indeks silhouette

Najman, Kamila

Artykuł - szczegóły

Czasopismo

Prace i Materiały Wydziału Zarządzania Uniwersytetu Gdańskiego

2006 | nr 2 | 111--120

Tytuł artykułu

Ocena wyniku grupowania w oparciu o indeks silhouette

Autorzy

Kamila Najman

Warianty tytułu

Evaluation of Clusters on the Strength Silhouette Index

Języki publikacji

Abstrakty

W artykule przedstawiono indeks silhouette i jego metodologię, jako miary stosowanej do oceny jakości grupowania. Rezultaty przeprowadzonej analizy zademonstrowano w oparciu o algorytm k-średnich oraz indeks silhouette, którego maksymalną wartość uznaje się za optymalną liczbę skupień w zbiorze danych. Wydaje się, że indeks silhouette może być przydanym narzędziem w uzyskiwaniu informacji i wiedzy o zbiorze danych.

Clustering is an unsupervised classification scheme where no a priori knowledge of data set is available. The prediction of correct number of clusters is a fundamental problem in classification problem. Many clustering algorithms require the definition of the number of clusters beforehand. To overcome this problem, various cluster validity indices have been proposed to assess the quality of a clustering partition. Thus, the main goal of cluster validity technique is to identify the partition of clusters for which a measure of quality is optimal. In clustering, the role of a validity index is very important. In the literature of clustering, a large number of cluster validity indices are there. Among them the Silhouette index is used for clustering. In this article, a cluster validity index and its methodology were described, which can provide a measure of goodness of clustering on different partitions of a data set. Results were demonstrating basis of k-means algorithm and silhouette index, proposed by Rousseeuw. The maximum value of this index provides the best partitioning. Silhouette index may be an effective tool to discovery knowledge in data sets.(original abstract)

Słowa kluczowe

Taksonomia Metody grupowania Analiza statystyczna Statystyka matematyczna Indeksy oceny klastrów

Taxonomy Grouping methods Statistical analysis Mathematical statistics Cluster validity indices

Czasopismo

Prace i Materiały Wydziału Zarządzania Uniwersytetu Gdańskiego

Rocznik

2006

Numer

nr 2

Strony

111--120

Opis fizyczny

Twórcy

autor

Kamila Najman

Bibliografia

Bolshakova A., Azuaje F. (2003), Cluster validation techniques for genome expression data, Signal Processing 83.
Kaufman L., Rousseeuw P.J. (1990), Finding groups in data: a introduction to cluster analysis, Wiley, New York.
Najman K., Najman K. (2005), Analityczne metody ustalania liczby skupień, Prace Naukowe Akademii Ekonomicznej we Wrocławiu, Nr 1076 Taksonomia 12, Klasyfikacja i analiza danych - teoria i zastosowania, Wrocław.
Rousseeuw P.J. (1987), Silhouettes: a graphical aid to the interpretation and validation of cluster analysis, J. Comput. Appi. Math. 20.
Stąpor K. (2005), Automatyczna klasyfikacja obiektów, Akademicka Oficyna Wydawnicza EXIT, Warszawa.
Struyf A., Hubert M., Rousseeuw P.J., (1997), Integrating robust clustering techniques in S-PLUS, Computational Statistics & Data Analysis 26.

Typ dokumentu

Bibliografia

Identyfikatory

Identyfikator YADDA

bwmeta1.element.ekon-element-000126710944

Komentarze

Musisz być zalogowany aby pisać komentarze.

Prace i Materiały Wydziału Zarządzania Uniwersytetu Gdańskiego

Ocena wyniku grupowania w oparciu o indeks silhouette

Zgłoszenie zostało wysłane

Zgłoszenie zostało wysłane