Function Reference Manual: Difference between revisions

← Older edit

Latest revision as of 19:34, 2 September 2008

Redirect to:

PLS Toolbox Topics

@@ Line 1: / Line 1: @@
-== cluster ==
+#REDIRECT [[PLS_Toolbox_Topics]]
-'''Purpose'''
-Agglomerative and K-means cluster analysis with dendrograms.
-'''Synopsis'''
-:[results,fig] = cluster(data'',labels,options'')
-:[results,fig] = cluster(data'',options'')
-:options = cluster('options')
-'''Description'''
-''cluster(data)'' performs a cluster analysis using either one of six different agglomerative
-methods (including K-Nearest-Neighbor (KNN), furthest neighbor, and Ward's
-method) or K-means clustering algorithm and plots a dendrogram. The input is data (class double or
-dataset).
-Optional input ''labels'' can be used to put labels on the
-dendrogram plots. For data ''M'' by ''N'' then ''labels'' must be a
-character array with ''M'' rows. When ''labels'' is not specified and data is class “double”, the
-dendrogram is plotted using sample numbers. When ''labels'' is not specified
-and ''data'' is class
-“dataset”, the dendrogram is plotted using sample labels. If the labels field is empty it
-will use sample numbers.
-The output is a dendrogram showing the sample distances.
-Note: Calling cluster}} with no inputs starts the graphical user interface (GUI) for this analysis
-method.
-OUTPUTS:
-The outputs are (results) a structure containing results of
-the clustering (defined below) and the handle (fig) to any plot created. The
-results structure will contain the following fields:
-<p class="optionsbody">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; font-family:Monaco,Courier'>dist :&nbsp;&nbsp; the distance threshold at which each
-cluster forms.
-<p class="optionsbody"><span style="font-size: 10.0; font-family: Monaco,Courier">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; class
-</span>:&nbsp;&nbsp; the classes of each sample (columns of class) for each distance
-(rows of class).
-<p class="optionsbody">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; Monaco,Courier'>order :&nbsp;&nbsp; the
-order of the samples which locates similar samples nearest to each other (this
-is the order used for the plots).
-<p class="optionsbody">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; Monaco,Courier'>linkage :&nbsp;&nbsp; a
-table of linkages where each row indicates a linkage of one group to another.
-Each row in the matrix represents one group. The first two columns indicate the
-sample or group numbers which were linked to form the group. The final column
-indicates the distance between linked items. Group numbers start at m+1 (where
-m is the number of samples in the input dat matrix) thus, row j of this matrix
-is group number m+j. This matrix can be used with the statistics toolbox
-dendogram function.
-The (results.class) matrix can be used with the
-(results.dist) matrix to determine clusters of samples for any distance using:
-<p class="MATLABCommand">&nbsp;
-<p class="MATLABCommand">results&nbsp;&nbsp; = cluster(data);&nbsp;&nbsp; %do
-cluster
-<p class="MATLABCommand">ind&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; = max(find(results.dist&lt;threshold));&nbsp;
-%user-desired threshold
-<p class="MATLABCommand">thisclass = results.class(ind,:);&nbsp;&nbsp; %grab arbitrary
-classes
-<p class="Ref2">Options
-<p class="optionsbody">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; options'' =&nbsp;&nbsp; a structure array with the following fields:
-<p class="optionsbody">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; font-family:Monaco,Courier'>plots:&nbsp;&nbsp; Monaco,Courier'>['none' | {'final'}] Governs plotting. When set to 'none', the
-distance/cluster matrix is returned, 'final' returns a dendrogram plot showing
-sample distances.
-<p class="optionsbody">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; <span style="font-size: 10.0; font-family: Monaco,Courier">algorithm</span>:&nbsp;&nbsp; [] clustering algorithm,
-<p class="optionsbody">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 'knn' {DEFAULT}:
-K-Nearest Neighbor
-<p class="optionsbody">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 'fn'
-: Furthest Neighbor
-<p class="optionsbody">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 'avgpair' : Average
-Paired Distance
-<p class="optionsbody">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 'med' : Median
-<p class="optionsbody">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 'cnt' : Centroid
-<p class="optionsbody">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 'ward' : Ward's Method
-<p class="optionsbody">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 'kmeans' : K-means
-<p class="optionsbody">&nbsp;&nbsp; <span style="font-size: 10.0; font-family: Monaco,Courier">preprocessing</span>:&nbsp;&nbsp; {[]} Preprocessing structure
-or keyword (see PREPROCESS),
-<p class="optionsbody">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; font-family:Monaco,Courier'>pca:&nbsp;&nbsp; Monaco,Courier'>[{'off'} | 'on'] if ‘on’ then font-family:Monaco,Courier'>CLUSTER performs PCA first and clustering on the
-scores,
-<p class="optionsbody">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; font-family:Monaco,Courier'>ncomp:&nbsp;&nbsp; Monaco,Courier'>[] number of PCA factors to use {default = [], the user is
-prompted to select the number of factors from the SSQ table},
-<p class="optionsbody">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; <span style="font-size: 10.0; font-family: Monaco,Courier">mahalanobis</span>:&nbsp;&nbsp; [{'off'} | 'on'] if ‘on’
-then a Mahalanobis distance on the scores is used,
-<p class="optionsbody">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; font-family:Monaco,Courier'>slack:&nbsp;&nbsp; Monaco,Courier'>[0] integer number indicating how many samples can be
-"overridden" when two class branches merge. If the smaller of the two
-classes has no more than this number of samples, the branch will be absorbed
-into the larger class. This feature is only valid when classes are supplied in
-the input data. A value of 0 (zero) disables this feature.
-<p class="optionsbody">&nbsp;
-The default options can be retreived using: options = cluster('options');.
-<p class="Ref2">See Also
-<span style="font-size: 10.0; font-family: Monaco,Courier"> agcluster, [analysis.html analysis], [corrmap.html corrmap], dendrogram, [gcluster.html gcluster], [simca.html simca]
-</span>

Function Reference Manual: Difference between revisions

Latest revision as of 19:34, 2 September 2008

Navigation menu

Search