AGNES¶
toyml.clustering.agnes.AGNES
dataclass
¶
AGNES(n_cluster: int, linkage: Literal['single', 'complete', 'average'] = 'single', distance_metric: Literal['euclidean'] = 'euclidean', distance_matrix_: list[list[float]] = list(), clusters_: list[ClusterTree] = list(), labels_: list[int] = list(), cluster_tree_: Optional[ClusterTree] = None, linkage_matrix: list[list[float]] = list(), _cluster_index: int = 0)
Agglomerative clustering algorithm (Bottom-up Hierarchical Clustering)
Examples:
>>> from toyml.clustering import AGNES
>>> dataset = [[1, 0], [1, 1], [1, 2], [10, 0], [10, 1], [10, 2]]
>>> agnes = AGNES(n_cluster=2).fit(dataset)
>>> print(agnes.labels_)
[0, 0, 0, 1, 1, 1]
>>> # Using fit_predict method
>>> labels = agnes.fit_predict(dataset)
>>> print(labels)
[0, 0, 0, 1, 1, 1]
>>> # Using different linkage methods
>>> agnes = AGNES(n_cluster=2, linkage="complete").fit(dataset)
>>> print(agnes.labels_)
[0, 0, 0, 1, 1, 1]
>>> # Plotting dendrogram
>>> agnes = AGNES(n_cluster=1).fit(dataset)
>>> agnes.plot_dendrogram(show=True)
The AGNES Dendrogram Plot
References
- Zhou Zhihua
- Tan
linkage
class-attribute
instance-attribute
¶
linkage: Literal['single', 'complete', 'average'] = 'single'
The linkage method to use.
distance_metric
class-attribute
instance-attribute
¶
distance_metric: Literal['euclidean'] = 'euclidean'
The distance metric to use.(For now we only support euclidean).
distance_matrix_
class-attribute
instance-attribute
¶
The distance matrix.
clusters_
class-attribute
instance-attribute
¶
clusters_: list[ClusterTree] = field(default_factory=list)
The clusters.
labels_
class-attribute
instance-attribute
¶
The labels of each sample.
fit
¶
Fit the model.
Source code in toyml/clustering/agnes.py
69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 |
|
fit_predict
¶
Fit the model and return the labels of each sample.
Source code in toyml/clustering/agnes.py
90 91 92 93 94 95 |
|
plot_dendrogram
¶
Plot the dendrogram of the clustering result.
This method visualizes the hierarchical structure of the clustering using a dendrogram. It requires the number of clusters to be set to 1 during initialization.
PARAMETER | DESCRIPTION |
---|---|
figure_name
|
The filename for saving the plot. Defaults to "agnes_dendrogram.png".
TYPE:
|
show
|
If True, displays the plot. Defaults to False.
TYPE:
|
RAISES | DESCRIPTION |
---|---|
ValueError
|
If the number of clusters is not 1. |
Note
This method requires matplotlib and scipy to be installed.
Source code in toyml/clustering/agnes.py
221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 |
|
toyml.clustering.agnes.ClusterTree
dataclass
¶
ClusterTree(cluster_index: int, parent: Optional[ClusterTree] = None, children: list[ClusterTree] = list(), sample_indices: list[int] = list(), children_cluster_distance: Optional[float] = None)
Represents a node in the hierarchical clustering tree.
Each node is a cluster containing sample indices. Leaf nodes represent individual samples, while internal nodes represent merged clusters. The root node contains all samples.
children
class-attribute
instance-attribute
¶
children: list[ClusterTree] = field(default_factory=list)
Children nodes.