Hierarchical Clustering of Univariate (1d) Data.
Introduction
Package hclust1d
(Hierarchical CLUSTering for 1D) is a suit of algorithms for univariate agglomerative hierarchical clustering (with a comprehensive list of choices of a linkage function, please consult supported_methods
for the current list) in $\mathcal{O}(n\log n)$ time.
The better algorithmic time complexity (compared to multidimensional hierarchical clustering) paired with its efficient C++
implementation make hclust1d
very fast. The computational time beats stats::hclust
on all sizes of data and is en par with fastcluster::hclust
with small data sizes. However, it is of orders of magnitude faster than both multivariate clustering routines on larger data sizes.
The output of hclust1d
is of the same S3 class and format as the outputs of stats::hclust
or fastcluster::hclust
and thus the resulting clustering can be further investigated with standard calls to print
, plot
(plots a dendrogram), etc. In fact, for 1D cases the call to hclust
can be simply replaced by a call to hclust1d
in a plug-and-play manner, with the surrounding code unchanged. The how-to is covered in detail in our replacing stats::hclust
vignette
For information on how to get started using hclust1d
, see our getting started vignette.
Installing hclust1d
package
To install the development package version please execute
library(devtools)
devtools::install_github("SzymonNowakowski/hclust1d")
Alternatively, to install the current stable CRAN version please execute
install.packages("hclust1d")
After that, you can load the installed package into memory with a call to library(hclust1d)
.