Clustering Analysis Using Survival Tree and Forest Algorithms.
SurvivalClusteringTree
Outcome-guided clustering by recursive partitioning methods.
Installation
remotes::install_github("luyouepiusf/SurvivalClusteringTree")
Introduction
Suppose that we are given a dataset of samples with survival outcomes and sample features. The main objective of the method is to identify several clusters of samples with similar survival outcomes and similar features. The package implements a novel statistical machine learning method to perform the task.
Algorithm
Perform bootstrap aggregation. For each bootstrap dataset, build a survival tree by recursively sweeping through all variables to find the best split in the sample space, and repeat until no further splits can be found.
Define a distance between each pair of samples in the original dataset with respect to the survival tree. The distances between samples in the same terminal node are 0. For samples that are not in the same terminal node, there is a unique path on the tree connecting the two terminal nodes, the sum of the absolute log-rank test z-scores on the path is the distance between them.
Aggregate the distances by averaging pairwise distances as the final distance.
Apply hierarchical clustering analysis to the distance matrix to identify clusters.