Subgroup classification is a basic task in genomic data analysis. The cola package provides a general framework for subgroup classification by consensus partitioning.
Zuguang Gu, et al., cola: an R/Bioconductor package for consensus partitioning through a general framework, Nucleic Acids Research, 2021. https://doi.org/10.1093/nar/gkaa1146
Zuguang Gu, et al., Improve consensus partitioning via a hierarchical procedure. Briefings in bioinformatics 2022. https://doi.org/10.1093/bib/bbac048
cola is available on Bioconductor, you can install it by:
if (!requireNamespace("BiocManager", quietly = TRUE))
install.packages("BiocManager")
BiocManager::install("cola")
The latest version can be installed directly from GitHub:
library(devtools)
install_github("jokergoo/cola")
The cola supports two types of consensus partitioning.
The steps of consensus partitioning is:
var()
or sd()
). Note the choice of "the top-value method" can beThree lines of code to perfrom cola analysis:
mat = adjust_matrix(mat) # optional
rl = run_all_consensus_partition_methods(
mat,
top_value_method = c("SD", "MAD", ...),
partition_method = c("hclust", "kmeans", ...),
cores = ...)
cola_report(rl, output_dir = ...)
Following plots compare consensus heatmaps with k = 4 under all combinations of methods.
Three lines of code to perfrom hierarchical consensus partitioning analysis:
mat = adjust_matrix(mat) # optional
rh = hierarchical_partition(mat, mc.cores = ...)
cola_report(rh, output_dir = ...)
Following figure shows the hierarchy of the subgroups.
Following figure shows the signature genes.
MIT @ Zuguang Gu