SGI: automatic clinical subgroup identification in omics datasets. Academic Article uri icon

Overview

abstract

  • SUMMARY: The 'Subgroup Identification' (SGI) toolbox provides an algorithm to automatically detect clinical subgroups of samples in large-scale omics datasets. It is based on hierarchical clustering trees in combination with a specifically designed association testing and visualization framework that can process an arbitrary number of clinical parameters and outcomes in a systematic fashion. A multi-block extension allows for the simultaneous use of multiple omics datasets on the same samples. In this article, we first describe the functionality of the toolbox and then demonstrate its capabilities through application examples on a type 2 diabetes metabolomics study as well as two copy number variation datasets from The Cancer Genome Atlas. AVAILABILITY AND IMPLEMENTATION: SGI is an open-source package implemented in R. Package source codes and hands-on tutorials are available at https://github.com/krumsieklab/sgi. The QMdiab metabolomics data is included in the package and can be downloaded from https://doi.org/10.6084/m9.figshare.5904022. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

publication date

  • January 3, 2022

Research

keywords

  • DNA Copy Number Variations
  • Diabetes Mellitus, Type 2

Identity

PubMed Central ID

  • PMC8723155

Scopus Document Identifier

  • 85126325958

Digital Object Identifier (DOI)

  • 10.1002/widm.1326

PubMed ID

  • 34529048

Additional Document Info

volume

  • 38

issue

  • 2