Biology

Geometric averaging provides normalization-invariant feature ranking in compositional sequencing data

AI Insight

This study demonstrates that the arithmetic mean, the default method for summarizing feature abundances in compositional sequencing data such as microbiome and RNA-seq analyses, produces unstable and inconsistent feature rankings that can reverse group-level biological conclusions. Using the dietswap dataset, the authors show that 22.5% of tested genera yielded opposite directional results depending on the normalization applied when using the arithmetic mean. By contrast, the geometric mean and centered log-ratio transformation produce rankings that are invariant to within-sample normalization, a property verified numerically with a Spearman correlation of 1.000 between the two approaches.


The findings identify a concrete and previously underappreciated source of irreproducibility in biomarker discovery across microbiome research, transcriptomics, metagenomics, and metabolomics. Adopting geometric mean-based summaries and CLR-transformed abundances requires no new software and could meaningfully improve consistency across studies in multiple high-impact biomedical fields.


⚠️ Preprint – Noch nicht peer-reviewed

Dieser Artikel wurde noch nicht von unabhängigen Experten begutachtet. Die Ergebnisse sind vorläufig und sollten mit Vorsicht interpretiert werden.

In compositional next-generation sequencing (NGS) analyses (including microbiome studies, RNA-seq and metagenomics) the arithmetic mean (AM) of relative proportions is the default operator for summarizing feature abundances. We show that this default produces unstable rankings in real compositional data. Across 102 prevalent genera in the dietswap dataset (n=38 baseline samples), 23 genera (22.5%), including members of Bacteroides, Eubacterium and Bilophila, yielded opposite group-level conclusions under AM and the geometric mean (GM). This pattern reflects two formal properties of compositional aggregation. First, AM-based rankings change with the within-sample normalization domain, whereas GM-based rankings are invariant under the multiplicative structure of compositional data. Second, the centered log-ratio (CLR) transformation absorbs geometric averaging into the data representation, so that arithmetic averaging on CLR-space recovers the GM ranking exactly. Both properties were verified numerically on the dietswap dataset, where the Spearman correlation between GM- and CLR-based rankings was 1.000 in both groups. The operator-choice problem propagates to between-group differential inference: under AM, log2 fold-changes vary across normalizations and the relative ranking of features by effect size is not preserved; under GM and CLR, the ranking is preserved. We recommend GM-based summaries for feature ranking and CLR-transformed abundances for cross-sample comparisons. This change requires no new computational tools and is fully compatible with existing differential-abundance pipelines, but eliminates an under-recognized source of irreproducibility in biomarker discovery across microbiome studies, transcriptomics, metagenomics, and mass-spectrometry-based metabolomics, in all settings where features are quantified relative to a sample total.

Source: Geometric averaging provides normalization-invariant feature ranking in compositional sequencing data