Biology

A Characterization of Level-k Realizability for Clustering Systems

AI Insight

This paper establishes a complete mathematical characterization of when a clustering system on a finite set of taxa can be realized as the hardwired clustering system of a rooted level-k phylogenetic network. The authors introduce a parameter mu(B) for each non-trivial block of the Hasse diagram associated with the clustering system, and prove that a level-k network realization exists if and only if mu(B) does not exceed k for every such block. The sufficiency direction is demonstrated through a constructive algorithm that iteratively modifies the Hasse diagram by splitting hybrid vertices while preserving the clustering system, providing both an existence proof and an implicit network construction method.


Phylogenetic networks are increasingly important in evolutionary biology for representing reticulate events such as hybridization and horizontal gene transfer, and this result gives researchers a precise combinatorial tool to assess whether observed biological groupings are compatible with a network of bounded complexity. The characterization could inform the design of more efficient algorithms for network inference from empirical clustering data.


arXiv:2605.21945v1 Announce Type: new
Abstract: We give a Hasse-diagram characterization of when a clustering system $mathcal C$ on a finite taxa set $X$ is the hardwired clustering system $C_N$ of a rooted level-$k$ network. For each non-trivial block $B$ of $H=mathcal H[mathcal C]$, we define a parameter $mu(B)$ using minimum families of clusters that generate all overlap-intersections inside $B$. The main theorem proves that there exists a rooted level-$k$ network $N$ with $C_N=mathcal C$ if and only if $mu(B)le k$ for every non-trivial block $B$ of $H$. The necessity proof shows that overlap-intersection pieces must be represented by non-root hybrid vertices in any realizing block. The sufficiency proof is constructive: starting from the Hasse diagram, it iteratively splits selected hybrid vertices, preserves the hardwired clustering system, and terminates with a realization whose level is bounded by the block-wise values of $mu$.

Source: A Characterization of Level-k Realizability for Clustering Systems