Skip to main content

Advertisement

Table 3 Comparison of widely-used tasks for modular analysis of networks using the introduced synthetic and real datasets

From: BicNET: Flexible module discovery in large-scale biological networks using biclustering

Approach Method Solution aspects and concerns Efficiency
Clustering (exhaustive and non-overlapping node coverage) k-Means Majority of clusters show loose connectedness; High variation on the size of modules (1-to-3 clusters covering almost all nodes and the remaining clusters being statistically non-significant [66]) Efficiency problems for networks with >100.000 interactions
Spectral Able to isolate modules where the degree of connectedness is approximately constant per module; Only a small subset of clusters is relevant (medium-to-high degree of connectedness) Medusa implementation only scales for networks with <10.000 interactions
Affinity propagation The clusters collected from (small samples of) the target biological networks show a generalized lack of biological relevance Time and memory bottlenecks for small nets (<1000 interactions)
Clustering (non-exhaustive and possibly overlapping node coverage) CPMw (weighted k-clique percolation) Intolerance to noise; Intractably large solutions (explosion of similar clusters) with strict coherency criterion (k-clique); Dependence on parameters (e.g. k, intensity level) Only scales for nets with <5000 nodes (5–10 % density). Bottlenecks for the target biological data even when removing >95 % interactions
Biclustering (bi-sets of nodes) Hypercliques (unweighted) Intolerant to missing interactions; Large number of highly similar modules; Dense coherency only BicNET implementation efficient for large networks (>10000 nodes) with density up to 25 %
Hypercliques (differential) Intolerant to noise and the prone item-boundaries problem during the selection of differential weights; Dense coherency only BicNET implementation scales for large dense networks
BicNET (dense assumption) Focus on dissimilar modules robust to noise and missings, with possibly distinct forms of coherency strength (|L| \(\in\){1,2,3,5}) Efficiency bounded by the search for unweigthed hypercliques (|L|=1)