Skip to main content

Table 4 Feature importances of random forest trained on the biggest dataset (\(M=1000\) and \(\max L=100\)) based on normal (a) and LGT (b) network data

From: Constructing phylogenetic networks via cherry picking and machine learning

Features

Importance

 

(a) Normal

(b) LGT

Leaf distance (t)

0.190

0.162

Trivial

0.155

0.184

Cherry in tree

0.143

0.146

Leaf distance (d)

0.122

0.114

LCA distance (t)

0.068

0.056

Depth x/y (t)

0.050

0.058

Cherry depth (t)

0.047

0.045

Depth x/y (d)

0.043

0.038

LCA distance (d)

0.028

0.032

Leaf depth x (t)

0.023

0.024

Leaf depth y (t)

0.023

0.023

Cherry depth (d)

0.020

0.023

Leaf depth x (d)

0.020

0.022

Leaf depth y (d)

0.020

0.022

Before/after

0.015

0.016

Tree depth (d)

0.012

0.013

Tree depth (t)

0.011

0.011

New cherries

0.006

0.006

Leaves in tree

0.004

0.003

  1. Higher importance indicates that a feature has more effect on the trained model. The values sum up to one. The descriptions of the features are given in Table 1