training one parameter at a time | ||||
---|---|---|---|---|
type of training | algorithm | time | memory | reference |
Viterbi | Viterbi | (T max LM) | (ML) | [17] |
Lam-Meyer | (T max LM) | (M) | this paper | |
Baum-Welch | Baum-Welch | (T max LM) | (ML) | [13] |
checkpointing | (T max LM log(L)) | (M log(L)) | [34] | |
linear-memory | (T max LM) | (M) | [29] | |
stochastic EM | forward & back-tracing | (T max L(M + K)) | (ML) | [32] |
Lam-Meyer | (T max LMK) | (MK + T max ) | this paper | |
training P of Q parameters at the same time with P ∈ {1, ..., Q} and Q/P ∈ ℕ | ||||
Viterbi | Viterbi | (T max LMQ/P) | (ML) | [17] |
Lam-Meyer | (T max LMQ/P) | (MP) | this paper | |
Baum-Welch | Baum-Welch | (T max LMQ/P) | (ML + P) | [13] |
checkpointing | (T max LMQ log(L/P)) | (M log(L)) | [34] | |
linear-memory | (T max LM Q/P) | (M) | [29] | |
stochastic EM | forward & back-tracing | (T max L(M + K)Q/P ) | (ML) | [32] |
Lam-Meyer | (T max LMKQ/P ) | (MKP + T max ) | this paper |