GRISOTTO: A greedy approach to improve combinatorial algorithms for motif discovery with prior knowledge

Table 1 Definition of terms used in describing the algorithms presented in Methods.

Symbol	Meaning
Σ	alphabet (usually DNA or IUPAC)
f	input sequences
f _i	i-th input sequence
f _ij	j-th position of the i-th input sequence
N	number of input sequences
n _i	length of f_i
k	motif size
S _p	p-th prior (in PSP format)
ℓ	number of priors (it can be zero)
S	S = 〈S₁, ..., S_ℓ〉 is the list of all priors
z _min	minimum number of motifs expected to be returned by a RISOTTO run
z _max	maximum number of motifs expected to be returned by a RISOTTO run
z	number of top motifs post-processed from RISOTTO output
	the set with the z top motifs to be post-processed by GRISOTTO
m	motif of size k
m〈i, α〉	motif m where the i-th position (starting with 0) is replaced by α ∈ Σ
ε	empty motif (its BIS score is the minimum possible value)
f_i[j ... j + k - 1]	k-long segment of the i-th input sequence that starts at position j
S_p[i, j]	prior probability at the j-th position of f_i
j _i	annotated position for f_i with maximum BIS score for a motif m
P _m	probability distribution given by the PSSM induced by m
α _p	the weight of the p-th prior
λ	coefficient to balance priors and over-representation contribution

ISSN: 1748-7188