Skip to main content

Advertisement

Figure 1 | Algorithms for Molecular Biology

Figure 1

From: A strand specific high resolution normalization method for chip-sequencing data employing multiple experimental control measurements

Figure 1

Strategy overview, synthetic & experimental results. (A) Results of the proposed normalization strategy on synthetic data. The top three panels represent (from top to bottom): i) the synthetic read signal, (sense in blue, antisense in red and combined in black (XSET)), ii) the individual fragment starts and ends (5' and 3') and iii) the sampled smoothed fragment start-signal used in the normalization step. The next three panels are for a synthetic control signal, and the two bottom panels are the resulting re-created per-bp-signal after normalization. (B) Brief overview of the proposed normalization strategy. For each observation (Oi) in the genome we calculate a moving window average of the number of start sites in the window centred over i, e.g. from i-5 to i+5. The observations, corresponding to the window centres, are taken at intervals that can be shorter than the window size generating 'overlapping' measurements or greater yielding 'side-by-side' windows. The representation of the read counts in the signal used in the proposed normalization procedure is taken as the resulting values for each observation of the centre Oi. A linear regression fit modelling the AB (antibody) against IP (input) and IGG (IgG) is performed and the residuals are stored. These are finally used to rebuild a per-bp-signal that can be reported in the bed-file format. (C) Number of MAX peaks detected by the three peak finders SISSRs, FindPeaks and MACS using statistical control. The numbers represent peaks found uniquely to the displayed fraction; sizes of the areas reflect the sizes of the sets.

Back to article page