Fig 1. Overview of SeqUnwinder, which takes an input list of annotated genomic sites and identifies label-specific discriminative motifs. (A) Schematic showing a typical input instance for SeqUnwinder: a list of genomic coordinates and corresponding annotation labels. (B) The underlying classification framework implemented in SeqUnwinder. Subclasses (combination of annotation labels) are treated as different classes in a multi-class classification framework. The label-specific properties are implicitly modeled using L1-regularization. (C) Weighted k-mer models are used to identify 10- 15bp focus regions called hills. MEME is used to identify motifs at hills. (D) De novo identified motifs in C) are scored using the weighted k-mer model to obtain label-specific scores.
...read more