Probabilistic K-mean with local alignment to locally cluster curves
and discover functional motifs
Marzia Cremona (Joint work with Francesca Chiaromonte)
In this work we develop a new method to locally cluster misaligned
curves and to address the problem of discovering functional motifs,
i.e. typical “shapes” that may recur several times along and across
a set of curves, capturing important local characteristics of these
curves.
We formulate probabilistic K-mean with local alignment, a novel
algorithm that leverages ideas from functional data analysis (joint
clustering and alignment of curves), bioinformatics (local alignment
through the extension of high similarity “seeds”) and fuzzy
clustering (curves belonging to more than one cluster, if they
contain more than one typical “shape”). Our methodology can employ
various dissimilarity measures and incorporate derivatives in the
discovery process, in order to capture different shape
characteristics.
After demonstrating the performance of our method on simulated data,
and showing how it generalizes other clustering methods for
functional data, we apply it to discover functional motifs in
“Omics” signals related to mutagenesis and genome dynamics.