Probabilistic K-mean with local alignment to locally cluster curves and discover functional motifs
Marzia Cremona (Joint work with Francesca Chiaromonte)

In this work we develop a new method to locally cluster misaligned curves and to address the problem of discovering functional motifs, i.e. typical “shapes” that may recur several times along and across a set of curves, capturing important local characteristics of these curves.
We formulate probabilistic K-mean with local alignment, a novel algorithm that leverages ideas from functional data analysis (joint clustering and alignment of curves), bioinformatics (local alignment through the extension of high similarity “seeds”) and fuzzy clustering (curves belonging to more than one cluster, if they contain more than one typical “shape”). Our methodology can employ various dissimilarity measures and incorporate derivatives in the discovery process, in order to capture different shape characteristics.
After demonstrating the performance of our method on simulated data, and showing how it generalizes other clustering methods for functional data, we apply it to discover functional motifs in “Omics” signals related to mutagenesis and genome dynamics.