Time series classification (TSC) has been one of the most fundamental problems of time series data. Time series shapelets (or simply, shapelets) are discriminative subsequences that have been recently found both effective and interpretable for solving TSC. However, shapelet discovery is known to be computationally costly. Meanwhile, matrix profile has been recently proposed for efficient motif discovery and anomaly detection. Our preliminary experiment shows that a direct adoption of the matrix profile on TSC does not bring superior classification accuracy. We have identified several issues of such an adoption: 1) discords as “shapelets”, and 2) lack of diversity. In response to these issues, we propose instance profile (IP) for shapelets, called IPS, for shapelet discovery for TSC. The main challenges are to utilize the instance profile (IP) to capture the characteristics of shapelets in a robust manner and to discover high-quality shapelets efficiently. First, we use our IP to generate abundant shapelet candidates. We next efficiently prune candidates that do not align with the definition of shapelets using a novel distribution-aware bloom filter (DABF). Three utility functions are proposed to measure the shapelet candidates and DABF is used to efficiently compute the functions but leads to a slight drop in accuracy.
Guozhong Li, Byron Choi, et al. Supplementary Material of IPS for Time Series Classification, Supplementary.
Yesterday is history, tomorrow is a mystery, today is a present. :)
Thanks the research community for supporting the datasets.