Modified Dynamic Time Warping for Hierarchical Clustering

Mahmoud Sammour, Zulaiha Ali Othman, Amalia Mabrina Masbar Rus, Rosmayati Mohamed

Abstract


Time series clustering is the process of grouping sequential correspondences in similar clusters. The key feature behind clustering time series data lies on the similarity/distance function used to identify the sequential matches. Dynamic Time Warping (DTW) is one of the common distance measures that have demonstrated competitive results compared to other functions. DTW aims to find the shortest path in the process of identifying sequential matches. DTW relies on dynamic programming to obtain the shortest path where the smaller distance is being computed. However, in the case of equivalent distances, DTW is selecting the path randomly. Hence, the selection could be misguided in such randomization process, which significantly affects the matching quality. This is due to randomization may lead to the longer path which drifts from obtaining the optimum path. This paper proposes a modified DTW that aims to enhance the dynamic selection of the shortest path when handling equivalent distances. Experiments were conducted using twenty UCR benchmark datasets. Also, the proposed modified DTW result has been compared with the state of the art competitive distance measures which is based on precision, recall and f-measure including the original DTW, Minkowski distance measure and Euclidean distance measure. The results showed that the proposed modified DTW reveal superior results in compared to the standard DTW, either using Minkowski or Euclidean. This can demonstrate the effectiveness of the proposed modification in which optimizing the shortest path has enhanced the performance of clustering. The proposed modified DTW can be used for having good clustering method for any time series data.


Keywords


hierarchical clustering; dynamic time warping; distance measures.

Full Text:

PDF

References


F. Nasution, N. E. N. Bazin, and A. Zulfikar, "Big Data’s Tools for Internet Data Analytics: Modelling of System Dynamics," International Journal on Advanced Science, Engineering and Information Technology, vol. 7, pp. 745-753, 2017.

Y. Lin, K. Ma, R. Sun, and A. Abraham, "Toward a MapReduce-Based K-Means Method for Multi-dimensional Time Serial Data Clustering," in International Conference on Intelligent Systems Design and Applications, 2017, pp. 816-825.doi.

A. Wismüller, O. Lange, D. R. Dersch, G. L. Leinsinger, K. Hahn, B. Pütz, and D. Auer, "Cluster analysis of biomedical image time-series," International Journal of Computer Vision, vol. 46, pp. 103-128, 2002.

N. Alanee and M. A. A. Murad, "A hybrid method of feature extraction and Naive Bayes classification for splitting identifiers," Journal of Theoretical and Applied Information Technology, vol. 95, 2017.

S. Aminikhanghahi and D. J. Cook, "A survey of methods for time series change point detection," Knowledge and information systems, vol. 51, pp. 339-367, 2017.

H. S. Badr, B. F. Zaitchik, and A. K. Dezfuli, "A tool for hierarchical climate regionalization," Earth Science Informatics, pp. 1-10, 2015.

B. Alshaikhdeeb and K. Ahmad, "Integrating correlation clustering and agglomerative hierarchical clustering for holistic schema matching," Journal of Computer Science, vol. 11, pp. 484-489, 2015.

H. Zaifoglu, B. Akintug, and A. Yanmaz, "Regional Frequency Analysis of Precipitation Using Time Series Clustering Approaches," Journal of Hydrologic Engineering, vol. 23, 2018.

F. Petitjean, G. Forestier, G. I. Webb, A. E. Nicholson, Y. Chen, and E. Keogh, "Faster and more accurate classification of time series by exploiting a novel dynamic time warping averaging algorithm," Knowledge and Information Systems, vol. 47, pp. 1-26, 2016.

F. Petitjean, A. Ketterlin, and P. Gançarski, "A global averaging method for dynamic time warping, with applications to clustering," Pattern Recognition, vol. 44, pp. 678-693, 2011.

Y. Yuan, Y.-P. P. Chen, S. Ni, A. G. Xu, L. Tang, M. Vingron, M. Somel, and P. Khaitovich, "Development and application of a modified dynamic time warping algorithm (DTW-S) to analyses of primate brain expression time series," BMC bioinformatics, vol. 12, p. 347, 2011.

B. Alshaikhdeeb and K. Ahmad, "Biomedical Named Entity Recognition: A Review," International Journal on Advanced Science, Engineering and Information Technology, vol. 6, pp. 889-895, 2016.

A. Efrat, Q. Fan, and S. Venkatasubramanian, "Curve matching, time warping, and light fields: New algorithms for computing similarity between curves," Journal of Mathematical Imaging and Vision, vol. 27, pp. 203-216, 2007.

M. Sammour and Z. Othman, "An Agglomerative Hierarchical Clustering with Various Distance Measurements for Ground Level Ozone Clustering in Putrajaya, Malaysia," International Journal on Advanced Science, Engineering and Information Technology, vol. 6, pp. 1127-1133, 2016.

N. Begum, L. Ulanova, J. Wang, and E. Keogh, "Accelerating dynamic time warping clustering with a novel admissible pruning strategy," in Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2015, pp. 49-58.doi.

Q. Zhu, G. E. Batista, T. Rakthanmanon, and E. J. Keogh, "A Novel Approximation to Dynamic Time Warping allows Anytime Clustering of Massive Time Series Datasets," in SDM, 2012, pp. 999-1010.doi.

L. N. Ferreira and L. Zhao, "Time series clustering via community detection in networks," Information Sciences, vol. 326, pp. 227-242, 2016.

D. F. Silva, G. E. Batista, and E. Keogh, "On the Effect of Endpoints on Dynamic Time Warping," 2016.

L. Zheng, Y. Qu, X. Qian, and G. Cheng, "A hierarchical co-clustering approach for entity exploration over Linked Data," Knowledge-Based Systems, vol. 141, pp. 200-210, 2018.

S. Rani and G. Sikka, "Recent techniques of clustering of time series data: A Survey," Int. J. Comput. Appl, vol. 52, pp. 1-9, 2012.

B. D. Fulcher, "Feature-based time-series analysis," in Feature Engineering for Machine Learning and Data Analytics, ed: CRC Press, 2018, pp. 87-116.

T. W. Liao, "Clustering of time series data—a survey," Pattern recognition, vol. 38, pp. 1857-1874, 2005.

C. Yanping, K. Eamonn, H. Bing, B. Nurjahan, B. Anthony, M. Abdullah, and B. Gustavo, "The UCR Time Series Classification Archive," ed, 2015.

A. R. Bahari, A. Musa, M. Z. Nuawi, Z. I. Rizman, and S. M. Saad, "Novel statistical clustering method for accurate characterization of word pronunciation," International Journal on Advanced Science, Engineering and Information Technology, vol. 7, pp. 1172-1177, 2017.

G. Al-Naymat, S. Chawla, and J. Taheri, "SparseDTW: a novel approach to speed up dynamic time warping," in Proceedings of the Eighth Australasian Data Mining Conference-Volume 101, 2009, pp. 117-127.doi.

V. Niennattrakul and C. A. Ratanamahatana, "On clustering multimedia time series data using k-means and dynamic time warping," in 2007 International Conference on Multimedia and Ubiquitous Engineering (MUE'07), 2007, pp. 733-738.

Salleh, Zulaiha Ali Othman and Mohamad Salleh,. Characteristics of Agent-based Hierarchical Diff-EDF Schedulability over Heterogeneous Real-Time Packet Networks, European Journal of Scientific Research Sci publication, 2009, Vol 14(3), pp. 431-243.

B. Alshaikhdeeb and K. Ahmad, Integrating Correlation Clustering and Agglomerative Hierarchical Clustering for Holistic Schema Matching, Journal of Computer Science, 2015, Vol 11 (3), pp. 484-489.

M., Islam, M.A. Hannan, H. Basri , A, Hussain and M.Arebey, A Solid waste bin detection and classification using Dynamic Time Warping and MLP classifier, Waste Management , 2014, Vol. 34. pp 281–290.




DOI: http://dx.doi.org/10.18517/ijaseit.9.5.7079

Refbacks

  • There are currently no refbacks.



Published by INSIGHT - Indonesian Society for Knowledge and Human Development