Regression-based Analytical Approach for Speech Emotion Prediction based on Multivariate Additive Regression Spline (MARS)

Budi Triandi, Syahril Efendi, Herman Mawengkang, - Sawaluddin

Abstract


Using regression analysis techniques for speech-emotion recognition (SER) is an excellent method of resource efficiency. The labeled speech emotion data has high emotional complexity and ambiguity, making this research difficult. The maximum average difference is used to consider the marginal agreement between the source and target domains without focusing on the distribution of the previous classes in the two domains. To address this issue, we propose emotion recognition in speech using a regression analysis technique based on local domain adaptation. The results of this study show that the model's generalization ability with the function of the local additive method is very good for improving speech emotion recognition performance. Even though it provides excellent benefits in resource efficiency, regression analytical techniques are rarely used in the SER field; however, we believe this method is the best solution for SER problems. Using the Multivariate Additive Regression Spline, this study developed a predictive model for the existence of angry and non-angry emotions (MARS). Using probability analysis of error values, this approach can overcome regression on data that is not typically distributed. This method yields an ideal basis function that significantly impacts changes in emotional form. This study generates a prediction model with a Mean Square Error (MSE) of 0.0130, a Generalized Cross Validation (GCV) value of 0.0062, and a R Square (RSQ) value of 0.9721, yielding test results with a 97% accuracy rate.

Keywords


Speech emotion; multivariate additive regression spline; regression analytic; predictive model; generalized cross validation

Full Text:

PDF

References


S. Latif, J. Qadir, and M. Bilal, “Unsupervised Adversarial Domain Adaptation for Cross-Lingual Speech Emotion Recognition,†in 2019 8th International Conference on Affective Computing and Intelligent Interaction, ACII 2019, 2019.

E. Lieskovská, M. Jakubec, R. Jarina, and M. Chmulík, “A review on speech emotion recognition using deep learning and attention mechanism,†Electronics (Switzerland), vol. 10, no. 10. 2021.

S. Dutta and S. Ganapathy, “Multimodal Transformer with Learnable Frontend and Self Attention for Emotion Recognition,†in ICASSP 2022-2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2022, pp. 6917–6921.

R. Orjesek, R. Jarina, and M. Chmulik, “End-to-end music emotion variation detection using iteratively reconstructed deep features,†Multimed. Tools Appl., vol. 81, no. 4, pp. 5017–5031, 2022.

L. Kerkeni, Y. Serrestou, M. Mbarki, K. Raoof, and M. A. Mahjoub, “Speech emotion recognition: Methods and cases study,†in ICAART 2018 - Proceedings of the 10th International Conference on Agents and Artificial Intelligence, 2018, vol. 2.

P. Baki, H. Kaya, E. Çiftçi, H. Güleç, and A. A. Salah, “A Multimodal Approach for Mania Level Prediction in Bipolar Disorder,†IEEE Trans. Affect. Comput., vol. 13, no. 4, pp. 2119–2131, 2022.

R. Li, J. Zhao, J. Hu, S. Guo, and Q. Jin, “Multi-modal fusion for video sentiment analysis,†in Proceedings of the 1st International on Multimodal Sentiment Analysis in Real-life Media Challenge and Workshop, 2020, pp. 19–25.

V. Vielzeuf, C. Kervadec, S. Pateux, and F. Jurie, “The many variations of emotion,†in Proceedings - 14th IEEE International Conference on Automatic Face and Gesture Recognition, FG 2019, 2019.

E. Avots, T. Sapiński, M. Bachmann, and D. Kamińska, “Audiovisual emotion recognition in wild,†in Machine Vision and Applications, 2019, vol. 30, no. 5.

J. Han, Z. Zhang, F. Ringeval, and B. Schuller, “Prediction-based learning for continuous emotion recognition in speech,†in ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings, 2017.

D. Le, Z. Aldeneh, and E. M. Provost, “Discretized continuous speech emotion recognition with multi-task deep recurrent neural network,†in Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH, 2017, vol. 2017-Augus.

Y. Liu, J. She, H. Duan, and N. Qi, “Hybrid Model Based on Maxwell-Slip Model and Relevance Vector Machine,†IEEE Trans. Ind. Electron., vol. 68, no. 10, 2021.

D. Wu, P. Yan, Y. Guo, H. Zhou, and J. Chen, “A gear machining error prediction method based on adaptive Gaussian mixture regression considering stochastic disturbance,†J. Intell. Manuf., 2021.

Y. H. H. Tsai, M. Q. Ma, M. Yang, R. Salakhutdinov, and L. P. Morency, “Multimodal routing: Improving local and global interpretability of multimodal language analysis,†in EMNLP 2020 - 2020 Conference on Empirical Methods in Natural Language Processing, Proceedings of the Conference, 2020.

R. A. Khalil, E. Jones, M. I. Babar, T. Jan, M. H. Zafar, and T. Alhussain, “Speech Emotion Recognition Using Deep Learning Techniques: A Review,†IEEE Access, vol. 7, 2019.

L. Tan et al., “Speech emotion recognition enhanced traffic efficiency solution for autonomous vehicles in a 5G-enabled space–air–ground integrated intelligent transportation system,†IEEE Trans. Intell. Transp. Syst., vol. 23, no. 3, pp. 2830–2842, 2021.

L. Kerkeni, Y. Serrestou, M. Mbarki, K. Raoof, M. Ali Mahjoub, and C. Cleder, “Automatic Speech Emotion Recognition Using Machine Learning,†in Social Media and Machine Learning, 2020.

M. B. Akçay and K. Oğuz, “Speech emotion recognition: Emotional models, databases, features, preprocessing methods, supporting modalities, and classifiers,†Speech Communication, vol. 116. 2020.

A. Çevik, G. W. Weber, B. M. Eyüboğlu, and K. K. Oğuz, “Voxel-MARS: a method for early detection of Alzheimer’s disease by classification of structural brain MRI,†Ann. Oper. Res., vol. 258, no. 1, 2017.

D. Li, Y. Zhou, Z. Wang, and D. Gao, “Exploiting the potentialities of features for speech emotion recognition,†Inf. Sci. (Ny)., vol. 548, 2021.

D. Luo, Y. Zou, and D. Huang, “Investigation on joint representation learning for robust feature extraction in speech emotion recognition,†in Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH, 2018, vol. 2018-Septe.

H. A. Abdulmohsin, H. B. Abdul wahab, and A. M. J. Abdul hossen, “A new proposed statistical feature extraction method in speech emotion recognition,†Comput. Electr. Eng., vol. 93, 2021.

M. A. Siddiqui, S. A. Ali, and N. G. Haider, “Reduced Feature Set for Emotion Based Spoken Utterances of Normal and Special Children Using Multivariate Analysis and Decision Trees,†Eng. Technol. Appl. Sci. Res., vol. 8, no. 4, 2018.

S. Chebbi and S. Ben Jebara, “On the use of pitch-based features for fear emotion detection from speech,†in 2018 4th International Conference on Advanced Technologies for Signal and Image Processing, ATSIP 2018, 2018.

Mustaqeem, M. Sajjad, and S. Kwon, “Clustering-Based Speech Emotion Recognition by Incorporating Learned Features and Deep BiLSTM,†IEEE Access, vol. 8, 2020.

A. Bose, C. H. Hsu, S. S. Roy, K. C. Lee, B. Mohammadi-ivatloo, and S. Abimannan, “Forecasting stock price by hybrid model of cascading Multivariate Adaptive Regression Splines and Deep Neural Network,†Comput. Electr. Eng., vol. 95, 2021.

Y. Gu et al., “Mutual correlation attentive factors in dyadic fusion networks for speech emotion recognition,†in MM 2019 - Proceedings of the 27th ACM International Conference on Multimedia, 2019.

S. E. Kahou et al., “EmoNets: Multimodal deep learning approaches for emotion recognition in video,†J. Multimodal User Interfaces, vol. 10, no. 2, 2016.

S. H. Samareh Moosavi and V. K. Bardsiri, “Poor and rich optimization algorithm: A new human-based and multi populations algorithm,†Eng. Appl. Artif. Intell., vol. 86, 2019.

I. H. Sarker, “Machine Learning: Algorithms, Real-World Applications and Research Directions,†SN Computer Science, vol. 2, no. 3. 2021.

M. Shahbaz, N. Khraief, and M. K. Mahalik, “Investigating the environmental Kuznets’s curve for Sweden: evidence from multivariate adaptive regression splines (MARS),†Empir. Econ., vol. 59, no. 4, 2020.

J. R. Leathwick, J. Elith, and T. Hastie, “Comparative performance of generalized additive models and multivariate adaptive regression splines for statistical modelling of species distributions,†Ecol. Modell., vol. 199, no. 2, 2006.

M. H. Ahmadi, B. Mohseni-Gharyehsafa, M. Farzaneh-Gord, R. D. Jilte, R. Kumar, and K. wing Chau, “Applicability of connectionist methods to predict dynamic viscosity of silver/water nanofluid by using ANN-MLP, MARS and MPR algorithms,†Eng. Appl. Comput. Fluid Mech., vol. 13, no. 1, 2019.

S. Dargan, M. Kumar, M. R. Ayyagari, and G. Kumar, “A Survey of Deep Learning and Its Applications: A New Paradigm to Machine Learning,†Arch. Comput. Methods Eng., vol. 27, no. 4, 2020.

M. Besharati Fard, D. Hamidi, J. Alavi, R. Jamshidian, A. Pendashteh, and S. A. Mirbagheri, “Saline oily wastewater treatment using Lallemantia mucilage as a natural coagulant: Kinetic study, process optimization, and modeling,†Ind. Crops Prod., vol. 163, 2021.

J. H. Friedman, “Multivariate adaptive regression splines,†Ann. Stat., vol. 19, no. 1, pp. 1–67, 1991.

T. Hastie, R. Tibshirani, and R. Tibshirani, “Best Subset, Forward Stepwise or Lasso? Analysis and Recommendations Based on Extensive Comparisons,†Stat. Sci., vol. 35, no. 4, 2020.

X. Ju, J. M. Rosenberger, V. C. P. Chen, and F. Liu, “Global optimization on non-convex two-way interaction truncated linear multivariate adaptive regression splines using mixed integer quadratic programming,†Inf. Sci. (Ny)., vol. 597, 2022.




DOI: http://dx.doi.org/10.18517/ijaseit.13.6.18603

Refbacks

  • There are currently no refbacks.



Published by INSIGHT - Indonesian Society for Knowledge and Human Development