Machine Learning for Classification of Emotion in Speech
Vivek Sharma1, Chandrabhanu Mishra2, Saroj Meher3, Santosh Mishra4
1Vivek Sharma, Biju Patnaik University of Technology, Rourkela, Odisha, India.
2Chandrabhanu Mishra, Associate Professor, Electronics & instrumentation engineering, College of Engineering and Technology, Bhubaneswar, Odisha, India.
3Saroj Meher, Systems Science and Informatics Unit, Indian Statistical Institute, Bangalore, India.
4Santosh Mishra, Biju Patnaik University of Technology, Rourkela, Odisha, India.
Manuscript received on January 02, 2020. | Revised Manuscript received on January 15, 2020. | Manuscript published on January 30, 2020. | PP: 2118-2124 | Volume-8 Issue-5, January 2020. | Retrieval Number: E6084018520/2020©BEIESP | DOI: 10.35940/ijrte.E6084.018520
Open Access | Ethics and Policies | Cite | Mendeley
© The Authors. Blue Eyes Intelligence Engineering and Sciences Publication (BEIESP). This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/)
Abstract: The naturalness of the speech in any human being comes from his or her emotions. All human beings deliver and construe the messages with heavy use of emotions. So there is a need to develop a speech interface through which emotions embedded in the speech signal can be analyzed and processed. There are many speech translation systems developed with intent to interpret the inherent emotions in the speech signals but lack in processing the embedded emotions in the speech as because there is a lacuna in their modeling and depiction. The main objective of any speech processing system is to retrieve interesting information from speech like features, models so that the retrieved knowledge of interest can be further used in various speech processing applications. The scope of the present paper is to travel around the attributes of speech and its respective models with a goal to distinguish emotions by imprisoning precise information about emotion. This paper also studied various sources like source of excitement, vocal track system’s silhouettes and its sequence, attributes of supra- segment to obtain a rich source of emotional information of a speech. The paper end with a final conclusion saying that source of excitation and its characteristics may be be single handedly enough for efficient acknowledgment of emotions.
Keywords: Linear Prediction (LP), Glottal Volume Velocity (GVV), Glottal Closure Instants (GCI).
Scope of the Article: Machine Learning.