IDENTIFICATION OF FEATURES IN PREDICTING PROMINENT MALAY WORDS USING DECISION TREE

Authors

  • Sabrina Tiun Centre of Artificial Intelligence Technology, Faculty of Technology and Information Science, Universiti Kebangsaan Malaysia, 43600 Bangi, Selangor, Malaysia
  • Liew Siaw Hong Centre of Artificial Intelligence Technology, Faculty of Technology and Information Science, Universiti Kebangsaan Malaysia, 43600 Bangi, Selangor, Malaysia

DOI:

https://doi.org/10.22452/mjcs.vol33no4.4

Keywords:

Malay prosody, Prominent words, Prominent Features, Decision Tree

Abstract

Predicting word prominence is a major topic in the field of speech synthesis where predicting prominent words is necessary to produce a natural-sounding speech synthesis. In our previous work, marking prominent words in a speech corpus is required to select the most suitable unit for speech synthesis; however, given that marking is performed manually, building a large speech corpus will be expensive in terms of labor and time-consuming. Thus, predicting prominent words automatically for which features represent an important aspect is required. This study presents an experimental work on identifying features (including part-of-speech (POS) sequence, phrasal break, and word position) in predicting prominent Malay words using decision tree and WEKA feature selection correlation method. Results show that using the decision tree for predicting prominent words (Precision = 85.0%, Recall = 84.2%, and F-measure = 83.5%) is optimal when the phrasal break is omitted as a feature. In addition, the results (Precision = 66.40%, Recall = 67.2%, and F-measure = 66.60%) are poorest when the POS sequence is excluded from the features. Therefore, this study concludes that phrasal break is a weak (noisy) feature, whereas POS sequence is an important feature in predicting prominent Malay words.

Downloads

Download data is not yet available.

Downloads

Published

2020-10-30

How to Cite

Tiun, S., & Hong, L. S. (2020). IDENTIFICATION OF FEATURES IN PREDICTING PROMINENT MALAY WORDS USING DECISION TREE. Malaysian Journal of Computer Science, 33(4), 298–305. https://doi.org/10.22452/mjcs.vol33no4.4