ANALISA MODIFIKASI ALGORITMA STEMMING UNTUK KASUS OVERSTEMMING

##plugins.themes.academic_pro.article.main##

Stephanie Betha Rossi Hersianie

Abstract

Overstemming merupakan pemenggalan kata ke bentuk asal (root word) yang berlebihan. Hal ini menyebabkan kata tersebut bermakna sangat berbeda dengan kata asal. Namun, stem yang dihasilkan sama bentuknya. Untuk mengatasi permasalahan tersebut, penelitian sebelumnya telah menerapkan algoritma stemming dengan tabel aturan kata. Namun kekurangan dari tabel aturan kata ini adalah kesulitan dalam menambahkan jenis kata yang mengalami overstemming. Oleh karena itu, penelitian ini bertujuan untuk memodifikasi algoritma overstemming tersebut. Penelitian ini akan menggabungkan algoritma stemming (hybrid stemming) yaitu algoritma look-up table, tabel aturan kata dan algoritma stemming Porter yang biasa digunakan. Dataset yang digunakan dalam pengujian adalah atribut judul pada dokumen publikasi ilmiah. Hasil pengujian menunjukkan bahwa modifikasi algoritma stemming menghasilkan recall sebesar 89, 9%.Saran untuk penelitian selanjutnya adalah pengujian dapat dilakukan menggunakan atribut lainnnya pada dokumen publikasi.

##plugins.themes.academic_pro.article.details##

How to Cite
Rossi Hersianie, S. B. . (2020). ANALISA MODIFIKASI ALGORITMA STEMMING UNTUK KASUS OVERSTEMMING. TEKNOKOM, 3(2), 23–28. https://doi.org/10.31943/teknokom.v3i2.51

References

  1. M. A. Nq, L. P. Manik, and D. Widiyatmoko, “Stemming Javanese: Another Adaptation of the Nazief-Adriani Algorithm,” in 2020 3rd International Seminar on Research of Information Technology and Intelligent Systems (ISRITI), pp. 627–631, Dec. 2020.
  2. S. Yuan and H. Sun, “A general adaptive finite element eigen-algorithm stemming from Wittrick-Williams algorithm,” Thin-Walled Structures, vol. 161, p. 107448, Apr. 2021.
  3. U. Tukeyev, A. Turganbayeva, B. Abduali, D. Rakhimova, D. Amirova, and A. Karibayeva, “Lexicon-free stemming for Kazakh language information retrieval,” in 2018 IEEE 12th International Conference on Application of Information and Communication Technologies (AICT), pp. 1–4, Oct. 2018.
  4. M. N. Kassim, M. A. Maarof, A. Zainal, and A. A. Wahab, “Word stemming challenges in Malay texts: A literature review,” in 2016 4th International Conference on Information and Communication Technology (ICoICT), vol. 4, no. c, pp. 1–6, May. 2016.
  5. R. B. Setya Putra, E. Utami, and S. Raharjo, “Accuracy Measurement on Indonesian Non-formal Affixed Word Stemming With Levenhstein,” in 2019 International Conference on Information and Communications Technology (ICOIACT), pp. 486–490, Jul. 2019.
  6. R. B. S. Putra and E. Utami, “Non-formal affixed word stemming in Indonesian language,” in 2018 International Conference on Information and Communications Technology (ICOIACT), vol. 2018–Janua, pp. 531–536, Mar. 2018.
  7. K. Vijayl, “Genera Tion of Caption Selection for News Images Using Stemming Algorithm,” pp. 536–540, 2015.
  8. S. Ernawati, E. R. Yulia, Frieyadie, and Samudi, “Implementation of The Naïve Bayes Algorithm with Feature Selection using Genetic Algorithm for Sentiment Review Analysis of Fashion Online Companies,” in 2018 6th International Conference on Cyber and IT Service Management (CITSM), no. Citsm, pp. 1–5, Aug. 2018.
  9. D. S. Maylawati, W. B. Zulfikar, C. Slamet, M. A. Ramdhani, and Y. A. Gerhana, “An Improved of Stemming Algorithm for Mining Indonesian Text with Slang on Social Media,” in 2018 6th International Conference on Cyber and IT Service Management (CITSM), no. Citsm, pp. 1–6, Aug. 2018.
  10. H. A. Almuzaini and A. M. Azmi, “Impact of Stemming and Word Embedding on Deep Learning-Based Arabic Text Categorization,” IEEE Access, vol. 8, pp. 127913–127928, 2020.
  11. M. Naili, A. H. Chaibi, and H. H. Ben Ghezala, “Comparative Study of Arabic Stemming Algorithms for Topic Identification,” Procedia Computer Science, vol. 159, pp. 794–802, 2019.
  12. W. Hare, C. Planiden, and C. Sagastizábal, “A derivative-free vu-algorithm for convex finite-max problems,” Optimization Methods and Software, vol. 35, no. 3, pp. 521–559, 2020.
  13. A. Muklason, R. G. Irianti, and A. Marom, “Automated Course Timetabling Optimization Using Tabu-Variable Neighborhood Search Based Hyper-Heuristic Algorithm,” Procedia Computer Science, vol. 161, pp. 656–664, 2019.
  14. A. Jabbar, S. Iqbal, A. Akhunzada, and Q. Abbas, “An improved Urdu stemming algorithm for text mining based on multi-step hybrid approach,” Journal of Experimental & Theoretical Artificial Intelligence, vol. 3079, no. May, pp. 1–21, May 2018.
  15. G. R. Hall and K. Taghva, “Using the Web 1T 5-Gram Database for Attribute Selection in Formal Concept Analysis to Correct Overstemmed Clusters,” in 2015 12th International Conference on Information Technology - New Generations, Apr. 2015, pp. 651–654, Apr. 2015.
  16. Moreword, “Enahanced North American Benchmark Lexicon (ENABLE2K),” 2020, [Online]. Available: https://www.morewords.com.