An Empirical Investigation of Performances of Different Word Embedding Algorithms in Comment Clustering
Küçük Resim Yok
Tarih
2019
Yazarlar
Dergi Başlığı
Dergi ISSN
Cilt Başlığı
Yayıncı
IEEE
Erişim Hakkı
info:eu-repo/semantics/closedAccess
Özet
With the rapid growth of the usage and interest in social network services, evaluating comment clustering has become increasingly important for various commercial and scientific applications. Analyzing, organizing and ascertaining the overall theme of a large volume of comments is a challenging and time-consuming task which has attracted much attention recently. In this study, we proposed a method to address the comment clustering problem. Extensive experiments have been conducted on seven different comment datasets using TF-IDF and different word embedding algorithms, namely Word2vec, Glove and FastText; the internal clustering validation have been conducted to evaluate the performance of each method in clustering of the comments. We observed that word embedding produced significantly better results in comment clustering than TF-IDF. In addition, word2vec has shown the best performance among all; however, we found that Glove is the most stable and consistent across all datasets such that the performance improved as dataset size increased.
Açıklama
Innovations in Intelligent Systems and Applications Conference (ASYU) -- OCT 31-NOV 02, 2019 -- Izmir, TURKEY
Anahtar Kelimeler
Comment Clustering, Word2vec, Glove, Fasttext, Tf=İdf
Kaynak
2019 Innovations in Intelligent Systems and Applications Conference (Asyu)
WoS Q Değeri
N/A
Scopus Q Değeri
N/A