An Empirical Investigation of Performances of Different Word Embedding Algorithms in Comment Clustering

Küçük Resim Yok

Tarih

2019

Dergi Başlığı

Dergi ISSN

Cilt Başlığı

Yayıncı

IEEE

Erişim Hakkı

info:eu-repo/semantics/closedAccess

Özet

With the rapid growth of the usage and interest in social network services, evaluating comment clustering has become increasingly important for various commercial and scientific applications. Analyzing, organizing and ascertaining the overall theme of a large volume of comments is a challenging and time-consuming task which has attracted much attention recently. In this study, we proposed a method to address the comment clustering problem. Extensive experiments have been conducted on seven different comment datasets using TF-IDF and different word embedding algorithms, namely Word2vec, Glove and FastText; the internal clustering validation have been conducted to evaluate the performance of each method in clustering of the comments. We observed that word embedding produced significantly better results in comment clustering than TF-IDF. In addition, word2vec has shown the best performance among all; however, we found that Glove is the most stable and consistent across all datasets such that the performance improved as dataset size increased.

Açıklama

Innovations in Intelligent Systems and Applications Conference (ASYU) -- OCT 31-NOV 02, 2019 -- Izmir, TURKEY

Anahtar Kelimeler

Comment Clustering, Word2vec, Glove, Fasttext, Tf=İdf

Kaynak

2019 Innovations in Intelligent Systems and Applications Conference (Asyu)

WoS Q Değeri

N/A

Scopus Q Değeri

N/A

Cilt

Sayı

Künye