A Knowledge-Poor Approach to Turkish Text Categorization

Yildirim, Savas

A Knowledge-Poor Approach to Turkish Text Categorization

dc.authorid	yildirim, savas/0000-0002-7764-2891
dc.authorwosid	yildirim, savas/AAG-4639-2019
dc.contributor.author	Yildirim, Savas
dc.date.accessioned	2024-07-18T20:51:00Z
dc.date.available	2024-07-18T20:51:00Z
dc.date.issued	2014
dc.department	İstanbul Bilgi Üniversitesi	en_US
dc.description	15th Annual Conference on Intelligent Text Processing and Computational Linguistics (CICLing) -- APR 06-12, 2014 -- Ctr Commun & Dev, Kathmandu, NEPAL	en_US
dc.description.abstract	Document categorization is a way of determining a category for a given document. Supervised methods mostly rely on a training data and rich linguistic resources that are either language-specific or generic. This study proposes a knowledge-poor approach to text categorization without using any sets of rules or language specific resources such as part-of-speech tagger or shallow parser. Knowledge-poor here refers to lack of a reasonable amount of background knowledge. The proposed system architecture takes data as-is and simply separates tokens by space. Documents represented in vector space models are used as training data for many machine learning algorithm. We empirically examined and compared a several factors from similarity metrics to learning algorithms in a variety of experimental setups. Although researchers believe that some particular classifiers or metrics are better than others for text categorization, the recent studies disclose that the ranking of the models purely depends on the class, experimental setup and domain as well. The study features extensive evaluation, comparison within a variety of experiments. We evaluate models and similarity metrics for Turkish language as one of the agglutinative language especially within poor-knowledge framework. It is seen that output of the study would be very beneficial for other studies.	en_US
dc.description.sponsorship	Inst Politecnico Nacl Centro Invest Computac Nat Language &Text Proc Lab,Mexican Soc Artificial Intelligence	en_US
dc.identifier.endpage	440	en_US
dc.identifier.isbn	978-3-642-54902-1
dc.identifier.isbn	978-3-642-54903-8
dc.identifier.issn	0302-9743
dc.identifier.issn	1611-3349
dc.identifier.scopus	2-s2.0-84958521342	en_US
dc.identifier.scopusquality	Q3	en_US
dc.identifier.startpage	428	en_US
dc.identifier.uri	https://hdl.handle.net/11411/8344
dc.identifier.volume	8404	en_US
dc.identifier.wos	WOS:000342990000036	en_US
dc.identifier.wosquality	N/A	en_US
dc.indekslendigikaynak	Web of Science	en_US
dc.indekslendigikaynak	Scopus	en_US
dc.language.iso	en	en_US
dc.publisher	Springer-Verlag Berlin	en_US
dc.relation.ispartof	Computational Linguistics and Intelligent Text Processing, Cicling 2014, Part Ii	en_US
dc.relation.publicationcategory	Konferans Öğesi - Uluslararası - Kurum Öğretim Elemanı	en_US
dc.rights	info:eu-repo/semantics/closedAccess	en_US
dc.subject	Text Categorization	en_US
dc.subject	Vector Space Model	en_US
dc.subject	Machine Learning	en_US
dc.title	A Knowledge-Poor Approach to Turkish Text Categorization	en_US
dc.type	Conference Object	en_US

Koleksiyon

Web of Science Indexed Publications
Scopus Indexed Publications

A Knowledge-Poor Approach to Turkish Text Categorization

Dosyalar

Koleksiyon