Yazar "Diri, Banu" seçeneğine göre listele
Listeleniyor 1 - 7 / 7
Sayfa Başına Sonuç
Sıralama seçenekleri
Öğe Acquisition of Turkish meronym based on classification of patterns(Springer, 2016) Yildiz, Tugba; Diri, Banu; Yildirim, SavasThe identification of semantic relations from a raw text is an important problem in Natural Language Processing. This paper provides semi-automatic pattern-based extraction of part-whole relations. We utilized and adopted some lexico-syntactic patterns to disclose meronymy relation from a Turkish corpus. We applied two different approaches to prepare patterns; one is based on pre-defined patterns that are taken from the literature, second automatically produces patterns by means of bootstrapping method. While pre-defined patterns are directly applied to corpus, other patterns need to be discovered first by taking manually prepared unambiguous seeds. Then, word pairs are extracted by their occurrence in those patterns. In addition, we used statistical selection on global data that is obtaining from all results of entire patterns. It is a whole-by-part matrix on which several association metrics such as information gain, T-score, etc., are applied. We examined how all these approaches improve the system accuracy especially within corpus-based approach and distributional feature of words. Finally, we conducted a variety of experiments with a comparison analysis and showed advantage and disadvantage of the approaches with promising results.Öğe A Hybrid Method for Extracting Turkish Part-Whole Relation Pairs from Corpus(IEEE, 2016) Sahin, Gurkan; Diri, Banu; Yildiz, TugbaExtraction of various semantic relation pairs from different sources (dictionary definitions, corpus etc.) with high accuracy is one of the most popular topics in natural language processing (NLP). In this study, a hybrid method is proposed to extract Turkish part-whole pairs from corpus. Corpus statistics, WordNet similarities and Word2Vec word vector similarities are used together in this study. Firstly, initial part-whole seeds are prepared and by using these seeds part-whole patterns are extracted from corpus. For each pattern, a reliability score is calculated and reliable patterns are selected to produce new pairs from corpus. Various reliability scores are used for new pairs. To measure success of method, 19 target whole words are selected and average 83% (first 10 pairs), 74% (first 20 pairs), 68% (first 30 pairs) precisions are obtained, respectively.Öğe An Integrated Approach to Automatic Synonym Detection in Turkish Corpus(Springer International Publishing Ag, 2014) Yildiz, Tugba; Yildirum, Savas; Diri, BanuIn this study, we designed a model to determine synonymy. Our main assumption is that synonym pairs show similar semantic and dependency relation by the definition. They share same meronym/holonym and hypernym/hyponym relations. Contrary to synonymy, hypernymy and meronymy relations can probably be acquired by applying lexico-syntactic patterns to a big corpus. Such acquisition might be utilized and ease detection of synonymy. Likewise, we utilized some particular dependency relations such as object/subject of a verb, etc. Machine learning algorithms were applied on all these acquired features. The first aim is to find out which dependency and semantic features are the most informative and contribute most to the model. Performance of each feature is individually evaluated with cross validation. The model that combines all features shows promising results and successfully detects synonymy relation. The main contribution of the study is to integrate both semantic and dependency relation within distributional aspect. Second contribution is considered as being first major attempt for Turkish synonym identification based on corpus-driven approach.Öğe Pattern and Semantic Similarity Based Automatic Extraction of Hyponym-Hypernym Relation from Turkish Corpus(IEEE, 2015) Sahin, Gurkan; Diri, Banu; Yildiz, TugbaExtraction of semantic relations from various resources (Wikipedia, Web, corpus etc.) is an important issue in natural language processing. In this paper, automatic extraction of hyponym-hypernym pairs from Turkish corpus is aimed. For extraction of hyponym-hypernym pairs, pattern and semantic similarity based methods are used together. Patterns are extracted from initial hyponym-hypernym pairs and using patterns, hyponyms are extracted for various hypernyms. Incorrect candidate hyponyms are removed using document frequency and semantic similarity based elimination methods. After experiments for 14 hypernyms, average accuracy of 77% was obtained.Öğe A Study on Turkish Meronym Extraction Using a Variety of Lexico-Syntactic Patterns(Springer International Publishing Ag, 2016) Yildiz, Tugba; Yildirim, Savas; Diri, BanuIn this paper, we applied lexico-syntactic patterns to disclose meronymy relation from a huge Turkish raw text. Once, the system takes a huge raw corpus and extract matched cases for a given pattern, it proposes a list of whole-part pairs depending on their co-occur frequencies. For the purpose, we exploited and compared a list of pattern clusters. The clusters to be examined could fall into three types; general patterns, dictionary-based pattern, and bootstrapped pattern. We evaluated how these patterns improve the system performance especially within corpusbased approach and distributional feature of words. Finally, we discuss all the experiments with a comparison analysis and we showed advantage and disadvantage of the approaches with promising results.Öğe Turkish synonym identi cation from multiple resources: monolingual corpus,mono/bilingual online dictionaries, and WordNe(2017) Yıldız, Tuğba; Diri, Banu; Yıldırım, SavaşIn this study, a model is proposed to determine synonymy by incorporating several resources. The modelextracts the features from monolingual online dictionaries, a bilingual online dictionary, WordNet, and a monolingualTurkish corpus. Once it has built a candidate list, it determines the synonymy for a given word by means of thosefeatures. All these resources and the approaches are evaluated. Taking all features into account and applying machinelearning algorithms, the model shows good performance of F-measure with 81.4%. The study contributes to the literatureby integrating several resources and attempting the rst corpus-driven synonym detection system for Turkish.Öğe Turkish synonym identification from multiple resources: monolingual corpus, mono/bilingual online dictionaries, and WordNet(TUBITAK SCIENTIFIC & TECHNICAL RESEARCH COUNCIL TURKEY, 2017) Yıldız, Tuğba; Diri, Banu; Yıldırım, SavaşIn this study, a model is proposed to determine synonymy by incorporating several resources. The model extracts the features from monolingual online dictionaries, a bilingual online dictionary, WordNet and a monolingual Turkish corpus. Once it has built a candidate list, it determines the synonymy for a given word by means of those features. All these resources and the approaches are evaluated. Taking all features into account and applying machine learning algorithms, the model shows good performance of F-Measure with 81.4%. The study contributes to the literature by integrating several resources and attempting the first corpus-driven synonym detection system for Turkish.