DEMIR at CLEF eHealth 2019: Information retrieval based classification of animal experiments summaries


Ahmed N., Arıbaş A., ALPKOÇAK A.

20th Working Notes of CLEF Conference and Labs of the Evaluation Forum, CLEF 2019, Lugano, İsviçre, 9 - 12 Eylül 2019, cilt.2380 identifier

  • Yayın Türü: Bildiri / Tam Metin Bildiri
  • Cilt numarası: 2380
  • Basıldığı Şehir: Lugano
  • Basıldığı Ülke: İsviçre
  • Anahtar Kelimeler: Elasticsearch, K-Nearest Neighbor k-NN, Multi-label classification, Threshold -Nearest Neighbor t-NN
  • Dokuz Eylül Üniversitesi Adresli: Evet

Özet

© 2019 CEUR-WS. All rights reserved.Information retrieval searching systems recently become powerful for retrieving full text results according to a particular query (or else a document query). Elastic search is an open source information retrieval searching system that is built on Apache Lucene, and works as a distributed search and analytics engine at the same time. Therefore, this engine can also be used as one of machine learnings' approaches to solve some challenges such as document classification problem. This study is published as working-notes paper for CLEF eHealth 2019 Task 1 on Multilingual Information Extraction and it proposes a k-nearest neighbor (k-NN) and Threshold (t-NN) approaches to classify animal experiment summaries into its correct ICD-10 codes. After that, another two methods are proposed to control and adjust the retrieved labels of the documents results to assign ICD-10 codes for the issued query document. These approaches register high precision, recall and f-measure after we experiment it with the development dataset.