Fine-tuning Turkish LLM With Human-Annotated Paraphrase Corpus

Teker G., Koşaner Ö.

2025 International Conference on Artificial Intelligence, Computer, Data Sciences and Applications (ACDSA), Antalya, Türkiye, 7 - 09 Ağustos 2025, ss.1-6, (Tam Metin Bildiri)

Yayın Türü: Bildiri / Tam Metin Bildiri
Doi Numarası: 10.1109/acdsa65407.2025.11166212
Basıldığı Şehir: Antalya
Basıldığı Ülke: Türkiye
Sayfa Sayıları: ss.1-6
Dokuz Eylül Üniversitesi Adresli: Evet

Özet

Paraphrasing is commonly used in daily language,

and both theoretical and applied research on paraphrasing exists.

In natural language processing, paraphrasing tasks have garnered

significant interest and have been the focus of much research.

With this interest, this study specifically focuses on Turkish

paraphrase generation, an area with limited research. In the first

phase of the study, we aimed to create a Turkish paraphrase

corpus annotated by twice. Next, we fine-tuned the corpus on the

Turkish large language model, TURNA, and finally we obtained

the best results in NLU mode with 75.9 Rouge1, 61.4 Rouge2, 73.5

RougeL, 49.6 BLEU, 72.5 METEOR scores.