On the cryptographic patterns and frequencies in Turkish language


Dalkilic M., Dalkilic G.

ADVANCES IN INFORMATION SYSTEMS, vol.2457, pp.144-153, 2002 (SCI-Expanded) identifier

  • Publication Type: Article / Article
  • Volume: 2457
  • Publication Date: 2002
  • Journal Name: ADVANCES IN INFORMATION SYSTEMS
  • Journal Indexes: Science Citation Index Expanded (SCI-EXPANDED)
  • Page Numbers: pp.144-153
  • Dokuz Eylül University Affiliated: Yes

Abstract

Although Turkish is a significant language with over 60 million native speakers, its cryptographic characteristics are relatively unknown. In this paper, some language patterns and frequencies of Turkish (such as letter frequency profile, letter contact patterns, most frequent digrams, trigrams and words, common word beginnings and endings, vowel/consonant patterns, etc.) relevant to information security, cryptography and plaintext recognition applications are presented and discussed. The data is collected from a large Turkish corpus and the usage of the data is illustrated through cryptanalysis of a mono-alphabetic substitution cipher. A new vowel identification method is developed using a distinct pattern of Turkish-(almost) non-existence of double consonants at word boundaries.