Zipf's law and Mandelbrot's constants for Turkish language using Turkish corpus (TurCo)

Dalkılıç G., Çebi Y.

ADVANCES IN INFORMATION SYSTEMS, PROCEEDINGS, vol.3261, pp.273-282, 2004 (SCI-Expanded) identifier identifier


Zipf's Law is a common law applied for different kinds of observations. Many investigations were carried out to find the correspondences between Zipf's Law and different languages. This study deals with the correspondence of Turkish with Zipf's Law and finding Mandelbrot constants (c and B) by using a large scale Turkish corpus (TurCo). In order to determine these constants, coefficient of determination was used, and different c and B values were examined. As both languages show agglutinative characteristics, the most suitable B value was found smaller than I for Turkish like Korean, and c value was found as 0.27.