Three Dimensions of Pitched Instrument Onset Detection


Holzapfel A., Stylianou Y., Gedik A. C., Bozkurt B.

IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, vol.18, no.6, pp.1517-1527, 2010 (SCI-Expanded) identifier

  • Publication Type: Article / Article
  • Volume: 18 Issue: 6
  • Publication Date: 2010
  • Doi Number: 10.1109/tasl.2009.2036298
  • Journal Name: IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING
  • Journal Indexes: Science Citation Index Expanded (SCI-EXPANDED)
  • Page Numbers: pp.1517-1527
  • Keywords: Automatic music transcription, group delay, music information retrieval, onset detection, SPEECH
  • Dokuz Eylül University Affiliated: No

Abstract

In this paper, we suggest a novel group delay based method for the onset detection of pitched instruments. It is proposed to approach the problem of onset detection by examining three dimensions separately: phase (i.e., group delay), magnitude and pitch. The evaluation of the suggested onset detectors for phase, pitch and magnitude is performed using a new publicly available and fully onset annotated database of monophonic recordings which is balanced in terms of included instruments and onset samples per instrument, while it contains different performance styles. Results show that the accuracy of onset detection depends on the type of instruments as well as on the style of performance. Combining the information contained in the three dimensions by means of a fusion at decision level leads to an improvement of onset detection by about 8% in terms of F-measure, compared to the best single dimension.