Network information improves cancer outcome prediction


Creative Commons License

Roy J., Winter C., IŞIK Z., Schroeder M.

BRIEFINGS IN BIOINFORMATICS, cilt.15, sa.4, ss.612-625, 2014 (SCI-Expanded) identifier identifier

  • Yayın Türü: Makale / Tam Makale
  • Cilt numarası: 15 Sayı: 4
  • Basım Tarihi: 2014
  • Doi Numarası: 10.1093/bib/bbs083
  • Dergi Adı: BRIEFINGS IN BIOINFORMATICS
  • Derginin Tarandığı İndeksler: Science Citation Index Expanded (SCI-EXPANDED), Scopus
  • Sayfa Sayıları: ss.612-625
  • Anahtar Kelimeler: network-based, outcome prediction, gene expression, PageRank, cancer biomarker, GENE-EXPRESSION SIGNATURE, DYSREGULATED SUBNETWORKS, SQUAMOUS-CELL, GROWTH-FACTOR, METASTASIS, SURVIVAL, PROFILE, SP1, IDENTIFICATION, PROGRESSION
  • Dokuz Eylül Üniversitesi Adresli: Hayır

Özet

Disease progression in cancer can vary substantially between patients. Yet, patients often receive the same treatment. Recently, there has been much work on predicting disease progression and patient outcome variables from gene expression in order to personalize treatment options. Despite first diagnostic kits in the market, there are open problems such as the choice of random gene signatures or noisy expression data. One approach to deal with these two problems employs protein-protein interaction networks and ranks genes using the random surfer model of Google's PageRank algorithm. In this work, we created a benchmark dataset collection comprising 25 cancer outcome prediction datasets from literature and systematically evaluated the use of networks and a PageRank derivative, NetRank, for signature identification. We show that the NetRank performs significantly better than classical methods such as fold change or t-test. Despite an order of magnitude difference in network size, a regulatory and protein-protein interaction network perform equally well. Experimental evaluation on cancer outcome prediction in all of the 25 underlying datasets suggests that the network-based methodology identifies highly overlapping signatures over all cancer types, in contrast to classical methods that fail to identify highly common gene sets across the same cancer types. Integration of network information into gene expression analysis allows the identification of more reliable and accurate biomarkers and provides a deeper understanding of processes occurring in cancer development and progression.