Comparison of PLS algorithms when number of objects is much larger than number of variables


Alın A.

STATISTICAL PAPERS, vol.50, no.4, pp.711-720, 2009 (SCI-Expanded) identifier identifier

  • Publication Type: Article / Article
  • Volume: 50 Issue: 4
  • Publication Date: 2009
  • Doi Number: 10.1007/s00362-009-0251-7
  • Journal Name: STATISTICAL PAPERS
  • Journal Indexes: Science Citation Index Expanded (SCI-EXPANDED), Scopus
  • Page Numbers: pp.711-720
  • Keywords: High-dimensional data, Kernel matrix, Multicollinearity, Multiple linear regression, NIPALS, Partial least squares, SIMPLS, KERNEL ALGORITHM, CROSS-VALIDATION, REGRESSION
  • Dokuz Eylül University Affiliated: Yes

Abstract

NIPALS and SIMPLS algorithms are the most commonly used algorithms for partial least squares analysis. When the number of objects, N, is much larger than the number of explanatory, K, and/or response variables, M, the NIPALS algorithm can be time consuming. Even though the SIMPLS is not as time consuming as the NIPALS and can be preferred over the NIPALS, there are kernel algorithms developed especially for the cases where N is much larger than number of variables. In this study, the NIPALS, SIMPLS and some kernel algorithms have been used to built partial least squares regression model. Their performances have been compared in terms of the total CPU time spent for the calculations of latent variables, leave-one-out cross validation and bootstrap methods. According to the numerical results, one of the kernel algorithms suggested by Dayal and MacGregor (J Chemom 11: 73-85, 1997) is the fastest algorithm.