关键词:
meta-analysis
partial least squares
penalized partial least squares
gradient directed path
squared error loss
GENE-EXPRESSION DATA
VENTRICULAR ASSIST DEVICE
TUMOR CLASSIFICATION
MECHANICAL SUPPORT
CANCER
LIKELIHOOD
CARDIOMYOPATHY
METAANALYSIS
REGRESSION
MODELS
摘要:
With an increasing number of publicly available microarray datasets, it becomes attractive to borrow information from other relevant studies to have more reliable and powerful analysis of a given dataset. We do not assume that subjects in the current study and other relevant studies are drawn from the same population as assumed by meta-analysis. In particular, the set of parameters in the current study may be different from that of the other studies. We consider sample classification based on gene expression profiles in this context. We propose two new methods, a weighted partial least squares (WPLS) method and a weighted penalized partial least squares (WPPLS) method, to build a classifier by a combined use of multiple datasets. The methods can weight the individual datasets depending on their relevance to the current study. A more standard approach is first to build a classifier using each of the individual datasets, then to combine the outputs of the multiple classifiers using a weighted voting. Using two quite different datasets on human heart failure, we show first that WPLS/WPPLS, by borrowing information from the other dataset, can improve the performance of PLS/PPLS built on only a single dataset. Second, WPLS/WPPLS performs better than the standard approach of combining multiple classifiers. Third, WTPLS can improve over WPLS, just as PPLS does over PLS for a single dataset. (c) 2004 Elsevier Ltd. All rights reserved.