On the Use of Variable Complementarity for Feature Selection in Cancer Classification

Meyer, Patrick E.; Bontempi, Gianluca

doi:10.1007/11732242_9

Patrick E. Meyer²⁹ &
Gianluca Bontempi²⁹

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 3907))

Included in the following conference series:

Workshops on Applications of Evolutionary Computation

1948 Accesses
152 Citations

Abstract

The paper presents an original filter approach for effective feature selection in classification tasks with a very large number of input variables. The approach is based on the use of a new information theoretic selection criterion: the double input symmetrical relevance (DISR). The rationale of the criterion is that a set of variables can return an information on the output class that is higher than the sum of the informations of each variable taken individually. This property will be made explicit by defining the measure of variable complementarity. A feature selection filter based on the DISR criterion is compared in theoretical and experimental terms to recently proposed information theoretic criteria. Experimental results on a set of eleven microarray classification tasks show that the proposed technique is competitive with existing filter selection methods.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

A Comparative Study of Filter Feature Selection Methods on Microarray Data

Variable Importance and Feature Selection

Hybrid Filter Feature Selection for Improving Cancer Classification in High-Dimensional Microarray Data

References

Guyon, I., Elisseeff, A.: An introduction to variable and feature selection. Journal of Machine Learning Research 3, 1157–1182 (2003)
Article MATH Google Scholar
Kohavi, R., John, G.H.: Wrappers for feature subset selection. Artificial Intelligence 97, 273–324 (1997)
Article MATH Google Scholar
Blum, A., Langley, P.: Selection of relevant features and examples in machine learning. Artificial Intelligence 97, 245–271 (1997)
Article MATH MathSciNet Google Scholar
Provan, G., Singh, M.: Learning bayesian networks using feature selection. In: Fifth International Workshop on Artificial Intelligence and Statistics, pp. 450–456 (1995)
Google Scholar
Duch, W., Winiarski, T., Biesiada, J., Kachel, A.: Feature selection and ranking filters. In: Kaynak, O., Alpaydın, E., Oja, E., Xu, L. (eds.) ICANN 2003 and ICONIP 2003. LNCS, vol. 2714, pp. 251–254. Springer, Heidelberg (2003)
Google Scholar
Bell, D.A., Wang, H.: A formalism for relevance and its application in feature subset selection. Machine Learning 41, 175–195 (2000)
Article MATH Google Scholar
Peng, H., Long, F.: An efficient max-dependency algorithm for gene selection. In: 36th Symposium on the Interface: Computational Biology and Bioinformatics (2004)
Google Scholar
Fleuret, F.: Fast binary feature selection with conditional mutual information. Journal of Machine Learning Research 5, 1531–1555 (2004)
MathSciNet Google Scholar
Yu, L., Liu, H.: Efficient feature selection via analysis of relevance and redundancy. Journal of Machine Learning Research 5, 1205–1224 (2004)
MathSciNet Google Scholar
Cover, T.M., Thomas, J.A.: Elements of Information Theory. John Wiley, New York (1990)
Google Scholar
Yang, H., Moody, J.: Feature selection based on joint mutual information. In: Advances in Intelligent Data Analysis (AIDA), Computational Intelligence Methods and Applications (CIMA), Rochester New York, ICSC (1999)
Google Scholar
Kojadinovic, I.: Relevance measures for subset variable selection in regression problems based on k-additive mutual information. Computational Statistics and Data Analysis 49, 1205–1227 (2005)
Article MATH MathSciNet Google Scholar
Meyer, P.: Information theoretic filters for feature selection. Technical report, Universite Libre de Bruxelles (548) (2005)
Google Scholar
web, http://www.tech.plym.ac.uk/spmc/bioinformatics/microarraycancers.html
Scott, D.W.: Multivariate Density Estimation. Wiley, Chichester (1992)
Book MATH Google Scholar
R-project, www.r-project.org

Download references

Author information

Authors and Affiliations

Université Libre de Bruxelles, CP 212, 1050, Bruxelles, Belgique
Patrick E. Meyer & Gianluca Bontempi

Authors

Patrick E. Meyer
View author publications
Search author on:PubMed Google Scholar
Gianluca Bontempi
View author publications
Search author on:PubMed Google Scholar

Editor information

Editors and Affiliations

Johannes Gutenberg University, Mainz, Germany
Franz Rothlauf
Institute AIFB, University of Karlsruhe, 76128, Karlsruhe, Germany
Jürgen Branke
Dipartimento di Ingegneria dell’Informazione, Università di Parma,
Stefano Cagnoni
Centre of Informatics and Systems of the University of Coimbra,
Ernesto Costa
Dept. LCC, Universidad de Málaga, Spain
Carlos Cotta
Institute of Computer Science, University of Bremen, 28359, Bremen, Germany
Rolf Drechsler
INRIA Saclay - Ile-de-France, Parc Orsay Université, 4, rue Jacques Monod, 91893, ORSAY Cedex, France
Evelyne Lutton
CISUC, Department of Informatics Engineering, University of Coimbra, Polo II of the University of Coimbra, 3030, Coimbra, Portugal
Penousal Machado
Dartmouth College, Lebanon, NH, USA
Jason H. Moore
Universidade de A Coruña, CP 15071, A Coruña, Spain
Juan Romero
School of Computing Sciences, UEA Norwich, University of East Anglia, NR4 7TJ, Norwich, UK
George D. Smith
Dipartimento di Automatica e Informatica, Politecnico di Torino, Italy
Giovanni Squillero
Kyushu University, Japan
Hideyuki Takagi

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Meyer, P.E., Bontempi, G. (2006). On the Use of Variable Complementarity for Feature Selection in Cancer Classification. In: Rothlauf, F., et al. Applications of Evolutionary Computing. EvoWorkshops 2006. Lecture Notes in Computer Science, vol 3907. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11732242_9

Download citation

DOI: https://doi.org/10.1007/11732242_9
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-33237-4
Online ISBN: 978-3-540-33238-1
eBook Packages: Computer ScienceComputer Science (R0)Springer Nature Proceedings Computer Science

Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Publish with us

Policies and ethics

Profiles

Gianluca Bontempi View author profile