Skip to main content

On the Use of Variable Complementarity for Feature Selection in Cancer Classification

  • Conference paper
Applications of Evolutionary Computing (EvoWorkshops 2006)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 3907))

Included in the following conference series:

  • 1948 Accesses

  • 152 Citations

Abstract

The paper presents an original filter approach for effective feature selection in classification tasks with a very large number of input variables. The approach is based on the use of a new information theoretic selection criterion: the double input symmetrical relevance (DISR). The rationale of the criterion is that a set of variables can return an information on the output class that is higher than the sum of the informations of each variable taken individually. This property will be made explicit by defining the measure of variable complementarity. A feature selection filter based on the DISR criterion is compared in theoretical and experimental terms to recently proposed information theoretic criteria. Experimental results on a set of eleven microarray classification tasks show that the proposed technique is competitive with existing filter selection methods.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Similar content being viewed by others

References

  1. Guyon, I., Elisseeff, A.: An introduction to variable and feature selection. Journal of Machine Learning Research 3, 1157–1182 (2003)

    Article  MATH  Google Scholar 

  2. Kohavi, R., John, G.H.: Wrappers for feature subset selection. Artificial Intelligence 97, 273–324 (1997)

    Article  MATH  Google Scholar 

  3. Blum, A., Langley, P.: Selection of relevant features and examples in machine learning. Artificial Intelligence 97, 245–271 (1997)

    Article  MATH  MathSciNet  Google Scholar 

  4. Provan, G., Singh, M.: Learning bayesian networks using feature selection. In: Fifth International Workshop on Artificial Intelligence and Statistics, pp. 450–456 (1995)

    Google Scholar 

  5. Duch, W., Winiarski, T., Biesiada, J., Kachel, A.: Feature selection and ranking filters. In: Kaynak, O., Alpaydın, E., Oja, E., Xu, L. (eds.) ICANN 2003 and ICONIP 2003. LNCS, vol. 2714, pp. 251–254. Springer, Heidelberg (2003)

    Google Scholar 

  6. Bell, D.A., Wang, H.: A formalism for relevance and its application in feature subset selection. Machine Learning 41, 175–195 (2000)

    Article  MATH  Google Scholar 

  7. Peng, H., Long, F.: An efficient max-dependency algorithm for gene selection. In: 36th Symposium on the Interface: Computational Biology and Bioinformatics (2004)

    Google Scholar 

  8. Fleuret, F.: Fast binary feature selection with conditional mutual information. Journal of Machine Learning Research 5, 1531–1555 (2004)

    MathSciNet  Google Scholar 

  9. Yu, L., Liu, H.: Efficient feature selection via analysis of relevance and redundancy. Journal of Machine Learning Research 5, 1205–1224 (2004)

    MathSciNet  Google Scholar 

  10. Cover, T.M., Thomas, J.A.: Elements of Information Theory. John Wiley, New York (1990)

    Google Scholar 

  11. Yang, H., Moody, J.: Feature selection based on joint mutual information. In: Advances in Intelligent Data Analysis (AIDA), Computational Intelligence Methods and Applications (CIMA), Rochester New York, ICSC (1999)

    Google Scholar 

  12. Kojadinovic, I.: Relevance measures for subset variable selection in regression problems based on k-additive mutual information. Computational Statistics and Data Analysis 49, 1205–1227 (2005)

    Article  MATH  MathSciNet  Google Scholar 

  13. Meyer, P.: Information theoretic filters for feature selection. Technical report, Universite Libre de Bruxelles (548) (2005)

    Google Scholar 

  14. web, http://www.tech.plym.ac.uk/spmc/bioinformatics/microarraycancers.html

  15. Scott, D.W.: Multivariate Density Estimation. Wiley, Chichester (1992)

    Book  MATH  Google Scholar 

  16. R-project, www.r-project.org

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2006 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Meyer, P.E., Bontempi, G. (2006). On the Use of Variable Complementarity for Feature Selection in Cancer Classification. In: Rothlauf, F., et al. Applications of Evolutionary Computing. EvoWorkshops 2006. Lecture Notes in Computer Science, vol 3907. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11732242_9

Download citation

Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Publish with us

Policies and ethics

Profiles

  1. Gianluca Bontempi