Discretizing environmental data for learning Bayesian-network classifiers

Publication date

2018-01-24

Authors

Fernandez Ropero, R.M.ISNI 0000000520965907
Renooij, SiljaORCID 0000-0003-4339-8146ISNI 0000000396172124
van der Gaag, L.C.ISNI 0000000117800715

Editors

Advisors

Supervisors

Document Type

Article
Open Access logo

License

taverne

Abstract

For predicting the presence of different bird species in Andalusia from land-use data, we compare the performances of Bayesian-network classifiers and logistic-regression models. In our study, both well balanced and less balanced data sets are used, and models are learned from both the original continuous data and from the data after discretization. For the latter purpose, four different discretization methods, called Equal Frequency, Equal Width, Chi-Merge and MDLP, are compared. The experimental results from our species data sets suggest that the simple Naive Bayesian classifiers are preferable to logistic-regression models and that the relatively unknown Chi-Merge method is the preferred method for discretizing these environmental data.

Keywords

Species distribution models, Bayesian-network classifiers, Logistic-regression models, Discretization methods, Taverne, SDG 15 - Life on Land

Citation

Fernandez Ropero, R M, Renooij, S & van der Gaag, L C 2018, 'Discretizing environmental data for learning Bayesian-network classifiers', Ecological Modelling, vol. 368, pp. 391-403. https://doi.org/10.1016/j.ecolmodel.2017.12.015