Evaluation of Bayesian network scoring functions in polychotomous data analysis

Xuejia Ke, Katherine Lisa Keenan, V.Anne Smith*

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

Abstract

Bayesian networks (BNs) are probabilistic graphical models used to represent dependencies and independencies among variables. They have been applied widely to many areas in, e.g., biology and medicine, to untangle complex interrelationships, and are now finding wider use in areas such as social science with differing data features, for example highly polychotomous (multi-category) data. To construct BNs, scoring functions guide selection of the most appropriate model. Among these, the BDe scoring function requires specifying hyperparameters that influence the priors on the network parameters. This study evaluates the performance of four scoring functions—AIC, BIC, BDe, and log-likelihood—particularly with highly polychotomous data. We assessed the overall performance of the scoring function, and for BDe, we varied its hyperparameter to evaluate its impact. Performance of the scoring functions was significantly influenced by the number of nodes, network complexity, and sample size. BIC and BDe (with default hyperparameters) generally offered higher precision, especially with larger sample sizes, while log-likelihood tended to overfit, showing high recall but low precision. AIC and BDe required careful tuning based on discrete levels and sample sizes. Optimizing the hyperparameters in BDe was crucial for balancing model complexity and fit. We propose a simulation method for identifying the optimum hyperparameters for using BDe scoring function in real-world data applications. The study provides insights to enhance BN models’ robustness and accuracy, emphasizing the importance of considering sample size and the number of discrete levels when selecting and tuning scoring functions for BN structure learning.
Original languageEnglish
Article number18
Pages (from-to)1-21
Number of pages21
JournalDiscover Data
Volume3
Issue number1
Early online date19 May 2025
DOIs
Publication statusE-pub ahead of print - 19 May 2025

Keywords

  • Bayesian networks
  • Structure learning
  • Scoring functions
  • Polychotomous data

Fingerprint

Dive into the research topics of 'Evaluation of Bayesian network scoring functions in polychotomous data analysis'. Together they form a unique fingerprint.

Cite this