Is Bayesian Network Learning any better than Stepwise?

Say you have limited domain knowledge available in a situation, eg a bunch of potential biomarkers that are unknown and need to be prioritized for identification as is common in omics studies.

The pitfalls of stepwise are well known, is bayes net learning any better at all? Basically referring to thinks like bnlearn - Conditional Independence tests, Hill Climb Search — pgmpy 0.1.15 documentation, etc.

Essentially where a DAG is learned from data. Usually for these things you do have to discretize the features though. But the general concept, is it of any use? It looks very similar to stepwise (uses scoring based on AIC/BIC or chi sq tests) except fancier

Any method requiring discretizing of data needs to be avoided. And any method that claims to learn everything you need to know from only using the data is doomed.

2 Likes

It does often require discretizing of the data yea though in R bnlearn there is also a Gaussian assumption version, but also assumes linearity I think.

Otherwise I see these methods as hypothesis generating. I don’t think they claim to learn everything, and often people add known edges beforehand to support the domain knowledge. With enough data I’ve heard these methods learn up to the “equivalence class” of a DAG though, but within that equivalence class you won’t know the direction of the edges.

From the hypothesis generating point of view, would you say it’s less problematic? One could use these and perform targeted designed experiments afterwards. The discretizing or linear-gaussian would be an issue, but if its to get a rough picture also less of a big deal.

Binning is highly problematic from every angle. It creates noise that can easily lead to generating wrong hypotheses. Continuous variables need to be respected.

Your view does bring into question these sorts of packages: Causal Inference with Bayesian Networks. Main Concepts and Methods — causalnex 0.11.0 documentation

" prediction & Inference** . The given structure and likelihoods can be used to make predictions, or perform observational and counterfactual inference. CausalNex supports structure learning from continuous data, and expert opinion. CausalNex supports likelihood estimation and prediction/inference from discrete data. A Discretiser class is provided to help discretising continuous data in a meaningful way."

So much of the causal inference DAG theory relies on discretizing the variables, that it brings into question if the result is truly “causal” at all. A lot of these fancy “causal” methods seem to ignore the basics…

There are two aspects of meaningful causal inference:

  • justifying the pathways and ensuring that all need variables involved in the pathway are measured accurately and completely enough
  • having evidence (from the data) for effect > 0 conditional on the effect being causal

Suboptimal analysis choices such as binning especially hurt the second.