Feature selection in causal discovery

AriRoane · May 6, 2026, 3:53pm

Hi! I am using a DAG approach in trying to better specify a research question examining stone-free rate after kidney stone treatment. I have constructed an a priori model using review of relevant literature, but wondering if it would be valuable to run some data-driven algorithms on variables in the dataset to understand if this a priori model aligns with the dataset. I know there are both constraint and non-constraint based methods, but wondering if there is a more systematic framework I should be using for feature definition and selection. One big limitation is that many of the actual mediators/mechanisms in the DAG aren’t measured in the dataset, although many of the parents to those mechanisms are.