Causal discovery from observational data is the process of learning the underlying causal relations in the form of a directed acyclic graph (DAG) when observational data is provided. Among the well-known causal discovery methodologies is the constraint-based method, which involves performing multiple conditional independence tests (CITs) in a principled manner and synthesizing the results to induce the causal structure [1,2]. The performance of CITs is crucial in this approach, because accurate results from these tests are essential for the algorithm to produce correct outcomes. However, an oracle CIT rarely, if ever, exists in the real world, although constraint-based methods assume oracle CIT to guarantee their correctness. In fact, they suffer from high-order CITs with low statistical power. These underpowered tests greatly contribute to the instability and performance degradation of causal structure learning [3,4].
To properly tackle the reliability concerns of CITs, several methods have been introduced so far, the majority of which are largely based on heuristics. However, recent approaches to address this concern utilize rules derived from graphoid axioms [5,6] to construct a consistent causal structure from inconsistent CIT results [7,8]. The intuition behind these approaches is that graphoid axioms can be used to constrain conditional independence (CI) statements by other CI statements. However, the practical application of these methods is often limited due to the significant computational cost and the uncertainties arising from the heuristics that determine the preference of CI statements.
In this research, Prof. Lee’s team presents a straightforward, principled, and practical approach to causal discovery, based on deductive reasoning over CI statements and graphoid axioms. The proposed method, coined DEDUCE-DEP, efficiently replaces unreliable CIT with outcomes derived from deductive reasoning with strictly low-order CITs that involve a smaller conditioning set compared to the target CIT. In contrast to previous approaches, it does not rely on any complex routines that incur the computational burden and uncertainties. Furthermore, the proposed method serves as a modular subroutine that can be seamlessly integrated with various constraint-based methods, highlighting its practicality. Empirical evaluations demonstrate significant performance improvements, affirming the effectiveness of the proposed method.