Causal Discovery with Deductive Reasoning: One Less Problem – 서울대학교 데이터사이언스대학원

Causal discovery from observational data is the process of learning the underlying causal relations in the form of a directed acyclic graph (DAG) when observational data is provided. Among the well-known causal discovery methodologies is the constraint-based method, which involves performing multiple conditional independence tests (CITs) in a principled manner and synthesizing the results to induce the causal structure [1,2]. The performance of CITs is crucial in this approach, because accurate results from these tests are essential for the algorithm to produce correct outcomes. However, an oracle CIT rarely, if ever, exists in the real world, although constraint-based methods assume oracle CIT to guarantee their correctness. In fact, they suffer from high-order CITs with low statistical power. These underpowered tests greatly contribute to the instability and performance degradation of causal structure learning [3,4].

To properly tackle the reliability concerns of CITs, several methods have been introduced so far, the majority of which are largely based on heuristics. However, recent approaches to address this concern utilize rules derived from graphoid axioms [5,6] to construct a consistent causal structure from inconsistent CIT results [7,8]. The intuition behind these approaches is that graphoid axioms can be used to constrain conditional independence (CI) statements by other CI statements. However, the practical application of these methods is often limited due to the significant computational cost and the uncertainties arising from the heuristics that determine the preference of CI statements.

In this research, Prof. Lee’s team presents a straightforward, principled, and practical approach to causal discovery, based on deductive reasoning over CI statements and graphoid axioms. The proposed method, coined DEDUCE-DEP, efficiently replaces unreliable CIT with outcomes derived from deductive reasoning with strictly low-order CITs that involve a smaller conditioning set compared to the target CIT. In contrast to previous approaches, it does not rely on any complex routines that incur the computational burden and uncertainties. Furthermore, the proposed method serves as a modular subroutine that can be seamlessly integrated with various constraint-based methods, highlighting its practicality. Empirical evaluations demonstrate significant performance improvements, affirming the effectiveness of the proposed method.

References

Peter Spirtes, Clark N Glymour, and Richard Scheines. Causation, Prediction, and Search. MIT Press, Cambridge, MA, 2nd edition, 2000.
Christopher Meek. Causal inference and causal explanation with background knowledge. In P. Besnard and S. Hanks, editors, Uncertainty in Artificial Intelligence 11, pages 403–410. Morgan Kaufmann, San Francisco, 1995.
Constantin F. Aliferis, Alexander Statnikov, Ioannis Tsamardinos, Subramani Mani, and Xenofon D. Koutsoukos. Local causal and Markov blanket induction for causal discovery and feature selection for classification part ii: Analysis and extensions. Journal of Machine Learning Research, 11(8):235–284, 2010b.
Angelos P. Armen and I. Tsamardinos. Estimation and control of the false discovery rate of bayesian network skeleton identification. Technical Report FORTH-ICS / TR-441, 2014.
Dan Geiger. Graphoids: A qualitative framework for probabilistic inference. PhD thesis, University of California, Los Angeles, Department of Computer Science, 1990.
Judea Pearl and Azaria Paz. Graphoids: Graph-based Logic for Reasoning about Relevance Relations. In B. Duboulay, D. Hogg, and L. Steels, editors, Advances in Artificial Intelligence II, pages 357–363. North-Holland Publishing Co., 1987.
Facundo Bromberg and Dimitris Margaritis. Improving the reliability of causal discovery from small data sets using argumentation. Journal of Machine Learning Research, 10(12):301–340, 2009.
Pingchuan Ma, Zhenlan Ji, Peisen Yao, Shuai Wang, and Kui Ren. Enabling runtime verification of causal discovery algorithms with automated conditional independence reasoning. In 2024 IEEE/ACM 46th International Conference on Software Engineering (ICSE), pages 329–341. IEEE Computer Society, 2023.