CAIS

2026

L.A.K.E.: Logic Agent for Knowledge Extraction in Data Planning

Data lakes in modern enterprises are massive, heterogeneous, and noisy, often preventing non-experts from effectively extracting value. Bridging the semantic gap between ambiguous user intent and explicit data requires orchestrating multiple tools under an open-world assumption. However, reliably executing these compound AI workflows to solve complex knowledge extraction tasks, while also providing the transparency needed for evaluation and debugging, remains a significant bottleneck. We propose L.A.K.E. (Logic Agent for Knowledge Extraction), an agentic data planning framework designed to map natural language questions to executable workflows over diverse data sources. Rather than relying on a brittle “one-size-fits-all” approach, L.A.K.E. dynamically generates a declarative plan comprised of modular operators—spanning relational and semantic functions over heterogeneous data sources. Within this framework, we introduce and benchmark three distinct planning regimes: Iterative Planning, Single-Shot Tree Planning, and Cascade Planning. We present an interactive demonstration platform that enables users to visually compare the latency and robustness trade-offs of these planners. By rendering execution paths as interactive Directed Acyclic Graphs (DAGs) with step-level provenance, L.A.K.E. provides the critical observability needed to establish trust, debug failures, and optimize data planning for enterprise-scale lakes.