The text-to-SQL problem aims to translate natural
language questions into SQL statements to ease the interaction
between database systems and end users. Recently, Large Language Models (LLMs) have exhibited impressive capabilities in a
variety of tasks, including text-to-SQL. While prior works have
explored various strategies for prompting LLMs to generate
SQL statements, they still fall short of fully harnessing the
power of LLM due to the lack of (1) high-quality contextual
information when constructing the prompts and (2) robust
feedback mechanisms to correct translation errors. To address
these challenges, we propose MageSQL, a text-to-SQL approach
based on in-context learning over LLMs. MageSQL explores a
suite of techniques that leverage the syntax and semantics of
SQL queries to identify relevant few-shot demonstrations as
context for prompting LLMs. In particular, we introduce a graphbased demonstration selection method — the first of its kind
in the text-to-SQL problem — that leverages graph contrastive
learning adapted with SQL-specific data augmentation strategies.
Furthermore, an error correction module is proposed to detect
and fix potential inaccuracies in the generated SQL query. We
conduct comprehensive evaluations on several benchmarking
datasets. The results show that our proposed methods outperform
state-of-the-art methods by an obvious margin.