In this work, we propose an end-to-end framework named Starmie. Dataset discovery from data lakes is a critical way to utilize open-domain data within the enterprise. To overcome the issues stemming from data quality and incomplete metadata in data lakes, it is essential to support the problem of table union search, which aims to find all tables that are unionable with the query table, given a query table and a collection of data lake tables.
The ACM SIGMOD conference is the leading forum for the principles, techniques, and applications of database management systems and data management technology. There are 26 sponsors for SIGMOD this year, and Megagon Labs was a Silver sponsor. The conference consisted of the research track, the industry track, the demonstration track, 11 tutorials, and 10 workshops.
At Megagon Labs, we are working on symbiotic models and systems (Figure 1) that take advantage of LLMs as well as structured (knowledge bases [KBs], knowledge graphs [KGs], databases [DBs], etc.) and unstructured (texts) information in a continuous and (semi-) automated machine-learning paradigm. In this post we will describe Megagon KnowledgeHub and how our research and development benefits from it.
We shine a spotlight on three cutting-edge AI projects that have been making waves in the industry: ZETT, CoCoSum, and ESE. These groundbreaking initiatives offer a glimpse into the future of AI and the transformative impact it holds across various domains.
At Megagon Labs, we see bringing on interns as more than just hiring a short-term helping hand. As we welcome spring and summer interns, we’d like to share with you how we foster an environment of growth and career development for both mentors and interns.