Knowledge Representation & Reasoning

Real-world AI applications, particularly in the Human Resources (HR) domain, demand fairness, factuality, controllability, consistency, interpretability, and reasoning. Our vision is to investigate, innovate, propose, and build symbiotic systems that mutually exploit language models (LM) and knowledge graphs (KG) in a continuous and (semi-) automated learning paradigm to address these challenges. We are tackling many interesting technical problems, from zero-shot methods for extracting knowledge and designing conceptual KGs to learning universal representations in KGs, exploiting symbolic, neural, and hybrid models, and fusing LMs with KGs and vice versa.

We apply our research to build next-generation knowledge graphs for the HR domain that can drive many AI applications. These uses include matching candidates to jobs, developing a deep understanding of skills and occupations, identifying skills gaps, and recommending career opportunities.

Related Projects:

KnowledgeHub

In the KnowledgeHub (KH) project, we exploit knowledge coming from many different sources (structured and unstructured) to address some current LLM limitations. Specifically, we envision KH as a symbiotic system that aims to couple knowledge represented in LLMs with structured data in knowledge graphs and relational databases as well as unstructured data in text corpora.　

ZETT: Zero-shot Triplet Extraction by Template Infilling

In this project, we hypothesize that relation triplet extraction can be reformulated such that it aligns with the pre-training objective of large pre-trained language models. This can enable the models to leverage knowledge acquired during pre-training and render improved generalization capabilities to unseen relations.

ESE: Low Resource Entity Set Expansion

Entity set expansion (ESE) is the task of expanding a given seed set of entities (e.g., ‘mini bar’, ‘tv unit’) for a concept (e.g., room features) using a textual corpus. The task is typically studied under low-resource settings since obtaining large-scale training data is expensive.