Human-Centered AI

The rapid advancements in LLMs have revolutionized machine learning (ML) tasks in and beyond natural language processing (NLP) domains by making complex capabilities more accessible to a broader range of users. However, this accessibility also presents significant challenges, particularly in effectively incorporating humans into the ML lifecycle. From model training to decision-making, ensuring that human input is meaningfully integrated remains critical for improving model performance, enhancing fairness, and aligning outputs with user intentions and ethical standards. As such, it has become increasingly important to create systems that thoughtfully bridge the gap between human expertise and AI-driven processes.

At Megagon Labs, the HAI team’s mission is to advance human-centered AI by facilitating seamless human-AI collaboration, with a particular emphasis on interactions between humans and LLMs. We develop innovative tools, workflows, and solutions that empower diverse stakeholders throughout the ML lifecycle to contribute their knowledge and preferences. We create conversational interfaces and interactive planning solutions that bridge the gap between users and intelligent agents in compound AI systems. Through these efforts, we aim to redefine how humans and AI systems work together, enabling more intuitive, transparent, and impactful collaborations in complex, real-world contexts.

Highlighted

Projects

Factlense abstract

Benchmark for evaluating fine-grained fact verification, with metrics and automated evaluators of sub-claim quality for complex claims.

Tyrogue Abstract

Tyrogue is an active learning method that employs a hybrid sampling strategy to minimize labeling cost and acquisition latency while providing a framework for adapting to dataset diversity via user guidance.

LLM-equipped open-source data annotation framework, enabling ML practitioners to bootstrap annotation tasks and manage the continual evolution of annotations through the machine learning lifecycle.

 

Lapras abstract image

Multi-step human-LLM collaborative framework for effective and explainable annotation. The framework utilizes LLMs’ self-explanation capabilities to explain their labeling decisions and provide signals for human verification.

Related

Publications

CHI
2024
Xinru Wang, Hannah Kim, Zhengjie Miao, Kushan Mitra, Sajjadur Rahman
Large language models (LLMs) have shown remarkable performance across various natural language processing (NLP) tasks, indicating their significant potential as data annotators. Although LLM-generated annotations are more cost-effective and efficient to obtain, they are often erroneous for complex or domain-specific tasks and may introduce bias when compared to human annotations. Therefore, instead of completely replacing human annotators with LLMs, we need to leverage the strengths of both LLMs and humans to ensure the accuracy and reliability of annotations. This paper presents a multi-step human-LLM collaborative approach where (1) LLMs generate labels and provide explanations, (2) a verifier assesses the quality of LLM-generated labels, and (3) human annotators re-annotate a subset of labels with lower verification scores. To facilitate human-LLM collaboration, we make use of LLM’s ability to rationalize its decisions. LLM-generated explanations can provide additional information to the verifier model as well as help humans better understand LLM labels. We demonstrate that our verifier is able to identify potentially incorrect LLM labels for human re-annotation. Furthermore, we investigate the impact of presenting LLM labels and explanations on human re-annotation through crowdsourced studies.
EACL - Demonstrations
2024
Large language models (LLMs) can label data faster and cheaper than humans for various NLP tasks. Despite their prowess, LLMs may fall short in understanding of complex, sociocultural, or domain-specific context, potentially leading to incorrect annotations. Therefore, we advocate a collaborative approach where humans and LLMs work together to produce reliable and high-quality labels. We present MEGAnno+, a human-LLM collaborative annotation system that offers effective LLM agent and annotation management, convenient and robust LLM annotation, and exploratory verification of LLM labels by humans.
Findings - EMNLP
2022
Seiji Maekawa, Dan Zhang, Hannah Kim, Sajjadur Rahman, and Estevam Hruschka
Recently, active learning (AL) methods have been used to effectively fine-tune pre-trained language models for various NLP tasks such as sentiment analysis and document classification. However, given the task of fine-tuning language models, understanding the impact of different aspects on AL methods such as labeling cost, sample acquisition latency, and the diversity of the datasets necessitates a deeper investigation. This paper examines the performance of existing AL methods within a low-resource, interactive labeling setting. We observe that existing methods often underperform in such a setting while exhibiting higher latency and a lack of generalizability. To overcome these challenges, we propose a novel active learning method TYROUGE that employs a hybrid sampling strategy to minimize labeling cost and acquisition latency while providing a framework for adapting to dataset diversity via user guidance. Through our experiments, we observe that compared to SOTA methods, TYROUGE reduces the labeling cost by up to 43% and the acquisition latency by as much as 11X, while achieving comparable accuracy. Finally, we discuss the strengths and weaknesses of TYROUGE by exploring the impact of dataset characteristics.
4 Min Read
July 31, 2024
MEGAnno combines the power of large language models (LLMs) with human expertise to streamline and enhance the data labeling process with a data annotation framework. Throughout this article, we’ll showcase MEGAnno’s capabilities as we provide detailed code snippets.
4 Min Read
May 8, 2024
Instead of completely replacing human annotators with LLMs, we need to leverage the strengths of both sides to obtain accurate and reliable annotations. This article will discuss how to effectively utilize LLMs as collaborators for data annotation.
8 Min Read
March 14, 2024
We introduce our human-LLM collaborative annotation tool, MEGAnno+, addressing the challenges in LLM annotation by integrating human expertise with LLM capabilities.