Understanding user intent is critical to conversational assistants, especially those powered by AI agents. Modern AI systems rely on agentic architectures, which typically consist of a conversational interface that interacts with the user and a planner that interprets user intent and coordinates multiple agents to execute tasks in order to achieve user goals. However, errors in interpreting user intent can lead to flawed plans, reduced user satisfaction, and inefficient task execution.
In real-world conversations, users rarely state their intent clearly in a single, well-defined statement. Users change their requests mid-conversation. They add context later. They drift between related tasks. They introduce extra details that are not relevant to the final goal. All of this makes it difficult for planning systems to maintain a clear understanding of what the user actually wants.
Our paper “RECAP: Rewriting Conversations for Intent Understanding in Agentic Planning” explores a simple but powerful idea: rewrite the conversation into a clear representation of the user’s intent before planning.
The Core Idea Behind RECAP
Traditional intent classification maps user input into predefined categories. While effective in narrow domains, this approach struggles in open environments where user goals are flexible and constantly evolving. When a planner receives an unclear or outdated representation of the user’s intent, subsequent planning and execution inherit the error.
RECAP addresses this challenge by rewriting the multi-turn dialogue into a concise, unambiguous representation that distills relevant context, removes distractions, and aligns with the user’s most recent goals.
As a result, the planner receives cleaner and more consistent input. It no longer needs to parse every nuance of the conversation and can rely on rewrites that are concise, unambiguous, and well-aligned with the user’s objective.
A Benchmark for Complex Conversational Challenges
To study this problem, we created the RECAP benchmark, comprising over 800 instances of USER-AGENT dialog that focus specifically on intent-rewriting challenges in agentic planning.
In contrast to existing datasets that focus on predefined intent categories, the RECAP benchmark targets open-ended conversational settings where user intent may evolve, shift, or span multiple goals across turns. It provides complex dialogue scenarios that enable models to infer and represent intent directly from context, rather than relying on fixed labels.
The RECAP benchmark captures situations that frequently arise in real-world interactions, such as underspecified requests, shifting intent, and multiple goals within a single conversation. Some instructions are inherently vague and require interpretation before execution. While these challenges are common in real AI assistants, they are rarely represented in traditional intent classification benchmarks. RECAP also includes conversations spanning multiple domains and of varying lengths.
What the Experiments Show
The experiments compare different approaches to intent rewriting. The raw dialogue itself (a ‘dummy’ rewriter) is consistently outperformed by prompt-based rewriting methods. While basic summarization may lead to information loss, more advanced prompting techniques, which preserve relevant information and context, lead to larger improvements in terms of planning effectiveness. When the rewriting model is further trained using Direct Preference Optimization (DPO), performance enhances even further.
The takeaway is clear: a more accurate representation of user intent consistently leads to better planning outcomes.
We also introduce a fine-tuned LLM-based evaluator that compares planning utility, given a User-Agent dialogue.
Why This Matters for Agentic Systems
For researchers and developers building LLM agents, multi-agent workflows, or tool orchestration systems, these findings highlight a critical insight: many failures in agentic systems occur before planning even begins, as planners often work with incomplete or inaccurate interpretations of user intent.
By introducing an intent-rewriting layer, systems can transform complex, multi-turn conversations into clear, structured goals. This simple architectural addition (if need be, within the planning module itself) not only improves coordination between conversational interfaces and planning modules but also reduces the risk of downstream agents acting on outdated or misunderstood instructions.
As agentic systems become more sophisticated, the focus often falls on improving reasoning, adding more tools, or building better planners. RECAP demonstrates that the interface between conversation and planning is just as important as the reasoning algorithms themselves. Starting with a clear representation of user intent allows the entire planning pipeline to operate more efficiently and effectively, improving task execution across agents.
Read the Paper
If you are building conversational agents, planning systems, or multi-agent LLM pipelines, this research offers an important insight. Better intent representations can lead to better plans.
Written by Kushan Mitra and Megagon Labs