agents recommendation-systems user-preferences benchmark dialog-systems

Agents fail to teach users what they want

Interactive recommender agents achieve only 56% accuracy because they don't expand user knowledge during conversation—the bottleneck is preference formation, not item search.

Summary

If you're building agentic recommendation or preference-elicitation systems, this paper quantifies a hard constraint: clarifying questions alone don't work when users lack domain knowledge. Your agent needs explicit teaching mechanisms (examples, explanations) to move the needle on task specification.

Why it matters

Implementation verdict

This doesn't replace existing systems yet—it's a diagnostic. CoShop benchmark reveals that five-turn interactions with frontier models don't actually educate users about their own preferences. If you're shipping an agent that relies on user clarity, this is a warning: invest in knowledge-building dialog actions before optimizing search.

Sources

1.no agent exceeds 56% accuracy on CoShop despite five turns of interaction
2.Failures stem not from agents' ability to find items, but from how little the interaction expands what users know about what they want
3.Users often lack the domain knowledge to have completely specified preferences

Dev Signal

Get briefs like this in your inbox — free, every weekday.

100+ sources compressed into one 4-minute read. Ranked, cited, implementation-ready.

Read the full issue →All briefs