Menu

Post image 1
Post image 2
1 / 2
0

UCP Playground at 1,000+ Agent Sessions: What 16 Models and 97 Real Stores Reveal About AI Shopping

DEV Community·Benji Fisher·28 days ago
#kJRcN7re
#finding#how#ecommerce#webdev#model#store
Reading 0:00
15s threshold

Two and a half months ago we published Why We Built UCP Playground , which closed on 114 agent sessions and an honest acknowledgement that the dataset was thin — most models had single-digit sample sizes, store coverage was uneven, and the headline rates moved meaningfully with every new run. A month later we crossed a different threshold: the first fully autonomous AI agent purchase through UCP — a Gemini agent searching, adding to cart, linking identity, paying, and completing checkout at houseofparfum.nl without a human past the initial prompt. Eighty days on from the first post, and roughly forty days after that autonomous purchase, the dataset is in a different shape: Over 1,000 agent shopping sessions captured end-to-end with full tool-call timelines and replayable event streams 16 frontier models — every major lab, plus a reasoning-tuned subset 97 distinct UCP-enabled stores across Shopify, WooCommerce, BigCommerce, Magento, PrestaShop, and custom stacks $96,032 of agent-driven cart value generated,…

Continue reading — create a free account

Join HashtagPLUS to read full articles, follow hashtags, vote, and join the conversation.

Read More