Beni x AI: How we use GPT to figure out what you’re shopping for with Beni
Post 3 of 3 (until we do more) breaking down the role of generative AI in helping Beni deliver on its mission to make resale accessible to everyone.
When you’re shopping online with Beni, we need to figure out what you’re shopping for in order to deliver you relevant secondhand versions of that. When you’re on a merchant page (think Nike, Patagonia, etc.) we collect some product-level information from the page and use that information to perform a search. We look at the product image, the product title, the price, the brand, etc. and all of that is factored into our search and ranking algorithm (learn more about our ranking algorithm in our previous post).
Here’s a great visual example of the kinds of information that are relevant to us:
Our product catalog is similarly structured to how you might organize your closet - we have all of our pants in one “drawer”, all of our shirts in another, etc. When you are shopping for pants, and therefore we are looking for matching pants (but secondhand and for half the price), we don’t want to be searching through our whole catalog of 200M+ listings to find you that exact item because it would take us way too long (and you probably won’t be waiting around). Instead, we want to go to our ‘pants drawer’ and search there specifically. Therefore, determining the category of the product that you are looking at is quite important for us - when you’re shopping for pants, we don’t want to be searching through where we store the shirts, and showing you shirts.
In most cases, we can figure out pretty easily that you’re looking at pants by performing a keyword search with our bank of keywords defining pants (pants, trousers, jeans, etc.) against the text descriptors that we can find on the page (breadcrumbs, product title, etc.). In other cases, the text descriptors that are provided are not actually super helpful in determining the product category.
Here’s an example. On QVC lives a product called the Bernardo Hooded Quilted Puffer Walker. The product page looks like this (also - checkout that 50% off steal!):
In this case, we use a generative AI model, namely GPT, to help us extract additional product descriptors from the information that we have available. What is GPT? You might have heard of it in the context of ChatGPT, but it’s essentially a Large Language Model (LLM) that can generate new, coherent, and contextually relevant text based on the inputs it has received. It’s pre-trained on a large corpus of text data, encompassing diverse topics and domains, which enables the model to learn general structure and nuance of human language.
Here’s how it works. We feed it the information we do have (you can see it’s not super useful):
Product Title: Bernardo Hooded Quilted Puffer Walker
Breadcrumbs: Product Detail | A566808
And we ask it to return some additional product tags, in which case, it says:
Apparel
Outerwear
Jackets
Puffer Jackets
Bernardo Hooded Quilted Puffer Walker
Aha! Now we have a lot more information about what the potential product category might be, and we can go searching through our catalog of jackets. And that’s how we use GPT to help us figure out what you’re shopping for with Beni!
Celine, your work is so great! Thanks Bryant for sharing the post with me.
If possible, I would like to ask Celine, with so many eCommerce sites and so many product categories, how do you organize them reasonably?
Awesome. Beni is definitely going to be a great product.
If I understand correctly, after the classification, the number of pictures that need to be compared in each bucket should not be too huge. But I am still curious, could you introduce a little about your algorithm related to the similarity of pictures? In this regard, any existing AI tools can help?