By implementing product recognition capabilities in your store, you can unlock a large number of improvements in your store operations and customer experiences. Your annual counting and cycle counting can be completed faster. Automated shelf monitoring can alert your staff to low stocks. The omnichannel experience of your website and app customers can improve by informing them about product availability in real-time.
A Deep Learning Pipeline for Visual Search
A visual search engine is at the heart of this entire system because you'll be needing it for every subsequent step. So, the first step is to implement a visual search engine for products using a deep learning image processing pipeline, a vector database for image recognition, and if needed, an optical character recognition and information extraction pipeline for fine-grained image recognition.
Deciding on Object Detection or Instance Segmentation
The deep learning pipeline can implement either object detection or instance segmentation to isolate a product from its surroundings in an image. Both are based on convolutional neural networks (CNN) for visual pattern recognition but each one brings some inherent benefits and drawbacks that you should evaluate based on the conditions in your retail environment.
Image Embeddings
Regardless of the model used, it computes a numerical vector called an image embedding for each product. An image embedding is the result of the neural network looking for all the unique local and global image features that characterize each product and encoding them as a vector of numbers using a mathematical function. Each SKU is represented by a unique image embedding since each SKU will have a unique shape, textures, colors, text, barcode, and other features. These embeddings are essential to our visual search as we'll soon see.
Vector Database
A vector database enables us to query images and find matching image embeddings for object recognition. You need one so that when a product is shown to the system, it can search this database to check if the same product or a visually indistinguishable product has previously been registered with it and retrieve its product details.