Zomato used SBERT for text-based clustering of addresses, eliminating the issue of word embedding alone not being able to solve the sequencing of words.
SBERT allowed for uniformity in processing addresses of varying lengths, resulting in embeddings of a consistent size and meaningful representation of sentences.
These embeddings of one fixed-length vector can be clustered into one string using DBSCAN, generating one final label for the different addresses of the same location that customers enter.
SBERT is a type of artificial intelligence model that learns words and their meanings like BERT, but also checks if two sentences mean the same, offering a big idea of what it learned, instead of talking about every single word.
The Siamese network structure of SBERT, where two identical BERT models share weights, allows for a direct comparison between two input sentences or addresses.
Encoders in Transformers are used to analyze and encode input sequences into a rich, contextual representation, while decoders generate the output sequence step by step using information from the encoder.
Transformers process all input tokens simultaneously, making training faster and more efficient than traditional recurrent neural networks.
Zomato's use of SBERT enabled it to group unique addresses, reducing discrepancies in cost calculations for delivery and the time and resources wasted by delivering to the same address but to different groups of people, which can be harmful for any last-mile delivery aggregator.
SBERT's fixed-length embeddings offer a meaningful representation of entire sentences, enabling the clustering of addresses of varying lengths with a consistent size.
Large Language Models, such as Chat-GPT and BERT, are trained on vast amounts of data and used for a wide range of language-related tasks, including understanding and generating human-like text.