Sitemap
TDS Archive

An archive of data science, data analytics, data engineering, machine learning, and artificial intelligence writing from the former Towards Data Science Medium publication.

Member-only story

Large Language Models: SBERT — Sentence-BERT

Learn how siamese BERT networks accurately transform sentences into embeddings

8 min readSep 12, 2023

--

Press enter or click to view image in full size

Introduction

It is no secret that transformers made evolutionary progress in NLP. Based on transformers, many other machine learning models have evolved. One of them is BERT which primarily consists of several stacked transformer encoders. Apart from being used for a set of different problems like sentiment analysis or question answering, BERT became increasingly popular for constructing word embeddings — vectors of numbers representing semantic meanings of words.

Representing words in the form of embeddings gave a huge advantage as machine learning algorithms cannot work with raw texts but can operate on vectors of vectors. This allows comparing different words by their similarity by using a standard metric like Euclidean or cosine distance.

The problem is that, in practice, we often need to construct embeddings not for single words but instead for whole sentences. However, the basic BERT version builds embeddings only on the word level. Due to this, several BERT-like approaches were later developed to solve this problem which will be discussed in this article. By progressively discussing them, we will then reach to the state-of-the-art model called…

--

--

TDS Archive
TDS Archive

Published in TDS Archive

An archive of data science, data analytics, data engineering, machine learning, and artificial intelligence writing from the former Towards Data Science Medium publication.

Vyacheslav Efimov
Vyacheslav Efimov

Written by Vyacheslav Efimov

Senior ML Engineer 👨‍💻 | Passionate about Data Science ⭐️ | Content Creator ✍️