Introduction

Why Vector DB

why make a specialized db and not use existing ones.

Untitled

Similarity/Semantic Search

Similarity/Semantic Search VS Keyword Search

Hybrid Search

Dimensionality reduction

Dimensionality reduction is an extremely powerful technique because it lets us take almost any object and translate it to a small convenient vector representation in a space. This space is generally referred to as latent because we don't necessarily have any prior notion of what the axes are. What we care about is that objects that are similar end up being close to each other.

Nearest Neighbour Search

For a set of points in some space (possibly many dimensions), we want to find the closest k neighbors quickly.

The basic idea of vector models is to represent objects in a space where proximity means two items are similar.

Exhaustive Search

Tradeoff between Speed and Accuracy

Approximate Algorithm

The whole idea behind approximate algorithms is that sacrificing a little bit of accuracy can give you enormous performance gains (orders of magnitude).

For instance we could return a decent solution where we really only computed the distance for 1% of the points – this is a 100x improvement over exhaustive search.

Indexing

Based on DS used and compression applied