Vector stores are specialized databases that store embeddings (numeric vectors that capture semantic meaning) and provide fast similarity search. In LangChain, vector stores are the backbone of Retrieval-Augmented Generation (RAG) workflows where we embed our documents, store them in a vector store, then retrieve semantically relevant chunks at query time and feed them to an LLM.

Key Terms

Embedding: A fixed-length numeric vector representing the semantic content of a text (or image/audio).
Vector store (vector DB / index): A system that stores vectors + metadata (document id original text, any tags) and supports similarity search (k-NN, ANN).
Retriever: LangChain abstraction that wraps a vector store and returns the top-k similar documents for a query.
ANN vs exact search: Exact search checks all vectors (very accurate but slow on large data), while Approximate Nearest Neighbor (ANN) uses shortcuts (much faster and lighter, with only a tiny accuracy loss).

Importance of Vector Stores

Vector stores have a key role,

Semantic Search: They find information based on meaning, not just exact keywords, so even if we phrase a question differently, we still get the right answer.
RAG (Retrieval-Augmented Generation): They supply the LLM with the most relevant context, helping it give accurate, fact-based answers instead of guesses.
Scalability and Speed: With indexing and ANN algorithms, vector stores can handle millions of records while keeping searches quick and efficient.

Working of LangChain

LangChain makes it easy to connect our data with large language models (LLMs). The process usually goes like this:

Embeddings Model: Turns text into numeric vectors (embeddings) so the meaning of the text can be compared. LangChain supports many providers like OpenAI, Hugging Face, Cohere and Google.
Document Loader and Chunking: Loads our data (PDFs, text, websites, etc.) and breaks it into smaller chunks (usually 500–1,000 tokens) so it can be processed efficiently.
Vector Store: Stores these embeddings along with metadata. LangChain can connect to different vector stores like Chroma, FAISS, Pinecone, Weaviate, Qdrant and Milvus.
Retriever: Searches the vector store to find the most relevant chunks when we ask a question. These results are then passed to the LLM for generating a final answer.

Implementation

Let's see an example implementation to understand the working of vector stores in LangChain,

Step 1: Install Dependencies and Packages

We will install the required packages for our system.

Python

!pip install langchain-community langchain chromadb openai

Step 2: Setup OpenAI API Key

We will setup our OpenAI API Key,

To know how to access the OpenAI API Key, read How to find and Use API Key of OpenAI.

Python

import os
os.environ["OPENAI_API_KEY"] = "API_key_here"

Note: We can use Gemini API Key as well, to know how to extract Gemini API Key, read How to Access and Use Google Gemini API Key

Step 3: Import Libraries

We will import the required libraries such as langchain, chroma, OpenAIEmbeddings.

Python

from langchain_community.embeddings import OpenAIEmbeddings
from langchain.vectorstores import Chroma
from langchain.docstore.document import Document

Step 4: Create OpenAI Embeddings

Initialize the embedding model using our OpenAI API Key.
This model will generate numerical vectors for text so they can be stored and compared in the vector database.

Python

emb = OpenAIEmbeddings(openai_api_key=os.getenv("OPENAI_API_KEY"))

Step 5: Prepare Data as Documents

Use Document objects from LangChain to structure our data. Each document has:

page_content: the actual text.
metadata: key-value pairs to store attributes (e.g., source, author).

Python

docs = [
    Document(page_content="LangChain connects LLMs with external tools.", metadata={"source": "intro"}),
    
    Document(page_content="Vector stores store embeddings and metadata.", metadata={"source": "concepts"})
]

Step 6: Create and Persist Chroma Collection

Store documents in a Chroma vector store with embeddings.

Documents (with text + metadata).
Embedding function (OpenAI).
Collection name (identifier for our DB).
Directory to save/persist embeddings

Python

vectordb = Chroma.from_documents(
    documents=docs,
    embedding=emb,
    collection_name="my_collection",
    persist_directory="./chroma_db"
)
vectordb.persist()

Step 7: Create a Retriever

A retriever is the component that searches embeddings for relevant documents.
Search type "similarity" finds documents whose embeddings are closest to the query.
k=2: return the top 2 most relevant results.

Python

retriever = vectordb.as_retriever(
    search_type="similarity",
    search_kwargs={"k": 2}
)

Step 8: Query and Retrieve Results

Pass a natural language query.
Chroma compares the query embedding with stored embeddings and retrieves the most similar documents.
Print page content and metadata for clarity.

Python

results = retriever.get_relevant_documents("What stores vectors?")
for r in results:
    print(r.page_content, r.metadata)

Output:

Vector stores store embeddings and metadata. {'source': 'concepts'}
LangChain connects LLMs with external tools. {'source': 'intro'}

How to choose a vector store

Use this checklist when selecting a store for a LangChain project:

1. Data Size:

Small projects or testing: Chroma or FAISS (easy setup, runs locally).
Large datasets (millions of vectors): Pinecone, Milvus, Qdrant or Weaviate (built for scale).

2. Speed and Reliability:

Managed services like Pinecone or Chroma Cloud are very fast and reliable, with uptime guarantees.
Self-hosted options like FAISS, Milvus, Qdrant or Weaviate give us control but require setup and maintenance.

3. Features:

Need metadata filtering or hybrid search: Weaviate, Qdrant or Pinecone are strong choices.
Need advanced indexing for very large data: Milvus is designed for this.

4. Cost and Maintenance:

FAISS is free but me must manage everything ourself.
Pinecone, Chroma Cloud cost more but save us from maintenance.
Milvus, Qdrant, Weaviate are open-source (can self-host free) but need infra management unless me use their cloud versions.

5. Integration and Support:

All six (Chroma, FAISS, Pinecone, Milvus, Qdrant, Weaviate) are officially supported in LangChain.
Chroma, Pinecone and FAISS have the richest documentation and examples.

Applications

Retrieval-Augmented Generation (RAG): Enhance LLMs by retrieving contextually relevant documents before generating an answer.
Semantic Search Engines: Build intelligent search systems that understand meaning rather than just keywords.
Chatbots and Virtual Assistants: Enable bots to answer questions based on organizational knowledge bases.
Document Q&A Systems: Ask questions over PDFs, websites or internal documentation using embeddings + retrieval.
Recommendation Systems: Recommend similar documents, articles or products based on semantic similarity.

Limitations

Embedding Dependence: Quality of retrieval depends heavily on the embeddings model used.
Reindexing Overhead: If embedding model changes, we must re-embed and reindex our entire dataset.
Storage Costs: Large vector datasets consume significant disk/memory; managed services can become expensive.
Latency Tradeoffs: Approximate nearest neighbor (ANN) indexing speeds up queries but may reduce recall.

Vector Stores in LangChain

Key Terms

Importance of Vector Stores

Working of LangChain

Implementation

Step 1: Install Dependencies and Packages

Step 2: Setup OpenAI API Key

Step 3: Import Libraries

Step 4: Create OpenAI Embeddings

Step 5: Prepare Data as Documents

Step 6: Create and Persist Chroma Collection

Step 7: Create a Retriever

Step 8: Query and Retrieve Results

How to choose a vector store

Applications

Limitations

Explore