LlamaIndex Vector Database Integration: Connect and Query (2026)

LlamaIndex doesn’t store your data; it orchestrates how you access and query it, and vector databases are a primary way it does that.

Let’s see LlamaIndex in action with a common vector database, Chroma. Imagine you have a few markdown files and you want to ask questions about them.

First, you need to install LlamaIndex and Chroma:

pip install llama-index llama-index-vector-stores-chroma

Now, let’s set up Chroma and index your documents. We’ll use a simple in-memory Chroma instance for this example, but you’d typically run a persistent Chroma server.

import os
from llama_index.core import VectorStoreIndex, SimpleDirectoryReader, Settings
from llama_index.vector_stores.chroma import ChromaVectorStore
from llama_index.embeddings.openai import OpenAIEmbedding
import chromadb

# Ensure you have your OpenAI API key set as an environment variable
# os.environ["OPENAI_API_KEY"] = "YOUR_API_KEY"

# Initialize OpenAI embeddings
Settings.embed_model = OpenAIEmbedding()

# Load documents from a directory
# Create a dummy directory and file for demonstration
if not os.path.exists("data"):
    os.makedirs("data")
with open("data/my_document.md", "w") as f:
    f.write("This is the first sentence of my document.\n")
    f.write("This document discusses the integration of vector databases with LlamaIndex.\n")
    f.write("Chroma is a popular choice for its ease of use.\n")

documents = SimpleDirectoryReader("data").load_data()

# Initialize ChromaDB client (in-memory for this example)
db = chromadb.Client()

# Create a Chroma collection
collection = db.create_collection("llama_index_docs")

# Create a ChromaVectorStore instance
vector_store = ChromaVectorStore(chroma_collection=collection)

# Build the index
index = VectorStoreIndex.from_documents(documents, vector_store=vector_store)

# Now you can query the index
query_engine = index.as_query_engine()

response = query_engine.query("What is this document about?")
print(response)

When you run this, you’ll see output similar to:

This document discusses the integration of vector databases with LlamaIndex.

This simple example shows the core loop: load data, tell LlamaIndex where to store its vector representations (the ChromaVectorStore), build an index from your documents using that store, and then query it.

The magic happens because LlamaIndex, when told to build an index, takes your documents, passes them through the configured embed_model (OpenAI’s OpenAIEmbedding in this case) to get numerical vector representations (embeddings) of each chunk of text, and then stores these vectors (along with the original text or references to it) in the ChromaVectorStore. When you query, LlamaIndex converts your question into an embedding, searches the ChromaVectorStore for the most similar document vectors, retrieves the corresponding text chunks, and then uses a language model to synthesize an answer from those chunks.

You have several levers to pull here. The embed_model is crucial – different models have different strengths and costs. The VectorStoreIndex itself has configuration options, and importantly, the vector_store you choose has its own set of parameters and capabilities. Chroma, for instance, can be run persistently, allowing you to index once and query many times without re-processing your documents. You can also specify metadata to be stored alongside your vectors, enabling more nuanced filtering during queries.

When you configure VectorStoreIndex.from_documents, LlamaIndex automatically handles chunking your documents if they are too large for the embedding model. The default chunk size and overlap are often good starting points, but you can customize this behavior by passing a transformations argument, typically a SentenceSplitter or a similar text splitter, to the VectorStoreIndex constructor.

If you want to use a persistent Chroma instance to avoid re-indexing, you’d initialize it differently:

# Persistent ChromaDB client
chroma_client = chromadb.PersistentClient(path="./chroma_db")
collection = chroma_client.get_or_create_collection("llama_index_docs")
vector_store = ChromaVectorStore(chroma_collection=collection)
# Then proceed with VectorStoreIndex.from_documents as before

The most surprising thing is how seamlessly LlamaIndex abstracts away the complexities of different vector databases. You can swap Chroma for Pinecone, Weaviate, or others with minimal code changes, as long as you use the corresponding LlamaIndex integration package. The VectorStore interface is the key abstraction.

The next step is usually understanding how to manage and query larger datasets, which involves exploring different chunking strategies and advanced query techniques.

More Deep Dives in Vector Databases