Home / Glossary / Vector Database
Vector Database
A vector database is a specialized database designed to store, index, and search high-dimensional embedding vectors efficiently. Unlike traditional databases that match exact values or keywords, vector databases find the most similar vectors to a query vector—enabling semantic search, recommendation systems, and the retrieval component of RAG (retrieval-augmented generation) architectures.
How vector databases work
Vector databases store embedding vectors alongside metadata (like file paths, function names, or code snippets). When you search, your query is converted to an embedding vector, and the database uses approximate nearest neighbor (ANN) algorithms to quickly find the most similar stored vectors. These algorithms (like HNSW or IVF) trade a small amount of accuracy for massive speed gains, making it possible to search millions of vectors in milliseconds.
Vector databases in AI coding tools
AI coding tools that index your codebase use vector databases under the hood. When Cursor or Cline indexes your project, it creates embedding vectors for code chunks and stores them in a local vector database. When you ask a question, the tool converts your query to an embedding, searches the vector database for relevant code, and passes those code chunks to the LLM as context. This is the "retrieval" step in RAG that makes AI tools aware of your specific codebase.
Popular vector databases in 2026
- +Pinecone: fully managed, serverless vector database optimized for production workloads
- +Weaviate: open-source with built-in vectorization and hybrid search
- +Chroma: lightweight, developer-friendly, popular for prototyping and small projects
- +Qdrant: high-performance, open-source, with advanced filtering capabilities
- +pgvector: PostgreSQL extension for teams that want vectors in their existing database
// Using a vector database for code search (Chroma example)
import { ChromaClient } from "chromadb";
const client = new ChromaClient();
const collection = await client.createCollection({ name: "codebase" });
// Index code chunks
await collection.add({
ids: ["auth-1", "auth-2", "payment-1"],
documents: [
"function validateToken(jwt) { ... }",
"function refreshSession(userId) { ... }",
"function processPayment(amount, card) { ... }"
],
metadatas: [
{ file: "src/auth.ts", line: 15 },
{ file: "src/auth.ts", line: 42 },
{ file: "src/payment.ts", line: 8 }
]
});
// Semantic search — finds auth-related code
const results = await collection.query({
queryTexts: ["how does user authentication work?"],
nResults: 5
});If you are building AI-powered features on top of your codebase, start with Chroma for prototyping and pgvector if you already use PostgreSQL. Migrate to Pinecone or Qdrant when you need production-grade performance and scaling.
Do I need a vector database to use AI coding tools?+
How is a vector database different from a regular database?+
Can I use PostgreSQL instead of a dedicated vector database?+
Related comparisons
Master Claude Code in days, not months
37 hands-on lessons from beginner to CI/CD automation. Module 1 is free.
START FREE →