Vector Database Architecture
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
| from langchain_community.vectorstores import PGVector
from langchain_openai import OpenAIEmbeddings
# Production PGVector Setup
embeddings = OpenAIEmbeddings(model="text-embedding-3-large")
vectorstore = PGVector.from_connection_string(
connection_string=DATABASE_URL,
embedding=embeddings,
collection_name="documents",
pre_delete_collection=False
)
# Hybrid search with metadata filtering
results = vectorstore.similarity_search_with_score(
query="patent infringement claims",
k=10,
filter={"category": "patents", "year": {"$gte": 2020}}
)
|
When to Use Each Vector Database
| Use Case | Recommended | Reason |
|---|
| PostgreSQL shop | PGVector | Unified infrastructure |
| Scale + low ops | Pinecone | Fully managed |
| Complex filters | Qdrant | Advanced filtering |
| Prototyping | Chroma | Simple setup |
| Self-hosted | Qdrant/Milvus | Full control |
Frequently Asked Questions
What are vector databases?
Vector databases store and search high-dimensional embeddings (numerical representations of text, images, etc.). They enable semantic search, recommendation systems, and RAG applications by finding items that are similar in meaning, not just matching keywords.
How much does vector database implementation cost?
Vector database development typically costs $110-160 per hour. A basic implementation starts around $10,000-20,000, while enterprise RAG systems with hybrid search, filtering, and multi-tenancy range from $40,000-100,000+. Database hosting costs are separate.
PGVector vs Pinecone vs Chroma: which should I choose?
Choose PGVector for: existing PostgreSQL, ACID requirements, simplicity. Choose Pinecone for: managed scale, minimal ops, pure vector search. Choose Chroma for: local development, prototyping. I help select based on your scale, requirements, and team.
I implement: appropriate index types (HNSW, IVF), proper dimension sizing, filtering optimization, query batching, and hybrid search (combining vector + keyword). Poor configuration can make vector search 10x slower than optimized.
Can you migrate between vector databases?
Yes. Migration involves: embedding export/re-generation, schema mapping, index configuration, and testing retrieval quality. I’ve migrated from Pinecone to PGVector and vice versa depending on requirements and cost considerations.
Experience:
Case Studies: Enterprise RAG for Legal Documents | Agentic AI Knowledge Systems
Related Technologies: RAG Systems, LangChain, PostgreSQL, OpenAI