Documentation Index Fetch the complete documentation index at: https://langchain-zh.cn/llms.txt
Use this file to discover all available pages before exploring further.
向量存储 存储 嵌入 数据并执行相似性搜索。
LangChain 为向量存储提供了统一的接口,允许您:
addDocuments - 向存储中添加文档。
delete - 按 ID 删除存储的文档。
similaritySearch - 查询语义相似的文档。
这种抽象使您可以在不同实现之间切换,而无需更改应用程序逻辑。
初始化
LangChain 中的大多数向量存储在初始化时接受一个嵌入模型作为参数。
import { OpenAIEmbeddings } from "@langchain/openai" ;
import { MemoryVectorStore } from "@langchain/classic/vectorstores/memory" ;
const embeddings = new OpenAIEmbeddings ( {
model : "text-embedding-3-small" ,
} ) ;
const vectorStore = new MemoryVectorStore (embeddings) ;
添加文档
您可以使用 addDocuments 函数向向量存储添加文档。
import { Document } from "@langchain/core/documents" ;
const document = new Document ( {
pageContent : "Hello world" ,
} ) ;
await vectorStore . addDocuments ([document]) ;
删除文档
您可以使用 delete 函数从向量存储中删除文档。
await vectorStore . delete ( {
filter : {
pageContent : "Hello world" ,
},
} ) ;
相似性搜索
使用 similaritySearch 发出语义查询,返回最接近的嵌入文档:
const results = await vectorStore . similaritySearch ( "Hello world" , 10 ) ;
许多向量存储支持以下参数:
k — 要返回的结果数量
filter — 基于元数据的条件过滤
相似性度量与索引
嵌入相似性可以使用以下方法计算:
高效搜索通常使用索引方法,如 HNSW(分层可导航小世界),但具体细节取决于向量存储。
元数据过滤
通过元数据(例如来源、日期)进行过滤可以优化搜索结果:
vectorStore . similaritySearch ( "query" , 2 , { source : "tweets" } ) ;
主要集成
选择嵌入模型:
安装依赖项: 添加环境变量: OPENAI_API_KEY = your-api-key
实例化模型: import { OpenAIEmbeddings } from "@langchain/openai" ;
const embeddings = new OpenAIEmbeddings ( {
model : "text-embedding-3-large"
} ) ;
安装依赖项 添加环境变量: AZURE_OPENAI_API_INSTANCE_NAME =< YOUR_INSTANCE_NAME >
AZURE_OPENAI_API_KEY =< YOUR_KEY >
AZURE_OPENAI_API_VERSION = "2024-02-01"
实例化模型: import { AzureOpenAIEmbeddings } from "@langchain/openai" ;
const embeddings = new AzureOpenAIEmbeddings ( {
azureOpenAIApiEmbeddingsDeploymentName : "text-embedding-ada-002"
} ) ;
安装依赖项: 添加环境变量: BEDROCK_AWS_REGION = your-region
实例化模型: import { BedrockEmbeddings } from "@langchain/aws" ;
const embeddings = new BedrockEmbeddings ( {
model : "amazon.titan-embed-text-v1"
} ) ;
安装依赖项: npm i @langchain/google-genai
添加环境变量: GOOGLE_API_KEY = your-api-key
实例化模型: import { GoogleGenerativeAIEmbeddings } from "@langchain/google-genai" ;
const embeddings = new GoogleGenerativeAIEmbeddings ( {
model : "text-embedding-004"
} ) ;
安装依赖项: npm i @langchain/google-vertexai
添加环境变量: GOOGLE_APPLICATION_CREDENTIALS = credentials.json
实例化模型: import { VertexAIEmbeddings } from "@langchain/google-vertexai" ;
const embeddings = new VertexAIEmbeddings ( {
model : "gemini-embedding-001"
} ) ;
安装依赖项: npm i @langchain/mistralai
添加环境变量: MISTRAL_API_KEY = your-api-key
实例化模型: import { MistralAIEmbeddings } from "@langchain/mistralai" ;
const embeddings = new MistralAIEmbeddings ( {
model : "mistral-embed"
} ) ;
安装依赖项: 添加环境变量: COHERE_API_KEY = your-api-key
实例化模型: import { CohereEmbeddings } from "@langchain/cohere" ;
const embeddings = new CohereEmbeddings ( {
model : "embed-english-v3.0"
} ) ;
安装依赖项: 实例化模型: import { OllamaEmbeddings } from "@langchain/ollama" ;
const embeddings = new OllamaEmbeddings ( {
model : "llama2" ,
baseUrl : "http://localhost:11434" , // 默认值
} ) ;
选择向量存储:
import { MemoryVectorStore } from "@langchain/classic/vectorstores/memory" ;
const vectorStore = new MemoryVectorStore (embeddings) ;
npm i @langchain/community
import { Chroma } from "@langchain/community/vectorstores/chroma" ;
const vectorStore = new Chroma (embeddings , {
collectionName : "a-test-collection" ,
} ) ;
npm i @langchain/community
import { FaissStore } from "@langchain/community/vectorstores/faiss" ;
const vectorStore = new FaissStore (embeddings , {} ) ;
import { MongoDBAtlasVectorSearch } from "@langchain/mongodb"
import { MongoClient } from "mongodb" ;
const client = new MongoClient (process . env . MONGODB_ATLAS_URI || "" ) ;
const collection = client
. db (process . env . MONGODB_ATLAS_DB_NAME )
. collection (process . env . MONGODB_ATLAS_COLLECTION_NAME ) ;
const vectorStore = new MongoDBAtlasVectorSearch (embeddings , {
collection ,
indexName : "vector_index" ,
textKey : "text" ,
embeddingKey : "embedding" ,
} ) ;
npm i @langchain/community
import { PGVectorStore } from "@langchain/community/vectorstores/pgvector" ;
const vectorStore = await PGVectorStore . initialize (embeddings , {} ) ;
npm i @langchain/pinecone
import { PineconeStore } from "@langchain/pinecone" ;
import { Pinecone as PineconeClient } from "@pinecone-database/pinecone" ;
const pinecone = new PineconeClient () ;
const vectorStore = new PineconeStore (embeddings , {
pineconeIndex ,
maxConcurrency : 5 ,
} ) ;
import { RedisVectorStore } from "@langchain/redis" ;
const vectorStore = new RedisVectorStore (embeddings , {
redisClient : client ,
indexName : "langchainjs-testing" ,
} ) ;
import { QdrantVectorStore } from "@langchain/qdrant" ;
const vectorStore = await QdrantVectorStore . fromExistingCollection (embeddings , {
url : process . env . QDRANT_URL ,
collectionName : "langchainjs-testing" ,
} ) ;
npm i @oracle/langchain-oracledb @langchain/core
import oracledb from "oracledb" ;
import { OracleEmbeddings , OracleVS } from "@oracle/langchain-oracledb" ;
const connection = await oracledb . getConnection ( {
user : process . env . ORACLE_USER ,
password : process . env . ORACLE_PASSWORD ,
connectionString : process . env . ORACLE_DSN ,
} ) ;
const embeddings = new OracleEmbeddings (connection , {
provider : "database" ,
model : process . env . DEMO_ONNX_MODEL ?? "DEMO_MODEL" ,
} ) ;
const vectorStore = new OracleVS (embeddings , {
client : connection ,
tableName : "DEMO_VECTORS" ,
query : "Find support tickets mentioning service outages." ,
distanceStrategy : "DOT" ,
} ) ;
await vectorStore . initialize () ;
npm i @langchain/weaviate
import { WeaviateStore } from "@langchain/weaviate" ;
const vectorStore = new WeaviateStore (embeddings , {
client : weaviateClient ,
indexName : "Langchainjs_test" ,
} ) ;
LangChain.js 集成了多种向量存储。您可以在下方查看完整列表:
所有向量存储
Azure Cosmos DB for NoSQL
Google Cloud SQL for PostgreSQL
Google Vertex AI Matching Engine
SAP HANA Cloud Vector Engine
Momento Vector Index (MVI)