Introduction to Qdrant

Qdrant (read: quadrant) is a vector similarity search engine and vector database that provides a production-ready service with a convenient API to store, search, and manage points—vectors with additional payload data.

Qdrant is written in Rust 🦀, which makes it fast and reliable even under high load. See benchmarks for performance comparisons.

What is Qdrant?

Qdrant is designed to turn embeddings or neural network encoders into full-fledged applications for matching, searching, recommending, and much more. It provides extended filtering support, making it useful for all sorts of neural-network or semantic-based matching, faceted search, and other advanced applications.

Key Features

Filtering & Payloads

Attach JSON payloads to vectors and filter based on values, including keyword matching, full-text filtering, numerical ranges, and geo-locations.

Hybrid Search

Support for both dense and sparse vectors to combine semantic search with keyword-based matching (BM25/TF-IDF style).

Vector Quantization

Reduce RAM usage by up to 97% with built-in vector quantization while maintaining search quality.

Distributed Deployment

Horizontal scaling through sharding and replication with zero-downtime rolling updates.

Core Capabilities

High-Performance Architecture

Qdrant leverages modern hardware and software optimizations:

SIMD Hardware Acceleration: Utilizes x86-x64 and ARM Neon architectures for better performance
Async I/O: Uses io_uring on Linux to maximize disk throughput, even on network-attached storage
Write-Ahead Logging (WAL): Ensures data persistence with update confirmation, protecting against power outages
Query Planning: Leverages payload indexes to optimize query execution strategies

The default WAL capacity is 32 MB per segment with configurable segments ahead for optimal write performance.

Storage & Memory Management

Qdrant offers flexible storage options configured in config/config.yaml:

storage:
  # Where to store all the data
  storage_path: ./storage
  
  # Where to store snapshots
  snapshots_path: ./snapshots
  
  # If true - point payloads will not be stored in memory
  on_disk_payload: true
  
  # Write-ahead-log configuration
  wal:
    wal_capacity_mb: 32
    wal_segments_ahead: 0

HNSW Index Configuration

The Hierarchical Navigable Small World (HNSW) algorithm powers Qdrant’s fast approximate nearest neighbor search:

hnsw_index:
  # Number of edges per node (higher = more accurate, more space)
  m: 16
  
  # Number of neighbors during index building (higher = more accurate, slower build)
  ef_construct: 100
  
  # Threshold for full-scan vs HNSW (in KiloBytes)
  full_scan_threshold_kb: 10000
  
  # Store index on disk or in RAM
  on_disk: false

Use Cases

Semantic Search

Transform text embeddings into powerful search capabilities that understand meaning, not just keywords. Perfect for document search, knowledge bases, and content discovery.

Recommendation Systems

Build sophisticated recommendation engines by finding similar items based on user preferences, product attributes, or behavioral patterns.

Retrieval Augmented Generation (RAG)

Power AI applications by retrieving relevant context from vector databases to enhance LLM responses with accurate, domain-specific information.

RAG Integration Example

Qdrant integrates seamlessly with popular RAG frameworks:

LangChain: Use Qdrant as a memory backend
LlamaIndex: Vector store integration for document indexing
OpenAI ChatGPT: Retrieval plugin for context enhancement
Microsoft Semantic Kernel: Persistent memory integration

Additional Applications

Chat Bots: Context-aware conversational AI with memory
Matching Engines: Find similar users, products, or content
Anomaly Detection: Identify outliers in high-dimensional data
Image Search: Visual similarity search for photos and media
Extreme Classification: Multi-class, multi-label problems with millions of labels

Architecture Overview

Core Components

Collections

Named sets of points (vectors with payloads) that share the same vector configuration and indexing parameters.

Points

Individual entries consisting of a vector, unique ID, and optional JSON payload for metadata and filtering.

Segments

Internal data structures that optimize storage and search. Qdrant automatically manages segment optimization.

Shards

Horizontal partitions of collections enabling distributed deployment and scaling.

Distance Metrics

Qdrant supports multiple distance metrics for vector similarity:

Dot Product: Best for normalized vectors (cosine similarity)
Cosine: Measures angle between vectors
Euclidean: L2 distance for spatial proximity
Manhattan: L1 distance for specific use cases

API Interfaces

REST API

Full-featured HTTP API with OpenAPI 3.0 specification:

Port: 6333 (default)
Documentation: https://api.qdrant.tech/
Format: JSON request/response

gRPC API

High-performance binary protocol for production workloads:

Port: 6334 (default, optional)
Use Case: Faster searches with lower latency
Features: Streaming support, efficient serialization

By default, Qdrant starts without authentication. Always enable api_key in production and use TLS encryption.

Integrations

Qdrant integrates with the most popular AI and ML frameworks:

Cohere: Embeddings integration
Haystack: Document store backend
LangChain: Vector memory backend
LlamaIndex: Vector store integration
OpenAI: ChatGPT retrieval plugin
Microsoft Semantic Kernel: Persistent memory

Getting Started

The fastest way to get started with Qdrant is using Docker:

docker run -p 6333:6333 qdrant/qdrant

Then connect with any client library:

from qdrant_client import QdrantClient

qdrant = QdrantClient("http://localhost:6333")

For local development and testing, the Python client offers an in-memory mode: QdrantClient(":memory:") or file-based mode: QdrantClient(path="path/to/db").

Official Client Libraries

Qdrant provides official client libraries for multiple programming languages:

Community clients are also available for Elixir, PHP, Ruby, and more.

Cloud Deployment

Qdrant is available as a fully managed cloud service at Qdrant Cloud, including a free tier for development and testing.

License

Qdrant is open-source software licensed under the Apache License, Version 2.0.

Next Steps

Quick Start

Get Qdrant running in minutes with a practical example

Installation

Explore all installation methods and deployment options

Documentation Index

​Introduction to Qdrant

​What is Qdrant?

​Key Features

Filtering & Payloads

Hybrid Search

Vector Quantization

Distributed Deployment

​Core Capabilities

​High-Performance Architecture

​Storage & Memory Management

​HNSW Index Configuration

​Use Cases

​Semantic Search

​Recommendation Systems

​Retrieval Augmented Generation (RAG)

​Additional Applications

​Architecture Overview

​Core Components

​Distance Metrics

​API Interfaces

​REST API

​gRPC API

​Integrations

​Getting Started

​Official Client Libraries

​Cloud Deployment

​License

​Next Steps

Quick Start

Installation

Introduction to Qdrant

What is Qdrant?

Key Features

Core Capabilities

High-Performance Architecture

Storage & Memory Management

HNSW Index Configuration

Use Cases

Semantic Search

Recommendation Systems

Retrieval Augmented Generation (RAG)

Additional Applications

Architecture Overview

Core Components

Distance Metrics

API Interfaces

REST API

gRPC API

Integrations

Getting Started

Official Client Libraries

Cloud Deployment

License

Next Steps