Prerequisites
Recommended Server Specifications
| Workload | Vectors | RAM | Storage | RamNode Plan |
|---|---|---|---|---|
| Dev / Testing | < 100K | 2 GB | 20 GB NVMe | Cloud 2 GB |
| Small Production | 100K–1M | 4 GB | 40 GB NVMe | Cloud 4 GB |
| Medium Production | 1M–10M | 8–16 GB | 80–160 GB | Cloud 8–16 GB |
| Large / Clustered | 10M+ | 32 GB+ | 320 GB+ | Multiple nodes |
Software Requirements
- Ubuntu 22.04 LTS or 24.04 LTS (recommended)
- Docker Engine v24.0+ with Docker Compose v2
- Alternatively: standalone binary installation (covered below)
- Domain name (optional, for TLS/HTTPS access)
- UFW or iptables configured
Provision Your RamNode VPS
- Log in to the RamNode client area at ramnode.com
- Select a Cloud VPS plan that meets your workload requirements
- Choose your preferred data center location (New York, Atlanta, Los Angeles, Seattle, or Netherlands)
- Select Ubuntu 22.04 LTS or 24.04 LTS as your operating system
- Complete the order and note your server IP address
ssh root@YOUR_SERVER_IPInitial Server Setup
apt update && apt upgrade -y
apt install -y curl wget gnupg2 software-properties-common \
apt-transport-https ca-certificates ufwCreate a Dedicated User
adduser qdrant
usermod -aG sudo qdrant
su - qdrantConfigure the Firewall
Qdrant uses port 6333 for its REST API and port 6334 for gRPC.
sudo ufw default deny incoming
sudo ufw default allow outgoing
sudo ufw allow ssh
sudo ufw allow 6333/tcp # Qdrant REST API
sudo ufw allow 6334/tcp # Qdrant gRPC
sudo ufw enableSecurity Note: If your Qdrant instance will only be accessed by application servers on the same private network, restrict ports 6333 and 6334 to internal IPs only. For public-facing deployments, always enable API key authentication and TLS.
Install Docker
curl -fsSL https://get.docker.com | sh
sudo usermod -aG docker qdrant
newgrp docker
docker --version
docker compose versionDeploy Qdrant with Docker Compose
mkdir -p ~/qdrant && cd ~/qdrant
mkdir -p qdrant_storage qdrant_snapshotsversion: "3.8"
services:
qdrant:
image: qdrant/qdrant:latest
container_name: qdrant
restart: unless-stopped
ports:
- "6333:6333" # REST API
- "6334:6334" # gRPC
volumes:
- ./qdrant_storage:/qdrant/storage
- ./qdrant_snapshots:/qdrant/snapshots
- ./config.yaml:/qdrant/config/production.yaml
environment:
- QDRANT__SERVICE__API_KEY=${QDRANT_API_KEY}
deploy:
resources:
limits:
memory: 4G # Adjust to your planQdrant Configuration File
storage:
storage_path: /qdrant/storage
snapshots_path: /qdrant/snapshots
optimizers:
default_segment_number: 2
memmap_threshold_kb: 20000
indexing_threshold_kb: 10000
service:
host: 0.0.0.0
http_port: 6333
grpc_port: 6334
enable_cors: true
log_level: INFOSet API Key and Launch
# Generate a secure API key
export QDRANT_API_KEY=$(openssl rand -hex 32)
echo "QDRANT_API_KEY=$QDRANT_API_KEY" > .env
echo "Save this key securely: $QDRANT_API_KEY"
# Start Qdrant
docker compose up -d
# Verify the service is running
docker compose logs -f qdrantVerify the Deployment
curl -s -H "api-key: $QDRANT_API_KEY" \
http://localhost:6333/healthz
# List collections
curl -s -H "api-key: $QDRANT_API_KEY" \
http://localhost:6333/collectionsAlternative: Standalone Binary Installation
If you prefer not to use Docker, Qdrant can be installed directly from its official releases.
# Download the latest release
QDRANT_VERSION=$(curl -s https://api.github.com/repos/qdrant/qdrant/releases/latest \
| grep tag_name | cut -d '"' -f 4)
wget https://github.com/qdrant/qdrant/releases/download/${QDRANT_VERSION}/qdrant-x86_64-unknown-linux-gnu.tar.gz
# Extract and install
tar -xzf qdrant-x86_64-unknown-linux-gnu.tar.gz
sudo mv qdrant /usr/local/bin/
sudo chmod +x /usr/local/bin/qdrant[Unit]
Description=Qdrant Vector Database
After=network.target
[Service]
Type=simple
User=qdrant
WorkingDirectory=/home/qdrant
ExecStart=/usr/local/bin/qdrant --config-path /home/qdrant/config.yaml
Restart=on-failure
RestartSec=5
LimitNOFILE=65535
[Install]
WantedBy=multi-user.targetsudo systemctl daemon-reload
sudo systemctl enable qdrant
sudo systemctl start qdrant
sudo systemctl status qdrantEnable TLS with Let's Encrypt
sudo apt install -y nginx certbot python3-certbot-nginxserver {
listen 80;
server_name qdrant.yourdomain.com;
location / {
proxy_pass http://127.0.0.1:6333;
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header X-Forwarded-Proto $scheme;
proxy_read_timeout 300;
proxy_send_timeout 300;
}
}sudo ln -s /etc/nginx/sites-available/qdrant /etc/nginx/sites-enabled/
sudo nginx -t && sudo systemctl reload nginx
# Obtain TLS certificate
sudo certbot --nginx -d qdrant.yourdomain.com
# Update firewall for HTTPS
sudo ufw allow 'Nginx Full'Performance Tuning
Kernel Parameters
vm.max_map_count = 262144
vm.swappiness = 10
net.core.somaxconn = 65535
net.ipv4.tcp_max_syn_backlog = 65535sudo sysctl --systemOn-Disk Storage for Large Collections
For collections larger than available RAM, enable on-disk storage. On RamNode's NVMe storage, mmap performance is excellent.
curl -X PUT http://localhost:6333/collections/my_collection \
-H "api-key: $QDRANT_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"vectors": {
"size": 1536,
"distance": "Cosine",
"on_disk": true
},
"optimizers_config": {
"memmap_threshold": 20000
}
}'Scalar Quantization
Scalar quantization can reduce memory usage by up to 4x with minimal accuracy loss, allowing you to store significantly more vectors on a smaller RamNode plan.
curl -X PUT http://localhost:6333/collections/my_collection \
-H "api-key: $QDRANT_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"vectors": {
"size": 1536,
"distance": "Cosine"
},
"quantization_config": {
"scalar": {
"type": "int8",
"always_ram": true
}
}
}'Backup & Recovery
Qdrant supports snapshot-based backups at the collection level. Automate nightly backups with a cron job.
#!/bin/bash
API_KEY=$(cat /home/qdrant/qdrant/.env | grep QDRANT_API_KEY | cut -d= -f2)
BACKUP_DIR=/home/qdrant/backups/$(date +%Y-%m-%d)
mkdir -p $BACKUP_DIR
# Get all collection names
COLLECTIONS=$(curl -s -H "api-key: $API_KEY" \
http://localhost:6333/collections | jq -r '.result.collections[].name')
# Snapshot each collection
for COLLECTION in $COLLECTIONS; do
echo "Backing up: $COLLECTION"
curl -s -X POST -H "api-key: $API_KEY" \
http://localhost:6333/collections/$COLLECTION/snapshots
done
# Copy snapshots to backup directory
cp -r /home/qdrant/qdrant/qdrant_snapshots/* $BACKUP_DIR/
# Retain only last 7 days
find /home/qdrant/backups -maxdepth 1 -mtime +7 -exec rm -rf {} +
echo "Backup complete: $BACKUP_DIR"chmod +x ~/qdrant/backup.sh
# Add to crontab (runs daily at 2 AM)
# crontab -e, then add:
0 2 * * * /home/qdrant/qdrant/backup.sh >> /var/log/qdrant-backup.log 2>&1Restore from Snapshot
curl -X PUT \
"http://localhost:6333/collections/my_collection/snapshots/recover" \
-H "api-key: $QDRANT_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"location": "file:///qdrant/snapshots/my_collection/snapshot_name.snapshot"
}'Monitoring & Health Checks
Qdrant exposes a Prometheus-compatible metrics endpoint that integrates with standard monitoring stacks.
Enable Metrics
Add to your config.yaml:
service:
enable_metrics: true
metrics_port: 6335curl http://localhost:6335/metricsKey Metrics to Monitor
| Metric | Description |
|---|---|
| app_info | Qdrant version and startup status |
| collections_total | Number of active collections |
| vectors_total | Total vectors stored across all collections |
| rest_responses_duration | REST API response latency (P50, P95, P99) |
| grpc_responses_duration | gRPC response latency |
| memory_usage_bytes | Process memory consumption |
Quick Start: Your First Collection
This example uses 1536-dimensional vectors, matching OpenAI's text-embedding-ada-002 output.
curl -X PUT http://localhost:6333/collections/documents \
-H "api-key: $QDRANT_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"vectors": {
"size": 1536,
"distance": "Cosine"
}
}'curl -X PUT http://localhost:6333/collections/documents/points \
-H "api-key: $QDRANT_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"points": [
{
"id": 1,
"vector": [0.05, 0.61, 0.76, ...],
"payload": {"title": "Getting Started with RAG", "source": "blog"}
},
{
"id": 2,
"vector": [0.19, 0.81, 0.75, ...],
"payload": {"title": "Vector Search Explained", "source": "docs"}
}
]
}'curl -X POST http://localhost:6333/collections/documents/points/query \
-H "api-key: $QDRANT_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"query": [0.2, 0.1, 0.9, ...],
"limit": 5,
"with_payload": true
}'Python SDK: For application integration, install the Qdrant Python client with pip install qdrant-client. The client supports both REST and gRPC transports, with gRPC offering significantly lower latency for high-throughput workloads.
Troubleshooting
| Issue | Solution |
|---|---|
| Port 6333 not reachable | Check UFW rules with sudo ufw status. Ensure Docker published ports match. Test with curl localhost:6333 from the server itself. |
| Out of memory errors | Enable on_disk vectors, configure quantization, or upgrade to a higher-memory RamNode plan. Reduce default_segment_number in config.yaml. |
| Slow search performance | Increase indexing_threshold_kb, verify HNSW index is built, and ensure vectors fit in RAM or mmap is enabled. |
| Container won't start | Check docker compose logs qdrant for errors. Verify volume permissions: ls -la qdrant_storage should be owned by UID 1000. |
| TLS certificate issues | Run sudo certbot renew --dry-run. Verify Nginx config with sudo nginx -t. Check DNS resolution. |
| Snapshot restore fails | Ensure the snapshot file path is accessible inside the container. Use the file:// prefix for local snapshot paths. |
Upgrading Qdrant
cd ~/qdrant
docker compose pull
docker compose down
docker compose up -d
# Verify the new version
curl -s -H "api-key: $QDRANT_API_KEY" http://localhost:6333 | jq .versionAlways create a full snapshot backup before upgrading. Review the Qdrant release notes for any breaking changes or required migration steps between major versions.
Qdrant Deployed Successfully!
Your production-ready vector database is now running. Integrate with your AI applications using the REST API, gRPC, or Python SDK. Access the built-in web UI at http://your-server:6333/dashboard to visually manage collections and run test queries.
