Small Datum: Vector indexes, large server, dbpedia-openai dataset: MariaDB, Qdrant and pgvector

My previous post has results for MariaDB and pgvector on the dbpedia-openai dataset. This post adds results from Qdrant. This uses ann-benchmarks to compare MariaDB, Qdrant and Postgres (pgvector) with a larger dataset, dbpedia-openai at 500k rows. The dataset has 1536 dimensions and uses angular (cosine) as the distance metric. This work was done by Small Datum LLC and sponsored by the MariaDB Corporation.

tl;dr

I am new to Qdrant so the chance that I made a mistake are larger than for MariaDB or Postgres
If you already run MariaDB or Postgres then I suggest you also use them for vector indexes
MariaDB usually gets ~2X more QPS than pgvector and ~1.5 more than Qdrant

Editorial

I have a bias. I am skeptical that you should deploy a new DBMS to support but one datatype (vectors) unless either you have no other DBMS in production or your production DBMS does not support vector indexing.

Production is expensive -- you have to worry about security, backups, operational support
A new DBMS is expensive -- you have to spend time to learn how to use it

My initial response to Qdrant is that the new developer experience isn't very good. This can be fixed, but right now the product is complicated (has many features), configuration is complicated (also true for the DBMS I know, but I already paid that price), and the cognitive load is large. Just one example of the cognitive load is the need to learn the names that Qdrant uses for things that already have well-known names in the SQL DBMS world.

Deploying Qdrant

The more DBMS you include in one benchmark, the more likely you are to make a mistake because you lack expertise in all of those DBMS. I will soon learn whether I made a mistake here but I made a good faith effort to get good results from Qdrant.

I first tried to compile from source. But that failed. The docs state that The current list of required libraries can be found in the Dockerfile and while I was able to figure that out, I prefer that they just list the dependencies. Alas, my attempts to compile from source failed with error messages about problems with (gRPC) protocol definitions.

So I decided to try the Docker container they provide. I ended up not changing the Qdrant configuration provided in the Docker container. I spent some time doing performance debugging and didn't see anything to indicate that a config change was needed. For example, I didn't see disk IO during queries. But the performance debugging was harder because that Docker container image doesn't come with my favorite debug tools installed. Some of the tools were easy to install, others (perf) were not.

Benchmark

This post has much more detail about my approach in general. I ran the benchmark for 1 session. I use ann-benchmarks via my fork of a fork of a fork at this commit.

The ann-benchmarks config files are here for MariaDB, Postgres and Qdrant. For Postgres I use the values for M and ef_construction. But MariaDB doesn't support ef_construction so I only specify the M values. While pgvector requires ef_construction to be >= 2*M, I do not know whether Qdrant has a similar requirement. Regardless I only test cases where that constraint is true.

Some quantization was used

MariaDB uses 16-bit integers rather than float32
pgvector uses float32, pgvector halfvec uses float16
For Qdrant I used none (float32) and scalar (int8)

I used a larger server (Hetzner ax162-s) with 48 cores, 128G of RAM, Ubuntu 22.04 and HW RAID 10 using 2 NVMe devices. The database configuration files are here for MariaDB and for Postgres. Output from the benchmark is here.

I had ps and vmstat running during the benchmark and confirmed there weren't storage reads as the table and index were cached by MariaDB and Postgres.

The command lines to run the benchmark using my helper scripts are:

bash rall.batch.sh v1 dbpedia-openai-500k-angular c32r128

Results: QPS vs recall

These charts show the best QPS for a given recall. MariaDB gets more QPS than Qdrant and pgvector but that is harder to see as the recall approaches 1, so the next section has a table for best QPS per DBMS at a given recall.

Results: create index

The database configs for Postgres and MariaDB are shared above, and parallel index create is disabled by the config for Postgres and not supported yet by MariaDB. The summary is:

index sizes are similar between MariaDB and pgvector with halfvec
time to create the index varies a lot and it is better to consider this in the context of recall which is done in next section. But Qdrant creates indexes a lot faster than MariaDB or pgvector.
I did not find an accurate way to determine index size for Qdrant. There is a default method in ann-benchmarks that a DBMS can override. The default just compares process RSS before and after creating an index which isn't accurate for small indexes. The MariaDB and Postgres code override the default and query the data dictionary to get a more accurate estimate.

The max time to create an index for MariaDB and Postgres exceeds 10,000 seconds on this dataset when M and/or ef_construction are large. The max time for Qdrant was <= 400 seconds for no quantization and <= 300 seconds for scalar quantization. This is excellent. But I wonder if things Qdrant does (or doesn't do) to save time during create index contributes to making queries slower because MariaDB has much better QPS.

More details on index size and index create time for MariaDB and Postgres are in my previous post.

Results: best QPS for a given recall

Many benchmark results are marketed via peak performance (max throughput or min response time) but these are usually constrained optimization problems -- determine peak performance that satisfies some SLA. And the SLA might be response time or efficiency (cost).

With ann-benchmarks the constraint is recall. Below I share the best QPS for a given recall target along with the configuration parameters (M, ef_construction, ef_search) at which that occurs for each of the algorithms -- MariaDB, pgvector with float32 and float16/halfvec, Qdrant with no and scalar quantization.

Summary

Qdrant with scalar quantization does not get a result for recall=1.0 for the values of M, ef_construction and ef_search I used
MariaDB usually gets ~2X more QPS than pgvector and ~1.5 more than Qdrant
Index create time was much less for Qdrant (described above)

Legend:

recall, QPS - best QPS at that recall
rel2ma - (QPS for me / QPS for MariaDB)
m= is the value for M when creating the index
ef_cons= is the value for ef_construction when creating the index
ef_search= is the value for ef_search when running queries
quant= is the quantization used by Qdrant
dbms

MariaDB - MariaDB, there is no option for quantization
PGVector - Postgres with pgvector and float32
PGVector_halfvec - Postgres with pgvector and halfvec (float16)
Qdrant(..., quant=none) - Qdrant with no quantization
Qdrant(..., quant=scalar) - Qdrant with scalar quantization

MariaDB gets more QPS than a DBMS when rel2ma is less than 1.0 and when rel2ma is 0.5 then MariaDB gets 2X more QPS. Below, the rel2ma values are always much less than 1.0 except in the first group of results for recall = 1.0.

Best QPS with recall = 1.000

recall QPS rel2ma

1.000 18.3 1.00 MariaDB(m=32, ef_search=200)

1.000 49.4 2.70 PGVector(m=64, ef_construct=256, ef_search=400)

1.000 56.4 3.08 PGVector_halfvec(m=64, ef_construct=256, ef_search=400)

1.000 153.9 8.41 Qdrant(m=32, ef_construct=256, quant=none, hnsw_ef=400)

Best QPS with recall >= 0.99

recall QPS rel2ma

0.993 861 1.00 MariaDB(m=24, ef_search=10)

0.991 370 0.43 PGVector(m=16, ef_construct=256, ef_search=80)

0.990 422 0.49 PGVector_halfvec(m=16, ef_construct=192, ef_search=80)

0.990 572 0.66 Qdrant(m=32, ef_construct=256, quant=none, hnsw_ef=40)

0.990 764 0.89 Qdrant(m=48, ef_construct=192, quant=scalar, hnsw_ef=40)

Best QPS with recall >= 0.98

recall QPS rel2ma

0.983 1273 1.00 MariaDB(m=16, ef_search=10)

0.981 492 0.39 PGVector(m=32, ef_construct=192, ef_search=30)

0.982 545 0.43 PGVector_halfvec(m=32, ef_construct=192, ef_search=30)

0.981 713 0.56 Qdrant(m=16, ef_construct=192, quant=none, hnsw_ef=40)

0.980 895 0.70 Qdrant(m=16, ef_construct=256, quant=scalar, hnsw_ef=40)

Best QPS with recall >= 0.97

recall QPS rel2ma

0.983 1273 1.00 MariaDB(m=16, ef_search=10)

0.971 635 0.50 PGVector(m=32, ef_construct=192, ef_search=20)

0.971 724 0.57 PGVector_halfvec(m=32, ef_construct=192, ef_search=20)

0.972 782 0.61 Qdrant(m=16, ef_construct=192, quant=none, hnsw_ef=30)

0.970 982 0.77 Qdrant(m=16, ef_construct=192, quant=scalar, hnsw_ef=30)

Best QPS with recall >= 0.96

recall QPS rel2ma

0.969 1602 1.00 MariaDB(m=12, ef_search=10)

0.965 762 0.48 PGVector(m=16, ef_construct=192, ef_search=30)

0.964 835 0.52 PGVector_halfvec(m=16, ef_construct=192, ef_search=30)

0.963 811 0.51 Qdrant(m=16, ef_construct=96, quant=none, hnsw_ef=30)

0.961 996 0.62 Qdrant(m=16, ef_construct=96, quant=scalar, hnsw_ef=30)

Best QPS with recall >= 0.95

recall QPS rel2ma

0.969 1602 1.00 MariaDB(m=12, ef_search=10)

0.954 802 0.50 PGVector(m=16, ef_construct=96, ef_search=30)

0.955 880 0.55 PGVector_halfvec(m=16, ef_construct=96, ef_search=30)

0.954 869 0.54 Qdrant(m=8, ef_construct=256, quant=none, hnsw_ef=40)

0.950 1060 0.66 Qdrant(m=16, ef_construct=192, quant=scalar, hnsw_ef=20)

2 comments:

AnonymousMarch 1, 2025 at 4:40 AM
Are you interested in adding https://milvus.io into the comparison? Feel free to email me Jiang.chen@zilliz.com and I’m happy to help on the set up!

We also open sourced a benchmark for more production-level test cases (search while doing ingestion, up to billion vector scale, taking machine cost into consideration etc). Maybe you will find that useful: https://github.com/zilliztech/VectorDBBench

Small Datum

Saturday, February 15, 2025

Vector indexes, large server, dbpedia-openai dataset: MariaDB, Qdrant and pgvector

2 comments:

Postgres 18 beta2: large server, Insert Benchmark, part 2