Saturday, February 15, 2025

Vector indexes, large server, dbpedia-openai dataset: MariaDB, Qdrant and pgvector

My previous post has results for MariaDB and pgvector on the dbpedia-openai dataset. This post adds results from Qdrant. This uses ann-benchmarks to compare MariaDB, Qdrant and Postgres (pgvector) with a larger dataset, dbpedia-openai at 500k rows. The dataset has 1536 dimensions and uses angular (cosine) as the distance metric. This work was done by Small Datum LLC and sponsored by the MariaDB Corporation.

tl;dr

  • I am new to Qdrant so the chance that I made a mistake are larger than for MariaDB or Postgres
  • If you already run MariaDB or Postgres then I suggest you also use them for vector indexes
  • MariaDB usually gets ~2X more QPS than pgvector and ~1.5 more than Qdrant

Editorial

I have a bias. I am skeptical that you should deploy a new DBMS to support but one datatype (vectors) unless either you have no other DBMS in production or your production DBMS does not support vector indexing. 
  • Production is expensive -- you have to worry about security, backups, operational support
  • A new DBMS is expensive -- you have to spend time to learn how to use it
My initial response to Qdrant is that the new developer experience isn't very good. This can be fixed, but right now the product is complicated (has many features), configuration is complicated (also true for the DBMS I know, but I already paid that price), and the cognitive load is large. Just one example of the cognitive load is the need to learn the names that Qdrant uses for things that already have well-known names in the SQL DBMS world.

Deploying Qdrant

The more DBMS you include in one benchmark, the more likely you are to make a mistake because you lack expertise in all of those DBMS. I will soon learn whether I made a mistake here but I made a good faith effort to get good results from Qdrant.

I first tried to compile from source. But that failed. The docs state that The current list of required libraries can be found in the Dockerfile and while I was able to figure that out, I prefer that they just list the dependencies. Alas, my attempts to compile from source failed with error messages about problems with (gRPC) protocol definitions.

So I decided to try the Docker container they provide. I ended up not changing the Qdrant configuration provided in the Docker container. I spent some time doing performance debugging and didn't see anything to indicate that a config change was needed. For example, I didn't see disk IO during queries. But the performance debugging was harder because that Docker container image doesn't come with my favorite debug tools installed. Some of the tools were easy to install, others (perf) were not.

Benchmark

This post has much more detail about my approach in general. I ran the benchmark for 1 session. I use ann-benchmarks via my fork of a fork of a fork at this commit

The ann-benchmarks config files are here for MariaDB, Postgres and Qdrant. For Postgres I use the values for M and ef_construction. But MariaDB doesn't support ef_construction so I only specify the M values. While pgvector requires ef_construction to be >= 2*M, I do not know whether Qdrant has a similar requirement. Regardless I only test cases where that constraint is true.

Some quantization was used
  • MariaDB uses 16-bit integers rather than float32
  • pgvector uses float32, pgvector halfvec uses float16
  • For Qdrant I used none (float32) and scalar (int8)
I used a larger server (Hetzner ax162-s) with 48 cores, 128G of RAM, Ubuntu 22.04 and HW RAID 10 using 2 NVMe devices. The database configuration files are here for MariaDB and for Postgres. Output from the benchmark is here.

I had ps and vmstat running during the benchmark and confirmed there weren't storage reads as the table and index were cached by MariaDB and Postgres.

The command lines to run the benchmark using my helper scripts are:
    bash rall.batch.sh v1 dbpedia-openai-500k-angular c32r128

Results: QPS vs recall

These charts show the best QPS for a given recall. MariaDB gets more QPS than Qdrant and pgvector but that is harder to see as the recall approaches 1, so the next section has a table for best QPS per DBMS at a given recall.

Results: create index

The database configs for Postgres and MariaDB are shared above, and parallel index create is disabled by the config for Postgres and not supported yet by MariaDB. The summary is:
  • index sizes are similar between MariaDB and pgvector with halfvec
  • time to create the index varies a lot and it is better to consider this in the context of recall which is done in next section. But Qdrant creates indexes a lot faster than MariaDB or pgvector.
  • I did not find an accurate way to determine index size for Qdrant. There is a default method in ann-benchmarks that a DBMS can override. The default just compares process RSS before and after creating an index which isn't accurate for small indexes. The MariaDB and Postgres code override the default and query the data dictionary to get a more accurate estimate.
The max time to create an index for MariaDB and Postgres exceeds 10,000 seconds on this dataset when M and/or ef_construction are large. The max time for Qdrant was <= 400 seconds for no quantization and <= 300 seconds for scalar quantization. This is excellent. But I wonder if things Qdrant does (or doesn't do) to save time during create index contributes to making queries slower because MariaDB has much better QPS.

More details on index size and index create time for MariaDB and Postgres are in my previous post.

Results: best QPS for a given recall

Many benchmark results are marketed via peak performance (max throughput or min response time) but these are usually constrained optimization problems -- determine peak performance that satisfies some SLA. And the SLA might be response time or efficiency (cost).

With ann-benchmarks the constraint is recall. Below I share the best QPS for a given recall target along with the configuration parameters (M, ef_construction, ef_search) at which that occurs for each of the algorithms -- MariaDB, pgvector with float32 and float16/halfvec, Qdrant with no and scalar quantization.

Summary
  • Qdrant with scalar quantization does not get a result for recall=1.0 for the values of M, ef_construction and ef_search I used
  • MariaDB usually gets ~2X more QPS than pgvector and ~1.5 more than Qdrant
  • Index create time was much less for Qdrant (described above)
Legend:
  • recall, QPS - best QPS at that recall
  • rel2ma - (QPS for me / QPS for MariaDB)
  • m= is the value for M when creating the index
  • ef_cons= is the value for ef_construction when creating the index
  • ef_search= is the value for ef_search when running queries
  • quant= is the quantization used by Qdrant
  • dbms
    • MariaDB - MariaDB, there is no option for quantization
    • PGVector - Postgres with pgvector and float32
    • PGVector_halfvec - Postgres with pgvector and halfvec (float16)
    • Qdrant(..., quant=none) - Qdrant with no quantization
    • Qdrant(..., quant=scalar) - Qdrant with scalar quantization
MariaDB gets more QPS than a DBMS when rel2ma is less than 1.0 and when rel2ma is 0.5 then MariaDB gets 2X more QPS. Below, the rel2ma values are always much less than 1.0 except in the first group of results for recall = 1.0. 

Best QPS with recall = 1.000
recall  QPS     rel2ma
1.000   18.3    1.00    MariaDB(m=32, ef_search=200)
1.000   49.4    2.70    PGVector(m=64, ef_construct=256, ef_search=400)
1.000   56.4    3.08    PGVector_halfvec(m=64, ef_construct=256, ef_search=400)
1.000  153.9    8.41    Qdrant(m=32, ef_construct=256, quant=none, hnsw_ef=400)

Best QPS with recall >= 0.99
recall  QPS     rel2ma
0.993   861     1.00    MariaDB(m=24, ef_search=10)
0.991   370     0.43    PGVector(m=16, ef_construct=256, ef_search=80)
0.990   422     0.49    PGVector_halfvec(m=16, ef_construct=192, ef_search=80)
0.990   572     0.66    Qdrant(m=32, ef_construct=256, quant=none, hnsw_ef=40)
0.990   764     0.89    Qdrant(m=48, ef_construct=192, quant=scalar, hnsw_ef=40)

Best QPS with recall >= 0.98
recall  QPS     rel2ma
0.983  1273     1.00    MariaDB(m=16, ef_search=10)
0.981   492     0.39    PGVector(m=32, ef_construct=192, ef_search=30)
0.982   545     0.43    PGVector_halfvec(m=32, ef_construct=192, ef_search=30)
0.981   713     0.56    Qdrant(m=16, ef_construct=192, quant=none, hnsw_ef=40)
0.980   895     0.70    Qdrant(m=16, ef_construct=256, quant=scalar, hnsw_ef=40)

Best QPS with recall >= 0.97
recall  QPS     rel2ma
0.983  1273     1.00    MariaDB(m=16, ef_search=10)
0.971   635     0.50    PGVector(m=32, ef_construct=192, ef_search=20)
0.971   724     0.57    PGVector_halfvec(m=32, ef_construct=192, ef_search=20)
0.972   782     0.61    Qdrant(m=16, ef_construct=192, quant=none, hnsw_ef=30)
0.970   982     0.77    Qdrant(m=16, ef_construct=192, quant=scalar, hnsw_ef=30)

Best QPS with recall >= 0.96
recall  QPS     rel2ma
0.969  1602     1.00    MariaDB(m=12, ef_search=10)
0.965   762     0.48    PGVector(m=16, ef_construct=192, ef_search=30)
0.964   835     0.52    PGVector_halfvec(m=16, ef_construct=192, ef_search=30)
0.963   811     0.51    Qdrant(m=16, ef_construct=96, quant=none, hnsw_ef=30)
0.961   996     0.62    Qdrant(m=16, ef_construct=96, quant=scalar, hnsw_ef=30)

Best QPS with recall >= 0.95
recall  QPS     rel2ma
0.969  1602     1.00    MariaDB(m=12, ef_search=10)
0.954   802     0.50    PGVector(m=16, ef_construct=96, ef_search=30)
0.955   880     0.55    PGVector_halfvec(m=16, ef_construct=96, ef_search=30)
0.954   869     0.54    Qdrant(m=8, ef_construct=256, quant=none, hnsw_ef=40)
0.950  1060     0.66    Qdrant(m=16, ef_construct=192, quant=scalar, hnsw_ef=20)

No comments:

Post a Comment

My database communities

I have been working on databases since 1996. In some cases I just worked on the product (Oracle & Informix), in others I consider myself...