Small Datum: Vector indexes, MariaDB & pgvector, large server, large dataset: part 1

This post has results from ann-benchmarks to compare MariaDB and Postgres with a larger dataset, gist-960-euclidean. Previous posts (here and here) used fashion-mnist-784-euclidean which is a small dataset. By larger I mean by the standards of what is in ann-benchmarks. This dataset has 1M rows and 960 dimensions. The fashion-mnist-784-euclidean dataset has 60,000 rows and 784 dimensions. Both use Euclidean distance. This work was done by Small Datum LLC and sponsored by the MariaDB Corporation.

tl;dr

MariaDB gets between 2.5X and 3.9X more QPS than Postgres for recall >= 0.95

Benchmark

This post has much more detail about my approach in general. I ran the benchmark for 1 session. I use ann-benchmarks via my fork of a fork of a fork at this commit. The ann-benchmarks config files are here for MariaDB and for Postgres.

I used a larger server (Hetzner ax162-s) with 48 cores, 128G of RAM, Ubuntu 22.04 and HW RAID 10 using 2 NVMe devices. The database configuration files are here for MariaDB and for Postgres.

I had ps and vmstat running during the benchmark and confirmed there weren't storage reads as the table and index were cached by MariaDB and Postgres.

The command line to run the benchmark using my helper scripts is:
bash rall.batch.sh v1 gist-960-euclidean c32r128

Results: QPS vs recall

This chart shows the best QPS for a given recall. MariaDB gets ~1.5X more QPS than pgvector at low recall and between 2X and 4X more QPS at high recall.

Results: create index

The database configs for Postgres and MariaDB are shared above, and parallel index create is disabled by the config for Postgres and not supported yet by MariaDB. The summary is:

index sizes are similar between MariaDB and pgvector with halfvec
time to create the index varies a lot and it is better to consider this in the context of recall which is done in next section

Table size is 3906 MB for Postgres and 5292 MB for MariaDB.

Legend

M - value for M when creating the index
cons - value for ef_construction when creating the index
secs - time in seconds to create the index
size(MB) - index size in MB

-- pgvector -- -- pgvector --

-- float32 -- -- halfvec --

M cons secs size(MB) secs size(MB)

8 32 323 3870 292 2568

16 32 610 7811 551 2603

8 64 512 3865 466 2565

16 64 964 7684 869 2561

32 64 1958 7812 1773 2604

8 96 717 3863 646 2564

16 96 1330 7681 1187 2560

32 96 2640 7679 2368 2559

48 96 3990 7812 3606 2606

8 192 1265 3861 1142 2562

16 192 2295 7679 2036 2559

32 192 4361 7678 3880 2559

48 192 6281 7678 5581 2562

64 192 8589 7678 7612 3839

8 256 1607 3861 1448 2562

16 256 2882 7678 2560 2559

32 256 5260 7678 4611 2559

48 256 7678 7678 6713 2561

64 256 9962 7678 8851 3839

mariadb

M secs size(MB)

4 243 2316

5 313 2320

6 439 2316

8 775 2316

12 1878 2316

16 3547 2348

24 8690 2696

32 16172 2696

48 38732 2756

Results: best QPS for a given recall

Many benchmark results are marketed via peak performance (max throughput or min response time) but these are usually constrained optimization problems -- determine peak performance that satisfies some SLA. And the SLA might be response time or efficiency (cost).

With ann-benchmarks the constraint is recall. Below I share the best QPS for a given recall target along with the configuration parameters (M, ef_construction, ef_search) at which that occurs for each of the algorithms (MariaDB, pgvector with float32, pgvector with float16/halfvec).

Summary

Postgres does not get recall=1.0 for the values of M, ef_construction and ef_search I used
Index create time was less for MariaDB in all cases except the result for recall >= 0.96. However, if you care more about index size than peak QPS then it might be better to look at more results per recall level, as in the best 3 results per DBMS rather than the best as I do here.
For a given recall target, MariaDB gets between 2.5X and 3.9X more QPS than Postgres

Legend:

recall, QPS - best QPS at that recall
isecs - time to create the index in seconds
m= - value for M when creating the index
ef_cons= - value for ef_construction when creating the index
ef_search= - value for ef_search when running queries

Best QPS with recall >= 1.000, Postgres did not reach the recall target

recall QPS isecs

1.000 87.1 38732 MariaDB(m=48, ef_search=10)

Best QPS with recall >= 0.99, MariaDB gets 3.7X more QPS than Postgres

recall QPS isecs

0.990 81.1 9962 PGVector(m=64, ef_cons=256, ef_search=120)

0.990 85.4 8851 PGVector_halfvec(m=64, ef_cons=256, ef_search=120)

0.991 311.8 8690 MariaDB(m=24, ef_search=10)

Best QPS with recall >= 0.98,MariaDB gets 3.5X more QPS than Postgres

recall QPS isecs

0.984 101.1 6281 PGVector(m=48, ef_cons=192, ef_search=120)

0.984 109.7 5581 PGVector_halfvec(m=48, ef_cons=192, ef_search=120)

0.985 384.9 3547 MariaDB(m=16, ef_search=20)

Best QPS with recall >= 0.97, MariaDB gets 2.5X more QPS than Postgres

recall QPS isecs

0.973 138.0 5260 PGVector(m=32, ef_cons=256, ef_search=120)

0.971 152.7 3880 PGVector_halfvec(m=32, ef_cons=192, ef_search=120)

0.985 384.9 3547 MariaDB(m=16, ef_search=20)

Best QPS with recall >= 0.96, MariaDB gets 3.9X more QPS than Postgres

recall QPS isecs

0.966 139.8 6281 PGVector(m=48, ef_cons=192, ef_search=80)

0.964 155.4 2368 PGVector_halfvec(m=32, ef_cons=96, ef_search=120)

0.960 610.1 3547 MariaDB(m=16, ef_search=10)

Best QPS with recall >= 0.95, MariaDB gets 2.9X more QPS than Postgres

recall QPS isecs

0.951 190.8 5260 PGVector(m=32, ef_cons=256, ef_search=80)

0.951 208.7 4611 PGVector_halfvec(m=32, ef_cons=256, ef_search=80)

0.960 610.1 3547 MariaDB(m=16, ef_search=10)

Small Datum

Tuesday, January 28, 2025

Vector indexes, MariaDB & pgvector, large server, large dataset: part 1

No comments:

Post a Comment

Postgres 18 beta1: small server, IO-bound Insert Benchmark (v2)