Posts

Showing posts from July, 2020

Sustaining high insert rates despite secondary indexes

If you want to sustain high insert rates in the presence of secondary indexes then there are two approaches at a high level. The first is to buy enough RAM to cache the entire database but that gets expensive. The second is to avoid storage reads during secondary index maintenance. So now the problem shifts to methods for avoiding storage reads during secondary index maintenance and there are a few popular ways to do that. Partition the database so that only a few of the partitions get inserts at any point in time and then use local (per-partition) secondary indexes. If the partitions are small enough they will fit in memory and then traditional indexing (B-Tree) can be used and the per-partition secondary indexes will fit it memory for the partitions that are getting inserts. TimescaleDB is an example of this. Use an LSM, or something like an LSM, that supports read-free index maintenance for non-unique secondary indexes. MyRocks is an example of that. Don't use secondary

Postgres, vacuum and the insert benchmark

I have been running benchmarks with Postgres for a few months and am slowly learning enough about Postgres to trust the results that I get. When I first read about Postgres vacuum I didn't think it would be important for the insert benchmark because the workload is insert-only. This post explains why I was wrong. Vacuum arrived in the original POSTGRES storage system  in the 1980s. Today it serves several purposes : 1) reclaim space for old versions of rows that are no longer visible, 2) update statistics for the query planner, 3) update bits in the visibility map to help index-only queries and 4) preventing transactionID wraparound. I didn't expect vacuum to help much with the  insert benchmark  because it doesn't do updates or deletes and there is no space for vacuum to reclaim. Also, query plans were OK and there weren't enough transactions for wraparound to be an issue. What I missed was the impact of the visibility map on the index-only range scans run duri

Review of -- TimescaleDB: SQL made scalable for time-series data

This is a short review of TimescaleDB: SQL made scalable for time-series data . My focus is on indexing and I ignore many interesting things described in that paper. My goal in reading it was to understand possible optimizations for workloads like the insert benchmark v3 . I also read some of the online TimescaleDB docs . This blog post is truthy as I am not an expert on TimescaleDB. Hopefully I haven't made too many mistakes. The paper is several years old and TimescaleDB has improved a lot since then. The paper does a great job asserting what they know to be true about the workloads for which TimescaleDB is targeted. Assertion 1: Time-series data is largely immutable. New data continually arrives, typically corresponding to the latest time periods. In other words, writes primarily occur as new inserts, not as updates to existing rows. Further, while the database needs to be able to support backfill for delayed data, writes are made primarily to recent time intervals. Asser

Indexing and write-heavy workloads

When I see impressive numbers for the insert rate that a DBMS can sustain I wonder what indexes exist and whether the inserts are in sequential or random order with respect to each index. One way to explain this is in terms of the numbers of points in the index at which the inserts occur. Although I use streams rather than insert points in what follows. I am writing this in part so that I can reference this post in future performance reports when describing workloads. It isn't sufficient to state that inserts are in PK order. They can be in ascending or descending PK order. When ascending the point at which the inserts are done can be at the right end of the index (inserted keys > than existing keys) or somewhere in the middle of the index. When descending the inserts can be done at the left end of the index (inserted keys < existing keys) or somewhere in the middle of the index. Explaining insert patterns There are four attributes per index that can explain such inse

Review: The Design of the Postgres Storage System

This is a review of The Design of the Postgres Storage System. The paper was in VLDB 1987 although my review used a version of the paper that might differ from the one in VLDB (which is appropriate, given MVCC). This is the also the first in a series of posts on MVCC GC. My standard disclaimer is that I don't review papers that aren't worth reading. The paper is about POSTGRES, not modern PostgreSQL. POSTGRES was using POSTQUEL at the time and the paper has sample POSTQUEL statements. Something I like very much about the paper is the use of simple estimates to explain the design decisions. 1 Introduction The abstract and introduction provide four reasons for the research: Optionally archive all previous versions of rows on separate storage as WORM optical disks were likely to arrive on the market.  Don't use a WAL (redo log). The benefits include instant recovery, less code complexity and better support for archiving. Use multiple processes to benefit from multi-processor

Updates for the insert benchmark

I continue to run and improve the insert benchmark. This is an update to  the overview . Details on how I share performance results  is here .    Per interval results The first change is that the benchmark client displays per-second performance results including IPS, QPS and max response time for inserts and queries. This makes it easier to understand stalls. This is output from a load and the legend for the columns is: i_sec - number of seconds for this interval t_sec - cumulative number of seconds i_ips, i_qps - average insert and query rate for that interval t_ips, t_qps - average insert and query rate from test start until now max_i, max_q - max insert and query response time for that interval, in microseconds t_ins, t_query - total inserts and queries from test start until now i_sec   t_sec   i_ips   t_ips   i_qps   t_qps   max_i   max_q   t_ins   t_query 1.0     1.0     61833   61833   0       0       2919    0       61900   0 1.0     2.0     63625   62729   0   

Something changed for the better in create index between MySQL 8.0.18 and 8.0.20

I include MongoDB, Postgres, MySQL/InnoDB and MySQL/RocksDB (MyRocks) in the things I test via Linkbench and the insert benchmark . Hopefully I add another test later this year. I have been using MySQL 8.0.18 this year and recently started to use 8.0.20. The new insert benchmark workflow is create tables with PK indexes, load tables, create 3 secondary indexes on each table, continue the load and then run several rounds of reads+inserts. For this test there were 8 clients with a table per client and 10M rows/table after the initial insert. I think something changed for the better from MySQL 8.0.18 to 8.0.20 and I didn't spot the change in the release notes. I see: Create index in 8.0.20 is 10% to 20% faster (nice, but not why I wrote this) Create index in 8.0.18 uses ~7X more read IO with innodb_flush_method = O_DIRECT (14G vs 2G) and ~2X more read IO with it set to O_DIRECT_NO_FSYNC (4G vs 2G). The table was cached at the start of create index and the indexed table should