Thursday, April 9, 2020

Reviving mstat

A long time ago I wrote mstat to collect iostat, vmstat and MySQL (show global status) performance counters. It samples everything at the same interval (every N seconds), computes rates for all numeric values and has some support for expressions so I can define a new counter as a function of other counters.

Eventually I stopped using it and then there were problems. Recently I fixed many of them. The tool now works with Python3 and supports old and new iostat output format. This week I started to add support for Postgres and MongoDB. It gets data from pg_stat_bgwriter for Postgres and from db.serverStatus() for MongoDB. So the tool is useful to me again but it won't be useful to others until I add docs.

The tool prints one line in CSV format per time interval. The line is likely to be long because there can be hundreds of counters in the line. The name and offset for each counter is printed at startup. I extract data using bash and awk as part of my benchmark workflow.

Because of this dependency to list the offset per counter at startup the tool is not able to handle dynamic data yet. An example of dynamic data are per-DBMS counters such as the rows in pg_stat_database. That view has one row per database and the database names (datname) are not fixed. Similar problems exist for the per-table and per-index monitoring tables and views provided by MySQL, other stats views provided by Postgres and other monitoring data provided by MongoDB. So I need more time to figure out how to support them.

I have more work to do on mstat, but I am happy that I can use it again:
  • I need to figure out whether there are global counters in Postgres for things like queries and rows read
  • For MongoDB mstat has a hack to replace space, dot and parentheses with _ in key names
  • More work remains to make sure that counters are ignored unless their type is numeric or a date

No comments:

Post a Comment

RocksDB on a big server: LRU vs hyperclock, v2

This post show that RocksDB has gotten much faster over time for the read-heavy benchmarks that I use. I recently shared results from a lar...