Comments on Small Datum: Types of writes

It's new in the MySQL Server code base that av...

2014-04-23T22:41:14.864-07:00

It's new in the MySQL Server code base that avoiding reads before updates is available in the handler interface (added in MySQL 5.6, it's been in MySQL Cluster code base for a long time). On-line alter support was added in 5.6 as well in the MySQL Server code base and there are some things that InnoDB supports performing on-line.

I guess it's on our side to make it easier to ...

2014-04-22T09:43:03.652-07:00

I guess it's on our side to make it easier to use, to have less amount of hassles. This is the major focus of the next cluster version MySQL Cluster 7.4. I think also the development on hardware (more memory, more networking bandwidth and more CPU resources) and the MySQL Cluster product becoming more and more stable works for the product.

Is this new? I investigated this in MySQL 5.1 and ...

2014-04-22T09:38:12.316-07:00

Is this new? I investigated this in MySQL 5.1 and MySQL 5.5 without any success. Also, I recall a bunch of innovations NDB had (like online alter table, back in the day) were available in the MySQL Cluster code base and not the MySQL Server code base. Has that changed?

How do we bring the greatness of NDB/Cluster to mo...

2014-04-22T08:55:13.811-07:00

How do we bring the greatness of NDB/Cluster to more users. Maybe hosted NDB as a service is the solution?

The handler interface supports avoiding the reads ...

2014-04-22T08:15:04.107-07:00

The handler interface supports avoiding the reads in the below case:
update foo set b=100 where id=10
It's used by NDB currently but can be implemented by any storage engine that supports it.
It would even be possible to do blind writes of:
update foo set a=a+1 where id=10
but this requires a bit more mapping the calculation to something useful so isn't currently
supported in the handler interface. Naturally this would require support for calculations in
the storage engine.

Your first example is an example of a blind-write....

2014-04-16T18:05:42.666-07:00

Your first example is an example of a blind-write. I think we agree that writes don't have commute to be fast. My guess is that updating counts is the most common use case and maintaining last-N for a bag/set subject to a max size is another use case.

I don't think the write needs to be commutativ...

2014-04-16T17:52:23.801-07:00

I don't think the write needs to be commutative in order to be fast. We just need to know what keys will need modification in advance. Let's take the following SQL example:
create table foo (id int, a int, b int, primary key (id), key (a))

In this schema, the primary key is the sole clustering key. Suppose I did "update foo set b=100 where id=10;" This update can be done fast, we know the pk needs a modification, and we know which one (the one where it is equal to 10). We also know that secondary index 'a' needs no modification.

Now take the following two examples:
update foo set a=a+1 where id=10;
update foo set b=b+1 where a=10;

In each of these cases, the update cannot be done fast. In the first example, we know the secondary index 'a' will need to be modified, but we don't know for which value. So a query is necessary. Similarly, in the second case, we don't know which id in the primary key associated with a being 10 needs modification, so that requires a query as well.

Similarly, if we have multiple clustering indexes, many updates that used to be fast are no longer fast. If 'a' is a clustering key, the this update that used to be done fast can no longer be fast:
update foo set b=100 where id=10;
This is because the new 'b' value will need to reflect the row in the clustering key, and we don't know where that row is.

So, really, the common update that can theoretically be done fast for TokuMX, TokuDB, etc... is one where the where clause is on the PK, the modifications don't touch indexed columns, and we have no additional clustering keys. I'd be interested in hearing how common such a scenario is.