I used the
MongoDB sysbench client to measure the overhead from skipping N attributes in a document to access the N+1th attribute for N set to 1, 10, 100 and 1000. As expected there is overhead as N grows, for small N the overhead is less for TokuMX and for large N the overhead is less for MongoDB.
My test used 1 collection with 1M documents and 1 client thread. The sysbench client was changed to add extra attributes immediately after the _id attribute. The extra attributes used names of the form
x${K} for K in 1 to 1, 1 to 10, 1 to 100 and 1 to 1000. Each of these attributes were assigned the value of K (x1=1, x2=2, ...). Immediately after these another attribute was added with the name
y and assigned the same value as the
_id attribute. Each test query matches one document and returns the values of the _id and y attributes after matching the _id and y attributes. The purpose of this test is to determine the overhead from skipping the extra attributes to find the y attribute. For N=1 the results and test was kind-of similar to what I did for a
recent cached & read-only test.
This is data for the chart above. Note that TokuMX does much better for small N but MongoDB does much better for large N. I repeated the tests with the y attribute before the x${K} attributes and the results are about the same as below so my assumption was wrong about the overhead for attribute searches and the real overhead is from BSON parsing. Note that I ran extra tests for 200, 400, 500 and 750 attributes to understand the rate at which QPS decreases.
queries per second by #attributes
1 10 100 200 400 500 750 1000 number of x${K} attributes
14326 13643 10317 8638 6543 5832 4515 3306 tokumx141
9866 9398 8473 8305 7242 6903 6164 4954 mongo249
9119 9530 8547 8613 7890 7577 6602 5289 mongo260
Overhead for 1000 attributes
I looked at top when the tests were running and for all cases mongod was using ~90% of 1 CPU core and Java was using ~10 of another. So MongoDB was using CPU at the same rate as TokuMX while serving many more QPS. I then used the Linux perf utility to understand what was using CPU in mongod. The CPU overhead for TokuMX is from calls to free & malloc. If I had an easy way to get hierarchical profiles I would include that, alas I don't have that. I do have the -g option with perf but the output isn't very interesting. It would be nice if TokuMX figured out how to get more debug symbols into their binary without expecting users to download/install a separate debug package.
This is the output from "perf -g" for TokuMX 1.4.1
14.56% mongod mongod [.] free
|
--- free
|
|--91.43%-- 0x7fe27c1c09d8
| (nil)
|
|--7.66%-- (nil)
--0.91%-- [...]
12.71% mongod mongod [.] malloc
|
--- malloc
|
|--76.17%-- mongo::ElementMatcher::~ElementMatcher()
| (nil)
|
|--16.90%-- operator new(unsigned long)
| 0x100000000
| 0x3031317473657462
|
|--4.22%-- (nil)
| |
| --100.00%-- 0xffb
|
|--1.37%-- 0xffffffff00000000
| 0x3031317473657462
|
--1.34%-- mongo::BSONObj::init
And this is the non-hierarchical output from Linux perf during a 5 second interval of the test.
TokuMX 141
15.83% mongod mongod [.] free
11.64% mongod mongod [.] malloc
11.23% mongod mongod [.] mongo::Projection::transform
6.42% mongod libstdc++.so.6.0.17 [.] _ZNSs4_Rep10_M_disposeERKSaIcE.part.13
6.40% mongod mongod [.] std::_Rb_tree<>::find
4.81% mongod libc-2.5.so [.] strlen
4.01% mongod libstdc++.so.6.0.17 [.] std::basic_string::basic_string
3.49% mongod libstdc++.so.6.0.17 [.] char* std::string::_S_construct
2.43% mongod libc-2.5.so [.] memcmp
2.38% mongod libc-2.5.so [.] memcpy
2.00% mongod libstdc++.so.6.0.17 [.] operator new(unsigned long)
1.93% mongod mongod [.] mongo::Projection::append
1.27% mongod libstdc++.so.6.0.17 [.] std::string::_Rep::_S_create
MongoDB 2.4.9
13.78% mongod mongod [.] mongo::Projection::transform
11.75% mongod mongod [.] mongo::UnorderedFastKeyTable::Area::find
7.95% mongod libc-2.13.so [.] __strlen_sse42
7.28% mongod mongod [.] mongo::Projection::append
4.21% mongod mongod [.] (anonymous namespace)::cpp_alloc
3.86% mongod mongod [.] mongo::BSONElement::size()
2.07% mongod mongod [.] _ZN12_GLOBAL__N_121do_free_with_callback
1.18% mongod mongod [.] mongo::KeyV1::toBson()
0.92% mongod mongod [.] mongo::BucketBasics::KeyNode::KeyNode
0.86% mongod mongod [.] boost::detail::shared_count::~shared_count()
0.71% mongod libpthread-2.13.so [.] pthread_mutex_lock
0.70% mongod libstdc++.so.6.0.16 [.] std::basic_string::~basic_string()
0.69% mongod mongod [.] tc_malloc
0.68% mongod libc-2.13.so [.] __memcpy_ssse3
MongoDB 2.6.0
14.28% mongod mongod [.] mongo::ProjectionStage::transform
10.03% mongod mongod [.] MurmurHash3_x64_128
4.45% mongod mongod [.] mongo::BSONElement::size()
3.93% mongod libc-2.13.so [.] __strlen_sse42
2.83% mongod mongod [.] _ZNKSt3tr110_HashtableIN5mongo10StringDataES2_SaIS2_ESt9_IdentityIS2_ESt8equal_toIS2_ENS2_6HasherENS_8__detail18_Mod_range_hashingENS9_20_Default_ranged_hashENS9_20_Prime_rehash_policyELb0ELb1ELb1EE12_M_find_nodeEPNS9_10_Hash_nodeIS2_Lb0EEERKS2_m.isra.162.constprop.255
2.65% mongod mongod [.] operator new(unsigned long)
1.98% mongod libc-2.13.so [.] __memcmp_sse4_1
1.74% mongod mongod [.] mongo::StringData::Hasher::operator()
1.62% mongod mongod [.] operator delete(void*)
1.46% mongod mongod [.] mongo::KeyV1::toBson()
1.20% mongod mongod [.] mongo::BucketBasics::KeyNode::KeyNode
1.06% mongod libc-2.13.so [.] __memcpy_ssse3
0.83% mongod mongod [.] tc_malloc
0.82% mongod mongod [.] mongo::ps::Rolling::access
0.78% mongod libpthread-2.13.so [.] pthread_mutex_lock
Overhead for 10 attributes
This lists the CPU profile from Linux perf during 5 seconds of the test. Unlike the 1000 attribute result above, here malloc/free or their equivalent are the top two sources for TokuMX and MongoDB.
TokuMX 1.4.1
4.53% mongod mongod [.] malloc
4.12% mongod mongod [.] free
2.89% mongod libtokufractaltree.so [.] _ZNK4toku3omtIP13klpair_structS2_Lb0EE24find_internal_plus_arrayIR9ft_searchXa
dL_Z15wrappy_fun_findIS6_XadL_ZL23heaviside_from_search_tRK10__toku_db
1.80% mongod libc-2.5.so [.] strlen
1.53% mongod mongod [.] mongo::storage::KeyV1::woCompare
1.14% mongod libpthread-2.5.so [.] pthread_mutex_lock
1.02% mongod libc-2.5.so [.] memcpy
0.99% mongod mongod [.] mongo::Projection::transform
0.96% mongod libstdc++.so.6.0.17 [.] _ZNSs4_Rep10_M_dispose
0.87% mongod libstdc++.so.6.0.17 [.] operator new(unsigned long)
0.86% mongod mongod [.] mongo::Projection::init
0.85% mongod libtokufractaltree.so [.] _Z26toku_ft_search_which_childP17__toku_descriptorPFiP9__toku_dbPK10__toku_dbt
S5_EP6ftnodeP9ft_search
0.85% mongod libc-2.5.so [.] memcmp
MongoDB 2.4.9
7.47% mongod mongod [.] cpp_alloc
3.94% mongod mongod [.] _ZN12_GLOBAL__N_121do_free_with_callback
2.24% mongod mongod [.] mongo::KeyV1::toBson()
2.21% mongod libc-2.13.so [.] __strlen_sse42
1.58% mongod mongod [.] boost::detail::shared_count::~shared_count()
1.51% mongod mongod [.] mongo::BSONElement::size()
1.48% mongod libc-2.13.so [.] __memcpy_ssse3
1.40% mongod mongod [.] mongo::BucketBasics::KeyNode::KeyNode
1.39% mongod libstdc++.so.6.0.16 [.] _ZNSs4_Rep10_M_disposeERKSaIcE
1.22% mongod libstdc++.so.6.0.16 [.] std::basic_string::~basic_string()
1.22% mongod libc-2.13.so [.] __memcmp_sse4_1
1.19% mongod mongod [.] tc_malloc
0.99% mongod mongod [.] operator new(unsigned long)
0.95% mongod mongod [.] boost::intrusive_ptr::~intrusive_ptr()
0.94% mongod libpthread-2.13.so [.] pthread_mutex_lock
0.89% mongod libstdc++.so.6.0.16 [.] std::basic_string::basic_string
0.77% mongod libstdc++.so.6.0.16 [.] std::basic_string::basic_string
0.69% mongod mongod [.] mongo::BtreeBucket::customBSONCmp
0.68% mongod mongod [.] mongo::UnorderedFastKeyTable::Area::find
0.64% mongod [ip_tables] [k] ipt_do_table
0.64% mongod mongod [.] CoveredIndexMatcher::matches
MongoDB 2.6.0
4.26% mongod mongod [.] operator new(unsigned long)
2.86% mongod mongod [.] operator delete(void*)
2.70% mongod mongod [.] mongo::KeyV1::toBson() const
1.62% mongod mongod [.] mongo::BucketBasics::KeyNode::KeyNode
1.57% mongod libc-2.13.so [.] __memcpy_ssse3
1.56% mongod libc-2.13.so [.] __strlen_sse42
1.14% mongod libc-2.13.so [.] vfprintf
1.13% mongod libpthread-2.13.so [.] pthread_mutex_lock
1.13% mongod libc-2.13.so [.] __memcmp_sse4_1
1.06% mongod mongod [.] mongo::BSONElement::size()
0.92% mongod libstdc++.so.6.0.16 [.] _ZNSs4_Rep10_M_dispose
0.87% mongod mongod [.] tc_malloc
0.84% mongod libstdc++.so.6.0.16 [.] std::basic_string::~basic_string()