For the past few years, data centers have been gradually replacing their hard drives with faster flash storage. Meanwhile, database storage engines such as MyISAM and InnoDB continue to use forty-year old indexing algorithms for managing read and write operations. Tokutek has developed a new kind of data structure called a Fractal Tree that is purported to achieve insertion rates of five to twenty times faster in large databases, as in 100 million rows and above. Moreover, these benefits have been shown to continue right on up to fifty billion rows!
But what’s in it for us smaller users? Are databases of “only” a few million rows too small to benefit from TokuDB’s Fractal Tree indexing? I could find plenty of good studies showing how TokuDB benefitted data centers requiring high insertion workloads, fast response times, and that must simultaneously store and query large volumes of rapidly arriving data. In fact, Profile Technology is using TokuDB for their Advance Searches of Facebook. Other sizeable clients include Evidenzia (Peer-to-Peer Monitoring), Limelight Networks (Agile Cloud Storage), Intent Media (Online Advertising), Jawa (Mobile Gaming), and many others. There was a lot less information on smaller players, which lead me to decide to set out and find out for myself.
What I found was very encouraging...very encouraging indeed.
How TokuDB Works
While the traditional B-Tree indexing works quite well with newly written data, which is laid out sequentially on the disk, write times grow considerably as the data becomes scattered in different disk sectors over time. And that’s where TokuDB’s Fractal Tree indexing shines. It uses a new kind of indexing that outperforms both B-trees and Log-Structured Merge trees (LSMs) by using a better method for writing data to disk, which in turn improves how fast data can be read.
Platforms and Licencing
The latest version (6.5) is optimized for flash and is available for Linux only. As a Windows user, I was initially surprised at its exclusion, until I came across a comment from one of my readers who asked “Why would you run MySQL on Windows?” Clearly, Linux is by far the more popular MySQL platform.
For now, TokuDB is meant to fill the gap between the open source MySQL server and the über-expensive enterprise offerings of the big players like Oracle, IBM, and Microsoft. TokuDB is licensed at a modest $2,500 / 100GB / year based on usage capacity rounded to the nearest 100GB for production deployments larger than 50GB.
You can download the patched MySQL source and Fractal Tree Index library and compile MySQL yourself or you can download the fully functional binaries. I used the latter, which was as easy to set up as a standard MySQL server.
On disk, a table using the TokuDB storage engine is different from both InnoDB and MyISAM. In addition to having a .frm file, as all MySQL tables do, there is a directory, which contains a main.tokudb file with the data, a status.tokudb file, and a key-KEYNAME.tokudb file for each index:
The Test Environment
Being a Windows user did not unequivocally impede my testing the linux-based installation thanks to virtual machine software. I went with Oracle VirtualBox because it accepted Open Virtual Application (.ova) files. That’s a package that contains files describing a virtual machine; it includes an .OVF descriptor file, optional manifest (.MF) and certificate files, and other related files, saved in a single archive using .TAR packaging. When I told the folks at Tokutek about my article, they were good enough to send me a pristine testing environment, which included the Ubuntu OS with a basically configured empty TokuDB database.
The TokuDB database also includes the usual MySQL storage engines, so InnoDB and MyISAM tables are available within the server. By Default, TokuDB grabs fifty percent of available RAM on start-up for caching, so I had to make sure not to set the innodb_buffer_pool_size to a value that would cause the server to overallocate memory and swap – hence degrading performance. I was also constrained by the 10 GB of virtual disk allocated to the virtual host, so I couldn’t go overboard on table size. Finally, we must keep in mind that the virtualization software stands in between the software and the physical disk, so I/O operations are partly determined by the virtualization software and the host OS.
I used MySQL Workbench to communicate with the database.
The following results should not be taken as gospel, since my tests were far from rigid and lacked the quality control standards that would be required of stringent official QA testing. They will however provide some idea as to what an average user can expect to achieve “out of the box”.
I added one table to each of my test databases called user, with the following structure:
Table user ========== user_id int(11) PK (autonumber) first_name varchar(45) last_name varchar(45) sex char(1)
Test #1: Import one million records into the user table.
To insert the records, I created a stored procedure that set each field to random strings. These would also be used for pattern matching later:
DELIMITER $$ CREATE DEFINER=`root`@`localhost` PROCEDURE `populate_user_table`() BEGIN DECLARE a INT Default 0; while a < 1000000 do SET a=a+1; insert into test.user set `first_name` = (SELECT SUBSTRING(MD5(RAND()) FROM 1 FOR (FLOOR(5 + RAND() * 45)) )), `last_name` = (SELECT SUBSTRING(MD5(RAND()) FROM 1 FOR (FLOOR(5 + RAND() * 45)) )), `sex` = (SELECT case when RAND() <= 0.5 then 'm' else 'f' end); END while; END $$
Here are the results of the insert test:
It took just over three hours to insert all one million records into the user table. Keep in mind, that in both cases:
- The creation of the random strings probably took the bulk of the processing time.
- Inserting one record at a time is a lot less efficient than bulk inserts, as noted by the mySQL docs:
If you are inserting many rows from the same client at the same time, use INSERT statements with multiple VALUES lists to insert several rows at a time. This is considerably faster (many times faster in some cases) than using separate single-row INSERT statements.
After ten hours, the dbms connection keep-alive interval kicked in and closed the connection:
Action Message Duration / Fetch call `test`.`populate_user_table` Error Code: 2013. Lost connection 36001.099 sec to MySQL server during query
I certainly did not have to rerun this test to see that TokuDB greatly outperformed MySQL.
Winner: TokuDB with an improvement of about 3X faster.
I inserted a second million later on, which took only two hours and forty-one minutes on TokuDB, and well over ten in MySQL.
Test #2: Query the Test Table Using a Table Scan.
String matching on non-indexed fields is one of the slowest searches you can perform because the database engine has to employ a table scan, meaning that the database has to search through all of the rows in a table. To avoid having a situation where one database happened to find a lot more records than the other, I performed three variations on a theme, and averaged the results. Here are the statements that I used to test our competitors’ reading prowess, along with their results:
TokuDB: select * from user where last_name like ‘%aa%’; 79910 rows in set (0.76 sec)
select * from user where last_name like ‘%bb%’; 79381 rows in set (0.74 sec)
select * from user where last_name like ‘%cc%’; 79874 rows in set (0.73 sec)
Average execution time: 0.7433 Average number of rows returned: 79721.6667
I made a couple of observations at this point:
- Each query, while returning about the same number of rows, executed progressively faster.
- It took far longer to display all of the records on the screen than to fetch them!
select * from user where last_name like ‘%aa%’; Fetched 78539 records in: 0.853 sec
select * from user where last_name like ‘%bb%’; Fetched 79752 records in: 0.868 sec
select * from user where last_name like ‘%cc%’; Fetched 78930 records in: 0.857 sec
Average execution time: 0.8593 Average number of rows returned: 79073.6667
Winner: TokuDB with an improvement of 13.5%
For the next test, I updated rows based on another like in the where clause. The purpose here was to test both the lookup speed as well as that of writing.
MySQL> update user set first_name = ‘updated’ where last_name like ‘%dd%’;
Query OK, 80129 rows affected (6.72 sec) Rows matched: 80129 Changed: 80129 Warnings: 0
MySQL> update user set first_name = ‘updated’ where last_name like ‘%dd%’; Query OK, 79107 rows affected (7.973 sec) Rows matched: 79107 Changed: 79107 Warnings: 0
Winner: TokuDB with an improvement of 15.72%
The final test touched on an area where TokuDB really goes into high gear: column and index management. Adding a column to the TokuDB user table was literally instantaneous:
MySQL> alter table user add email varchar(60);
Query OK, 0 rows affected (0.22 sec)
MySQL> alter table user add email varchar(60);
Query OK, 0 rows affected (58.136 sec)
Winner: TokuDB with an improvement of 100X faster.
Even at a relatively small size of a few million rows, TokuDB for MySQL shows notable performance improvements in both reads and writes over the stock MySQL RDBMS with the InnoDB engine. In that light, TokuDB’s appeal may be wider than proposed by Tokutek. TokuDB for MySQL is free for development, evaluation, experimentation, so I urge you to try your own experiments.