It's become a challenge to whittle down the speed of database transactions to achieve the real-time requirements of modern applications. Getting both SELECT and INSERT operations to execute within hundredths of a second has required administrators and developers of traditional relational databases to adopt the mindset of an elite athlete who desires to trim fractions of a second off his or her times. A developer's efforts to minimize execution times are further hampered by continually increasing datasets generated in the processing and analyzing of Big Data.
In this article, I will present a couple of ideas for achieving better INSERT speeds in MySQL. Note that these are Best Practices; your results will be somewhat dependent on your particular topology, technologies, and usage patterns.
LOAD DATA INFILE
Importing a file using the LOAD DATA INFILE statement is the preferred solution when looking for raw performance over a single connection. It does require you to prepare a properly formatted file, so if you have to dynamically generate this file first, be sure to take that into account when measuring INSERT speed.
It's meant to take a data file's content and import them into a database table. It has a fairly developed syntax, but at its core, it can be expressed as:
LOAD DATA INFILE 'FILE_PATH' INTO TABLE TABLE_NAME;
LOAD DATA INFILE 'data.txt' INTO TABLE my_db.contacts;
In its basic form, LOAD DATA assumes that fields are enclosed within single quotes and delimited by the TAB character (\t). Each line is assumed to contain a single row of data. Here's some sample lines from the above data.txt file:
'John' 'Doe' '120 jefferson st.' 'Riverside' 'NJ' '08075' 'Jack' 'McGinnis' '220 hobo Av.' 'Phily' 'PA' '09119' 'John "Da Man"' 'Repici' '120 Jefferson St.' 'Riverside' 'NJ' '08075'
The LOAD DATA INFILE statement lends itself extremely well to Comma-Separated-Value (CSV) files. To import those, you need to include the FIELDS TERMINATED BY and LINES TERMINATED BY clauses.
Here's the updated statement:
LOAD DATA INFILE 'filename.csv' INTO TABLE my_db.contacts FIELDS TERMINATED BY ',' LINES TERMINATED BY '\n';
The FIELDS TERMINATED BY and LINES TERMINATED BY clauses make TEXT fields equally straight-forward to populate from a file; just use custom field and line delimiters.
I recently employed the LOAD DATA INFILE statement to import HTML markup into my database:
-F-https://pnbcommunity.bank/rates.html-F-#content > div.container.container-sm-F-<div class="container container-sm"> <h3>Savings Accounts</h3> <table class="rates rtable201 stacktable large-only hidden-xs"> <caption class="sr-only">Comparison table of Savings Accounts.</caption> <thead> <tr> <th class="product">Product</th> <th class="min">Min. Opening Deposit</th> <th class="rate">Int<span class="visible-xs-inline">.</span><span class="hidden-xs">erest</span> Rate</th> <th class="apy">APY*</th> </tr> </thead> <tbody> <tr> <th class="product" scope="row">Savings </th> <td class="min">$100<span class="hidden-xs">.00</span></td> <td class="rate">0.100%</td> <td class="apy">0.100%</td> </tr> </tbody> </table> </div>-L- -F-https://www.psbt.com/-F-body > section-F-<div class="subsection-content"> <h2>Checking Accounts<sup>1</sup></h2> <table class="Table-Product" style="width: 100%; opacity: 1;"> <thead> <tr><th>Account</th><th>Balance</th><th>Interest Rate</th><th>APY</th></tr> </thead> <tbody> <tr> <td data-title="Account">NOW Checking</td> <td data-title="Balance">$1,000 or more</td> <td data-title="Interest Rate">0.05%</td> <td data-title="APY">0.05%</td> </tr> <tr> <td data-title="Account">Peoples Package Checking</td> <td data-title="Balance">Eligible combined balances of $5,000 or more</td> <td data-title="Interest Rate">0.10%</td> <td data-title="APY">0.10% </td> </tr> <tr> <td data-title="Account">Business Interest Checking</td> <td data-title="Balance">$2,500.00 or more</td> <td data-title="Interest Rate">0.05%</td> <td data-title="APY">0.05%</td> </tr> </tbody> </table>
Here I used "-L-" and "-F-" Line and Field delimiters:
LOAD DATA LOCAL INFILE 'C:\\My Documents\\selectors.txt' INTO TABLE page_content CHARACTER SET utf8 FIELDS TERMINATED BY '-F-' LINES TERMINATED BY '-L-';
Multiple-row INSERT Statement
Multiple-row inserts are the next best option, if for some reason, you can't or don't want to generate a temporary text file. It will give you around 65% of the LOAD DATA INFILE throughput, which is still quite a bit faster than using individual inserts. Moreover, it doesn't matter whether you're working localhost or going over a network; grouping several inserts in a single query always yields better performance.
To insert multiple records with one statement, rather than write this:
INSERT INTO my_db.contacts VALUES ('John' 'Doe' '120 jefferson st.' 'Riverside' 'NJ' '08075');
We can write this, where each set of field values is separated by a comma:
INSERT INTO my_db.contacts VALUES ('John', 'Doe', '120 jefferson st.', 'Riverside', 'NJ', '08075'), ('Jack', 'McGinnis', '220 hobo Av.', 'Phily', 'PA', '09119'), ('John "Da Man"', 'Repici', '120 Jefferson St.', 'Riverside', 'NJ', '08075');
If you decide to give multiple inserts a try, make sure that you test out a sample of your data using a few different inserts-per-query configurations in order to ascertain which number works best for you. As you increase the number of inserts per query, you may have to allocate more memory on the client-side as well as increase the max_allowed_packet setting on your MySQL server(s).
And finally, if you are generating your insert statements on the application side, be sure to pass field values via Query Parameters in order to escape characters like quotes and prevent SQL injection attacks.
# # #