With the proliferation of mobile devices being at an all-time high, there has never been more demand for “Find near me” type apps. As a result, just when you thought you were finally free of trigonometric functions like sine and cosine, you find out that they were useful after all! In fact, it turns out that they can be instrumental in calculating the distance between two points. We’ll be using them today to fetch a list of nearby restaurants sorted by distance using nothing but the power of MySQL!
The Haversine Formula
The Greek mathematician Pythagoras developed his theorem for calculating the shortest distance between points way back in 5th-century B.C.
distance = sqrt((X2 - X1)^2 + (Y2 - Y1)^2)
That formula would work perfectly for distance calculations as well, if only the Earth were flat (some say it is)! Due to the curvature of the Earth, Pythagoras’s Theorem does not lend itself so well to geolocation calculations.
Image source: Wikipedia
A more precise result may be obtained using the Haversine Formula. It takes the radius of the Earth into account. It’s the R variable below:
dlon = lon2 - lon1 dlat = lat2 - lat1 a = (sin(dlat/2))^2 + cos(lat1) * cos(lat2) * (sin(dlon/2))^2 c = 2 * atan2( sqrt(a), sqrt(1-a) ) distance = R * c (where R is the radius of the Earth)
R = 6367 km OR 3956 mi
Since the earth is not a perfect sphere, using a fixed value for the Radius is still not ideal, but good enough for most applications. The Earth is actually an oblate spheroid and not perfectly spherical (it bulges at the equator), so the above formula will tend to overestimate trans-polar distances and underestimate trans-equatorial distances.
Applying the Haversine Formula to MySQL
Calculating the distance between two locations using the Haversine Formula in MySQL requires us to call upon several of MySQL’s built-in Math functions, including cos(), sin(), acos(), and radians(). In case you were wondering like I was, the radians() function converts the value of a number from degrees to radians where pi radians equals 180 degrees.
3959 * acos( cos( radians(lat1) ) * cos( radians(lat2) ) * cos( radians(lon2) - radians(lon1)) + sin(radians(lat1)) * sin( radians(lat2) )
Incorporating the Haversine Formula into a Stored Procedure
For ease of use, I would recommend that you place the above formula in a stored procedure (a.k.a. a proc). Our sample proc searches a WordPress database for restaurants that are located within a given distance from the user. For the latitude and longitude parameters, I chose to use the Decimal type with six digits of precision. You may choose to adjust this for your needs. In addition to the latitude and longitude values, this proc accepts input parameters for the units of measurement (miles or kilometers), the maximum distance to search, and the maximum number of results to return.
CREATE DEFINER=`root`@`localhost` PROCEDURE `closest_restaurants` (IN units varchar(5), IN lat Decimal(9,6), IN lon Decimal(9,6), IN max_distance SMALLINT, IN limit_rows MEDIUMINT) BEGIN SELECT pm1.post_id, p.post_title, ROUND((CASE units WHEN 'miles' THEN 3959 ELSE 6371 END * acos( cos( radians(lat) ) * cos( radians(pm1.meta_value) ) * cos( radians(pm2.meta_value) - radians(lon)) + sin(radians(lat)) * sin( radians(pm1.meta_value))) ), 3) AS distance FROM goodfood_wp_md20m_postmeta AS pm1, goodfood_wp_md20m_postmeta AS pm2, goodfood_wp_md20m_posts AS p WHERE pm1.meta_key = 'latitude' AND pm2.meta_key = 'longitude' AND pm1.post_id = pm2.post_id AND pm1.post_id = p.id AND p.post_status = 'publish' HAVING distance <= max_distance ORDER BY distance ASC LIMIT limit_rows; END
Calling the Proc
Here is a search for the ten closest restaurants to the center of London, UK, measured in miles, within a hundred mile radius:
CALL `restaurants`.`closest_restaurants`('miles', 51.5112139, -0.119824, 100, 10);
post_id |
post_title |
distance |
2103 |
Pret A Manger - 135 Strand |
0.041 |
2288 |
Pret A Manger - 87-88 Strand |
0.093 |
2093 |
Pret A Manger - 421/422 Strand |
0.182 |
2095 |
Pret A Manger - 29-33 Kingsway |
0.212 |
2139 |
Pret A Manger - 65 Long Acre |
0.230 |
7075 |
Nando's - Covent Garden |
0.236 |
2146 |
Pret A Manger - 182 Strand |
0.239 |
1324 |
Carluccio's - Covent Garden |
0.256 |
2088 |
Pret A Manger - 25 Villiers Street |
0.267 |
2305 |
Pret A Manger - 37 Shelton Street |
0.291 |
As evidenced by the proc output, the center of London is saturated by fresh food chains like Pret A Manger, Nando’s Chicken, and Carluccio’s. Just look at the Pret A Manger shops that come up in Google Maps:
Optimizing for Speed
There are those who feel that the database is not the best place to put these types of intensive calculations. I have no strong opinions either way, but I have to admit that with an average execution time of about 2.6 seconds on a local database, the search is not as fast as I would like.
One way to speed things up considerably is to define a box around the user’s coordinates whose size is equal to the max_distance parameter. Each degree of latitude is approximately 69 miles (111 kilometers) apart. Here’s why: the circumference of the earth along the equator is roughly 24,901.92 miles and there are 360 degrees in a circle. Therefore, if we divide 24,901.92 miles by 360 degrees, we get 69.172 miles per degree. There is also a formula for extrapolating one degree of longitude based on the latitude:
1° of latitude ~= 69 miles
1° of longitude ~= cos(latitude)*69
Thus, to calculate the longitude and latitude for the rectangle in MySQL we would add the following to our proc:
DECLARE lon1, lon2, lat1, lat2 float; SET lon1 = lon-max_distance/abs(cos(radians(lat))*ONE_DEGREE_CONSTANT); SET lon2 = lon+max_distance/abs(cos(radians(lat))*ONE_DEGREE_CONSTANT); SET lat1 = lat-(max_distance/ONE_DEGREE_CONSTANT); SET lat2 = lat+(max_distance/ONE_DEGREE_CONSTANT);
Then, we would replace the “HAVING distance <= max_distance” in the WHERE clause with the following two lines:
AND pm2.meta_value between lon1 and lon2 AND pm1.meta_value between lat1 and lat2
Here is the full proc:
CREATE DEFINER=`root`@`localhost` PROCEDURE `closest_restaurants_optimized` (IN units varchar(5), IN lat Decimal(9,6), IN lon Decimal(9,6), IN max_distance SMALLINT, IN limit_rows MEDIUMINT) BEGIN DECLARE ONE_DEGREE_CONSTANT TINYINT; DECLARE EARTH_RADIUS_CONSTANT SMALLINT; DECLARE lon1, lon2, lat1, lat2 float;
IF units = 'miles' THEN SET ONE_DEGREE_CONSTANT = 69; SET EARTH_RADIUS_CONSTANT = 3959; ELSE -- default to kilometers SET ONE_DEGREE_CONSTANT = 111; SET EARTH_RADIUS_CONSTANT = 6371; END IF;
SET lon1 = lon-max_distance/abs(cos(radians(lat))*ONE_DEGREE_CONSTANT); SET lon2 = lon+max_distance/abs(cos(radians(lat))*ONE_DEGREE_CONSTANT); SET lat1 = lat-(max_distance/ONE_DEGREE_CONSTANT); SET lat2 = lat+(max_distance/ONE_DEGREE_CONSTANT);
SELECT pm1.post_id, p.post_title, ROUND((EARTH_RADIUS_CONSTANT * acos( cos( radians(lat) ) * cos( radians(pm1.meta_value) ) * cos( radians(pm2.meta_value) - radians(lon)) + sin(radians(lat)) * sin( radians(pm1.meta_value))) ), 3) AS distance FROM goodfood_wp_md20m_postmeta AS pm1, goodfood_wp_md20m_postmeta AS pm2, goodfood_wp_md20m_posts AS p WHERE pm1.meta_key = 'latitude' AND pm2.meta_key = 'longitude' AND pm1.post_id = pm2.post_id AND pm1.post_id = p.id AND p.post_status = 'publish' AND pm2.meta_value between lon1 and lon2 AND pm1.meta_value between lat1 and lat2 ORDER BY distance ASC LIMIT limit_rows; END
The optimized version ran at a far more respectable 0.078 seconds on average.
Conclusion
The part of your systems to which you assign calculation duties depends on your particular environment. If you’re using MySQL as the backend of a multi-tier enterprise system, perhaps the programming code can perform the calculations faster. As we saw here today, MySQL is more than capable of performing complex numeric calculations, provided that you take the time to optimize them.