For those of you interested in Geolocation applications who read my MySQL: Calculating Distance Based on Latitude and Longitude tutorial, I have some good news: the latest development releases include some new Geo-Spatial functions. These were implemented using the excellent Boost.Geometry geometric engine, which is known for its strong performance and reliability, and active and vibrant status within the development community. In today’s article, we’ll be modifying the closest_restaurants procedure that we wrote in the MySQL: Calculating Distance Based on Latitude and Longitude tutorial to utilize some of the new MySQL Geo-Spatial Functions, including ST_AsText(), Point(), ST_Distance_Sphere(), ST_Contains(), and ST_MakeEnvelope().
Downloading the Latest MySQL Development Release
The new Geo-Spatial features were originally included in the 5.7.5 DMR, but you might as well install the latest and greatest realease (5.7 at the time of this writing). That will get you all of the refactored GIS* functions, as well as the R-tree based InnoDB Spatial Indexes.
*A Geographic Information System (GIS) is a system designed to capture, store, manipulate, analyze, manage, and present all types of spatial or geographical data.
Converting Latitude and Longitude Values into Points
For the most part, MySQL Geo-Spatial functions accept coordinate Points as arguments. The reason is that all calculations are performed assuming Euclidean (planar) geometry as opposed to the geocentric system (coordinates on the Earth's surface). As explained in the first tutorial, due to the curvature of the Earth, the distance on the planar coordinate system and the geocentric system are two different things. That being said, the ST_Distance_Sphere() function does accept an optional radius parameter and we can still account for the Earth’s curvature ourselves when necessary.
The Point(x, y) function constructs a Point from a pair of x, y coordinates, that you can pass directly to other functions. In the case of latitude and longitude, the order is crucial! The longitude is the first parameter and the latitude is the second.
The closest_restaurants_gis Procedure Explained
Unlike most tutorials, where I present and explain each part of a code block, I’d like to unveil the entire procedure right here and then go through significant parts.
CREATE DEFINER=`root`@`localhost` PROCEDURE `closest_restaurants_gis` (IN units varchar(5), IN lat Decimal(9,6), IN lon Decimal(9,6), IN max_distance SMALLINT, IN limit_rows MEDIUMINT) BEGIN DECLARE avg_deg smallint DEFAULT 111; DECLARE kms_to_miles_conversion Decimal(6,5) DEFAULT 0.62137; IF units = 'miles' THEN SET avg_deg = avg_deg * kms_to_miles_conversion; SET max_distance = max_distance * kms_to_miles_conversion; END IF; SELECT pm1.post_id, p.post_title as restaurant_name_and_address, ST_AsText(Point(pm2.meta_value, pm1.meta_value)) as lon_lat, CONCAT( ROUND( ST_Distance_Sphere(Point(lon, lat), Point(pm2.meta_value, pm1.meta_value)) / 1000 * (CASE units WHEN 'miles' THEN kms_to_miles_conversion ELSE 1 END) , 3) , (CASE units WHEN ' miles' THEN ' miles' ELSE ' kms' END) ) as distance FROM goodfood_wp_md20m_postmeta AS pm1, goodfood_wp_md20m_postmeta AS pm2, goodfood_wp_md20m_posts AS p WHERE pm1.meta_key = 'latitude' AND pm2.meta_key = 'longitude' AND pm1.post_id = pm2.post_id AND pm1.post_id = p.id AND p.post_status = 'publish' AND ST_Contains( ST_MakeEnvelope( Point((lon+(max_distance/avg_deg)), (lat+(max_distance/avg_deg))), Point((lon-(max_distance/avg_deg)), (lat-(max_distance/avg_deg))) ), Point(pm2.meta_value, pm1.meta_value) ) ORDER BY distance ASC LIMIT limit_rows; END
Creating an Envelope using the Average Distance between Longitude and Latitude Degrees
The very first variable declaration sets avg_deg to 111. That number is the average distance between longitude and latitude degrees, which is 111km. It’s fairly accurate for latitude, but much more variant for longitude, due to the bulging of the Earth at the equator.
We use that distance to create the envelope (or bounding box) around our location.
Take a look at the call to ST_MakeEnvelope() in the WHERE clause and you’ll see that the max_distance is divided by the avg_deg before being added to or subtracted from our location. Adding to and subtracting from both the latitude and longitude has the effect of creating a diagonal, which results in ST_MakeEnvelope() constructing a polygon using pt1 and pt2 as diagonal points.
While serviceable for small bounding boxes, larger areas will lose precision due to the variations in longitude. For such cases, you can employ the following formulas to calculate the bounding box longitude coordinates:
lon1 = lon + (distance/abs(cos(radians(lat))*111)) lon2 = lon - (distance/abs(cos(radians(lat))*111))
The resulting envelope is then passed to the ST_Contains() function to determine whether or not the coordinates of each restaurant falls within the bounding box.
Using ST_Distance_Sphere() to Calculate the Distance between Two Points
In addition to finding the closest restaurants within a given range, the above procedure also returns the distance of each restaurant from our location. We can now replace the Haversine formula with ST_Distance_Sphere(). It returns the minimum distance in meters between two lon/lat coordinates using a spherical earth and radius of 6,370,986 meters.
I included some additional calculations for converting from meters to kilometers (and to miles if necessary), rounding, and appending the unit type to each value.
Calling the Procedure
Here is the same search for the ten closest restaurants to the center of London, UK, measured in kilometers, within a hundred kilometer radius:
CALL `restaurants`.`closest_restaurants_gis`('kms', 51.5112139, -0.119824, 100, 10);
post_id |
restaurant_name_and_address |
lon_lat |
distance |
2103 |
Pret A Manger - 135 Strand |
POINT(-0.12143076204578 51.510322681011) |
0.066 kms |
2288 |
Pret A Manger - 87-88 Strand |
POINT(-0.123385 51.509794) |
0.149 kms |
2093 |
Pret A Manger - 421/422 Strand |
POINT(-0.118399 51.514158) |
0.293 kms |
2095 |
Pret A Manger - 29-33 Kingsway |
POINT(-0.122915 51.513926) |
0.342 kms |
2139 |
Pret A Manger - 65 Long Acre |
POINT(-0.1251 51.5103) |
0.370 kms |
7075 |
Nando's - Covent Garden |
POINT(-0.114694 51.512557) |
0.379 kms |
2146 |
Pret A Manger - 182 Strand |
POINT(-0.125737 51.511575) |
0.385 kms |
1324 |
Carluccio's - Covent Garden |
POINT(-0.12373537807585 51.5082070105091) |
0.411 kms |
2088 |
Pret A Manger - 25 Villiers Street |
POINT(-0.124849 51.514033) |
0.430 kms |
2305 |
Pret A Manger - 37 Shelton Street |
POINT(-0.118865231606 51.5112310396) |
0.468 kms |
Rounded to three decimal places, the distances obtained using the MySQL 5.7 Geo-Spatial functions are exactly the same as those obtained in the last tutorial. Just to get a second opinion, I ran the coordinates through an online calculator that does use the Haversine formula. Again, the numbers match. For instance, here are the results for Pret A Manger - 87-88 Strand:
Conclusion
Just because these functions will soon be available in an official MySQL release doesn’t implicitly mean that you need to use them. Just think of them as another tool at your disposal. The important thing is that you are able to generate the correct results for your particular application.