In a system level programming language like C, C++ or D, what is the best type/encoding for storing latitude and longitude?
The options I see are:
IEEE-754 FP as degrees or radians
degrees or radians stored as a fixed point value in an 32 or 64 bit int
mapping of an integer range to the degree range: -> deg = (360/2^32)*val
degrees, minutes, seconds and fractional seconds stored as bit fields in an int
a struct of some kind.
The easy solution (FP) has the major down side that it has highly non uniform resolution (somewhere in England it can measure in microns, over in Japan, it can't). Also this has all the issues of FP comparison and whatnot. The other options require extra effort in different parts of the data's life cycle. (generation, presentation, calculations etc.)
One interesting option is a floating precision type that where as the Latitude increase it gets more bits and the Longitude gets less (as they get closer together towards the poles).
Related questions that don't quite cover this:
What is the ideal data type to use when storing latitude / longitudes in a MySQL
Working with latitude/longitude values in Java
BTW: 32 bits gives you an E/W resolution at the equator of about 0.3 in. This is close to the scale that high grade GPS setups can work at (IIRC they can get down to about 0.5 in in some modes).
OTOH if the 32 bits is uniformly distributed over the earth's surface, you can index squares of about 344m on a side, 5 Bytes give 21m, 6B->1.3m and 8B->5mm.
I don't have a specific use in mind right now but have worked with this kind of thing before and expect to again, at some point.
The easiest way is just to store it as a float/double in degrees. Positive for N and E, negative for S and W. Just remember that minutes and seconds are out of 60 (so 31 45'N is 31.75). Its easy to understand what the values are by looking at them and, where necessary, conversion to radians is trivial.
Calculations on latitudes and longitudes such as the Great Circle distance between two coordinates rely heavily on trigonometric functions, which typically use doubles. Any other format is going to rely on another implementation of sine, cosine, atan2 and square root, at a minimum. Arbitrary precision numbers (eg BigDecimal in Java) won't work for this. Something like the int where 2^32 is spread uniformly is going to have similar issues.
The point of uniformity has come up in several comments. On this I shall simply note that the Earth, with respect to longitude, isn't uniform. One arc-second longitude at the Arctic Circle is a shorter distance than at the Equator. Double precision floats give sub-millimetre precision anywhere on Earth. Is this not sufficient? If not, why not?
It'd also be worth noting what you want to do with that information as the types of calculations you require will have an impact on what storage format you use.
Longitudes and latitudes are not generally known to any greater precision than a 32-bit float. So if you're concerned about storage space, you can use floats. But in general it's more convenient to work with numbers as doubles.
Radians are more convenient for theoretical math. (For example, the derivative of sine is cosine only when you use radians.) But degrees are typically more familiar and easier for people to interpret, so you might want to stick with degrees.
A Decimal representation with precision of 8 should be more than enough according to this wikipedia article on Decimal Degrees.
0 decimal places, 1.0 = 111 km
...
7 decimal places, 0.0000001 = 1.11 cm
8 decimal places, 0.00000001 = 1.11 mm
http://www.esri.com/news/arcuser/0400/wdside.html
At the equator, an arc-second of longitude approximately equals an arc-second of latitude, which is 1/60th of a nautical mile (or 101.27 feet or 30.87 meters).
32-bit float contains 23 explicit bits of data.
180 * 3600 requires log2(648000) = 19.305634287546711769425914064259 bits of data. Note that sign bit is stored separately and therefore we need to amount only for 180 degrees.
If you normalize the value 648000 to some power of 2 then the following calculation applies.
After subtracting from 23 the bits for log2(648000) we have remaining extra 3.694365712453288230574085935741 bits for sub-second data.
That is 2 ^ 3.694365712453288230574085935741 = 12.945382716049382716049382716053 parts per second.
Therefore a float data type can have 30.87 / 12.945382716049382716049382716053 ~= 2.38 meters precision at equator.
The above calculation is precise in case you normalize the 180 degrees value to some power of 2. Else assuming that sub-degree precision is stored after the decimal point, the floating point representation will physically use all 8 bits for the degrees part. That leaves 15 bits for the sub-degree precision. Then 15 - log2(3600) makes 3.1862188087829629413518832531256 bits for sub-second data, or 3.3914794921875 ~= 3.39 meters precision at equator. That is about a meter less than normalization would have provided.
Great question!
I know this question is 9 years old now, and I only know a part of the answer you were seeking, but I just came here having a similar question, and many things have changed since that question was asked, such as hardware and GPSes available. I work with this subject frequently in firmware dealing with different kinds of GPSes in different kinds of applications, and have lost count of the hours (and days) I have spent working out "the best design" for different applications that I have worked with or developed.
As always, different solutions are going to provide benefits and costs, and ultimately, a "best design" is always going to be a "best fit" of the benefits and costs against system requirements. Here are some things that I have to consider when I ask the same question:
CPU Time Cost
If CPU does not have a built-in floating-point co-processor (as is the case with many microcontrollers), then dealing with 'float', 'double', and 'long double' can be extremely costly. For example, with one 16-bit microcontroller I work with regularly, a multiplication using 'double' values costs 326 CPU clock cycles, and a division costs 1193 clock cycles. Very expensive!
Accuracy Trade-Off
At the equator, a 'float' (IEEE-754 32-bit floating point value), needing to represent a signed degree value, assuming 7 "clean" significant decimal digits able to be represented, the change of one least-significant decimal digit (e.g. from 179.9999 to 180.0000) is going to represent a distance of about 11.12 meters. This may or may not meet hard system accuracy requirements. Whereas a 'double' (with 15 "clean" significant decimal digits represented, thus a change from 179.999999999999 to 180.000000000000) represents about 0.00011 mm.
Input Accuracy Limitations
If you're dealing with input from a GPS, how many digits of real accuracy are you getting, and how many do you need to preserve?
Development Time Costs
An IEEE-754 64-bit double-precision value ('double') and 32-bit single-precision value ('float') are VERY convenient to deal with in the C language since math libraries for both come with virtually every C compiler, and are usually very reliable. If your CPU comes with a hardware floating-point processor, this is an easy choice.
RAM and Storage Costs
If you have to keep a large number of these values in RAM (or storage e.g. MYSQL), available RAM (and storage space) might have an impact on the workability of the solution.
Available Data vs Required Data
One example I'm dealing with at this writing (the reason I came here to this question) is that I am dealing with a u-blox M8 GPS which is able to give me binary GPS information (saving the CPU overhead of translating ASCII NMEA sentences). In this binary format (called "UBX Protocol") latitude and longitude are represented as signed 32-bit integers, which representation is able to represent accuracy (at the equator) of down to about 1.11 cm. For example, -105.0269805 degrees longitude is represented as -1050269805 (using all 32 bits) and one LSb change represents about 1.11 cm change in latitude anywhere, and 1.11 cm longitude at the equator (and less at higher latitudes, in proportion to the cosine of the latitude). The application this GPS is in does navigation tasks, which (already existing and well-tested code) requires 'double' data types. Unfortunately, converting this integer to an IEEE-754 64-bit 'double' cannot be easily done just by moving the base-2 bits of the integer into the internal representation bits of the 'double' since the decimal shift to be performed is a base-10 decimal shift. Were it a base-2 decimal shift instead, then the base-2 bits of the integer could be moved into the bit-fields of the 'double' with very little translation required. But alas, this is not the case with the signed integer I have. So it is going to cost me a multiplication on a CPU that doesn't have a hardware floating-point processor: 326 CPU clock cycles.
double ldLatitude;
int32_t li32LatFromGps;
ldLatitude = (double)li32LatFromGps * 0.0000001;
Note this multiplication was chosen over this:
ldLatitude = (double)li32LatFromGps / 10000000.0;
because 'double' multiplication is about 3.6X faster than 'double' division on the CPU that I'm dealing with. Such is life in the microcontroller world. :-)
What would have been BRILLIANT (and may be in the future if I can spare the time on weekends) is if the navigation tasks could be done directly with the 32-bit signed integer! Then no conversion would be needed.... But would it cost more to do the navigation tasks with such an integer? CPU costs, probably much more efficient. Development time costs? That's another question, especially with a well-tested system already in place, that uses IEEE-754 64-bit 'double' values! Plus there is already-existing software that provides map data (using 'double' degree values), which software would have to be converted to use the signed integer as well -- not an overnight task!
One VERY interesting option is to directly (without translation) represent intersections between approximations of "rectangles" (actually trapezoids, which become triangles at the poles) using the raw latitude/longitude integers. At the equator these rectangles would have dimensions of approximately 1.11 cm east-west by 1.11 cm north-south, whereas at a latitude of say London, England, the dimensions would be approximately 0.69 cm east-west by 1.11 cm north-south. That may or may not be easy to deal with, depending on what the application needs.
Anyway, I hope these thoughts and discussion help others who are looking at this topic for "the best design" for their system.
Kind regards, Vic
What encoding is "best" really depends on your goals/requirements.
If you are performing arithmetic, floating point latitude,longitude is often quite convenient. Other times cartesian coordinates (ie x,y,z) can be more convenient. For example, if you only cared about points on the surface of earth, you could use an n-vector.
As for longer term storage, IEEE floating point will waste bits for ranges you don't care about (for lat/lon) or for precision you may not care about in the case of cartesian coordinates (unless you want very good precision at the origin for whatever reason). You can of course map either type of coordinates to ints of your preferred size, such that the entire range of said ints covers the range you are interested in at the resolution you care about.
There are of course other things to think about than merely not wasting bits in the encoding. For example, (Geohashes)[https://en.wikipedia.org/wiki/Geohash] have the nice property that it is easy to find other geohashes in the same area. (Most will have the same prefix, and you can compute the prefix the others will have.) Unfortunately, they maintain the same precision in degrees longitude near the equator as near the poles. I'm currently using 64-bit geohashes for storage, which gives about 3 m resolution at the equator.
The Maidenhead Locator System has some similar characteristics, but seems more optimized for communicating locations between humans rather than storing on a computer. (Storing MLS strings would waste a lot of bits for some rather trivial error detection.)
The one system I found that does handle the poles differently is the Military Grid Reference System, although it too seems more human-communications oriented. (And it seems like a pain to convert from or to lat/lon.)
Depending on what you want exactly, you could use something similar to the Universal polar sereographic coordinate system near the poles along with something more computationally sane than UTM for the rest of the world, and use at most one bit to indicate which of the two systems you're using. I say at most one bit, because it's unlikely most of the points you care about would be near the poles. For example, you could use "half a bit" by saying 11 indicates use of the polar system, while 00, 01, and 10 indicate use of the other system, and are part of the representation.
Sorry this is a bit long, but I wanted to save what I had learned recently. Sadly I have not found any standard, sane, and efficient way to represent a point on earth with uniform precision.
Edit: I found another approach which looks a lot more like what you wanted, since it more directly takes advantage of the lower precision needed for longitude closer to the poles. It turns out there is a lot of research on storing normal vectors. Encoding Normal Vectors using Optimized Spherical Coordinates describes such a system for encoding normal vectors while maintaining a minimum level of accuracy, but it could just as well be used for geographical coordinates.
Might the problems you mentioned with floating point values become an issue? If the answer is no, I'd suggest just using the radians value in double precision - you'll need it if you'll be doing trigonometric calculations anyway.
If there might be an issue with precision loss when using doubles or you won't be doing trigonometry, I'd suggest your solution of mapping to an integer range - this will give you the best resolution, can easily be converted to whatever display format you're locale will be using and - after choosing an appropriate 0-meridian - can be used to convert to floating point values of high precision.
PS: I've always wondered why there seems to be no one who uses geocentric spherical coordinates - they should be reasonably close to the geographical coordinates, and won't require all this fancy math on spheroids to do computations; for fun, I wanted to convert Gauss-Krüger-Koordinaten (which are in use by the German Katasteramt) to GPS coordinates - let me tell you, that was ugly: one uses the Bessel ellipsoid, the other WGS84, and the Gauss-Krüger mapping itself is pretty crazy on it's own...
0.3 inch resolution is getting down to the point where earthquakes over a few years make a difference. You may want to reconsider why you believe you need such fine resolution worldwide.
Some of the spreading centres in the Pacific Ocean change by as much as 15 cm/year.
A Java program for comuting max rounding error in meters from casting lat/long values into Float/Double:
import java.util.*;
import java.lang.*;
import com.javadocmd.simplelatlng.*;
import com.javadocmd.simplelatlng.util.*;
public class MaxError {
public static void main(String[] args) {
Float flng = 180f;
Float flat = 0f;
LatLng fpos = new LatLng(flat, flng);
double flatprime = Float.intBitsToFloat(Float.floatToIntBits(flat) ^ 1);
double flngprime = Float.intBitsToFloat(Float.floatToIntBits(flng) ^ 1);
LatLng fposprime = new LatLng(flatprime, flngprime);
double fdistanceM = LatLngTool.distance(fpos, fposprime, LengthUnit.METER);
System.out.println("Float max error (meters): " + fdistanceM);
Double dlng = 180d;
Double dlat = 0d;
LatLng dpos = new LatLng(dlat, dlng);
double dlatprime = Double.longBitsToDouble(Double.doubleToLongBits(dlat) ^ 1);
double dlngprime = Double.longBitsToDouble(Double.doubleToLongBits(dlng) ^ 1);
LatLng dposprime = new LatLng(dlatprime, dlngprime);
double ddistanceM = LatLngTool.distance(dpos, dposprime, LengthUnit.METER);
System.out.println("Double max error (meters): " + ddistanceM);
}
}
Output:
Float max error (meters): 1.7791213425235692
Double max error (meters): 0.11119508289500799
As @Roland Pihlakas already pointed out, it depends at what precision are you going to use your coords.
I'd just suggest an alternative point of view:
Earth's equatorial circumference (perimeter) is 40.000 Km;
This equals to 40M meters, or 4 Billions of centimetres;
32-bit variable contains 2^32 or ~4.2 Billions different values, which is a bit more than number of centimetres in mentioned circumference.
That means, if we'd choose 32-bit integer values for latitude and longitude, it'll allow us to address a point on Earth with a precision of < than 1 centimetre.
With float values: float32 contains 23 significant bits => ~4.7 metres precision float64 contains 52 significant bits => < 1 mm precision
float32 contains 23 significant bits => ~4.7 metres precision
float64 contains 52 significant bits => < 1 mm precision
If by "storing" you mean "holding in memory", the real question is: what are you going to do with them?
I suspect that before these coordinates do anything interesting, they will have been funnelled as radians through the functions in math.h. Unless you plan on implementing quite a few transcendental functions that operate on Deg/Min/Secs packed into a bit field.
So why not keep things simple and just store them in IEEE-754 degrees or radians at the precision of your requirements?
The following code packs the WGS84 coordinates losslessly coordinates into an unsigned long (i.e. into 8 bytes):
using System;
using System.Collections.Generic;
using System.Text;
namespace Utils
{
/// <summary>
/// Lossless conversion of OSM coordinates to a simple long.
/// </summary>
unsafe class CoordinateStore
{
private readonly double _lat, _lon;
private readonly long _encoded;
public CoordinateStore(double lon,double lat)
{
// Ensure valid lat/lon
if (lon < -180.0) lon = 180.0+(lon+180.0); else if (lon > 180.0) lon = -180.0 + (lon-180.0);
if (lat < -90.0) lat = 90.0 + (lat + 90.0); else if (lat > 90.0) lat = -90.0 + (lat - 90.0);
_lon = lon; _lat = lat;
// Move to 0..(180/90)
var dlon = (decimal)lon + 180m;
var dlat = (decimal)lat + 90m;
// Calculate grid
var grid = (((int)dlat) * 360) + ((int)dlon);
// Get local offset
var ilon = (uint)((dlon - (int)(dlon))*10000000m);
var ilat = (uint)((dlat - (int)(dlat))*10000000m);
var encoded = new byte[8];
fixed (byte* pEncoded = &encoded[0])
{
((ushort*)pEncoded)[0] = (ushort) grid;
((ushort*)pEncoded)[1] = (ushort)(ilon&0xFFFF);
((ushort*)pEncoded)[2] = (ushort)(ilat&0xFFFF);
pEncoded[6] = (byte)((ilon >> 16)&0xFF);
pEncoded[7] = (byte)((ilat >> 16)&0xFF);
_encoded = ((long*) pEncoded)[0];
}
}
public CoordinateStore(long source)
{
// Extract grid and local offset
int grid;
decimal ilon, ilat;
var encoded = new byte[8];
fixed(byte *pEncoded = &encoded[0])
{
((long*) pEncoded)[0] = source;
grid = ((ushort*) pEncoded)[0];
ilon = ((ushort*)pEncoded)[1] + (((uint)pEncoded[6]) << 16);
ilat = ((ushort*)pEncoded)[2] + (((uint)pEncoded[7]) << 16);
}
// Recalculate 0..(180/90) coordinates
var dlon = (uint)(grid % 360) + (ilon / 10000000m);
var dlat = (uint)(grid / 360) + (ilat / 10000000m);
// Returns to WGS84
_lon = (double)(dlon - 180m);
_lat = (double)(dlat - 90m);
}
public double Lon { get { return _lon; } }
public double Lat { get { return _lat; } }
public long Encoded { get { return _encoded; } }
public static long PackCoord(double lon,double lat)
{
return (new CoordinateStore(lon, lat)).Encoded;
}
public static KeyValuePair<double, double> UnPackCoord(long coord)
{
var tmp = new CoordinateStore(coord);
return new KeyValuePair<double, double>(tmp.Lat,tmp.Lon);
}
}
}
Source: http://www.dupuis.me/node/35
Best precision at smallest size is int32.
Storing 7 decimal places (1.11 cm error) longitude double number give you a +/-1.800.000.000 number, perfect to store in int32, you only have to multiply double number by 10M like
int32_t lng = (int32_t)(double_lng * 10000000);
Explanation (wikipedia)
The equator is divided into 360 degrees of longitude, so each degree at the equator represents 111,319.5 m (111.32 km). As one moves away from the equator towards a pole, however, one degree of longitude is multiplied by the cosine of the latitude, decreasing the distance, approaching zero at the pole. The number of decimal places required for a 1cm precision at equator is 7. If you need to store 180º with 7 decimal places in a integer the result will be 1.800.000.000 number that is in the range of 32 bit integer.
As you can see in Google Maps when you click on any place Golge give you a 6 decimal places float number that fits on 32 bit integer.
Comparison:
vs double -> Half size vs float -> Float don't have enough precision vs 24bit suggestion: 24 bit is not addressable by any 32 or 64 bit processor, you have to get the three bytes and then convert to int32 or double an then opperate, lot of cicles lost and many lines of code
After coming across this question after searching for an answer myself, here is another possible scheme based on some precedent.
The Network Working Group, RFC 3825 proposed a coordinate-based geographic location option for DHCP (ie the system that hands out IP addresses on a network). See https://tools.ietf.org/rfc/rfc3825.txt
In this scheme, latitude and longitude are encoded in degrees with fixed-point values where the first 9 bits are the signed degrees, 25 bits are fractional degrees, and 6 bits are used for the accuracy. The value of the accuracy bits indicates the number of the 25 fractional bits that are considered to be accurate (e.g. coordinates collected via a consumer GPS vs a high-precision surveyor's GPS). Using WGS84, the accuracy is 8 decimal digits which is good to about a millimeter regardless of where you are on the globe.
As a couple of others have posted, floating point encoding really isn't good for this type of thing. Yes, it can represent a very large number of decimal places but the accuracy is either ignored or has to be dealt with somewhere else. For example, printing a float or a double with full floating-point precision results in a number with decimal digits very very unlikely to be remotely accurate. Likewise, simply outputting a float or a double with 8 or 10 decimal digits of precision many not be a true representation of the source values based on how floating point numbers are computed (e.g. why 1.2-1.0 does not equal 0.2 using floating point arithmetic).
For for a humorous example of why you should care about coordinate-system precision, see https://xkcd.com/2170/.
Granted, the 40-bit encoding used in RFC 3825 is hardly convenient in a 32 or 64-bit world but this style can be easily extended to a 64-bit number where 9 bits are used for the signed degree, 6 bits are used for the accuracy, leaving 49 bits for the decimal portion. This results in 15 decimal digits of precision which is more than basically anyone will ever need (see humorous example).
You can use decimal
datatype:
CREATE TABLE IF NOT EXISTS `map` (
`latitude` decimal(18,15) DEFAULT NULL,
`longitude` decimal(18,15) DEFAULT NULL
);
Success story sharing